summaryrefslogtreecommitdiff
path: root/plac/doc/plac.txt
blob: d540e91807dc990c070c085e1dfe4b04d1228d19 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
Parsing the Command Line the Easy Way: Introducing plac, the Easiest Argument Parser in the Python World
==========================================================================================================

:Author: Michele Simionato
:E-mail: michele.simionato@gmail.com
:Requires: Python 2.3+
:Download page: http://pypi.python.org/pypi/plac
:Project page: http://micheles.googlecode.com/hg/plac/doc/plac.html
:Installation: ``easy_install plac``
:License: BSD license

.. contents::

The importance of scaling down
------------------------------------------------

There is no want of command line arguments parsers in the Python
world. The standard library alone contains three different modules:
getopt_ (from the stone age),
optparse_ (from Python 2.3) and argparse_ (from Python 2.7).  All of
them are quite powerful and especially argparse_ is an industrial
strength solution; unfortunately, all of them feature a non-zero learning
curve and a certain verbosity. They do not scale down well enough, at
least in my opinion.

It should not be necessary to stress the importance `scaling down`_;
nevertheless most people are obsessed with features and concerned with
the possibility of scaling up, whereas I think that we should be even
more concerned with the issue of scaling down. This is an old meme in
the computing world: programs should address the common cases simply,
simple things should be kept simple, while at the same keeping
difficult things possible. plac_ adhere as much as possible to this
philosophy and it is designed to handle well the simple cases, while
retaining the ability to handle complex cases by relying on the
underlying power of argparse_.

Technically plac_ is just a simple wrapper over argparse_ which hides
most of its complexity by using a declarative interface: the argument
parser is inferred rather than written down by imperatively.  Still, plac_ is
surprisingly scalable upwards, even without using the underlying
argparse_. I have been using Python for 8 years and in my experience
it is extremely unlikely that you will ever need to go beyond the
features provided by the declarative interface of plac_: they should
be more than enough for 99.9% of the use cases.

plac_ is targetting especially unsophisticated users,
programmers, sys-admins, scientists and in general people writing
throw-away scripts for themselves, choosing the command line
interface because it is the quick and simple. Such users are not
interested in features, they are interested in a small learning curve:
they just want to be able to write a simple command line tool from a
simple specification, not to build a command line parser by
hand. Unfortunately, the modules in the standard library forces them
to go the hard way. They are designed to implement power user tools
and they have a non-trivial learning curve. On the contrary, plac_
is designed to be simple to use and extremely concise, as the examples
below will show.

Scripts with required arguments
---------------------------------------------

Let me start with the simplest possible thing: a script that takes a
single argument and does something to it.  It cannot get simpler
than that, unless you consider a script without command line
arguments, where there is nothing to parse. Still, it is a use
case *extremely common*: I need to write scripts like that nearly
every day, I wrote hundreds of them in the last few years and I have
never been happy. Here is a typical example of code I have been
writing by hand for years:

.. include:: example1.py
   :literal:

As you see the whole ``if __name__ == '__main__'`` block (nine lines)
is essentially boilerplate that should not exists.  Actually I think
the language should recognize the main function and pass to it the
command line arguments automatically; unfortunaly this is unlikely to
happen. I have been writing boilerplate like this in hundreds of
scripts for years, and every time I *hate* it. The purpose of using a
scripting language is convenience and trivial things should be
trivial. Unfortunately the standard library does not help for this
incredibly common use case. Using getopt_ and optparse_ does not help,
since they are intended to manage options and not positional
arguments; the argparse_ module helps a bit and it is able to reduce
the boilerplate from nine lines to six lines:

.. include:: example2.py
   :literal:

However saving three lines does not justify introducing the external
dependency: most people will not switch to Python 2.7, which at the time of
this writing is just about to be released, for many years. 
Moreover, it just feels too complex to instantiate a class and to
define a parser by hand for such a trivial task.

The plac_ module is designed to manage well such use cases, and it is able
to reduce the original nine lines of boiler plate to two lines. With the
plac_ module all you need to write is

.. include:: example3.py
   :literal:

The plac_ module provides for free (actually the work is done by the
underlying argparse_ module) a nice usage message::

 $ python example3.py -h
 usage: example3.py [-h] dsn
 
 positional arguments:
   dsn
 
 optional arguments:
   -h, --help  show this help message and exit

This is only the tip of the iceberg: plac_ is able to do much more than that.

Scripts with default arguments
--------------------------------------------------

The need to have suitable defaults for command line arguments is quite
common. For instance I have encountered this use case at work hundreds
of times:

.. include:: example4.py
   :literal:

Here I want to perform a query on a database table, by extracting the
today's data: it makes sense for ``today`` to be a default argument.
If there is a most used table (in this example a table called ``'product'``)
it also makes sense to make it a default argument. Performing the parsing
of the command lines arguments by hand takes 8 ugly lines of boilerplate
(using argparse_ would require about the same number of lines).
With plac_ the entire ``__main__`` block reduces to the usual two lines::

  if __name__ == '__main__':
      import plac; plac.call(main)

In other words, six lines of boilerplate have been removed, and we get
the usage message for free::

 usage: example5.py [-h] dsn [table] [today]
 
 positional arguments:
   dsn
   table
   today
 
 optional arguments:
   -h, --help  show this help message and exit

plac_ manages transparently even the case when you want to pass a
variable number of arguments. Here is an example, a script running
on a database a series of SQL scripts:

.. include:: example6.py
   :literal:

Using plac_, you can just replace the ``__main__`` block with the
usual two lines (I have defined an Emacs keybinding for them)
and then you get the following nice usage message::

 usage: example7.py [-h] dsn [scripts [scripts ...]]
 
 positional arguments:
   dsn
   scripts

 optional arguments:
   -h, --help  show this help message and exit

The examples here should have made clear that *plac is able to figure out
the command line arguments parser to use from the signature of the main
function*. This is the whole idea behind plac_: if the intent is clear,
let's the machine take care of the details.

Scripts with options
---------------------------------------

It is surprising how few command line scripts with options I have
written over the years (probably less than a hundred), compared to the
number of scripts with positional arguments I wrote (certainly more
than a thousand of them).  Still, this use case cannot be neglected.
The standard library modules (all of them) are quite verbose when it
comes to specifying the options and frankly I have never used them
directly. Instead, I have always relied on an old recipe of mine, the
optionparse_ recipe, which provides a convenient wrapper over
optionparse_. Alternatively, in the simplest cases, I have just
performed the parsing by hand.

plac_ is inspired to the optionparse_ recipe, in the sense that it
delivers the programmer from the burden of writing the parser, but is
less of a hack: instead of extracting the parser from the docstring of
the module, it extracts it from the signature of the ``main``
function.

The idea comes from the `function annotations` concept, a new
feature of Python 3. An example is worth a thousand words, so here
it is:

.. include:: example8.py
   :literal:

As you see, the argument ``command`` has been annotated with the tuple
``("SQL query", 'option', 'c')``: the first string is the help string
which will appear in the usage message, the second string tell plac_
that ``command`` is an option and the third string that it can be
abbreviated with the letter ``c``. Of course, the long option format
(``--command=``) comes from the argument name.  The resulting usage
message is the following::

 usage: example8.py [-h] [-c COMMAND] dsn
 
 positional arguments:
   dsn

 optional arguments:
   -h, --help            show this help message and exit
   -c COMMAND, --command COMMAND
                         SQL query

Here are two examples of usage::

 $ python3 example8.py -c"select * from table" dsn
 executing select * from table on dsn

 $ python3 example8.py --command="select * from table" dsn
 executing select * from table on dsn

Notice that if the option is not passed, the variable ``command``
will get the value ``None``. It is possible to specify a non-trivial
default for an option. Here is an example:

.. include:: example8_.py
   :literal:

Now if you do not pass the ``command option``, the
default query will be executed::

 $ python example8_.py dsn
 executing 'select * from table' on dsn

Positional argument can be annotated too::

 def main(command: ("SQL query", 'option', 'c'),
          dsn: ("Database dsn", 'positional', None)):
     ...

Of course explicit is better than implicit, an no special cases are
special enough, but sometimes practicality beats purity, so plac_ is
smart enough to convert help messages into tuples; in other words, you
can just write ``"Database dsn"`` instead of ``("Database dsn",
'positional', None)``. In both cases the usage message will show a
nice help string on the right hand side of the ``dsn`` positional
argument. 

I should also notice that varargs (starred-arguments) can be annotated too;
here is an example::

  def main(dsn: "Database dsn", *scripts: "SQL scripts"):
      ...

This is a valid signature for plac_, which will recognize the help strings
for both ``dsn`` and ``scripts``::

 positional arguments:
   dsn                          Database dsn
   scripts                      SQL scripts

Scripts with flags
--------------------

plac_ also recognizes flags, i.e. boolean options which are
``True`` if they are passed to the command line and ``False`` 
if they are absent. Here is an example:

.. include:: example9.py
   :literal:

::

 $ python3 example9.py -h
 usage: example9.py [-h] [-v] dsn
 
 positional arguments:
   dsn            connection string
 
 optional arguments:
   -h, --help     show this help message and exit
   -v, --verbose  prints more info

::

 $ python3 example9.py -v dsn
 connecting to dsn

Notice that it is an error trying to specify a default for flags: the
default value for a flag is always ``False``. If you feel the need to
implement non-boolean flags, you should use an option with two
choices, as explained in the "more features" section.

For consistency with the way the usage message is printed, I suggest
you to follow the Flag-Option-Required-Default (FORD) convention: in
the ``main`` function write first the flag arguments, then the option
arguments, then the required arguments and finally the default
arguments. This is just a convention and you are not forced to use it,
except for the default arguments (including the varargs) which must
stay at the end as it is required by the Python syntax.

plac for Python 2.X users
--------------------------------------------------

I do not use Python 3. At work we are just starting to think about
migrating to Python 2.6. It will take years before we
think to migrate to Python 3. I am pretty much sure most Pythonistas
are in the same situation. Therefore plac_ provides a way to work
with function annotations even in Python 2.X (including Python 2.3).
There is no magic involved; you just need to add the annotations
by hand. For instance the annotate function declaration

::

  def main(dsn: "Database dsn", *scripts: "SQL scripts"):
      ...

is equivalent to the following code::

  def main(dsn, *scripts):
      ...
  main.__annotations__ = dict(
      dsn="Database dsn",
      scripts="SQL scripts")

One should be careful to match the keys of the annotation dictionary
with the names of the arguments in the annotated function; for lazy
people with Python 2.4 available the simplest way is to use the
``plac.annotations`` decorator that performs the check for you::

  @plac.annotations(
      dsn="Database dsn",
      scripts="SQL scripts")
  def main(dsn, *scripts):
      ...

In the rest of this article I will assume that you are using Python 2.X with
``X >= 4`` and I will use the ``plac.annotations`` decorator. Notice however
that the tests for plac_ runs even on Python 2.3.

More features
--------------------------------------------------

Even if one of the goals of plac is to have a learning curve of
*minutes*, compared to the learning curve of *hours* of
argparse_, it does not mean that I have removed all the features of
argparse_. Actually a lot of argparse_ power persists in plac_.  Until
now, I have only showed simple annotations, but in general an
annotation is a 5-tuple of the form

  ``(help, kind, abbrev, type, choices, metavar)``

where ``help`` is the help message, ``kind`` is a string in the set {
``"flag"``, ``"option"``, ``"positional"``}, ``abbrev`` is a
one-character string, ``type`` is a callable taking a string in input,
``choices`` is a discrete sequence of values and ``metavar`` is a string.

``type`` is used to automagically convert the command line arguments
from the string type to any Python type; by default there is no
convertion and ``type=None``.

``choices`` is used to restrict the number of the valid
options; by default there is no restriction i.e. ``choices=None``.

``metavar`` is used to change the argument name in the usage message
(and only there); by default the metavar is ``None``: this means that
the name in the usage message is the same as the argument name,
unless the argument has a default and in such a case is
equal to the stringified form of the default.

Here is an example showing many of the features (taken from the
argparse_ documentation):

.. include:: example10.py
   :literal:

Here is the usage::

 usage: example10.py [-h] {add,mul} [n [n ...]]
 
 A script to add and multiply numbers

 positional arguments:
   {add,mul}   The name of an operator
   n           A number

 optional arguments:
   -h, --help  show this help message and exit

Notice that the docstring of the ``main`` function has been automatically added
to the usage message. Here are a couple of examples of use::

 $ python example10.py add 1 2 3 4
 10.0
 $ python example10.py mul 1 2 3 4
 24.0
 $ python example10.py ad 1 2 3 4 # a mispelling error
 usage: example10.py [-h] {add,mul} [n [n ...]]
 example10.py: error: argument operator: invalid choice: 'ad' (choose from 'add', 'mul')

A more realistic example
---------------------------------------

Here is a more realistic script using most of the features of plac_ to
run SQL queries on a database by relying on SQLAlchemy_. Notice the usage
of the ``type`` feature to automagically convert a SQLAlchemy connection
string into a SqlSoup_ object:

.. include:: dbcli.py
   :literal:

Here is the usage message::

 $ python dbcli.py -h
 usage: dbcli.py [-h] [-H] [-c SQL] [-d |] db [scripts [scripts ...]]

 A script to run queries and SQL scripts on a database
 
 positional arguments:
   db                    Connection string
   scripts               SQL scripts
 
 optional arguments:
   -h, --help            show this help message and exit
   -H, --header          Header
   -c SQL, --sqlcmd SQL  SQL command
   -d |, --delimiter |   Column separator

Advanced usage
----------------------------------------------------

plac_ relies on a argparse_ for all of the heavy lifting work and it is
possible to leverage on argparse_ features directly or indirectly.

For instance, you can make invisible an argument in the usage message
simply by using ``'==SUPPRESS=='`` as help string (or
``argparse.SUPPRESS``). Similarly, you can use argparse.FileType_
directly.

It is also possible to pass options to the underlying
``argparse.ArgumentParser`` object (currently it accepts the default
arguments ``description``, ``epilog``, ``prog``, ``usage``,
``add_help``, ``argument_default``, ``parents``, ``prefix_chars``,
``fromfile_prefix_chars``, ``conflict_handler``, ``formatter_class``).
It is enough to set such attributes on the ``main`` function.  For
instance

::

  def main(...):
      pass

  main.add_help = False

disable the recognition of the help flag ``-h, --help``. This is not
particularly elegant, but I assume the typical user of plac_ will be
happy with the defaults and would not want to change them; still it is
possible if she wants to. For instance, by setting the ``description``
attribute, it is possible to add a comment to the usage message (by
default the docstring of the ``main`` function is used as
description). It is also possible to change the option prefix; for
instance if your script must run under Windows and you want to use "/"
as option prefix you can add the lines::

  main.prefix_chars='-/'
  main.short_prefix = '/'

The recognition of the ``short_prefix`` attribute is a plac_
extension; there is also a companion ``long_prefix`` attribute with
default value of ``"--"``. ``prefix_chars`` is an argparse_ feature.
Interested readers should read the documentation of argparse_ to
understand the meaning of the other options. If there is a set of
options that you use very often, you may consider writing a decorator
adding such options to the ``main`` function for you. For simplicity,
plac_ does not perform any magic of that kind.

It is possible to access directly the underlying ArgumentParser_ object, by
invoking the ``plac.parser_from`` utility function:

>>> import plac
>>> def main(arg):
...     pass
... 
>>> print plac.parser_from(main)
ArgumentParser(prog='', usage=None, description=None, version=None, 
formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error',
add_help=True)

I use ``plac.parser_from`` in the unit tests of the module, but regular
users should never need to use it.

Custom annotation objects
------------------------------------------------------

Internally plac_ uses an ``Annotation`` class to convert the tuples
in the function signature into annotation objects, i.e. objects with
six attributes ``help, kind, short, type, choices, metavar``.

Advanced users can implement their own annotation objects.
For instance, here is an example of how you could implement annotations for
positional arguments:

.. include:: annotations.py
   :literal:

You can use such annotations objects as follows:

.. include:: example11.py
   :literal:

Here is the usage message you get::

 usage: example11.py [-h] i n [rest [rest ...]]
 
 positional arguments:
   i           This is an int
   n           This is a float
   rest        Other arguments
 
 optional arguments:
   -h, --help  show this help message and exit

You can go on and define ``Option`` and ``Flag`` classes, if you like.
Using custom annotation objects you could do advanced things like extracting the
annotations from a configuration file or from a database, but I expect such
use cases to be quite rare: the default mechanism should work 
pretty well for most users.

plac vs argparse
---------------------------------------------

plac_ is opinionated and by design it does not try to make available
all of the features of argparse_ in an easy way.  In particular you
should be aware of the following limitations/differences (the
following assumes knowledge of argparse_):

- plac_ automatically defines both a long and short form for each options,
  just like optparse_. argparse_ allows you to define only a long form,
  or only a short form, if you like. However, since I have always been
  happy with the behavior of optparse_, which I feel is pretty much
  consistent, I have decided not to support this feature.

- plac does not support the destination concept: the destination
  coincides with the name of the argument, always. This restriction
  has some drawbacks. For instance, suppose you want to define a long
  option called ``--yield``. In this case the destination would be ``yield``,
  which is a Python keyword, and since you cannot introduce an
  argument with that name in a function definition, it is impossible
  to implement it. Your choices are to change the name of the long
  option, or to use argparse_ with a suitable destination.

- plac_ does not support "required options". As the argparse_
  documentation puts it: *Required options are generally considered bad
  form - normal users expect options to be optional. You should avoid
  the use of required options whenever possible.*

- plac_ supports only regular boolean flags. argparse_ has the ability to 
  define generalized two-value flags with values different from ``True`` 
  and ``False``. An earlier version of plac_ had this feature too, but 
  since you can use options with two choices instead, and in any case
  the conversion from ``{True, False}`` to any couple of values
  can be trivially implemented with a ternary operator
  (``value1 if flag else value2``), I have removed it (KISS rules!).

- plac_ does not support ``nargs`` options directly (it uses them internally,
  though, to implement flag recognition). The reason it that all the use
  cases of interest to me are covered by plac_ and did not feel the need
  to increase the learning curve by adding direct support for ``nargs``.

- plac_ does not support subparsers directly. For the moment, this
  looks like a feature too advanced for the goals of plac_.

- plac_ does not support actions directly. This also
  looks like a feature too advanced for the goals of plac_. Notice however
  that the ability to define your own annotation objects may mitigate the
  need for custom actions.

I should stress again that if you want to access all of the argparse_ features
from plac_ you can use ``plac.parser_from`` and you will get
the underlying ArgumentParser_ object. The the full power of argparse_
is then available to you: you can use ``add_argument``, ``add_subparsers()``,
etc. In other words, while some features are not supported directly,
*all* features are supported indirectly.

The future
-------------------------------

Currently plac is below 100 lines of code, not counting blanks, comments
and docstrings. I do not plan to extend it much in the future. The idea is
to keep the module short: it is and it should remain a little wrapper over
argparse_. Actually I have thought about contributing the code back to
argparse_ if plac_ becomes successfull and gains a reasonable number of
users. For the moment it should be considered experimental: after all 
I wrote it in three days, including the tests, the documentation and the
time to learn argparse_.

Trivia: the story behind the name
-----------------------------------------

The plac_ project started very humble: I just wanted to make 
easy_installable my old optionparse_ recipe, and to publish it on PyPI.
The original name of plac_ was optionparser and the idea behind it was
to build an OptionParser_ object from the docstring of the module.
However, before doing that, I decided to check out the argparse_ module,
since I knew it was going into Python 2.7 and Python 2.7 was coming out.
Soon enough I realized two things:

1. the single greatest idea of argparse_ was unifying the positional arguments
   and the options in a single namespace object;
2. parsing the docstring was so old-fashioned, considering the existence
   of functions annotations in Python 3.

Putting together these two observations with the original idea of inferring the
parser I decided to build an ArgumentParser_ object from function
annotations. The ``optionparser`` name was ruled out, since I was
now using argparse_; a name like ``argparse_plus`` was also ruled out,
since the typical usage was completely different from the argparse_ usage.

I made a research on PyPI and the name plac (Command Line Arguments Parser)
was not taken, so I renamed everything to plac. After two days 
a Clap_ module appeared on PyPI <expletives deleted>!

Having little imagination, I decided to rename everything again to plac,
an anagram of plac: since it is a non-existing English name, I hope nobody
will steal it from me!

That's all, I hope you will enjoy working with plac!

.. _argparse: http://argparse.googlecode.com
.. _optparse: http://docs.python.org/library/optparse.html
.. _getopt: http://docs.python.org/library/getopt.html
.. _optionparse: http://code.activestate.com/recipes/278844-parsing-the-command-line/
.. _plac: http://pypi.python.org/pypi/plac
.. _scaling down: http://www.welton.it/articles/scalable_systems
.. _ArgumentParser: http://argparse.googlecode.com/svn/tags/r11/doc/ArgumentParser.html
.. _argparse.FileType: http://argparse.googlecode.com/svn/tags/r11/doc/other-utilities.html?highlight=filetype#FileType
.. _Clap: http://pypi.python.org/pypi/Clap/0.7
.. _OptionParser: http://docs.python.org/library/optparse.html?highlight=optionparser#optparse.OptionParser
.. _SQLAlchemy: http://www.sqlalchemy.org/
.. _SqlSoup: http://www.sqlalchemy.org/docs/reference/ext/sqlsoup.html