summaryrefslogtreecommitdiff
path: root/plac/doc/plac.txt
blob: 661a0ca81f0e4decaec2baa1b6ee93bd1b76380f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
Parsing the Command Line the Easy Way: Introducing plac, the EasiestArgument Parser in the Python World
==========================================================================================================

:Author: Michele Simionato
:E-mail: michele.simionato@gmail.com
:Requires: Python 2.3+
:Download page: http://pypi.python.org/pypi/plac
:Installation: ``easy_install plac``
:License: BSD license

.. contents::

The importance of scaling down
------------------------------------------------

There is no want of command line arguments parsers in the Python
world. The standard library alone contains three different modules:
getopt_ (from the stone age),
optparse_ (from Python 2.3) and argparse_ (from Python 2.7).  All of
them are quite powerful and especially argparse_ is an industrial
strength solution; unfortunately, all of them feature a non-zero learning
curve and a certain verbosity.

An ex-coworker of mine, David Welton, once wrote a nice article about
the importance of `scaling down`_: most people are concerned with the
possibility of scaling up, but we should also be concerned with the
issue of scaling down. This is an old meme in the computing world:
programs should address the common cases simply, simple things should
be kept simple, while at the same keeping difficult things
possible. plac_ adhere as much as possible to this philosophy and it
is designed to handle well the simple cases, while retaining the
ability to handle complex cases by relying on the underlying power of
argparse_.

Technically plac_ is just a simple wrapper over argparse_, hiding most
of its complexity while retaining most of its power. The complexity is
removed by using a declarative interface instead of an imperative one.
Still, plac_ is surprisingly scalable upwards, even without
using the underlying argparse_. I have been using Python for
8 years and in my experience it is extremely unlikely that you will
ever need to go beyond the features provided by the declarative interface
of plac_: they should be more than enough for 99.9% of the typical use cases. 

plac_ is targetting programmers, sys-admins, scientists and in
general people writing throw-away scripts for themselves, choosing to
use a command line interface because it is the quick and simple. Such
users are not interested in features, they are interested in a small
learning curve: they just want to be able to write a simple command
line tool from a simple specification, not to build a command line
parser by hand. Unfortunately, the modules in the standard library
forces them to go the hard way. They are designed to implement power
user tools for programmers or system administrators, and they have a
non-trivial learning curve.

Scripts with required positional arguments
---------------------------------------------

Let me start with the simplest possible thing: a script that takes a
single argument and does something to it.  It cannot get more trivial
than that (discarding the possibility of a script without command line
arguments, where there is nothing to parse), nevertheless it is a use
case *extremely common*: I need to write scripts like that nearly
every day, I wrote hundreds of them in the last few years and I have
never been happy. Here is a typical example of code I have been
writing by hand for years:

 .. include:: example1.py
    :literal:

As you see the whole ``if __name__ == '__main__'`` block (nine lines) is
essentially boilerplate that should not exists.  Actually I think the
Python language should recognize the main function and pass to it the
command line arguments behind the scenes; unfortunaly this is unlikely to
happen. I have been writing boilerplate like this in hundreds of
scripts for years, and every time I *hate* it. The purpose of using a
scripting language is convenience and trivial things should be
trivial. Unfortunately the standard library does not help for
this use case, which may be trivial, but it is still incredibly
common. Using getopt_ and optparse_ does not help, since they are
intended to manage options and not positional arguments; the argparse_
module helps a bit and it is able to reduce the boilerplate from nine
lines to six lines:

 .. include:: example2.py
    :literal:

However saving three lines does not justify introducing the external
dependency: most people will not switch Python 2.7, which at the time of
this writing is just about to be released, for many years. 
Moreover, it just feels too complex to instantiate a class and to
define a parser by hand for such a trivial task.

The plac_ module is designed to manage well such use cases, and it is able
to reduce the original nine lines of boiler plate to two lines. With the
plac_ module all you need to write is

 .. include:: example3.py
    :literal:

The plac_ module provides for free (actually the work is done by the
underlying argparse_ module) a nice usage message::

 $ python example3.py -h
 usage: example3.py [-h] dsn
 
 positional arguments:
   dsn
 
 optional arguments:
   -h, --help  show this help message and exit

This is only the tip of the iceberg: plac_ is able to do much more than that.

Scritps with default arguments
--------------------------------------------------

I have encountered this use case at work hundreds of times:

 .. include:: example4.py
    :literal:

With plac_ the entire ``__main__`` block reduces to the usual two lines::

  if __name__ == '__main__':
      import plac; plac.call(main)

In other words, six lines of boilerplate have been removed, and I have
the usage message for free::

 usage: example4_.py [-h] dsn [table] [today]
 
 positional arguments:
   dsn
   table
   today
 
 optional arguments:
   -h, --help  show this help message and exit

plac_ manages transparently even the case when you want to pass a
variable number of arguments. Here is an example, a script running
on a database a series of SQL scripts:

 .. include:: example6.py
    :literal:

Using plac_, you can just replace the ``__main__`` block with the
usual two lines (I have defined an Emacs keybinding for them)
and you get the following usage message::

 usage: example7.py [-h] dsn [scripts [scripts ...]]
 
 positional arguments:
   dsn
   scripts

 optional arguments:
   -h, --help  show this help message and exit

The examples here should have made clear that *plac is able to figure out
the command line arguments parser to use from the signature of the main
function*. This is the whole idea behind plac_: if my intent is clear,
let's the machine take care of the details.

Options and flags
---------------------------------------

It is surprising how few command line scripts with options I have
written over the years (probably less than a hundred), compared to the
number of scripts with positional arguments I have written (certainly
more than a thousand of them).  Still, this use case is quite common
and cannot be neglected.  The standard library modules (all of them)
are quite verbose when it comes to specifying the options and frankly
I have never used them directly. Instead, I have always relied on an
old recipe of mine, the optionparse_ recipe, which provides a
convenient wrapper over optionparse_. Alternatively, in the simplest
cases, I have just performed the parsing by hand, instead of manually
building a suitable OptionParser_.

plac_ is inspired to the optionparse_ recipe, in the sense that it
delivers the programmer from the burden of writing the parser, but is
less of a hack: instead of extracting the parser from the docstring of
the module, it extracts it from the signature of the ``main``
function.

The idea comes from the `function annotations` concept, a new
feature of Python 3. An example is worth a thousand words, so here
it is:

 .. include:: example8.py
    :literal:

As you see, the argument ``command`` has been annotated with the
tuple ``("SQL query", 'option', 'c')``: the first string is the
help string which will appear in the usage message, whereas the
second and third strings tell plac_ that ``command`` is an option and that
it can be abbreviated with the letter ``c``. Of course, the long option 
format (``--command=``) comes from the argument name.
The resulting usage message is the following::

 $ python3 example8.py -h
 usage: example8.py [-h] [-c COMMAND] dsn
 
 positional arguments:
   dsn

 optional arguments:
   -h, --help            show this help message and exit
   -c COMMAND, --command COMMAND
                         SQL query

Here are two examples of usage::

 $ python3 example8.py -c"select * from table" dsn
 executing select * from table on dsn

 $ python3 example8.py --command="select * from table" dsn
 executing select * from table on dsn

Notice that if the option is not passed, the variable ``command``
will get the value ``None``. It is possible to specify a non-trivial
default for an option. Here is an example:

 .. include:: example8_.py
    :literal:

Now if you do not pass the ``command option``, the
default query will be executed::

 $ python article/example8_.py dsn
 executing 'select * from table' on dsn

Positional argument can be annotated too::

 def main(command: ("SQL query", 'option', 'c'),
          dsn: ("Database dsn", 'positional', None)):
     ...

Of course explicit is better than implicit, an no special cases are
special enough, but sometimes practicality beats purity, so plac_ is
smart enough to convert help messages into tuples; in other words, you
can just write ``"Database dsn"`` instead of ``("Database dsn",
'positional', None)``. In both cases the usage message will show a
nice help string on the right hand side of the ``dsn`` positional
argument. varargs (starred-arguments) can also be annotated::

  def main(dsn: "Database dsn", *scripts: "SQL scripts"):
      ...

is a valid signature for plac_, which will recognize the help strings
for both ``dsn`` and ``scripts``::

 positional arguments:
   dsn                          Database dsn
   scripts                      SQL scripts

plac_ also recognizes flags, i.e. boolean options which are
``True`` if they are passed to the command line and ``False`` 
if they are absent. Here is an example::

 $ python3 example9.py -v dsn
 connecting to dsn

::

 $ python3 example9.py -h
 usage: example9.py [-h] [-v] dsn
 
 positional arguments:
   dsn            connection string
 
 optional arguments:
   -h, --help     show this help message and exit
   -v, --verbose  prints more info

Notice that it is an error trying to specify a default for flags: the
default value for a flag is always ``False``. If you feel the need to
implement non-boolean flags, you should use an option with two
choices, as explained in the "more features" section.

For consistency with the way the usage message is printed, I suggest
you to follow the Flag-Option-Required-Default (FORD) convention: in
the ``main`` function write first the flag arguments, then the option
arguments, then the required arguments and finally the default
arguments. This is just a convention and you are not forced to use it,
except for the default arguments (including the varargs) which must
stay at the end since it is required by the Python syntax.

plac for Python 2.X users
--------------------------------------------------

I do not use Python 3. At work we are just starting to think about
migrating to Python 2.6. It will take years before we even
think to migrate to Python 3. I am pretty much sure most Pythonistas
are in the same situation. Therefore plac_ provides a way to work
with function annotations even in Python 2.X (including Python 2.3).
There is no magic involved; you just need to add the annotations
by hand. For instance

::

  def main(dsn: "Database dsn", *scripts: "SQL scripts"):

becomes::

  def main(dsn, *scripts):
      ...
  main.__annotations__ = dict(
  dsn="Database dsn",
  scripts="SQL scripts")

One should be careful to much the keys of the annotations dictionary
with the names of the arguments in the annotated function; for lazy
people with Python 2.4 available the simplest way is to use the
``plac.annotations`` decorator that performs the check for you.

::

  @annotations(
      dsn="Database dsn",
      scripts="SQL scripts")
  def main(dsn, *scripts):
      ...

In the rest of this article I will assume that you are using Python 2.X with
``X >= 4`` and I will use the ``plac.annotations`` decorator. Notice however
that the tests for plac_ are supposed to run even with Python 2.3.

More features
--------------------------------------------------

One of the goals of plac is to have a learning curve of *minutes*, compared
to the learning curve of *hours* of argparse_. That does not mean
that I have removed all the features of argparse_. Actually
a lot of argparse_ power persists in plac_.
Until now, I have only showed simple annotations, but in general
an annotation is a 5-tuple of the form

  ``(help, kind, abbrev, type, choices, metavar)``

where ``help`` is the help message, ``kind`` is one of {"flag",
"option ", "positional"}, ``abbrev`` is a one-character string,
``type`` is callable taking a string in input, choices is a sequence
of values and ``metavar`` is a string.

``type`` is used to automagically convert the arguments from string
to any Python type; by default there is no convertion i.e. ``type=None``.

``choices`` is used to restrict the number of the valid
options; by default there is no restriction i.e. ``choices=None``.

``metavar`` is used to change the argument name in the usage message
(and only there); by default the metavar is equal to the name of the
argument, unless the argument has a default and in such a case is
equal to the stringified form of the default.

Here is an example showing many of the features (shamelessly stolen
from the argparse_ documentation):

 .. include:: example10.py
    :literal:

Here is the usage for the script::

 usage: example10.py [-h] {add,mul} [n [n ...]]
 
 A script to add and multiply numbers

 positional arguments:
   {add,mul}   The name of an operator
   n           A number

 optional arguments:
   -h, --help  show this help message and exit

Notice that the docstring of the ``main`` function has been automatically added
to the usage message. Here are a couple of examples of use::

 $ python example10.py add 1 2 3 4
 10.0
 $ python example10.py mul 1 2 3 4
 24.0
 $ python example10.py ad 1 2 3 4 # a mispelling error
 usage: example10.py [-h] {add,mul} [n [n ...]]
 example10.py: error: argument operator: invalid choice: 'ad' (choose from 'add', 'mul')

A somewhat realistic example
---------------------------------------

Here is a more realistic script using most of the features of plac_ to
run SQL queries on a database by relying on SQLAlchemy_. Notice the usage
of the ``type`` feature to automagically convert a SQLAlchemy connection
string into a SqlSoup_ object:

 .. include:: dbcli.py
    :literal:

Here is the usage message::

 $ python article/dbcli.py -h
 usage: dbcli.py [-h] [-H] [-c SQL] [-d |] db [scripts [scripts ...]]

 A script to run queries and SQL scripts on a database
 
 positional arguments:
   db                    Connection string
   scripts               SQL scripts
 
 optional arguments:
   -h, --help            show this help message and exit
   -H, --header          Header
   -c SQL, --sqlcmd SQL  SQL command
   -d |, --delimiter |   Column separator

A few notes about the underlying implementation
----------------------------------------------------

plac_ relies on a argparse_ for all of the heavy lifting work and it is
possible to leverage on argparse_ features directly or indirectly.

For instance, you can make invisible an argument in the usage message
simply by using ``'==SUPPRESS=='`` as help string (or
``argparse.SUPPRESS``). Similarly, you can use argparse.FileType_
directly.

It is also possible to pass options to the underlying
``argparse.ArgumentParser`` object (currently it accepts the default
arguments ``description``, ``epilog``, ``prog``, ``usage``,
``add_help``, ``argument_default``, ``parents``, ``prefix_chars``,
``fromfile_prefix_chars``, ``conflict_handler``, ``formatter_class``).
It is enough to set such attributes on the ``main`` function.  For
instance

::

  def main(...):
      pass

  main.add_help = False

disable the recognition of the help flag ``-h, --help``. This is not
particularly elegant, but I assume the typical user of plac_ will be
happy with the defaults and would not want to change them; still it is
possible if she wants to. For instance, by setting the ``description``
attribute, it is possible to add a comment to the usage message (by
default the docstring of the ``main`` function is used as
description). It is also possible to change the option prefix; for
instance if your script must run under Windows and you want to use "/"
as option prefix you can add the lines::

  main.prefix_chars='-/'
  main.short_prefix = '/'

The recognition of the ``short_prefix`` attribute is a plac_
extension; there is also a companion ``long_prefix`` attribute with
default value of ``--``. ``prefix_chars`` is an argparse_ feature.
Interested readers should read the documentation of argparse_ to
understand the meaning of the other options. If there is a set of
options that you use very often, you may consider writing a decorator
adding such options to the ``main`` function for you. For simplicity,
plac_ does not perform any magic of that kind.

It is possible to access directly the underlying ArgumentParser_ object, by
invoking the ``plac.parser_from`` utility function:

>>> import plac
>>> def main(arg):
...     pass
... 
>>> print plac.parser_from(main)
ArgumentParser(prog='', usage=None, description=None, version=None, 
formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error',
add_help=True)

I use ``plac.parser_from`` in the unit tests of the module, but regular
users should never need to use it.

Custom annotation objects
------------------------------------------------------

Internally plac_ uses an ``Annotation`` class to convert the tuples
in the function signature into annotation objects, i.e. objects with
six attributes ``help, kind, short, type, choices, metavar``.

Advanced users can implement their own annotation objects.
For instance, here is an example of how you could implement annotations for
positional arguments:

 .. include:: annotations.py
    :literal:

You can use such annotations objects as follows:

 .. include:: example11.py
    :literal:

Here is the usage message you get::

 usage: example11.py [-h] i n [rest [rest ...]]
 
 positional arguments:
   i           This is an int
   n           This is a float
   rest        Other arguments
 
 optional arguments:
   -h, --help  show this help message and exit

You can go on and define ``Option`` and ``Flag`` classes, if you like.
Using custom annotation objects you could do advanced things like extracting the
annotations from a configuration file or from a database, but I expect such
use cases to be quite rare: the default mechanism should work 
pretty well for most users.

plac vs argparse
---------------------------------------------

plac_ is opinionated and by design it does not try to make available
all of the features of argparse_.  In particular you should be aware
of the following limitations/differences (the following assumes knowledge
of argparse_):

- plac_ automatically defines both a long and short form for each options,
  just like optparse_. argparse_ allows you to define only a long form,
  or only a short form, if you like. However, since I have always been
  happy with the behavior of optparse_, which I feel is pretty much
  consistent, I have decided not to support this feature.

- plac does not support the destination concept: the destination
  coincides with the name of the argument, always. This restriction
  has some drawbacks. For instance, suppose you want to define a long
  option called ``--yield``. In this case the destination would be ``yield``,
  which is a Python keyword, and since you cannot introduce an
  argument with that name in a function definition, it is impossible
  to implement it. Your choices are to change the name of the long
  option, or to use argparse_ with a suitable destination.

- plac_ does not support "required options". As the argparse_
  documentation puts it: *Required options are generally considered bad
  form - normal users expect options to be optional. You should avoid
  the use of required options whenever possible.*

- plac_ supports only regular boolean flags. argparse_ has the ability to 
  define generalized two-value flags with values different from ``True`` 
  and ``False``. An earlier version of plac_ had this feature too, but 
  since you can use options with two choices instead, and in any case
  the conversion from ``{True, False}`` to any couple of values
  can be trivially implemented with a ternary operator
  (``value1 if flag else value2``), I have removed it (KISS rules!).

- plac_ does not support ``nargs`` options directly (it uses them internally,
  though, to implement flag recognition). The reason it that all the use
  cases of interest to me are covered by plac_ and did not feel the need
  to increase the learning curve by adding direct support to ``nargs``.

- plac_ does not support subparsers directly. For the moment, this
  looks like a feature too advanced for the goals of plac_.

- plac_ does not support actions directly. This also
  looks like a feature too advanced for the goals of plac_. Notice however
  that the ability to define your own annotation objects may mitigate the
  need for custom actions.

I should stress again that if you want to access all of the argparse_ features
from plac_ you can use ``plac.parser_from`` and you will get
the underlying ArgumentParser_ object. The the full power of argparse_
is then available to you: you can use ``add_argument``, ``add_subparsers()``,
etc. In other words, while some features are not supported directly,
*all* features are supported indirectly.

The future
-------------------------------

Currently plac is below 100 lines of code, not counting blanks, comments
and docstrings. I do not plan to extend it much in the future. The idea is
to keep the module short: it is and it should remain a little wrapper over
argparse_. Actually I have thought about contributing the code back to
argparse_ if plac_ becomes successfull and gains a reasonable number of
users. For the moment it should be considered experimental: after all 
I wrote it in three days, including the tests, the documentation and the
time to learn argparse_.

Trivia: the story behind the name
-----------------------------------------

The plac_ project started very humble: I just wanted to make 
easy_installable my old optionparse_ recipe, and to publish it on PyPI.
The original name of plac_ was optionparser and the idea behind it was
to build an OptionParser_ object from the docstring of the module.
However, before doing that, I decided to check out the argparse_ module,
since I knew it was going into Python 2.7 and Python 2.7 was coming out.
Soon enough I realized two things:

1. the single greatest idea of argparse_ was unifying the positional arguments
   and the options in a single namespace object;
2. parsing the docstring was so old-fashioned, considering the existence
   of functions annotations in Python 3.

Putting together these two observations with the original idea of inferring the
parser I decided to build an ArgumentParser_ object from function
annotations. The ``optionparser`` name was ruled out, since I was
using argparse_; a name like ``argparse_plus`` was also ruled out,
since the typical usage was completely different from the argparse_ usage.

I made a research on PyPI and the name clap (Command Line Arguments Parser)
was not taken, so I renamed everything to clap. After two days 
a Clap_ module appeared on PyPI! <expletives deleted>

Having little imagination, I decided to rename everything again to plac,
an anagram of clap: since it is a non-existing English name, I hope nobody
will steal it from me!

.. _argparse: http://argparse.googlecode.com
.. _optparse: http://docs.python.org/library/optparse.html
.. _getopt: http://docs.python.org/library/getopt.html
.. _optionparse: http://code.activestate.com/recipes/278844-parsing-the-command-line/
.. _plac: http://pypi.python.org/pypi/plac
.. _scaling down: http://www.welton.it/articles/scalable_systems
.. _ArgumentParser: http://argparse.googlecode.com/svn/tags/r11/doc/ArgumentParser.html
.. _argparse.FileType: http://argparse.googlecode.com/svn/tags/r11/doc/other-utilities.html?highlight=filetype#FileType
.. _Clap: http://pypi.python.org/pypi/Clap/0.7
.. _OptionParser: http://docs.python.org/library/optparse.html?highlight=optionparser#optparse.OptionParser
.. _SQLAlchemy: http://www.sqlalchemy.org/
.. _SqlSoup: http://www.sqlalchemy.org/docs/reference/ext/sqlsoup.html