summaryrefslogtreecommitdiff
path: root/doc/src/advanced.rst
blob: 28c4be9f9c018fd612aeb550466fa1423a74444b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
More advanced topics
====================

.. sectionauthor:: Daniele Varrazzo <daniele.varrazzo@gmail.com>

.. testsetup:: *

    import re
    import select

    cur.execute("CREATE TABLE atable (apoint point)")
    conn.commit()

    def wait(conn):
        while True:
            state = conn.poll()
            if state == psycopg2.extensions.POLL_OK:
                break
            elif state == psycopg2.extensions.POLL_WRITE:
                select.select([], [conn.fileno()], [])
            elif state == psycopg2.extensions.POLL_READ:
                select.select([conn.fileno()], [], [])
            else:
                raise psycopg2.OperationalError("poll() returned %s" % state)

    aconn = psycopg2.connect(database='test', async=1)
    wait(aconn)
    acurs = aconn.cursor()


.. index::
    double: Subclassing; Cursor
    double: Subclassing; Connection

.. _subclassing-connection:
.. _subclassing-cursor:

Connection and cursor factories
-------------------------------

Psycopg exposes two new-style classes that can be sub-classed and expanded to
adapt them to the needs of the programmer: `psycopg2.extensions.cursor`
and `psycopg2.extensions.connection`.  The `connection` class is
usually sub-classed only to provide an easy way to create customized cursors
but other uses are possible. `cursor` is much more interesting, because
it is the class where query building, execution and result type-casting into
Python variables happens.

The `~psycopg2.extras` module contains several examples of :ref:`connection
and cursor subclasses <cursor-subclasses>`.

.. note::

    If you only need a customized cursor class, since Psycopg 2.5 you can use
    the `~connection.cursor_factory` parameter of a regular connection instead
    of creating a new `!connection` subclass.


.. index::
    single: Example; Cursor subclass

An example of cursor subclass performing logging is::

    import psycopg2
    import psycopg2.extensions
    import logging

    class LoggingCursor(psycopg2.extensions.cursor):
        def execute(self, sql, args=None):
            logger = logging.getLogger('sql_debug')
            logger.info(self.mogrify(sql, args))

            try:
                psycopg2.extensions.cursor.execute(self, sql, args)
            except Exception, exc:
                logger.error("%s: %s" % (exc.__class__.__name__, exc))
                raise

    conn = psycopg2.connect(DSN)
    cur = conn.cursor(cursor_factory=LoggingCursor)
    cur.execute("INSERT INTO mytable VALUES (%s, %s, %s);",
                 (10, 20, 30))



.. index::
    single: Objects; Creating new adapters
    single: Adaptation; Creating new adapters
    single: Data types; Creating new adapters

.. _adapting-new-types:

Adapting new Python types to SQL syntax
---------------------------------------

Any Python class or type can be adapted to an SQL string.  Adaptation mechanism
is similar to the Object Adaptation proposed in the :pep:`246` and is exposed
by the `psycopg2.extensions.adapt()` function.

The `~cursor.execute()` method adapts its arguments to the
`~psycopg2.extensions.ISQLQuote` protocol.  Objects that conform to this
protocol expose a `!getquoted()` method returning the SQL representation
of the object as a string (the method must return `!bytes` in Python 3).
Optionally the conform object may expose a
`~psycopg2.extensions.ISQLQuote.prepare()` method.

There are two basic ways to have a Python object adapted to SQL:

- the object itself is conform, or knows how to make itself conform. Such
  object must expose a `__conform__()` method that will be called with the
  protocol object as argument. The object can check that the protocol is
  `!ISQLQuote`, in which case it can return `!self` (if the object also
  implements `!getquoted()`) or a suitable wrapper object. This option is
  viable if you are the author of the object and if the object is specifically
  designed for the database (i.e. having Psycopg as a dependency and polluting
  its interface with the required methods doesn't bother you). For a simple
  example you can take a look at the source code for the
  `psycopg2.extras.Inet` object.

- If implementing the `!ISQLQuote` interface directly in the object is not an
  option (maybe because the object to adapt comes from a third party library),
  you can use an *adaptation function*, taking the object to be adapted as
  argument and returning a conforming object.  The adapter must be
  registered via the `~psycopg2.extensions.register_adapter()` function.  A
  simple example wrapper is `!psycopg2.extras.UUID_adapter` used by the
  `~psycopg2.extras.register_uuid()` function.

A convenient object to write adapters is the `~psycopg2.extensions.AsIs`
wrapper, whose `!getquoted()` result is simply the `!str()`\ ing conversion of
the wrapped object.

.. index::
    single: Example; Types adaptation

Example: mapping of a `!Point` class into the |point|_ PostgreSQL
geometric type:

.. doctest::

    >>> from psycopg2.extensions import adapt, register_adapter, AsIs

    >>> class Point(object):
    ...    def __init__(self, x, y):
    ...        self.x = x
    ...        self.y = y

    >>> def adapt_point(point):
    ...     x = adapt(point.x).getquoted()
    ...     y = adapt(point.y).getquoted()
    ...     return AsIs("'(%s, %s)'" % (x, y))

    >>> register_adapter(Point, adapt_point)

    >>> cur.execute("INSERT INTO atable (apoint) VALUES (%s)",
    ...             (Point(1.23, 4.56),))


.. |point| replace:: :sql:`point`
.. _point: https://www.postgresql.org/docs/current/static/datatype-geometric.html#DATATYPE-GEOMETRIC

The above function call results in the SQL command::

    INSERT INTO atable (apoint) VALUES ('(1.23, 4.56)');



.. index:: Type casting

.. _type-casting-from-sql-to-python:

Type casting of SQL types into Python objects
---------------------------------------------

PostgreSQL objects read from the database can be adapted to Python objects
through an user-defined adapting function.  An adapter function takes two
arguments: the object string representation as returned by PostgreSQL and the
cursor currently being read, and should return a new Python object.  For
example, the following function parses the PostgreSQL :sql:`point`
representation into the previously defined `!Point` class:

    >>> def cast_point(value, cur):
    ...    if value is None:
    ...        return None
    ...
    ...    # Convert from (f1, f2) syntax using a regular expression.
    ...    m = re.match(r"\(([^)]+),([^)]+)\)", value)
    ...    if m:
    ...        return Point(float(m.group(1)), float(m.group(2)))
    ...    else:
    ...        raise InterfaceError("bad point representation: %r" % value)


In order to create a mapping from a PostgreSQL type (either standard or
user-defined), its OID must be known. It can be retrieved either by the second
column of the `cursor.description`:

    >>> cur.execute("SELECT NULL::point")
    >>> point_oid = cur.description[0][1]
    >>> point_oid
    600

or by querying the system catalog for the type name and namespace (the
namespace for system objects is :sql:`pg_catalog`):

    >>> cur.execute("""
    ...    SELECT pg_type.oid
    ...      FROM pg_type JOIN pg_namespace
    ...             ON typnamespace = pg_namespace.oid
    ...     WHERE typname = %(typename)s
    ...       AND nspname = %(namespace)s""",
    ...    {'typename': 'point', 'namespace': 'pg_catalog'})
    >>> point_oid = cur.fetchone()[0]
    >>> point_oid
    600

After you know the object OID, you can create and register the new type:

    >>> POINT = psycopg2.extensions.new_type((point_oid,), "POINT", cast_point)
    >>> psycopg2.extensions.register_type(POINT)

The `~psycopg2.extensions.new_type()` function binds the object OIDs
(more than one can be specified) to the adapter function.
`~psycopg2.extensions.register_type()` completes the spell.  Conversion
is automatically performed when a column whose type is a registered OID is
read:

    >>> cur.execute("SELECT '(10.2,20.3)'::point")
    >>> point = cur.fetchone()[0]
    >>> print type(point), point.x, point.y
    <class 'Point'> 10.2 20.3

A typecaster created by `!new_type()` can be also used with
`~psycopg2.extensions.new_array_type()` to create a typecaster converting a
PostgreSQL array into a Python list.


.. index::
    pair: Asynchronous; Notifications
    pair: LISTEN; SQL command
    pair: NOTIFY; SQL command

.. _async-notify:

Asynchronous notifications
--------------------------

Psycopg allows asynchronous interaction with other database sessions using the
facilities offered by PostgreSQL commands |LISTEN|_ and |NOTIFY|_. Please
refer to the PostgreSQL documentation for examples about how to use this form of
communication.

Notifications are instances of the `~psycopg2.extensions.Notify` object made
available upon reception in the `connection.notifies` list. Notifications can
be sent from Python code simply executing a :sql:`NOTIFY` command in an
`~cursor.execute()` call.

Because of the way sessions interact with notifications (see |NOTIFY|_
documentation), you should keep the connection in `~connection.autocommit`
mode if you wish to receive or send notifications in a timely manner.

.. |LISTEN| replace:: :sql:`LISTEN`
.. _LISTEN: https://www.postgresql.org/docs/current/static/sql-listen.html
.. |NOTIFY| replace:: :sql:`NOTIFY`
.. _NOTIFY: https://www.postgresql.org/docs/current/static/sql-notify.html

Notifications are received after every query execution. If the user is
interested in receiving notifications but not in performing any query, the
`~connection.poll()` method can be used to check for new messages without
wasting resources.

A simple application could poll the connection from time to time to check if
something new has arrived. A better strategy is to use some I/O completion
function such as :py:func:`~select.select` to sleep until awakened by the kernel when there is
some data to read on the connection, thereby using no CPU unless there is
something to read::

    import select
    import psycopg2
    import psycopg2.extensions

    conn = psycopg2.connect(DSN)
    conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)

    curs = conn.cursor()
    curs.execute("LISTEN test;")

    print "Waiting for notifications on channel 'test'"
    while True:
        if select.select([conn],[],[],5) == ([],[],[]):
            print "Timeout"
        else:
            conn.poll()
            while conn.notifies:
                notify = conn.notifies.pop(0)
                print "Got NOTIFY:", notify.pid, notify.channel, notify.payload

Running the script and executing a command such as :sql:`NOTIFY test, 'hello'`
in a separate :program:`psql` shell, the output may look similar to:

.. code-block:: none

    Waiting for notifications on channel 'test'
    Timeout
    Timeout
    Got NOTIFY: 6535 test hello
    Timeout
    ...

Note that the payload is only available from PostgreSQL 9.0: notifications
received from a previous version server will have the
`~psycopg2.extensions.Notify.payload` attribute set to the empty string.

.. versionchanged:: 2.3
    Added `~psycopg2.extensions.Notify` object and handling notification
    payload.

.. versionchanged:: 2.7
    The `~connection.notifies` attribute is writable: it is possible to
    replace it with any object exposing an `!append()` method. An useful
    example would be to use a `~collections.deque` object.


.. index::
    double: Asynchronous; Connection

.. _async-support:

Asynchronous support
--------------------

.. versionadded:: 2.2

Psycopg can issue asynchronous queries to a PostgreSQL database. An asynchronous
communication style is established passing the parameter *async*\=1 to the
`~psycopg2.connect()` function: the returned connection will work in
*asynchronous mode*.

In asynchronous mode, a Psycopg connection will rely on the caller to poll the
socket file descriptor, checking if it is ready to accept data or if a query
result has been transferred and is ready to be read on the client. The caller
can use the method `~connection.fileno()` to get the connection file
descriptor and `~connection.poll()` to make communication proceed according to
the current connection state.

The following is an example loop using methods `!fileno()` and `!poll()`
together with the Python :py:func:`~select.select` function in order to carry on
asynchronous operations with Psycopg::

    def wait(conn):
        while True:
            state = conn.poll()
            if state == psycopg2.extensions.POLL_OK:
                break
            elif state == psycopg2.extensions.POLL_WRITE:
                select.select([], [conn.fileno()], [])
            elif state == psycopg2.extensions.POLL_READ:
                select.select([conn.fileno()], [], [])
            else:
                raise psycopg2.OperationalError("poll() returned %s" % state)

The above loop of course would block an entire application: in a real
asynchronous framework, `!select()` would be called on many file descriptors
waiting for any of them to be ready.  Nonetheless the function can be used to
connect to a PostgreSQL server only using nonblocking commands and the
connection obtained can be used to perform further nonblocking queries.  After
`!poll()` has returned `~psycopg2.extensions.POLL_OK`, and thus `!wait()` has
returned, the connection can be safely used:

    >>> aconn = psycopg2.connect(database='test', async=1)
    >>> wait(aconn)
    >>> acurs = aconn.cursor()

Note that there are a few other requirements to be met in order to have a
completely non-blocking connection attempt: see the libpq documentation for
|PQconnectStart|_.

.. |PQconnectStart| replace:: `!PQconnectStart()`
.. _PQconnectStart: https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PQCONNECTSTARTPARAMS

The same loop should be also used to perform nonblocking queries: after
sending a query via `~cursor.execute()` or `~cursor.callproc()`, call
`!poll()` on the connection available from `cursor.connection` until it
returns `!POLL_OK`, at which point the query has been completely sent to the
server and, if it produced data, the results have been transferred to the
client and available using the regular cursor methods:

    >>> acurs.execute("SELECT pg_sleep(5); SELECT 42;")
    >>> wait(acurs.connection)
    >>> acurs.fetchone()[0]
    42

When an asynchronous query is being executed, `connection.isexecuting()` returns
`!True`. Two cursors can't execute concurrent queries on the same asynchronous
connection.

There are several limitations in using asynchronous connections: the
connection is always in `~connection.autocommit` mode and it is not
possible to change it. So a
transaction is not implicitly started at the first query and is not possible
to use methods `~connection.commit()` and `~connection.rollback()`: you can
manually control transactions using `~cursor.execute()` to send database
commands such as :sql:`BEGIN`, :sql:`COMMIT` and :sql:`ROLLBACK`. Similarly
`~connection.set_session()` can't be used but it is still possible to invoke the
:sql:`SET` command with the proper :sql:`default_transaction_...` parameter.

With asynchronous connections it is also not possible to use
`~connection.set_client_encoding()`, `~cursor.executemany()`, :ref:`large
objects <large-objects>`, :ref:`named cursors <server-side-cursors>`.

:ref:`COPY commands <copy>` are not supported either in asynchronous mode, but
this will be probably implemented in a future release.




.. index::
    single: Greenlet
    single: Coroutine
    single: Eventlet
    single: gevent
    single: Wait callback

.. _green-support:

Support for coroutine libraries
-------------------------------

.. versionadded:: 2.2

Psycopg can be used together with coroutine_\-based libraries and participate
in cooperative multithreading.

Coroutine-based libraries (such as Eventlet_ or gevent_) can usually patch the
Python standard library in order to enable a coroutine switch in the presence of
blocking I/O: the process is usually referred as making the system *green*, in
reference to the `green threads`_.

Because Psycopg is a C extension module, it is not possible for coroutine
libraries to patch it: Psycopg instead enables cooperative multithreading by
allowing the registration of a *wait callback* using the
`psycopg2.extensions.set_wait_callback()` function. When a wait callback is
registered, Psycopg will use `libpq non-blocking calls`__ instead of the regular
blocking ones, and will delegate to the callback the responsibility to wait
for the socket to become readable or writable.

Working this way, the caller does not have the complete freedom to schedule the
socket check whenever they want as with an :ref:`asynchronous connection
<async-support>`, but has the advantage of maintaining a complete |DBAPI|
semantics: from the point of view of the end user, all Psycopg functions and
objects will work transparently in the coroutine environment (blocking the
calling green thread and giving other green threads the possibility to be
scheduled), allowing non modified code and third party libraries (such as
SQLAlchemy_) to be used in coroutine-based programs.

.. warning::
    Psycopg connections are not *green thread safe* and can't be used
    concurrently by different green threads. Trying to execute more than one
    command at time using one cursor per thread will result in an error (or a
    deadlock on versions before 2.4.2).

    Therefore, programmers are advised to either avoid sharing connections
    between coroutines or to use a library-friendly lock to synchronize shared
    connections, e.g. for pooling.

Coroutine libraries authors should provide a callback implementation (and
possibly a method to register it) to make Psycopg as green as they want. An
example callback (using `!select()` to block) is provided as
`psycopg2.extras.wait_select()`: it boils down to something similar to::

    def wait_select(conn):
        while True:
            state = conn.poll()
            if state == extensions.POLL_OK:
                break
            elif state == extensions.POLL_READ:
                select.select([conn.fileno()], [], [])
            elif state == extensions.POLL_WRITE:
                select.select([], [conn.fileno()], [])
            else:
                raise OperationalError("bad state from poll: %s" % state)

Providing callback functions for the single coroutine libraries is out of
psycopg2 scope, as the callback can be tied to the libraries' implementation
details. You can check the `psycogreen`_ project for further informations and
resources about the topic.

.. _coroutine: https://en.wikipedia.org/wiki/Coroutine
.. _greenlet: https://pypi.org/project/greenlet/
.. _green threads: https://en.wikipedia.org/wiki/Green_threads
.. _Eventlet: https://eventlet.net/
.. _gevent: http://www.gevent.org/
.. _SQLAlchemy: https://www.sqlalchemy.org/
.. _psycogreen: https://github.com/psycopg/psycogreen/
.. __: https://www.postgresql.org/docs/current/static/libpq-async.html

.. warning::

    :ref:`COPY commands <copy>` are currently not supported when a wait callback
    is registered, but they will be probably implemented in a future release.

    :ref:`Large objects <large-objects>` are not supported either: they are
    not compatible with asynchronous connections.


.. testcode::
    :hide:

    aconn.close()
    conn.rollback()
    cur.execute("DROP TABLE atable")
    conn.commit()
    cur.close()
    conn.close()



.. index::
    single: Replication

.. _replication-support:

Replication protocol support
----------------------------

.. versionadded:: 2.7

Modern PostgreSQL servers (version 9.0 and above) support replication.  The
replication protocol is built on top of the client-server protocol and can be
operated using ``libpq``, as such it can be also operated by ``psycopg2``.
The replication protocol can be operated on both synchronous and
:ref:`asynchronous <async-support>` connections.

Server version 9.4 adds a new feature called *Logical Replication*.

.. seealso::

   - PostgreSQL `Streaming Replication Protocol`__

   .. __: https://www.postgresql.org/docs/current/static/protocol-replication.html


Logical replication Quick-Start
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You must be using PostgreSQL server version 9.4 or above to run this quick
start.

Make sure that replication connections are permitted for user ``postgres`` in
``pg_hba.conf`` and reload the server configuration.  You also need to set
``wal_level=logical`` and ``max_wal_senders``, ``max_replication_slots`` to
value greater than zero in ``postgresql.conf`` (these changes require a server
restart).  Create a database ``psycopg2_test``.

Then run the following code to quickly try the replication support out.  This
is not production code -- it's only intended as a simple demo of logical
replication::

  from __future__ import print_function
  import sys
  import psycopg2
  import psycopg2.extras

  conn = psycopg2.connect('dbname=psycopg2_test user=postgres',
     connection_factory=psycopg2.extras.LogicalReplicationConnection)
  cur = conn.cursor()
  try:
      # test_decoding produces textual output
      cur.start_replication(slot_name='pytest', decode=True)
  except psycopg2.ProgrammingError:
      cur.create_replication_slot('pytest', output_plugin='test_decoding')
      cur.start_replication(slot_name='pytest', decode=True)

  class DemoConsumer(object):
      def __call__(self, msg):
          print(msg.payload)
          msg.cursor.send_feedback(flush_lsn=msg.data_start)

  democonsumer = DemoConsumer()

  print("Starting streaming, press Control-C to end...", file=sys.stderr)
  try:
     cur.consume_stream(democonsumer)
  except KeyboardInterrupt:
     cur.close()
     conn.close()
     print("The slot 'pytest' still exists. Drop it with "
        "SELECT pg_drop_replication_slot('pytest'); if no longer needed.",
        file=sys.stderr)
     print("WARNING: Transaction logs will accumulate in pg_xlog "
        "until the slot is dropped.", file=sys.stderr)


You can now make changes to the ``psycopg2_test`` database using a normal
psycopg2 session, ``psql``, etc. and see the logical decoding stream printed
by this demo client.

This will continue running until terminated with ``Control-C``.

For the details see :ref:`replication-objects`.