summaryrefslogtreecommitdiff
path: root/doc/src/sgml/pgaudit.sgml
blob: c7ca65296d42f5aafbd88e4fedf45cc2cbe33bfb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
<!-- doc/src/sgml/pgaudit.sgml -->

<sect1 id="pgaudit" xreflabel="pgaudit">
  <title>pg_audit</title>

  <indexterm zone="pgaudit">
    <primary>pg_audit</primary>
  </indexterm>

  <para>
    The <filename>pg_audit</filename> extension provides detailed session
    and/or object audit logging via the standard logging facility.  The goal
    is to provide the tools needed to produce audit logs required to pass any
    government, financial, or ISO certification audit.
  </para>

  <para>
    An audit is an official inspection of an individual's or organization's
    accounts, typically by an independent body.  The information gathered by
    <filename>pg_audit</filename> is properly called an audit trail or audit
    log.  The term audit log is used in this documentation.
  </para>

  <sect2>
    <title>Why <literal>pg_audit</>?</title>

    <para>
      Basic statement logging can be provided by the standard logging facility
      using <literal>log_statement = all</>.  This is acceptable for monitoring
      and other usages but does not provide the level of detail generally
      required for an audit.  It is not enough to have a list of all the
      operations performed against the database. It must also be possible to
      find particular statements that are of interest to an auditor.  The
      standard logging facility shows what the user requested, while
      <literal>pg_audit</> focuses on the details of what happened while
      the database was satisfying the request.
    </para>

    <para>
      For example, an auditor may want to verify that a particular table was
      created inside a documented maintenance window.  This might seem like a
      simple job for grep, but what if you are presented with something like
      this (intentionally obfuscated) example:
    </para>

    <programlisting>
DO $$
BEGIN
    EXECUTE 'CREATE TABLE import' || 'ant_table (id INT)';
END $$;
    </programlisting>

    <para>
      Standard logging will give you this:
    </para>

    <programlisting>
LOG:  statement: DO $$
BEGIN
    EXECUTE 'CREATE TABLE import' || 'ant_table (id INT)';
END $$;
    </programlisting>

    <para>
      It appears that finding the table of interest may require some knowledge
      of the code in cases where tables are created dynamically.  This is not
      ideal since it would be preferrable to just search on the table name.
      This is where <literal>pg_audit</> comes in.  For the same input,
      it will produce this output in the log:
    </para>

    <programlisting>
AUDIT: SESSION,33,1,FUNCTION,DO,,,"DO $$
BEGIN
    EXECUTE 'CREATE TABLE import' || 'ant_table (id INT)';
END $$;"
AUDIT: SESSION,33,2,DDL,CREATE TABLE,TABLE,public.important_table,CREATE TABLE important_table (id INT)
    </programlisting>

    <para>
      Not only is the <literal>DO</> block logged, but substatement 2 contains
      the full text of the <literal>CREATE TABLE</> with the statement type,
      object type, and full-qualified name to make searches easy.
    </para>

    <para>
      When logging <literal>SELECT</> and <literal>DML</> statements,
      <literal>pg_audit</> can be configured to log a separate entry for each
      relation referenced in a statement.  No parsing is required to find all
      statements that touch a particular table.  In fact, the goal is that the
      statement text is provided primarily for deep forensics and should not be
      required for an audit.
    </para>
  </sect2>

  <sect2>
    <title>Usage Considerations</title>

    <para>
      Depending on settings, it is possible for <literal>pg_audit</literal> to
      generate an enormous volume of logging.  Be careful to determine
      exactly what needs to be audit logged in your environment to avoid
      logging too much.
    </para>

    <para>
      For example, when working in an OLAP environment it would probably not be
      wise to audit log inserts into a large fact table.  The size of the log
      file will likely be many times the actual data size of the inserts because
      the log file is expressed as text.  Since logs are generally stored with
      the OS this may lead to disk space being exhausted very
      quickly.  In cases where it is not possible to limit audit logging to
      certain tables, be sure to assess the performance impact while testing
      and allocate plenty of space on the log volume.  This may also be true for
      OLTP environments.  Even if the insert volume is not as high, the
      performance impact of audit logging may still noticeably affect latency.
    </para>

    <para>
      To limit the number of relations audit logged for <literal>SELECT</>
      and <literal>DML</> statments, consider using object audit logging
      (see <xref linkend="pgaudit-object-audit-logging">).  Object audit logging
      allows selection of the relations to be logged allowing for reduction
      of the overall log volume.  However, when new relations are added they
      must be explicitly added to object audit logging.  A programmatic
      solution where specified tables are excluded from logging and all others
      are included may be a good option in this case.
    </para>
  </sect2>

  <sect2>
    <title>Settings</title>

    <para>
      Settings may be modified only by a superuser. Allowing normal users to
      change their settings would defeat the point of an audit log.
    </para>

    <para>
      Settings can be specified globally (in
      <filename>postgresql.conf</filename> or using
      <literal>ALTER SYSTEM ... SET</>), at the database level (using
      <literal>ALTER DATABASE ... SET</literal>), or at the role level (using
      <literal>ALTER ROLE ... SET</literal>).  Note that settings are not
      inherited through normal role inheritance and <literal>SET ROLE</> will
      not alter a user's <literal>pg_audit</> settings.  This is a limitation
      of the roles system and not inherent to <literal>pg_audit</>.
    </para>

    <para>
      The <literal>pg_audit</> extension must be loaded in
      <xref linkend="guc-shared-preload-libraries">.  Otherwise, an error
      will be raised at load time and no audit logging will occur.
    </para>

    <variablelist>
      <varlistentry id="guc-pgaudit-log" xreflabel="pg_audit.log">
        <term><varname>pg_audit.log</varname> (<type>string</type>)
          <indexterm>
            <primary><varname>pg_audit.log</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies which classes of statements will be logged by session
            audit logging.  Possible values are:
          </para>

          <itemizedlist>
            <listitem>
              <para>
                <literal>READ</literal> - <literal>SELECT</literal> and
                <literal>COPY</literal> when the source is a relation or a
                query.
              </para>
            </listitem>
            <listitem>
              <para>
                <literal>WRITE</literal> - <literal>INSERT</literal>,
                <literal>UPDATE</literal>, <literal>DELETE</literal>,
                <literal>TRUNCATE</literal>, and <literal>COPY</literal> when the
                destination is a relation.
              </para>
            </listitem>
            <listitem>
              <para>
                <literal>FUNCTION</literal> - Function calls and
                <literal>DO</literal> blocks.
              </para>
            </listitem>
            <listitem>
              <para>
                <literal>ROLE</literal> - Statements related to roles and
                privileges: <literal>GRANT</literal>,
                <literal>REVOKE</literal>,
                <literal>CREATE/ALTER/DROP ROLE</literal>.
              </para>
            </listitem>
            <listitem>
              <para>
                <literal>DDL</literal> - All <literal>DDL</> that is not included
                in the <literal>ROLE</> class.
              </para>
            </listitem>
            <listitem>
              <para>
                <literal>MISC</literal> - Miscellaneous commands, e.g.
                <literal>DISCARD</literal>, <literal>FETCH</literal>,
                <literal>CHECKPOINT</literal>, <literal>VACUUM</literal>.
              </para>
            </listitem>
          </itemizedlist>

          <para>
            Multiple classes can be provided using a comma-separated list and
            classes can be subtracted by prefacing the class with a
            <literal>-</> sign (see <xref linkend="pgaudit-session-audit-logging">).
            The default is <literal>none</>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry id="guc-pgaudit-log-catalog" xreflabel="pg_audit.log_catalog">
        <term><varname>pg_audit.log_catalog</varname> (<type>boolean</type>)
          <indexterm>
            <primary><varname>pg_audit.log_catalog</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies that session logging should be enabled in the case where all
            relations in a statement are in pg_catalog.  Disabling this setting
            will reduce noise in the log from tools like psql and PgAdmin that query
            the catalog heavily. The default is <literal>on</>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry id="guc-pgaudit-log-level" xreflabel="pg_audit.log_level">
        <term><varname>pg_audit.log_level</varname> (<type>boolean</type>)
          <indexterm>
            <primary><varname>pg_audit.log_level</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies the log level that will be used for log entries (see
            <xref linkend="RUNTIME-CONFIG-SEVERITY-LEVELS"> for valid levels).
            This setting is used for regression testing and may also be useful
            to end users for testing or other purposes.  It is not intended to
            be used in a production environment as it may leak which statements
            are being logged to the user. The default is <literal>log</>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry id="guc-pgaudit-log-parameter" xreflabel="pg_audit.log_parameter">
        <term><varname>pg_audit.log_parameter</varname> (<type>boolean</type>)
          <indexterm>
            <primary><varname>pg_audit.log_parameter</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies that audit logging should include the parameters that
            were passed with the statement.  When parameters are present they will
            be included in CSV format after the statement text. The default is
            <literal>off</>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry id="guc-pgaudit-log-relation" xreflabel="pg_audit.log_relation">
        <term><varname>pg_audit.log_relation</varname> (<type>boolean</type>)
          <indexterm>
            <primary><varname>pg_audit.log_relation</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies whether session audit logging should create a separate
            log entry for each relation (<literal>TABLE</>, <literal>VIEW</>,
            etc.) referenced in a <literal>SELECT</> or <literal>DML</>
            statement.  This is a useful shortcut for exhaustive logging
            without using object audit logging.  The default is
            <literal>off</>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry id="guc-pgaudit-log-statement-once" xreflabel="pg_audit.log_statement-once">
        <term><varname>pg_audit.log_statement_once</varname> (<type>boolean</type>)
          <indexterm>
            <primary><varname>pg_audit.log_statement_once</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies whether logging will include the statement text and
            parameters with the first log entry for a statement/substatement
            combination or with every entry.  Disabling this setting will
            result in less verbose logging but may make it more difficult to
            determine the statement that generated a log entry, though the
            statement/substatement pair along with the process id should suffice
            to identify the statement text logged with a previous entry.  The
            default is <literal>off</>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry id="guc-pgaudit-role" xreflabel="pg_audit.role">
        <term><varname>pg_audit.role</varname> (<type>string</type>)
          <indexterm>
            <primary><varname>pg_audit.role</> configuration parameter</primary>
          </indexterm>
        </term>
        <listitem>
          <para>
            Specifies the master role to use for object audit logging.  Muliple
            audit roles can be defined by granting them to the master role.
            This allows multiple groups to be in charge of different aspects
            of audit logging.  There is no default.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </sect2>

  <sect2 id="pgaudit-session-audit-logging">
    <title>Session Audit Logging</title>

    <para>
      Session audit logging provides detailed logs of all statements executed
      by a user in the backend.
    </para>

    <sect3>
      <title>Configuration</title>

      <para>
        Session logging is enabled with the <xref linkend="guc-pgaudit-log">
        setting.

        Enable session logging for all <literal>DML</> and <literal>DDL</> and
        log all relations in <literal>DML</> statements:
          <programlisting>
set pg_audit.log = 'write, ddl';
set pg_audit.log_relation = on;
          </programlisting>
      </para>

      <para>
        Enable session logging for all commands except <literal>MISC</> and
        raise audit log messages as <literal>NOTICE</>:
          <programlisting>
set pg_audit.log = 'all, -misc';
set pg_audit.log_notice = on;
          </programlisting>
      </para>
    </sect3>

    <sect3>
      <title>Example</title>

      <para>
        In this example session audit logging is used for logging
        <literal>DDL</> and <literal>SELECT</> statements.  Note that the
        insert statement is not logged since the <literal>WRITE</> class
        is not enabled
      </para>

      <para>
        SQL:
      </para>
      <programlisting>
set pg_audit.log = 'read, ddl';

create table account
(
    id int,
    name text,
    password text,
    description text
);

insert into account (id, name, password, description)
             values (1, 'user1', 'HASH1', 'blah, blah');

select *
    from account;
      </programlisting>

      <para>
        Log Output:
      </para>

      <programlisting>
AUDIT: SESSION,1,1,DDL,CREATE TABLE,TABLE,public.account,create table account
(
    id int,
    name text,
    password text,
    description text
);
AUDIT: SESSION,2,1,READ,SELECT,,,select *
    from account
      </programlisting>
    </sect3>
  </sect2>

  <sect2 id="pgaudit-object-audit-logging">
    <title>Object Auditing</title>

    <para>
      Object audit logging logs statements that affect a particular relation.
      Only <literal>SELECT</>, <literal>INSERT</>, <literal>UPDATE</> and
      <literal>DELETE</> commands are supported.  <literal>TRUNCATE</> is not
      included because there is no specific privilege for it.
    </para>

    <para>
      Object audit logging is intended to be a finer-grained replacement for
      <literal>pg_audit.log = 'read, write'</literal>.  As such, it may not
      make sense to use them in conjunction but one possible scenario would
      be to use session logging to capture each statement and then supplement
      that with object logging to get more detail about specific relations.
    </para>

    <sect3>
      <title>Configuration</title>

      <para>
        Object-level audit logging is implemented via the roles system.  The
        <xref linkend="guc-pgaudit-role"> setting defines the role that
        will be used for audit logging.  A relation (<literal>TABLE</>,
        <literal>VIEW</>, etc.) will be audit logged when the audit role has
        permissions for the command executed or inherits the permissions from
        another role.  This allows you to effectively have multiple audit roles
        even though there is a single master role in any context.
      </para>

      <para>
      Set <xref linkend="guc-pgaudit-role"> to <literal>auditor</> and
      grant <literal>SELECT</> and <literal>DELETE</> privileges on the
      <literal>account</> table.  Any <literal>SELECT</> or
      <literal>DELETE</> statements on <literal>account</> will now be
      logged:
      </para>

      <programlisting>
set pg_audit.role = 'auditor';

grant select, delete
   on public.account
   to auditor;
      </programlisting>
    </sect3>

    <sect3>
      <title>Example</title>

      <para>
        In this example object audit logging is used to illustrate how a
        granular approach may be taken towards logging of <literal>SELECT</>
        and <literal>DML</> statements.  Note that logging on the
        <literal>account</> table is controlled by column-level permissions,
        while logging on <literal>account_role_map</> is table-level.
      </para>

      <para>
        SQL:
      </para>

        <programlisting>
set pg_audit.role = 'auditor';

create table account
(
    id int,
    name text,
    password text,
    description text
);

grant select (password)
   on public.account
   to auditor;

select id, name
  from account;

select password
  from account;

grant update (name, password)
   on public.account
   to auditor;

update account
   set description = 'yada, yada';

update account
   set password = 'HASH2';

create table account_role_map
(
    account_id int,
    role_id int
);

grant select
   on public.account_role_map
   to auditor;

select account.password,
       account_role_map.role_id
  from account
       inner join account_role_map
            on account.id = account_role_map.account_id
        </programlisting>

      <para>
        Log Output:
      </para>

      <programlisting>
AUDIT: OBJECT,1,1,READ,SELECT,TABLE,public.account,select password
  from account
AUDIT: OBJECT,2,1,WRITE,UPDATE,TABLE,public.account,update account
   set password = 'HASH2'
AUDIT: OBJECT,3,1,READ,SELECT,TABLE,public.account,select account.password,
       account_role_map.role_id
  from account
       inner join account_role_map
            on account.id = account_role_map.account_id
AUDIT: OBJECT,3,1,READ,SELECT,TABLE,public.account_role_map,select account.password,
       account_role_map.role_id
  from account
       inner join account_role_map
            on account.id = account_role_map.account_id
      </programlisting>
    </sect3>
  </sect2>

  <sect2>
    <title>Format</title>

    <para>
      Audit entries are written to the standard logging facility and contain
      the following columns in comma-separated format:

      <note>
        <para>
          Output is compliant CSV format only if the log line prefix portion
          of each log entry is removed.
        </para>
      </note>

      <itemizedlist>
        <listitem>
          <para>
            <literal>AUDIT_TYPE</> - <literal>SESSION</> or
            <literal>OBJECT</>.
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>STATEMENT_ID</> - Unique statement ID for this session.
            Each statement ID represents a backend call.  Statement IDs are
            sequential even if some statements are not logged.  There may be
            multiple entries for a statement ID when more than one relation
            is logged.
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>SUBSTATEMENT_ID</> - Sequential ID for each
            substatement within the main statement.  For example, calling
            a function from a query.  Substatement IDs are continuous
            even if some substatements are not logged.  There may be multiple
            entries for a substatement ID when more than one relation is logged.
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>CLASS</> - e.g. (<literal>READ</>,
            <literal>ROLE</>) (see <xref linkend="guc-pgaudit-log">).
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>COMMAND</> - e.g. <literal>ALTER TABLE</>,
            <literal>SELECT</>.
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>OBJECT_TYPE</> - <literal>TABLE</>,
            <literal>INDEX</>, <literal>VIEW</>, etc.
            Available for <literal>SELECT</>, <literal>DML</> and most
            <literal>DDL</> statements.
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>OBJECT_NAME</> - The fully-qualified object name
            (e.g. public.account).  Available for <literal>SELECT</>,
            <literal>DML</> and most <literal>DDL</> statements.
          </para>
        </listitem>
        <listitem>
          <para>
            <literal>STATEMENT</> - Statement executed on the backend.
          </para>
        </listitem>
      </itemizedlist>
    </para>

    <para>
      Use <xref linkend="guc-log-line-prefix"> to add any other fields that
      are needed to satisfy your audit log requirements.  A typical log line
      prefix might be <literal>'%m %u %d: '</> which would provide the date/time,
      user name, and database name for each audit log.
    </para>
  </sect2>

  <sect2>
    <title>Caveats</title>

    <itemizedlist>
      <listitem>
        <para>
          Object renames are logged under the name they were renamed to.
          For example, renaming a table will produce the following result:
        </para>

        <programlisting>
ALTER TABLE test RENAME TO test2;

AUDIT: SESSION,36,1,DDL,ALTER TABLE,TABLE,public.test2,ALTER TABLE test RENAME TO test2
        </programlisting>
      </listitem>

      <listitem>
        <para>
          It is possible to have a command logged more than once.  For example,
          when a table is created with a primary key specified at creation time
          the index for the primary key will be logged independently and another
          audit log will be made for the index under the create entry.  The
          multiple entries will however be contained within one statement ID.
        </para>
      </listitem>

      <listitem>
        <para>
          Autovacuum and Autoanalyze are not logged.
        </para>
      </listitem>

      <listitem>
        <para>
          Statements that are executed after a transaction enters an aborted state
          will not be audit logged.  However, the statement that caused the error
          and any subsequent statements executed in the aborted transaction will
          be logged as ERRORs by the standard logging facility.
        </para>
      </listitem>
    </itemizedlist>
  </sect2>

  <sect2>
    <title>Authors</title>

    <para>
      Abhijit Menon-Sen <email>ams@2ndQuadrant.com</email>, Ian Barwick <email>ian@2ndQuadrant.com</email>, and David Steele <email>david@pgmasters.net</email>.
    </para>
  </sect2>
</sect1>