summaryrefslogtreecommitdiff
path: root/doc/src/sgml/advanced.sgml
blob: 640b5562b0c7bb446c77226a56d3eb45faa452e4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/advanced.sgml,v 1.37 2003/09/30 03:22:32 tgl Exp $
-->

 <chapter id="tutorial-advanced">
  <title>Advanced Features</title>

  <sect1 id="tutorial-advanced-intro">
   <title>Introduction</title>

   <para>
    In the previous chapter we have covered the basics of using
    <acronym>SQL</acronym> to store and access your data in
    <productname>PostgreSQL</productname>.  We will now discuss some
    more advanced features of <acronym>SQL</acronym> that simplify
    management and prevent loss or corruption of your data.  Finally,
    we will look at some <productname>PostgreSQL</productname>
    extensions.
   </para>

   <para>
    This chapter will on occasion refer to examples found in <xref
    linkend="tutorial-sql"> to change or improve them, so it will be
    of advantage if you have read that chapter.  Some examples from
    this chapter can also be found in
    <filename>advanced.sql</filename> in the tutorial directory.  This
    file also contains some example data to load, which is not
    repeated here.  (Refer to <xref linkend="tutorial-sql-intro"> for
    how to use the file.)
   </para>
  </sect1>


  <sect1 id="tutorial-views">
   <title>Views</title>

   <indexterm zone="tutorial-views">
    <primary>view</primary>
   </indexterm>

   <para>
    Refer back to the queries in <xref linkend="tutorial-join">.
    Suppose the combined listing of weather records and city location
    is of particular interest to your application, but you do not want
    to type the query each time you need it.  You can create a
    <firstterm>view</firstterm> over the query, which gives a name to
    the query that you can refer to like an ordinary table.

<programlisting>
CREATE VIEW myview AS
    SELECT city, temp_lo, temp_hi, prcp, date, location
        FROM weather, cities
        WHERE city = name;

SELECT * FROM myview;
</programlisting>
   </para>

   <para>
    Making liberal use of views is a key aspect of good SQL database
    design.  Views allow you to encapsulate the details of the
    structure of your tables, which may change as your application
    evolves, behind consistent interfaces.
   </para>

   <para>
    Views can be used in almost any place a real table can be used.
    Building views upon other views is not uncommon.
   </para>
  </sect1>


  <sect1 id="tutorial-fk">
   <title>Foreign Keys</title>

   <indexterm zone="tutorial-fk">
    <primary>foreign key</primary>
   </indexterm>

   <indexterm zone="tutorial-fk">
    <primary>referential integrity</primary>
   </indexterm>

   <para>
    Recall the <classname>weather</classname> and
    <classname>cities</classname> tables from <xref
    linkend="tutorial-sql">.  Consider the following problem:  You
    want to make sure that no one can insert rows in the
    <classname>weather</classname> table that do not have a matching
    entry in the <classname>cities</classname> table.  This is called
    maintaining the <firstterm>referential integrity</firstterm> of
    your data.  In simplistic database systems this would be
    implemented (if at all) by first looking at the
    <classname>cities</classname> table to check if a matching record
    exists, and then inserting or rejecting the new
    <classname>weather</classname> records.  This approach has a
    number of problems and is very inconvenient, so
    <productname>PostgreSQL</productname> can do this for you.
   </para>

   <para>
    The new declaration of the tables would look like this:

<programlisting>
CREATE TABLE cities (
 	city     varchar(80) primary key,
	 location point
);

CREATE TABLE weather (
	city      varchar(80) references cities,
	temp_lo   int,
	temp_hi   int,
	prcp      real,
	date      date
);
</programlisting>

    Now try inserting an invalid record:

<programlisting>
INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28');
</programlisting>

<screen>
ERROR:  insert or update on table "weather" violates foreign key constraint "$1"
DETAIL:  Key (city)=(Berkeley) is not present in table "cities".
</screen>
   </para>

   <para>
    The behavior of foreign keys can be finely tuned to your
    application.  We will not go beyond this simple example in this
    tutorial, but just refer you to <xref linkend="ddl">
    for more information.  Making correct use of
    foreign keys will definitely improve the quality of your database
    applications, so you are strongly encouraged to learn about them.
   </para>
  </sect1>


  <sect1 id="tutorial-transactions">
   <title>Transactions</title>

   <indexterm zone="tutorial-transactions">
    <primary>transaction</primary>
   </indexterm>

   <para>
    <firstterm>Transactions</> are a fundamental concept of all database
    systems.  The essential point of a transaction is that it bundles
    multiple steps into a single, all-or-nothing operation.  The intermediate
    states between the steps are not visible to other concurrent transactions,
    and if some failure occurs that prevents the transaction from completing,
    then none of the steps affect the database at all.
   </para>

   <para>
    For example, consider a bank database that contains balances for various
    customer accounts, as well as total deposit balances for branches.
    Suppose that we want to record a payment of $100.00 from Alice's account
    to Bob's account.  Simplifying outrageously, the SQL commands for this
    might look like

<programlisting>
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
UPDATE branches SET balance = balance - 100.00
    WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice');
UPDATE accounts SET balance = balance + 100.00
    WHERE name = 'Bob';
UPDATE branches SET balance = balance + 100.00
    WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob');
</programlisting>
   </para>

   <para>
    The details of these commands are not important here; the important
    point is that there are several separate updates involved to accomplish
    this rather simple operation.  Our bank's officers will want to be
    assured that either all these updates happen, or none of them happen.
    It would certainly not do for a system failure to result in Bob
    receiving $100.00 that was not debited from Alice.  Nor would Alice long
    remain a happy customer if she was debited without Bob being credited.
    We need a guarantee that if something goes wrong partway through the
    operation, none of the steps executed so far will take effect.  Grouping
    the updates into a <firstterm>transaction</> gives us this guarantee.
    A transaction is said to be <firstterm>atomic</>: from the point of
    view of other transactions, it either happens completely or not at all.
   </para>

   <para>
    We also want a
    guarantee that once a transaction is completed and acknowledged by
    the database system, it has indeed been permanently recorded
    and won't be lost even if a crash ensues shortly thereafter.
    For example, if we are recording a cash withdrawal by Bob,
    we do not want any chance that the debit to his account will
    disappear in a crash just as he walks out the bank door.
    A transactional database guarantees that all the updates made by
    a transaction are logged in permanent storage (i.e., on disk) before
    the transaction is reported complete.
   </para>

   <para>
    Another important property of transactional databases is closely
    related to the notion of atomic updates: when multiple transactions
    are running concurrently, each one should not be able to see the
    incomplete changes made by others.  For example, if one transaction
    is busy totalling all the branch balances, it would not do for it
    to include the debit from Alice's branch but not the credit to
    Bob's branch, nor vice versa.  So transactions must be all-or-nothing
    not only in terms of their permanent effect on the database, but
    also in terms of their visibility as they happen.  The updates made
    so far by an open transaction are invisible to other transactions
    until the transaction completes, whereupon all the updates become
    visible simultaneously.
   </para>

   <para>
    In <productname>PostgreSQL</>, a transaction is set up by surrounding
    the SQL commands of the transaction with
    <command>BEGIN</> and <command>COMMIT</> commands.  So our banking
    transaction would actually look like

<programlisting>
BEGIN;
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
-- etc etc
COMMIT;
</programlisting>
   </para>

   <para>
    If, partway through the transaction, we decide we do not want to
    commit (perhaps we just noticed that Alice's balance went negative),
    we can issue the command <command>ROLLBACK</> instead of
    <command>COMMIT</>, and all our updates so far will be canceled.
   </para>

   <para>
    <productname>PostgreSQL</> actually treats every SQL statement as being
    executed within a transaction.  If you do not issue a <command>BEGIN</>
    command, 
    then each individual statement has an implicit <command>BEGIN</> and
    (if successful) <command>COMMIT</> wrapped around it.  A group of
    statements surrounded by <command>BEGIN</> and <command>COMMIT</>
    is sometimes called a <firstterm>transaction block</>.
   </para>

   <note>
    <para>
     Some client libraries issue <command>BEGIN</> and <command>COMMIT</>
     commands automatically, so that you may get the effect of transaction
     blocks without asking.  Check the documentation for the interface
     you are using.
    </para>
   </note>
  </sect1>


  <sect1 id="tutorial-inheritance">
   <title>Inheritance</title>

   <indexterm zone="tutorial-inheritance">
    <primary>inheritance</primary>
   </indexterm>

   <para>
    Inheritance is a concept from object-oriented databases.  It opens
    up interesting new possibilities of database design.
   </para>

   <para>
    Let's create two tables:  A table <classname>cities</classname>
    and a table <classname>capitals</classname>.  Naturally, capitals
    are also cities, so you want some way to show the capitals
    implicitly when you list all cities.  If you're really clever you
    might invent some scheme like this:

<programlisting>
CREATE TABLE capitals (
  name       text,
  population real,
  altitude   int,    -- (in ft)
  state      char(2)
);

CREATE TABLE non_capitals (
  name       text,
  population real,
  altitude   int     -- (in ft)
);

CREATE VIEW cities AS
  SELECT name, population, altitude FROM capitals
    UNION
  SELECT name, population, altitude FROM non_capitals;
</programlisting>

    This works OK as far as querying goes, but it gets ugly when you
    need to update several rows, to name one thing.
   </para>

   <para>
    A better solution is this:

<programlisting>
CREATE TABLE cities (
  name       text,
  population real,
  altitude   int     -- (in ft)
);

CREATE TABLE capitals (
  state      char(2)
) INHERITS (cities);
</programlisting>
   </para>

   <para>
    In this case, a row of <classname>capitals</classname>
    <firstterm>inherits</firstterm> all columns (<structfield>name</>,
    <structfield>population</>, and <structfield>altitude</>) from its
    <firstterm>parent</firstterm>, <classname>cities</classname>.  The
    type of the column <structfield>name</structfield> is
    <type>text</type>, a native <productname>PostgreSQL</productname>
    type for variable length character strings.  State capitals have
    an extra column, state, that shows their state.  In
    <productname>PostgreSQL</productname>, a table can inherit from
    zero or more other tables.
   </para>

   <para>
    For example, the  following  query finds the  names  of  all  cities,
    including  state capitals, that are located at an altitude 
    over 500 ft.:

<programlisting>
SELECT name, altitude
  FROM cities
  WHERE altitude &gt; 500;
</programlisting>

    which returns:

<screen>
   name    | altitude
-----------+----------
 Las Vegas |     2174
 Mariposa  |     1953
 Madison   |      845
(3 rows)
</screen>
   </para>

   <para>
    On the other hand, the  following  query  finds
    all  the cities that are not state capitals and
    are situated at an altitude of 500 ft. or higher:

<programlisting>
SELECT name, altitude
    FROM ONLY cities
    WHERE altitude &gt; 500;
</programlisting>

<screen>
   name    | altitude
-----------+----------
 Las Vegas |     2174
 Mariposa  |     1953
(2 rows)
</screen>
   </para>

   <para>
    Here the <literal>ONLY</literal> before <literal>cities</literal>
    indicates that the query should be run over only the
    <classname>cities</classname> table, and not tables below
    <classname>cities</classname> in the inheritance hierarchy.  Many
    of the commands that we have already discussed --
    <command>SELECT</command>, <command>UPDATE</command>, and
    <command>DELETE</command> -- support this <literal>ONLY</literal>
    notation.
   </para>
  </sect1>


  <sect1 id="tutorial-conclusion">
   <title>Conclusion</title>
 
   <para>
    <productname>PostgreSQL</productname> has many features not
    touched upon in this tutorial introduction, which has been
    oriented toward newer users of <acronym>SQL</acronym>.  These
    features are discussed in more detail in the remainder of this
    book.
   </para>

   <para>
    If you feel you need more introductory material, please visit the
    <ulink url="http://www.postgresql.org">PostgreSQL web
    site</ulink> for links to more resources.
   </para>
  </sect1>
 </chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->