pod/modpods/DB_File.pod


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319

=head1 NAME

DB_File - Perl5 access to Berkeley DB

=head1 SYNOPSIS

 use DB_File ;
  
 [$X =] tie %hash,  DB_File, $filename [, $flags, $mode, $DB_HASH] ;
 [$X =] tie %hash,  DB_File, $filename, $flags, $mode, $DB_BTREE ;
 [$X =] tie @array, DB_File, $filename, $flags, $mode, $DB_RECNO ;
   
 $status = $X->del($key [, $flags]) ;
 $status = $X->put($key, $value [, $flags]) ;
 $status = $X->get($key, $value [, $flags]) ;
 $status = $X->seq($key, $value [, $flags]) ;
 $status = $X->sync([$flags]) ;
 $status = $X->fd ;
    
 untie %hash ;
 untie @array ;

=head1 DESCRIPTION

B<DB_File> is a module which allows Perl programs to make use of 
the facilities provided by Berkeley DB.  If you intend to use this
module you should really have a copy of the Berkeley DB manual
page at hand. The interface defined here
mirrors the Berkeley DB interface closely.

Berkeley DB is a C library which provides a consistent interface to a number of 
database formats. 
B<DB_File> provides an interface to all three of the database types currently
supported by Berkeley DB.

The file types are:

=over 5

=item DB_HASH

This database type allows arbitrary key/data pairs to be stored in data files.
This is equivalent to the functionality provided by 
other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM.
Remember though, the files created using DB_HASH are 
not compatible with any of the other packages mentioned.

A default hashing algorithm, which will be adequate for most applications, 
is built into Berkeley DB.  
If you do need to use your own hashing algorithm it is possible to write your
own in Perl and have B<DB_File> use it instead.

=item DB_BTREE

The btree format allows arbitrary key/data pairs to be stored in a sorted, 
balanced binary tree.

As with the DB_HASH format, it is possible to provide a user defined Perl routine
to perform the comparison of keys. By default, though, the keys are stored 
in lexical order.

=item DB_RECNO

DB_RECNO allows both fixed-length and variable-length flat text files to be 
manipulated using 
the same key/value pair interface as in DB_HASH and DB_BTREE. 
In this case the key will consist of a record (line) number. 

=back

=head2 How does DB_File interface to Berkeley DB?

B<DB_File> allows access to Berkeley DB files using the tie() mechanism
in Perl 5 (for full details, see L<perlfunc/tie()>).
This facility allows B<DB_File> to access Berkeley DB files using
either an associative array (for DB_HASH & DB_BTREE file types) or an
ordinary array (for the DB_RECNO file type).

In addition to the tie() interface, it is also possible to use most of the
functions provided in the Berkeley DB API.

=head2 Differences with Berkeley DB

Berkeley DB uses the function dbopen() to open or create a 
database. Below is the C prototype for dbopen().

      DB*
      dbopen (const char * file, int flags, int mode, 
              DBTYPE type, const void * openinfo)

The parameter C<type> is an enumeration which specifies which of the 3
interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used.
Depending on which of these is actually chosen, the final parameter,
I<openinfo> points to a data structure which allows tailoring of the
specific interface method.

This interface is handled 
slightly differently in B<DB_File>. Here is an equivalent call using
B<DB_File>.

        tie %array, DB_File, $filename, $flags, $mode, $DB_HASH ;

The C<filename>, C<flags> and C<mode> parameters are the direct equivalent 
of their dbopen() counterparts. The final parameter $DB_HASH
performs the function of both the C<type> and C<openinfo>
parameters in dbopen().

In the example above $DB_HASH is actually a reference to a hash object.
B<DB_File> has three of these pre-defined references.
Apart from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.

The keys allowed in each of these pre-defined references is limited to the names
used in the equivalent C structure.
So, for example, the $DB_HASH reference will only allow keys called C<bsize>,
C<cachesize>, C<ffactor>, C<hash>, C<lorder> and C<nelem>. 

To change one of these elements, just assign to it like this

	$DB_HASH{cachesize} = 10000 ;


=head2 RECNO


In order to make RECNO more compatible with Perl the array offset for all
RECNO arrays begins at 0 rather than 1 as in Berkeley DB.


=head2 In Memory Databases

Berkeley DB allows the creation of in-memory databases by using NULL (that is, a 
C<(char *)0 in C) in 
place of the filename. 
B<DB_File> uses C<undef> instead of NULL to provide this functionality.


=head2 Using the Berkeley DB Interface Directly

As well as accessing Berkeley DB using a tied hash or array, it is also
possible to make direct use of most of the functions defined in the Berkeley DB
documentation.


To do this you need to remember the return value from the tie.

	$db = tie %hash, DB_File, "filename"

Once you have done that, you can access the Berkeley DB API functions directly.

	$db->put($key, $value, R_NOOVERWRITE) ;

All the functions defined in L<dbx(3X)> are available except
for close() and dbopen() itself.  
The B<DB_File> interface to these functions have been implemented to mirror
the the way Berkeley DB works. In particular note that all the functions return
only a status value. Whenever a Berkeley DB function returns data via one of
its parameters, the B<DB_File> equivalent does exactly the same.

All the constants defined in L<dbopen> are also available.

Below is a list of the functions available.

=over 5

=item get

Same as in C<recno> except that the flags parameter is optional. 
Remember the value
associated with the key you request is returned in the $value parameter.

=item put

As usual the flags parameter is optional. 

If you use either the R_IAFTER or
R_IBEFORE flags, the key parameter will have the record number of the inserted
key/value pair set.

=item del

The flags parameter is optional.

=item fd

As in I<recno>.

=item seq

The flags parameter is optional.

Both the key and value parameters will be set.

=item sync

The flags parameter is optional.

=back

=head1 EXAMPLES

It is always a lot easier to understand something when you see a real example.
So here are a few.

=head2 Using HASH

	use DB_File ;
	use Fcntl ;
	
	tie %h,  DB_File, "hashed", O_RDWR|O_CREAT, 0640, $DB_HASH ;
	
	# Add a key/value pair to the file
	$h{"apple"} = "orange" ;
	
	# Check for existence of a key
	print "Exists\n" if $h{"banana"} ;
	
	# Delete 
	delete $h{"apple"} ;
	
	untie %h ;

=head2 Using BTREE

Here is sample of code which used BTREE. Just to make life more interesting
the default comparision function will not be used. Instead a Perl sub, C<Compare()>,
will be used to do a case insensitive comparison.

        use DB_File ;
        use Fcntl ;
	 
	sub Compare
        {
	    my ($key1, $key2) = @_ ;
	
	    "\L$key1" cmp "\L$key2" ;
	}
	
        $DB_BTREE->{compare} = 'Compare' ;
	 
        tie %h,  DB_File, "tree", O_RDWR|O_CREAT, 0640, $DB_BTREE ;
	 
        # Add a key/value pair to the file
        $h{'Wall'} = 'Larry' ;
        $h{'Smith'} = 'John' ;
	$h{'mouse'} = 'mickey' ;
	$h{'duck'}   = 'donald' ;
	 
        # Delete
        delete $h{"duck"} ;
	 
	# Cycle through the keys printing them in order.
	# Note it is not necessary to sort the keys as
	# the btree will have kept them in order automatically.
	foreach (keys %h)
	  { print "$_\n" }
	
        untie %h ;

Here is the output from the code above.

	mouse
	Smith
	Wall


=head2 Using RECNO

	use DB_File ;
	use Fcntl ;
	
	$DB_RECNO->{psize} = 3000 ;
	
	tie @h,  DB_File, "text", O_RDWR|O_CREAT, 0640, $DB_RECNO ;
	
	# Add a key/value pair to the file
	$h[0] = "orange" ;
	
	# Check for existence of a key
	print "Exists\n" if $h[1] ;
	
	untie @h ;


=head1 WARNINGS

If you happen find any other functions defined in the source for this module 
that have not been mentioned in this document -- beware. 
I may drop them at a moments notice.

If you cannot find any, then either you didn't look very hard or the moment has
passed and I have dropped them.

=head1 BUGS

Some older versions of Berkeley DB had problems with fixed length records
using the RECNO file format. The newest version at the time of writing 
was 1.85 - this seems to have fixed the problems with RECNO.

I am sure there are bugs in the code. If you do find any, or can suggest any
enhancements, I would welcome your comments.

=head1 AVAILABILITY

Berkeley DB is available via the hold C<ftp.cs.berkeley.edu> in the
directory C</ucb/4bsd/db.tar.gz>.  It is I<not> under the GPL.

=head1 SEE ALSO

L<perl(1)>, L<dbopen(3)>, L<hash(3)>, L<recno(3)>, L<btree(3)> 

Berkeley DB is available from F<ftp.cs.berkeley.edu> in the directory F</ucb/4bsd>.

=head1 AUTHOR

The DB_File interface was written by 
Paul Marquess <pmarquess@bfsec.bt.co.uk>.
Questions about the DB system itself may be addressed to
Keith Bostic  <bostic@cs.berkeley.edu>.