Prev	Chapter 11. - Berkeley DB Transactional Data Store Applications -	Chapter 11. Berkeley DB Transactional Data Store Applications	Next

There are a few different issues to consider when tuning the performance -of Berkeley DB transactional applications. First, you should review -Access method tuning, as the -tuning issues for access method applications are applicable to -transactional applications as well. The following are additional tuning -issues for Berkeley DB transactional applications:

+ There are a few different issues to consider when tuning the + performance of Berkeley DB transactional applications. First, + you should review Access method tuning, as the tuning issues for + access method applications are applicable to transactional + applications as well. The following are additional tuning + issues for Berkeley DB transactional applications: +

access method

Highly concurrent applications should use the Queue access method, where -possible, as it provides finer-granularity of locking than the other -access methods. Otherwise, applications usually see better concurrency -when using the Btree access method than when using either the Hash or -Recno access methods.

+ Highly concurrent applications should use the + Queue access method, where possible, as it provides + finer-granularity of locking than the other access + methods. Otherwise, applications usually see better + concurrency when using the Btree access method than + when using either the Hash or Recno access + methods. +

record numbers

Using record numbers outside of the Queue access method will often slow -down concurrent applications as they limit the degree of concurrency -available in the database. Using the Recno access method, or the Btree -access method with retrieval by record number configured can slow -applications down.

+ Using record numbers outside of the Queue access + method will often slow down concurrent applications as + they limit the degree of concurrency available in the + database. Using the Recno access method, or the Btree + access method with retrieval by record number + configured can slow applications down. +

Btree database size

When using the Btree access method, applications supporting concurrent -access may see excessive numbers of deadlocks in small databases. There -are two different approaches to resolving this problem. First, as the -Btree access method uses page-level locking, decreasing the database -page size can result in fewer lock conflicts. Second, in the case of -databases that are cyclically growing and shrinking, turning off reverse -splits (with DB_REVSPLITOFF) can leave the database with enough -pages that there will be fewer lock conflicts.

+ When using the Btree access method, applications + supporting concurrent access may see excessive numbers + of deadlocks in small databases. There are two + different approaches to resolving this problem. First, + as the Btree access method uses page-level locking, + decreasing the database page size can result in fewer + lock conflicts. Second, in the case of databases that + are cyclically growing and shrinking, turning off + reverse splits (with DB_REVSPLITOFF) can leave the + database with enough pages that there will be fewer + lock conflicts. +

read locks

Performing all read operations outside of transactions or at -Degrees of isolation can often -significantly increase application throughput. In addition, limiting -the lifetime of non-transactional cursors will reduce the length of -times locks are held, thereby improving concurrency.

+ Performing all read operations outside of + transactions or at Degrees of isolation can often + significantly increase application throughput. In + addition, limiting the lifetime of non-transactional + cursors will reduce the length of times locks are + held, thereby improving concurrency. +

DB_DIRECT_DB, DB_LOG_DIRECT

-On some systems, avoiding caching in the operating system can improve -write throughput and allow the creation of larger Berkeley DB caches.

DB_READ_UNCOMMITTED, DB_READ_COMMITTED

- Consider decreasing the level of isolation of transaction using - the DB_READ_UNCOMMITTED, or DB_READ_COMMITTED flags for - transactions or cursors or the DB_READ_UNCOMMITTED flag on - individual read operations. The DB_READ_COMMITTED flag will - release read locks on cursors as soon as the data page is - nolonger referenced. This is also called - degree 2 isolation. This will - tend to block write operations for shorter periods for - applications that do not need to have repeatable reads for - cursor operations. -

- The DB_READ_COMMITTED flag will allow read operations to - potentially return data which has been modified but not yet - committed, and can significantly increase application - throughput in applications that do not require data be - guaranteed to be permanent in the database. This is also - called degree 1 isolation, - or dirty reads. -

+ Consider decreasing the level of isolation of + transaction using the DB_READ_UNCOMMITTED, or + DB_READ_COMMITTED flags for transactions or + cursors or the DB_READ_UNCOMMITTED flag on + individual read operations. The + DB_READ_COMMITTED flag will release read locks + on cursors as soon as the data page is nolonger + referenced. This is also called degree + 2 isolation. This will tend to + block write operations for shorter periods for + applications that do not need to have repeatable + reads for cursor operations. +

+ The DB_READ_UNCOMMITTED flag will allow read + operations to potentially return data which has + been modified but not yet committed, and can + significantly increase application throughput in + applications that do not require data be + guaranteed to be permanent in the database. This + is also called degree 1 + isolation, or dirty + reads. +

DB_RMW

If there are many deadlocks, consider -using the DB_RMW flag to -immediately acquire write locks when reading data items that will -subsequently be modified. Although this flag may increase contention -(because write locks are held longer than they would otherwise be), it -may decrease the number of deadlocks that occur.

+ If there are many deadlocks, consider using the + DB_RMW flag to immediately acquire write locks when + reading data items that will subsequently be modified. + Although this flag may increase contention (because + write locks are held longer than they would otherwise + be), it may decrease the number of deadlocks that + occur. +

DB_TXN_WRITE_NOSYNC, DB_TXN_NOSYNC

By default, transactional commit in Berkeley DB implies durability, that is, -all committed operations will be present in the database after recovery -from any application or system failure. For applications not requiring -that level of certainty, specifying the DB_TXN_NOSYNC flag will -often provide a significant performance improvement. In this case, the -database will still be fully recoverable, but some number of committed -transactions might be lost after application or system failure.

+ By default, transactional commit in Berkeley DB + implies durability, that is, all committed operations + will be present in the database after recovery from + any application or system failure. For applications + not requiring that level of certainty, specifying the + DB_TXN_NOSYNC flag will often provide a significant + performance improvement. In this case, the database + will still be fully recoverable, but some number of + committed transactions might be lost after application + or system failure. +

access databases in order

When modifying multiple databases in a single transaction, always access -physical files and databases within physical files, in the same order -where possible. In addition, avoid returning to a physical file or -database, that is, avoid accessing a database, moving on to another -database and then returning to the first database. This can -significantly reduce the chance of deadlock between threads of -control.

+ When modifying multiple databases in a single + transaction, always access physical files and + databases within physical files, in the same order + where possible. In addition, avoid returning to a + physical file or database, that is, avoid accessing a + database, moving on to another database and then + returning to the first database. This can + significantly reduce the chance of deadlock between + threads of control. +

large key/data items

Transactional protections in Berkeley DB are guaranteed by before and after -physical image logging. This means applications modifying large -key/data items also write large log records, and, in the case of the -default transaction commit, threads of control must wait until those -log records have been flushed to disk. Applications supporting -concurrent access should try and keep key/data items small wherever -possible.

+ Transactional protections in Berkeley DB are + guaranteed by before and after physical image logging. + This means applications modifying large key/data items + also write large log records, and, in the case of the + default transaction commit, threads of control must + wait until those log records have been flushed to + disk. Applications supporting concurrent access should + try and keep key/data items small wherever + possible. +

mutex selection

- During configuration, Berkeley DB selects a mutex implementation - for the architecture. Berkeley DB normally prefers blocking-mutex - implementations over non-blocking ones. For example, Berkeley DB - will select POSIX pthread mutex interfaces rather than - assembly-code test-and-set spin mutexes because pthread mutexes are - usually more efficient and less likely to waste CPU cycles spinning - without getting any work accomplished. -

- For some applications and systems (generally highly concurrent - applications on large multiprocessor systems), Berkeley DB makes - the wrong choice. In some cases, better performance can be - achieved by configuring with the --with-mutex - argument and selecting a different mutex implementation than the - one selected by Berkeley DB. When a test-and-set spin mutex - implementation is selected, it may be useful to tune the number of - spins made before yielding the processor and sleeping. For more - information, see the DB_ENV->mutex_set_tas_spins() method. -

+ During configuration, Berkeley DB selects a + mutex implementation for the architecture. + Berkeley DB normally prefers blocking-mutex + implementations over non-blocking ones. For + example, Berkeley DB will select POSIX pthread + mutex interfaces rather than assembly-code + test-and-set spin mutexes because pthread mutexes + are usually more efficient and less likely to + waste CPU cycles spinning without getting any work + accomplished. +

- Finally, Berkeley DB may put multiple mutexes on individual cache - lines. When tuning Berkeley DB for large multiprocessor systems, - it may be useful to tune mutex alignment using the DB_ENV->mutex_set_align() - method. -

+ For some applications and systems (generally + highly concurrent applications on large + multiprocessor systems), Berkeley DB makes the + wrong choice. In some cases, better performance + can be achieved by configuring with the + --with-mutex argument and selecting a different + mutex implementation than the one selected by + Berkeley DB. When a test-and-set spin mutex + implementation is selected, it may be useful to + tune the number of spins made before yielding the + processor and sleeping. This may be particularly + beneficial for systems containing several + hyperthreaded processor cores. For more + information, see the DB_ENV->mutex_set_tas_spins() method. +

+ Finally, Berkeley DB may put multiple mutexes + on individual cache lines. When tuning Berkeley DB + for large multiprocessor systems, it may be useful + to tune mutex alignment using the DB_ENV->mutex_set_align() + method. +

--enable-posix-mutexes

By default, the Berkeley DB library will only select the POSIX pthread mutex -implementation if it supports mutexes shared between multiple processes. -If your application does not share its database environment between -processes and your system's POSIX mutex support was not selected because -it did not support inter-process mutexes, you may be able to increase -performance and transactional throughput by configuring with the ---enable-posix-mutexes argument.

+ By default, the Berkeley DB library will only + select the POSIX pthread mutex implementation if it + supports mutexes shared between multiple processes. If + your application does not share its database + environment between processes and your system's POSIX + mutex support was not selected because it did not + support inter-process mutexes, you may be able to + increase performance and transactional throughput by + configuring with the --enable-posix-mutexes + argument. +

log buffer size

Berkeley DB internally maintains a buffer of log writes. The buffer is -written to disk at transaction commit, by default, or, whenever it -is filled. If it is consistently being filled before transaction -commit, it will be written multiple times per transaction, costing -application performance. In these cases, increasing the size of the -log buffer can increase application throughput.

+ Berkeley DB internally maintains a buffer of log + writes. The buffer is written to disk at transaction + commit, by default, or, whenever it is filled. If it + is consistently being filled before transaction + commit, it will be written multiple times per + transaction, costing application performance. In these + cases, increasing the size of the log buffer can + increase application throughput. +

log file location

If the database environment's log files are on the same disk as the -databases, the disk arms will have to seek back-and-forth between the -two. Placing the log files and the databases on different disk arms -can often increase application throughput.

+ If the database environment's log files are on + the same disk as the databases, the disk arms will + have to seek back-and-forth between the two. Placing + the log files and the databases on different disk arms + can often increase application throughput. +

trickle write

In some applications, the cache is sufficiently active and dirty that -readers frequently need to write a dirty page in order to have space in -which to read a new page from the backing database file. You can use -the db_stat utility (or the statistics returned by the -DB_ENV->memp_stat() method) to see how often this is happening in your -application's cache. In this case, using a separate thread of control -and the DB_ENV->memp_trickle() method to trickle-write pages can often increase -the overall throughput of the application.

+ In some applications, the cache is sufficiently + active and dirty that readers frequently need to write + a dirty page in order to have space in which to read a + new page from the backing database file. You can use + the db_stat utility (or the statistics returned by the + DB_ENV->memp_stat() method) to see how often this is happening + in your application's cache. In this case, using a + separate thread of control and the DB_ENV->memp_trickle() + method to trickle-write pages can often increase the + overall throughput of the application. +