diff options
Diffstat (limited to 'src/third_party/wiredtiger/src/docs/wtperf.dox')
-rw-r--r-- | src/third_party/wiredtiger/src/docs/wtperf.dox | 269 |
1 files changed, 269 insertions, 0 deletions
diff --git a/src/third_party/wiredtiger/src/docs/wtperf.dox b/src/third_party/wiredtiger/src/docs/wtperf.dox new file mode 100644 index 00000000000..64e25978dd8 --- /dev/null +++ b/src/third_party/wiredtiger/src/docs/wtperf.dox @@ -0,0 +1,269 @@ +/*! @page wtperf Simulating workloads with wtperf + +The WiredTiger distribution includes a tool that can be used to simulate +workloads in WiredTiger, in the directory \c bench/wtperf. + +The \c wtperf utility generally has two phases, the populate phase which +creates a database and then populates an object in that database, and a +workload phase, that does some set of operations on the object. + +For example, the following configuration uses a single thread to +populate a file object with 500,000 records in a 500MB cache. The +workload phase consists of 8 threads running for two minutes, all +reading from the file. + +@code +conn_config="cache_size=500MB" +table_config="type=file" +icount=500000 +run_time=120 +populate_threads=1 +threads=((count=8,reads=1)) +@endcode + +In most cases, where the workload is the only interesting phase, the +populate phase can be performed once and the workload phase run +repeatedly (for more information, see the wtperf \c create configuration +variable). + +The \c conn_config configuration supports setting any WiredTiger +connection configuration value. This is commonly used to configure +statistics with regular reports, to obtain more information from the +run: + +@code +conn_config="cache_size=20G,statistics=(fast,clear),statistics_log=(wait=600)" +report_interval=5 +@endcode + +Note quoting must be used when passing values to Wiredtiger +configuration, as opposed to configuring the \c wtperf utility itself. + +The \c table_config configuration supports setting any WiredTiger object +creation configuration value, for example, the above test can be +converted to using an LSM store instead of a B+tree store, with +additional LSM configuration, by changing \c conn_config to: + +@code +table_config="lsm=(chunk_size=5MB),type=lsm,os_cache_dirty_max=16MB" +@endcode + +More complex workloads can be configured by creating more threads doing +inserts and updates as well as reads. For example, to configure two +inserting threads two threads doing a mixture of inserts, reads and +updates: + +@code +threads=((count=2,inserts=1),(count=2,inserts=1,reads=1,updates=1)) +@endcode + +Example \c wtperf configuration files can be found in the +\c bench/wtperf/runners/ directory. + +There are also a number of command line arguments that can be passed +to \c wtperf: +@par <code>-C config</code> +Specify configuration strings for the ::wiredtiger_open function. +This argument is additive to the \c conn_config parameter in the +configuration file. +@par <code>-h directory</code> +Specify a database home directory. The default is \c ./WT_TEST. +@par <code>-m monitor_directory</code> +Specify a directory for all monitoring related files. +The default is the database home directory. +@par <code>-O config_file</code> +Specify the configuration file to run. +@par <code>-o config</code> +Specify configuration strings for the \c wtperf program. +This argument will override settings in the configuration file. +@par <code>-T config</code> +Specify configuration strings for the WT_SESSION::create function. +This argument is additive to the \c table_config parameter in the +configuration file. + +@section monitor Monitoring wtperf + +Like all WiredTiger applications, the \c wtperf command can be configured +with statistics logging, and the resulting output displayed using the +\c wtstats visualization tool. For more information, see @ref wtstats. + +In addition to statistics logging, \c wtperf can monitor performance and +operation latency times. Monitoring is enabled using the \c sample_interval +configuration. For example to record information every 10 seconds, set the +following on the command line or add it to the \c wtperf configuration file: + +@code +sample_interval=10 +@endcode + +Enabling monitoring causes \c wtperf to create a file \c monitor in the +database home directory (or another directory as specified using the +\c -m option to \c wtperf). + +The same visualization tool, \c wtstats, can be used to view a combined +chart with both the \c monitor output and the statistics logging output +at the same time. + +The following example shows how to run the \c medium-btree.wtperf configuration +with monitoring enabled, and then generate a graph. + +@code +# Change into the WiredTiger directory. +cd wiredtiger + +# Configure and build WiredTiger if not already built. +./configure && make + +# Remove and re-create the run directory. +rm -rf WTPERF_RUN && mkdir WTPERF_RUN + +# Run the medium-btree.wtperf workload, sampling performance every 5 seconds. +bench/wtperf/wtperf \ + -h WTPERF_RUN \ + -o sample_interval=5 \ + -O bench/wtperf/runners/medium-btree.wtperf + +# Use the visualization tool to create HTML graph output; the output file is +# named wtstats.html. +python tools/wtstats/wtstats.py WTPERF_RUN/monitor + +# Possible alternatives if statistics logging also enabled: +# python tools/wtstats/wtstats.py WTPERF_RUN/monitor WTPERF_RUN/WiredTigerStat* +# python tools/wtstats/wtstats.py WTPERF_RUN +@endcode + +The python command creates a file named \c wtstats.html in the current +working directory. You can open the generated HTML document in your browser +and see the generated statistics. + +@section config Wtperf configuration options + +The following is a list of the currently available \c wtperf +configuration options: + +\if START_AUTO_GENERATED_WTPERF_CONFIGURATION +DO NOT EDIT: THIS PART OF THE FILE IS GENERATED BY dist/s_docs. +\endif + +@par async_threads (unsigned int, default=0) +number of async worker threads +@par checkpoint_interval (unsigned int, default=120) +checkpoint every interval seconds during the workload phase. +@par checkpoint_stress_rate (unsigned int, default=0) +checkpoint every rate operations during the populate phase in the +populate thread(s), 0 to disable +@par checkpoint_threads (unsigned int, default=0) +number of checkpoint threads +@par conn_config (string, default=create) +connection configuration string +@par compact (boolean, default=false) +post-populate compact for LSM merging activity +@par compression (string, default=none) +compression extension. Allowed configuration values are: 'none', +'bzip', 'lz4', 'snappy', 'zlib' +@par create (boolean, default=true) +do population phase; false to use existing database +@par database_count (unsigned int, default=1) +number of WiredTiger databases to use. Each database will execute the +workload using a separate home directory and complete set of worker +threads +@par drop_tables (unsigned int, default=0) +Whether to drop all tables at the end of the run, and report time +taken to do the drop. +@par icount (unsigned int, default=5000) +number of records to initially populate. If multiple tables are +configured the count is spread evenly across all tables. +@par idle_table_cycle (unsigned int, default=0) +Enable regular create and drop of idle tables, value is the maximum +number of seconds a create or drop is allowed before flagging an +error. Default 0 which means disabled. +@par index (boolean, default=false) +Whether to create an index on the value field. +@par insert_rmw (boolean, default=false) +execute a read prior to each insert in workload phase +@par key_sz (unsigned int, default=20) +key size +@par log_partial (boolean, default=false) +perform partial logging on first table only. +@par min_throughput (unsigned int, default=0) +notify if any throughput measured is less than this amount. Aborts or +prints warning based on min_throughput_fatal setting. Requires +sample_interval to be configured +@par min_throughput_fatal (boolean, default=false) +print warning (false) or abort (true) of min_throughput failure. +@par max_latency (unsigned int, default=0) +notify if any latency measured exceeds this number of +milliseconds.Aborts or prints warning based on min_throughput_fatal +setting. Requires sample_interval to be configured +@par max_latency_fatal (boolean, default=false) +print warning (false) or abort (true) of max_latency failure. +@par pareto (unsigned int, default=0) +use pareto distribution for random numbers. Zero to disable, otherwise +a percentage indicating how aggressive the distribution should be. +@par populate_ops_per_txn (unsigned int, default=0) +number of operations to group into each transaction in the populate +phase, zero for auto-commit +@par populate_threads (unsigned int, default=1) +number of populate threads, 1 for bulk load +@par random_range (unsigned int, default=0) +if non zero choose a value from within this range as the key for +insert operations +@par random_value (boolean, default=false) +generate random content for the value +@par read_range (unsigned int, default=0) +scan a range of keys after each search +@par reopen_connection (boolean, default=true) +close and reopen the connection between populate and workload phases +@par report_interval (unsigned int, default=2) +output throughput information every interval seconds, 0 to disable +@par run_ops (unsigned int, default=0) +total read, insert and update workload operations +@par run_time (unsigned int, default=0) +total workload seconds +@par sample_interval (unsigned int, default=0) +performance logging every interval seconds, 0 to disable +@par sample_rate (unsigned int, default=50) +how often the latency of operations is measured. One for every +operation,two for every second operation, three for every third +operation etc. +@par sess_config (string, default=) +session configuration string +@par table_config (string, default=key_format=S,value_format=S,type=lsm,exclusive=true,allocation_size=4kb,internal_page_max=64kb,leaf_page_max=4kb,split_pct=100) +table configuration string +@par table_count (unsigned int, default=1) +number of tables to run operations over. Keys are divided evenly over +the tables. Cursors are held open on all tables. Default 1, maximum +99999. +@par table_count_idle (unsigned int, default=0) +number of tables to create, that won't be populated. Default 0. +@par threads (string, default=) +workload configuration: each 'count' entry is the total number of +threads, and the 'insert', 'read' and 'update' entries are the ratios +of insert, read and update operations done by each worker thread; If a +throttle value is provided each thread will do a maximum of that +number of operations per second; multiple workload configurations may +be specified per threads configuration; for example, a more complex +threads configuration might be +'threads=((count=2,reads=1)(count=8,reads=1,inserts=2,updates=1))' +which would create 2 threads doing nothing but reads and 8 threads +each doing 50% inserts and 25% reads and updates. Allowed +configuration values are 'count', 'throttle', 'reads', 'inserts', +'updates', 'truncate', 'truncate_pct' and 'truncate_count'. There are +also behavior modifiers, supported modifiers are 'ops_per_txn' +@par transaction_config (string, default=) +transaction configuration string, relevant when populate_opts_per_txn +is nonzero +@par table_name (string, default=test) +table name +@par value_sz (unsigned int, default=100) +value size +@par verbose (unsigned int, default=1) +verbosity +@par warmup (unsigned int, default=0) +How long to run the workload phase before starting measurements + +\if STOP_AUTO_GENERATED_WTPERF_CONFIGURATION +DO NOT EDIT: THIS PART OF THE FILE IS GENERATED BY dist/s_docs. +\endif + +*/ |