diff options
author | Lloyd Hilaiel <lloyd@hilaiel.com> | 2011-11-28 12:43:54 -0700 |
---|---|---|
committer | Lloyd Hilaiel <lloyd@hilaiel.com> | 2011-11-28 12:43:54 -0700 |
commit | 52c559dafce0e9f5b9b54e1dfe893bd3bd9d4873 (patch) | |
tree | 63cc234f7a9f6df9af39a559666ca4a5bd901db0 | |
parent | e2d080612f566205d16a65fdad05618f92774d4c (diff) | |
download | yajl-52c559dafce0e9f5b9b54e1dfe893bd3bd9d4873.tar.gz |
augment in-tree performance benchmark to assess serialization performance as well as parsing performance. (provides a means to assess issue #59)
-rw-r--r-- | ChangeLog | 104 | ||||
-rw-r--r-- | perf/perftest.c | 205 | ||||
-rw-r--r-- | src/api/yajl_common.h | 4 | ||||
-rw-r--r-- | src/api/yajl_gen.h | 4 | ||||
-rw-r--r-- | src/api/yajl_tree.h | 2 |
5 files changed, 236 insertions, 83 deletions
@@ -1,83 +1,87 @@ +2.1.0 + * lloyd update in-tree synthetic benchmark (perftest) to assess gen + throughput as well as parse time. + 2.0.3 * John Stamp generation of a pkgconfig file at build time. 2.0.2 - * lth fix typos in yajl_tree.h macros YAJL_IS_INTEGER and YAJL_IS_DOUBLE, + * lloyd fix typos in yajl_tree.h macros YAJL_IS_INTEGER and YAJL_IS_DOUBLE, contributed by Artem S Vybornov. - * lth add #ifdef __cplusplus wrappers to yajl_tree to allow proper + * lloyd add #ifdef __cplusplus wrappers to yajl_tree to allow proper usage from many populer C++ compilers. 2.0.1 - * lth generator flag to allow client to specify they want + * lloyd generator flag to allow client to specify they want escaped solidi '/'. issue #28 - * lth crash fix when yajl_parse() is never called. issue #27 + * lloyd crash fix when yajl_parse() is never called. issue #27 2.0.0 - * lth YAJL is now ISC licensed: http://en.wikipedia.org/wiki/ISC_license - * lth 20-35% (osx and linux respectively) parsing performance + * lloyd YAJL is now ISC licensed: http://en.wikipedia.org/wiki/ISC_license + * lloyd 20-35% (osx and linux respectively) parsing performance improvement attained by tweaking string scanning (idea: @michaelrhanson). - * Florian Forster & lth - yajl_tree interface introduced as a higher level + * Florian Forster & lloyd - yajl_tree interface introduced as a higher level interface to the parser (eats JSON, poops a memory representation) - * lth require a C99 compiler - * lth integers are now represented with long long (64bit+) on all platforms. - * lth size_t now used throughout to represent buffer lengths, so you can + * lloyd require a C99 compiler + * lloyd integers are now represented with long long (64bit+) on all platforms. + * lloyd size_t now used throughout to represent buffer lengths, so you can safely manage buffers greater than 4GB. * gno semantic improvements to yajl's API regarding partial value parsing and trailing garbage - * lth new configuration mechanism for yajl, see yajl_config() and + * lloyd new configuration mechanism for yajl, see yajl_config() and yajl_gen_config() * gno more allocation checking in more places * gno remove usage of strtol, replace with custom implementation that cares not about your locale. - * lth yajl_parse_complete renamed to yajl_complete_parse. - * lth add a switch to validate utf8 strings as they are generated. - * lth tests are a lot quieter in their output. - * lth addition of a little in tree performance benchmark, `perftest` in + * lloyd yajl_parse_complete renamed to yajl_complete_parse. + * lloyd add a switch to validate utf8 strings as they are generated. + * lloyd tests are a lot quieter in their output. + * lloyd addition of a little in tree performance benchmark, `perftest` in perf/perftest.c 1.0.12 * Conrad Irwin - Parse null bytes correctly * Mirek Rusin - fix LLVM warnings * gno - Don't generate numbers for keys. closes #13 - * lth - various win32 fixes, including build documentation improvements + * lloyd - various win32 fixes, including build documentation improvements * John Stamp - Don't export private symbols. * John Stamp - Install yajl_version.h, not the template. * John Stamp - Don't use -fPIC for static lib. Cmake will automatically add it for the shared. - * lth 0 fix paths embedded in dylib upon installation on osx. closes #11 + * lloyd 0 fix paths embedded in dylib upon installation on osx. closes #11 1.0.11 - * lth remove -Wno-missing-field-initializers for greater gcc compat (3.4.6) + * lloyd remove -Wno-missing-field-initializers for greater gcc compat (3.4.6) 1.0.10 * Brian Maher - yajl is now buildable without a c++ compiler present * Brian Maher - fix header installation on OSX with cmake 2.8.0 installed - * lth & vitali - allow builder to specify alternate lib directory + * lloyd & vitali - allow builder to specify alternate lib directory for installation (i.e. lib64) * Vitali Lovich - yajl version number now programatically accessible - * lth - prevent cmake from embedding rpaths in binaries. Static linking + * lloyd - prevent cmake from embedding rpaths in binaries. Static linking makes this unneccesary. 1.0.9 - * lth - fix inverted logic causing yajl_gen_double() to always fail on + * lloyd - fix inverted logic causing yajl_gen_double() to always fail on win32 (thanks to Fredrik Kihlander for the report) 1.0.8 * Randall E. Barker - move dllexport defnitions so dlls with proper exports can again be generated on windows - * lth - add yajl_get_bytes_consumed() which allows the client to + * lloyd - add yajl_get_bytes_consumed() which allows the client to determine the offset as an error, as well as determine how many bytes of an input buffer were consumed. - * lth - fixes to keep "error offset" up to date (like when the + * lloyd - fixes to keep "error offset" up to date (like when the client callback returns 0) * Brian Maher - allow client to specify a printing function in generation 1.0.7 - * lth fix win32 build (isinf and isnan) + * lloyd fix win32 build (isinf and isnan) 1.0.6 - * lth fix several compiler warnings - * lth fix generation of invalid json from yajl_gen_double + * lloyd fix several compiler warnings + * lloyd fix generation of invalid json from yajl_gen_double (NaN is not JSON) * jstamp support for combining short options in tools * jstamp exit properly on errors from tools @@ -85,30 +89,30 @@ * max fix configure --prefix 1.0.5 - * lth several performance improvements related to function + * lloyd several performance improvements related to function inlinin' 1.0.4 - * lth fix broken utf8 validation for three & four byte represenations. + * lloyd fix broken utf8 validation for three & four byte represenations. thanks to http://github.com/brianmario and http://github.com/technoweenie 1.0.3 - * lth fix syntax error in cplusplus extern "C" statements for wider + * lloyd fix syntax error in cplusplus extern "C" statements for wider compiler support 1.0.2 - * lth update doxygen documentation with new sample code, passing NULL + * lloyd update doxygen documentation with new sample code, passing NULL for allocation functions added in 1.0.0 1.0.1 - * lth resolve crash in json_reformatter due to incorrectly ordered + * lloyd resolve crash in json_reformatter due to incorrectly ordered parameters. 1.0.0 - * lth add 'make install' rules, thaks to Andrei Soroker for the + * lloyd add 'make install' rules, thaks to Andrei Soroker for the contribution. - * lth client may override allocation routines at generator or parser + * lloyd client may override allocation routines at generator or parser allocation time * tjw add yajl_parse_complete routine to allow client to explicitly specify end-of-input, solving the "lonely number" case, where @@ -116,42 +120,42 @@ end. * tjw many new test cases * tjw cleanup of code for symmetry and ease of reading - * lth integration of patches from Robert Varga which cleanup + * lloyd integration of patches from Robert Varga which cleanup compilation warnings on 64 bit linux 0.4.0 - * lth buffer overflow bug in yajl_gen_double s/%lf/%g/ - thanks to + * lloyd buffer overflow bug in yajl_gen_double s/%lf/%g/ - thanks to Eric Bergstrome - * lth yajl_number callback to allow passthrough of arbitrary precision + * lloyd yajl_number callback to allow passthrough of arbitrary precision numbers to client. Thanks to Hatem Nassrat. - * lth yajl_integer now deals in long, instead of long long. This + * lloyd yajl_integer now deals in long, instead of long long. This combined with yajl_number improves compiler compatibility while maintaining precision. - * lth better ./configure && make experience (still requires cmake and + * lloyd better ./configure && make experience (still requires cmake and ruby) - * lth fix handling of special characters hex 0F and 1F in yajl_encode + * lloyd fix handling of special characters hex 0F and 1F in yajl_encode (thanks to Robert Geiger) - * lth allow leading zeros in exponents (thanks to Hatem Nassrat) + * lloyd allow leading zeros in exponents (thanks to Hatem Nassrat) 0.3.0 - * lth doxygen documentation (html & man) generated as part of the + * lloyd doxygen documentation (html & man) generated as part of the build - * lth many documentation updates. - * lth fix to work with older versions of cmake (don't use LOOSE_LOOP + * lloyd many documentation updates. + * lloyd fix to work with older versions of cmake (don't use LOOSE_LOOP constructs) - * lth work around different behavior of freebsd 4 scanf. initialize + * lloyd work around different behavior of freebsd 4 scanf. initialize parameter to scanf to zero. - * lth all tests run 32x with ranging buffer sizes to stress stream + * lloyd all tests run 32x with ranging buffer sizes to stress stream parsing - * lth yajl_test accepts -b option to allow read buffer size to be + * lloyd yajl_test accepts -b option to allow read buffer size to be set - * lth option to validate UTF8 added to parser (argument in + * lloyd option to validate UTF8 added to parser (argument in yajl_parser_cfg) - * lth fix buffer overrun when chunk ends inside \u escaped text - * lth support client cancelation + * lloyd fix buffer overrun when chunk ends inside \u escaped text + * lloyd support client cancelation 0.2.2 - * lth on windows build debug with C7 symbols and no pdb files. + * lloyd on windows build debug with C7 symbols and no pdb files. 0.2.1 * fix yajl_reformat and yajl_verify to work on arbitrarily sized diff --git a/perf/perftest.c b/perf/perftest.c index 2d30984..2e09431 100644 --- a/perf/perftest.c +++ b/perf/perftest.c @@ -15,6 +15,8 @@ */ #include <yajl/yajl_parse.h> +#include <yajl/yajl_tree.h> +#include <yajl/yajl_gen.h> #include <stdio.h> #include <stdlib.h> #include <string.h> @@ -44,27 +46,60 @@ static double mygettime(void) { } #endif -#define PARSE_TIME_SECS 3 +#define TEST_TIME_SECS 3 + +/* if sample documents have been parsed `times` times, what + * throughput does this represent? print to stdout */ +static void +print_throughput(long long times, double starttime) +{ + double throughput; + double now; + const char * all_units[] = { "B/s", "KB/s", "MB/s", (char *) 0 }; + const char ** units = all_units; + int i, avg_doc_size = 0; + + now = mygettime(); + + for (i = 0; i < num_docs(); i++) avg_doc_size += doc_size(i); + avg_doc_size /= num_docs(); + + throughput = (times * avg_doc_size) / (now - starttime); + + while (*(units + 1) && throughput > 1024) { + throughput /= 1024; + units++; + } + + printf("%g %s", throughput, *units); +} + static int -run(int validate_utf8) +parse(int validate_utf8) { - long long times = 0; + long long times = 0; double starttime; starttime = mygettime(); - /* allocate a parser */ + printf("Parsing speed (with%s UTF8 validation): ", + validate_utf8 ? "" : "out"); + fflush(stdout); + for (;;) { int i; { + /* we'll run this test for no less than 3 seconds. */ double now = mygettime(); - if (now - starttime >= PARSE_TIME_SECS) break; + if (now - starttime >= TEST_TIME_SECS) break; } + /* parse through 100 documents at a time before kicking out and + * checking if time has elapsed */ for (i = 0; i < 100; i++) { yajl_handle hand = yajl_alloc(NULL, NULL, NULL); - yajl_status stat; + yajl_status stat; const char ** d; yajl_config(hand, yajl_dont_validate_strings, validate_utf8 ? 0 : 1); @@ -73,7 +108,7 @@ run(int validate_utf8) stat = yajl_parse(hand, (unsigned char *) *d, strlen(*d)); if (stat != yajl_status_ok) break; } - + stat = yajl_complete_parse(hand); if (stat != yajl_status_ok) { @@ -90,32 +125,145 @@ run(int validate_utf8) } } - /* parsed doc 'times' times */ - { - double throughput; - double now; - const char * all_units[] = { "B/s", "KB/s", "MB/s", (char *) 0 }; - const char ** units = all_units; - int i, avg_doc_size = 0; + print_throughput(times, starttime); + printf("\n"); + + return 0; +} + +static int +genRecurse(yajl_gen yg, yajl_val v) +{ + switch (v->type) { + case yajl_t_string: + yajl_gen_string(yg, (unsigned char *) v->u.string, strlen(v->u.string)); + break; + case yajl_t_number: + yajl_gen_number(yg, v->u.number.r, strlen(v->u.number.r)); + break; + case yajl_t_object: + yajl_gen_map_open(yg); + { + int i; + for (i=0; i < v->u.object.len; i++) { + const unsigned char * key = (unsigned char *) v->u.object.keys[i]; + yajl_gen_string(yg, key, strlen((char *) key)); + genRecurse(yg, v->u.object.values[i]); + } + } + yajl_gen_map_close(yg); + break; + case yajl_t_array: + yajl_gen_array_open(yg); + { + int i; + for (i=0; i < v->u.array.len; i++) { + genRecurse(yg, v->u.array.values[i]); + } + } + yajl_gen_array_close(yg); + break; + case yajl_t_true: + yajl_gen_bool(yg, 1); + break; + case yajl_t_false: + yajl_gen_bool(yg, 0); + break; + case yajl_t_null: + yajl_gen_null(yg); + break; + default: + break; + } + + return 0; +} + +static void +noopYAJLPrintFunc(void * v, const char * c, size_t s) +{ + return; +} + +static int +doGen(yajl_val n) +{ + yajl_gen yg; + + yg = yajl_gen_alloc(NULL); + yajl_gen_config(yg, yajl_gen_print_callback, noopYAJLPrintFunc, NULL); + genRecurse(yg, n); + yajl_gen_free(yg); + + return 0; +} + - now = mygettime(); +static int +gen(void) +{ + long long times = 0; + double starttime; + yajl_val * forest; + int i; + char ebuf[256]; + + starttime = mygettime(); + + printf("Stringify speed: "); + + /* first we'll parse all three documents into a trees */ + forest = (yajl_val *) calloc(sizeof(yajl_val *), num_docs()); + + for (i=0; i<num_docs(); i++) { + char * buf; + const char ** doc; + int j; + int used = 0; + + buf = (char *) calloc(1, doc_size(i)); + doc = get_doc(i); + for (j=0; doc[j] != NULL; j++) { + memcpy(buf + used, doc[j], strlen(doc[j])); + used += strlen(doc[j]); + } + buf[used] = 0; + forest[i] = yajl_tree_parse(buf, ebuf, sizeof(ebuf)); + free(buf); + } - for (i = 0; i < num_docs(); i++) avg_doc_size += doc_size(i); - avg_doc_size /= num_docs(); + /* now we'll start stringifying these memory representations of + * documents */ + for (;;) { + int i; + { + /* we'll run this test for no less than 3 seconds. */ + double now = mygettime(); + if (now - starttime >= TEST_TIME_SECS) break; + } - throughput = (times * avg_doc_size) / (now - starttime); - - while (*(units + 1) && throughput > 1024) { - throughput /= 1024; - units++; + /* parse through 100 documents at a time before kicking out and + * checking if time has elapsed */ + for (i = 0; i < 100; i++) { + doGen(forest[times % num_docs()]); + times++; } - - printf("Parsing speed: %g %s\n", throughput, *units); } + + /* free up the parsed documents when we're done */ + for (i=0; i<num_docs(); i++) { + yajl_tree_free(forest[i]); + } + free(forest); + + print_throughput(times, starttime); + printf("\n"); + return 0; } + int main(void) { @@ -124,11 +272,12 @@ main(void) printf("-- speed tests determine parsing throughput given %d different sample documents --\n", num_docs()); - printf("With UTF8 validation:\n"); - rv = run(1); + rv = parse(1); + if (rv != 0) return rv; + rv = parse(0); if (rv != 0) return rv; - printf("Without UTF8 validation:\n"); - rv = run(0); + rv = gen(); + return rv; } diff --git a/src/api/yajl_common.h b/src/api/yajl_common.h index b208fd7..b8a7c24 100644 --- a/src/api/yajl_common.h +++ b/src/api/yajl_common.h @@ -21,7 +21,7 @@ #ifdef __cplusplus extern "C" { -#endif +#endif #define YAJL_MAX_DEPTH 128 @@ -40,7 +40,7 @@ extern "C" { # else # define YAJL_API # endif -#endif +#endif /** pointer to a malloc function, supporting client overriding memory * allocation routines */ diff --git a/src/api/yajl_gen.h b/src/api/yajl_gen.h index 52fa99f..6d4a8ad 100644 --- a/src/api/yajl_gen.h +++ b/src/api/yajl_gen.h @@ -43,7 +43,7 @@ extern "C" { * state */ yajl_gen_in_error_state, /** A complete JSON document has been generated */ - yajl_gen_generation_complete, + yajl_gen_generation_complete, /** yajl_gen_double was passed an invalid floating point value * (infinity or NaN). */ yajl_gen_invalid_number, @@ -152,6 +152,6 @@ extern "C" { #ifdef __cplusplus } -#endif +#endif #endif diff --git a/src/api/yajl_tree.h b/src/api/yajl_tree.h index 729c579..d69d7a3 100644 --- a/src/api/yajl_tree.h +++ b/src/api/yajl_tree.h @@ -137,7 +137,7 @@ YAJL_API void yajl_tree_free (yajl_val v); * \param type the yajl_type of the object you seek, or yajl_t_any if any will do. * * \returns a pointer to the found value, or NULL if we came up empty. - * + * * Future Ideas: it'd be nice to move path to a string and implement support for * a teeny tiny micro language here, so you can extract array elements, do things * like .first and .last, even .length. Inspiration from JSONPath and css selectors? |