From 391edb01f8122b4e229a2d7c78529a4e080abc1a Mon Sep 17 00:00:00 2001 From: Aaron Patterson Date: Mon, 19 Oct 2015 10:53:16 -0700 Subject: resize strings after parsing The parser uses `rb_str_buf_new` to allocate new strings. `rb_str_buf_new` [has a minimum size of 128 and is not an embedded string](https://github.com/ruby/ruby/blob/9949407fd90c1c5bfe332141c75db995a9b867aa/string.c#L1119-L1135). This causes applications that parse JS to allocate extra memory when parsing short strings. For a real-world example, we can use the mime-types gem. The mime-types gem stores all mime types inside a JSON file and parses them when you require the gem. Here is a sample program: ```ruby require 'objspace' require 'mime-types' GC.start GC.start p ObjectSpace.memsize_of_all String ``` The example program loads the mime-types gem and outputs the total space used by all strings. Here are the results of the program before and after this patch: ** Before ** ``` [aaron@TC json (memuse)]$ ruby test.rb 5497494 [aaron@TC json (memuse)]$ ``` ** After ** ``` [aaron@TC json (memuse)]$ ruby -I lib:ext test.rb 3335862 [aaron@TC json (memuse)]$ ``` This change results in a ~40% reduction of memory use for strings in the mime-types gem. Thanks @matthewd for finding the problem, and @nobu for the patch! --- ext/json/ext/parser/parser.rl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ext/json/ext/parser/parser.rl b/ext/json/ext/parser/parser.rl index f3933cb..157b001 100644 --- a/ext/json/ext/parser/parser.rl +++ b/ext/json/ext/parser/parser.rl @@ -527,6 +527,8 @@ static char *JSON_parse_string(JSON_Parser *json, char *p, char *pe, VALUE *resu if (json->symbolize_names && json->parsing_name) { *result = rb_str_intern(*result); + } else { + rb_str_resize(*result, RSTRING_LEN(*result)); } if (cs >= JSON_string_first_final) { return p + 1; -- cgit v1.2.1 From 4aae95f41b6b972245d15e52c46dbd5f278ff2c2 Mon Sep 17 00:00:00 2001 From: Aaron Patterson Date: Thu, 3 Mar 2016 14:49:15 -0800 Subject: regenerate parser --- ext/json/ext/parser/parser.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ext/json/ext/parser/parser.c b/ext/json/ext/parser/parser.c index 5b2e61c..ea95e71 100644 --- a/ext/json/ext/parser/parser.c +++ b/ext/json/ext/parser/parser.c @@ -1632,6 +1632,8 @@ case 7: if (json->symbolize_names && json->parsing_name) { *result = rb_str_intern(*result); + } else { + rb_str_resize(*result, RSTRING_LEN(*result)); } if (cs >= JSON_string_first_final) { return p + 1; -- cgit v1.2.1