diff options
author | Carlos Garnacho <carlosg@gnome.org> | 2022-08-15 20:46:34 +0200 |
---|---|---|
committer | Carlos Garnacho <carlosg@gnome.org> | 2022-08-30 17:38:57 +0200 |
commit | 17bf28377f45b2f134b05891ebf36512655f98e7 (patch) | |
tree | 419d1f70d78f27f908e86eca1a9f1e181f9fc671 /tests | |
parent | f8203968b8cc59af09dc72897947871ff860f4ec (diff) | |
download | tracker-17bf28377f45b2f134b05891ebf36512655f98e7.tar.gz |
core: Refactor buffering of database updates
Currently, our caching of triples does have a number of nested structures:
- In the buffer there is a struct for each graph
- In the graph struct there is a set of changed resource
- In the resource struct there is a set of modified tables
- In the table struct there is a set of modified properties
- In the property struct there is a list of values
This incurs in a maintenance cost that is higher than desired, adding and
removing elements here becomes a fair chunk of the time spent in updates,
since there is a number of allocations and list/hashtable updates
performed for batches that deal with a fair amount of different resources
(i.e. most of them).
In order to improve this, keep use two arrays to buffer this data:
- A "properties" array, that keeps individual predicate/object pairs. This
is used to store the values of properties being inserted or deleted for
single-valued and multivalued properties. This struct is "linked" with
(i.e. references) other elements in the array, so that e.g. class updates
may reference multiple properties/values being updated.
- An "update log" array, containing structs that are a event_type/graph/
subject tuple, plus optionally a link to one of the properties in the
previous array, all other properties are fetched through iterating through
the linked properties. These log entries are valid for class table updates
(i.e. single-valued properties) or multi-valued property tables.
These arrays make allocating the buffer a one-time operation (buffer size
is fixed, and the arrays are reused during the processing of a TrackerBatch)
and insertions into the log largely O(1) as opposed to a number of
array/hashtable lookups and inserts.
But we still want to coalesce updates to a same class table (e.g. changes to
several single-valued properties in the same table), for that there is an
additional hashtable set that uses these log entries as keys themselves,
with special hash/equal functions, lookups for prior events modifying the
same TrackerClass is also quite fast.
Overall, this makes the maintenance of this buffer less expensive in the
big picture. Even though there are still some remnants of the previous
caching for graphs and resources, this plays less of a role.
Since this changes the ordering of updates, some tests that rely on implicit
ordering (DESCRIBE ones) had to be adapted for this change.
Diffstat (limited to 'tests')
-rw-r--r-- | tests/core/describe/describe-multiple.out | 10 | ||||
-rw-r--r-- | tests/core/tracker-sparql-test.c | 5 |
2 files changed, 8 insertions, 7 deletions
diff --git a/tests/core/describe/describe-multiple.out b/tests/core/describe/describe-multiple.out index 8edc0fd45..dadbfbb86 100644 --- a/tests/core/describe/describe-multiple.out +++ b/tests/core/describe/describe-multiple.out @@ -1,8 +1,8 @@ "b" "http://example/relation" "z" "c" "http://example/relation" "x" "d" "http://example/relation" "z" -"z" "http://example/title" "titleZ" "x" "http://example/title" "titleX" +"z" "http://example/title" "titleZ" "b" "http://example/number" "73" "c" "http://example/number" "113" "b" "http://example/date" "2001-01-01T00:00:01Z" @@ -10,13 +10,13 @@ "b" "http://example/name" "nameB" "c" "http://example/name" "nameC" "d" "http://example/name" "nameD" -"z" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" -"z" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/B" -"x" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" -"x" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/B" "b" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" "b" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/A" "c" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" "c" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/A" "d" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" "d" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/A" +"x" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" +"x" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/B" +"z" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://www.w3.org/2000/01/rdf-schema#Resource" +"z" "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" "http://example/B" diff --git a/tests/core/tracker-sparql-test.c b/tests/core/tracker-sparql-test.c index b06a26ffa..0da6acafa 100644 --- a/tests/core/tracker-sparql-test.c +++ b/tests/core/tracker-sparql-test.c @@ -468,13 +468,14 @@ check_result (TrackerSparqlCursor *cursor, gchar *diff; quoted_results = g_shell_quote (test_results->str); - command_line = g_strdup_printf ("echo -n %s | diff -u %s -", quoted_results, results_filename); + command_line = g_strdup_printf ("echo -n %s | diff -uZ %s -", quoted_results, results_filename); quoted_command_line = g_shell_quote (command_line); shell = g_strdup_printf ("sh -c %s", quoted_command_line); g_spawn_command_line_sync (shell, &diff, NULL, NULL, &error); g_assert_no_error (error); - g_error ("%s", diff); + if (diff && *diff) + g_error ("%s", diff); g_free (quoted_results); g_free (command_line); |