| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| | |
core: handle FTS search terms individually
See merge request https://gitlab.gnome.org/GNOME/tracker/-/merge_requests/585
|
| |
| |
| |
| |
| | |
Test matching of disjoint terms, and usage of explicit quotes
to make exact matches.
|
|/
|
|
|
|
|
|
|
|
| |
We specify G_TYPE_INT64 but pass an unspecified integer that gets
promoted to 32-bit. This wreaks havoc in the GValue varargs collection
glib code.
Specify the right type in our tests, so that we don't hit this bug.
Closes: https://gitlab.gnome.org/GNOME/tracker/-/issues/397
|
|\
| |
| |
| |
| |
| |
| | |
Make unicode library a module
Closes #396
See merge request https://gitlab.gnome.org/GNOME/tracker/-/merge_requests/581
|
| |
| |
| |
| |
| | |
This is only created to give it to the parser, have it created from there
and detach TrackerLanguage from the rest of the code.
|
| |
| |
| |
| | |
This somewhat pointless boilerplate, get rid of it.
|
| |
| |
| |
| |
| | |
The HAVE_LIBICU/LIBUNISTRING defines should not be needed for
anything now.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This function was pushing things a bit too hard our guarantees,
while that works on direct connections, remote connections may
have order inverted by the asynchronicities involved in writing
updates through pipes, so we may end up "heisenbugs" like:
Tracker:ERROR:../tests/libtracker-sparql/tracker-batch-test.c:189:assert_photo: assertion failed (tracker_sparql_cursor_get_integer (cursor, 1) == horizontal_res): (234 == 123)
The one guarantee we can do is: After execute_finish() returns,
the batch was inserted. So adapt the existing test and add a new
one to probe that.
|
|/
|
|
|
| |
Take some better care at memory handling, to put actual
library leaks in the spotlight.
|
|
|
|
|
| |
The Python community are mostly using a standard code style defined by
`black` tool these days, let's do the same.
|
|
|
|
|
|
|
|
| |
Python convention is to use `_` in file names, so they can be imported, and
prefix all test modules with `test_`.
All config files are now in `config/` subdir and all data files are now
in `data/` subdir.
|
|
|
|
|
|
|
|
|
|
| |
This debug flag forces a FTS integrity check after every set of FTS
updates, and raises an error if the integrity check did fail. This
is a more proactive (and expensive) approach to finding out FTS index
corruptions.
In order to make this helpful right away, toggle this flag on for
our own test suite, so that CI may catch any remaining/popping issues.
|
|
|
|
|
| |
Test async ops, update APIs, and check that statements return the
expected types.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Graphs are currently the last component in our iteration through
tables. Move it to be on top of the loop, while it does not affect
performance querying things in one or other order, this provides
better natural clustering of data for the serialization formats
that handle graphs (e.g. Trig).
|
|
|
|
|
|
|
|
|
|
| |
Currently, the virtual table does query all properties individually,
even single-valued ones that live together with other single-valued
properties in the class table.
Since we no longer have to deal with value-to-string conversions
for the upper layers, it is now easier to query these single-valued
properties altogether in a single query to the whole class table.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Avoid the string conversion performed by the tracker_triples table,
and rely on the additional object type hints we get along with it.
This coincidentally also fixes isBlank() for these objects, when the
type is a non-literal (e.g. a link to a rdfs:Resource).
|
|
|
|
|
|
|
|
|
| |
This function should return true/false, as per
https://www.w3.org/TR/sparql11-query/#func-isBlank, we were
returning true/unbound.
This is a partial fix though, as "SELECT isBlank(?o) { ?s ?p ?o }"
would still return an unbound value for non-literal properties.
|
|
|
|
| |
This had broken indentation and some extra whitespaces.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since recently, GVDB repository includes a minimal meson.build
file to allow building as a subproject without additional
hassles (e.g. shipping supporting files at /subprojects/repofiles/)
Drop our internal copy of GVDB in favor of a subproject built
through Meson.
Since we're updating many years across, there has been GVDB API
updates that we need to adapt to: GvdbTable is no longer a refcounted
object, and gvdb_table_walk() is no longer offered to iterate across
values.
These largely affect our own set of GVDB tests though, the test
for gvdb_table_walk() was dropped, and so is the ref/unref one
(it basically does the same than gvdb/flat_strings, after dropping
the refcounting). These remaining tests stay useful, and should
ideally move into the GVDB repository, so it can run as a separate
suite here.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
| |
Do not rely on our loose interpretation of IRIs to make tests work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prefix is used in some of our base SPARQL API (e.g. fts:match
and the misc functions to deal with results), but the prefix is
currently defined in the Nepomuk ontology side.
This makes applications that want to use FTS with custom ontologies
either need to declare the prefix themselves, or help queries with
`PREFIX` syntax to add the missing stock prefix.
Move this prefix definition to the base ontology, so all databases
inherit the builtin fts: prefix and it can be used right away in
FTS queries.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 0x20 character should also be escaped as per the SPARQL reference,
and it correctly is when setting a TrackerResource IRI. Even though,
the fast path check for the presence of characters that should be
escaped is missing it, so it would be possible to let IRIs that only
have this invalid character as valid.
Since 0x20 (whitespace) is possibly the most ubiquitous character that
should be escaped, it's a bit of an oversight.
Fixes: 33031007c ("libtracker-sparql: Escape illegal characters in IRIREF...")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, all IRIREF going through SPARQL updates will be validated for the
characters being in the expected set (https://www.w3.org/TR/sparql11-query/#rIRIREF),
meanwhile TrackerResource is pretty liberal in the characters used by a
TrackerResource identifier or IRI reference.
This disagreement causes has 2 possible outcomes:
- If the resource is inserted via print_sparql_update(), print_rdf() or alike while
containing illegal characters, it will find errors when handling the SPARQL update.
- If the resource is directly inserted via TrackerBatch or update_resource(), the
validation step will be bypassed, ending up with an IRI that contains illegal
characters as per the SPARQL grammar.
In order to make TrackerResource friendly to e.g. sloppy IRI composition and avoid
these ugly situations when an illegal char sneaks in, make it escape the IRIs as
defined by IRIREF in the SPARQL grammar definition. This way every method of insertion
will succeed and be most correct with the given input.
Also, add tests for this behavior, to ensure we escape what should be escaped.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, our caching of triples does have a number of nested structures:
- In the buffer there is a struct for each graph
- In the graph struct there is a set of changed resource
- In the resource struct there is a set of modified tables
- In the table struct there is a set of modified properties
- In the property struct there is a list of values
This incurs in a maintenance cost that is higher than desired, adding and
removing elements here becomes a fair chunk of the time spent in updates,
since there is a number of allocations and list/hashtable updates
performed for batches that deal with a fair amount of different resources
(i.e. most of them).
In order to improve this, keep use two arrays to buffer this data:
- A "properties" array, that keeps individual predicate/object pairs. This
is used to store the values of properties being inserted or deleted for
single-valued and multivalued properties. This struct is "linked" with
(i.e. references) other elements in the array, so that e.g. class updates
may reference multiple properties/values being updated.
- An "update log" array, containing structs that are a event_type/graph/
subject tuple, plus optionally a link to one of the properties in the
previous array, all other properties are fetched through iterating through
the linked properties. These log entries are valid for class table updates
(i.e. single-valued properties) or multi-valued property tables.
These arrays make allocating the buffer a one-time operation (buffer size
is fixed, and the arrays are reused during the processing of a TrackerBatch)
and insertions into the log largely O(1) as opposed to a number of
array/hashtable lookups and inserts.
But we still want to coalesce updates to a same class table (e.g. changes to
several single-valued properties in the same table), for that there is an
additional hashtable set that uses these log entries as keys themselves,
with special hash/equal functions, lookups for prior events modifying the
same TrackerClass is also quite fast.
Overall, this makes the maintenance of this buffer less expensive in the
big picture. Even though there are still some remnants of the previous
caching for graphs and resources, this plays less of a role.
Since this changes the ordering of updates, some tests that rely on implicit
ordering (DESCRIBE ones) had to be adapted for this change.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For nrl:modified, we had some double handling since the modifier was
pushed as a property change, but also dealt with directly when flushing
the buffer.
Since these are essentially properties (even though changed automatically),
it makes sense to stick to the former, also for nrl:added. Do that, and
drop the special handling of these 2 properties when flushing the buffer.
This commit also changes the expected output of a serialization test,
since nrl:added is added now on more graphs than the first one that sees
a resource, this slightly changes the spacing after filtering the test
filters nrl:added/nrl:modified.
|
|
|
|
|
|
| |
When checking a function with a return value and a GError out argument,
it often gives better clues to check the error first, then the return
value.
|
|
|
|
|
|
|
|
|
| |
The order of the returned resultset was implicit and up to SQLite, but
the order for this test started changing starting with SQLite 3.39.0.
Make the order explicit, so that SQLite implementation details don't
leak up here.
Closes: https://gitlab.gnome.org/GNOME/tracker/-/issues/370
|
|
|
|
|
|
| |
These tests take some RDF, deserialize into a local connection,
serialize it again, and deserializes it into a dbus connection.
Then queries are run on both connections to verify data matches.
|
|
|
|
|
| |
We already wrap this with a prefix that adds location/line/col, so this
is inconsistently duplicated info.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test is currently ineffective (finds an error handling a duplicate
@prefix, but the test doesn't quite catch it or test anything worthwhile),
and it will actually break with the changes to come (arguably for the
better, since the error is actually propagated).
Comment out this test so far, we do need to handle @prefix possibly
overwriting a previous value, and we need to test this behavior properly.
Avoid making deserializers depend on this just because error propagation
was fixed in this place. This will be improved in the future.
|
|\
| |
| |
| |
| |
| |
| | |
libtracker-sparql: Fix handling of partial FTS deletion
Closes #361
See merge request GNOME/tracker!510
|