diff options
author | Carlos Garnacho <carlosg@gnome.org> | 2021-05-27 18:40:04 +0200 |
---|---|---|
committer | Carlos Garnacho <carlosg@gnome.org> | 2021-08-26 14:04:23 +0200 |
commit | dd78ca9e52d316beb1679ba5fe33f7a8f23e3cc1 (patch) | |
tree | 82153e9fdef95b9ce1376f50715fced04227a899 /docs/reference | |
parent | 13b35ed91cbb409a8725c898caa3463d5115a92f (diff) | |
download | tracker-dd78ca9e52d316beb1679ba5fe33f7a8f23e3cc1.tar.gz |
docs: Port "performance" docs to markdown
Diffstat (limited to 'docs/reference')
-rw-r--r-- | docs/reference/libtracker-sparql/meson.build | 1 | ||||
-rw-r--r-- | docs/reference/libtracker-sparql/performance.md | 103 | ||||
-rw-r--r-- | docs/reference/libtracker-sparql/performance.xml | 142 |
3 files changed, 104 insertions, 142 deletions
diff --git a/docs/reference/libtracker-sparql/meson.build b/docs/reference/libtracker-sparql/meson.build index 8e9f5b149..b528ac231 100644 --- a/docs/reference/libtracker-sparql/meson.build +++ b/docs/reference/libtracker-sparql/meson.build @@ -1,6 +1,7 @@ content = [ 'overview.md', 'limits.md', + 'performance.md', 'tutorial.md', ] diff --git a/docs/reference/libtracker-sparql/performance.md b/docs/reference/libtracker-sparql/performance.md new file mode 100644 index 000000000..7537657cb --- /dev/null +++ b/docs/reference/libtracker-sparql/performance.md @@ -0,0 +1,103 @@ +Title: Performance dos and donts +Slug: performance-dos-donts + +SPARQL is a very powerful query language. As it should be +suspected, this means there are areas where performance is +sacrificed for versatility. + +These are some tips to get the best of SPARQL as implemented +by Tracker. + +## Avoid queries with unrestricted predicates + +Queries with unrestricted predicates are those like: + +```SPARQL +SELECT ?p { <a> ?p 42 } +``` + +They involve lookups across all possible triples of +an object, which roughly translates to a traversal +through all tables and columns. + +The most pathological case is: + +```SPARQL +SELECT ?s ?p ?o { ?s ?p ?o } +``` + +Which does retrieve every triple existing in the store. + +Queries with unrestricted predicates are most useful to +introspect resources, or the triple store in its entirety. +Production code should do this in rare occasions. + +## Avoid the negated property path + +The `!` negation operator in property paths negate the +match. For example: + +```SPARQL +SELECT ?s ?o { ?s !nie:url ?o } +``` + +This query looks for every other property that is not +`nie:url`. The same reasoning than unrestricted predicates +apply, since that specific query is equivalent to: + +```SPARQL +SELECT ?s ?o { + ?s ?p ?o . + FILTER (?p != nie:url) +} +``` + +## Specify graphs wherever possible + +Queries on the union graph, or with unrestricted graphs, for +example: + +```SPARQL +SELECT ?u { ?u a rdfs:Resource } +SELECT ?g ?u { GRAPH ?g { ?u a rdfs:Resource }} +``` + +Will traverse across all graphs. Query complexity will increment +linearly with the amount of graphs. Production code should rarely +need to introspect graphs, and should strive to being aware of +the graph(s) involved. The fastest case is accessing one graph. + +The graph(s) may be specified through +`WITH / FROM / FROM NAMED / GRAPH` and other +SPARQL syntax for graphs. For example: + +```SPARQL +WITH <G> SELECT ?u { ?u a rdfs:Resource } +WITH <G> SELECT ?g ?u { GRAPH ?g { ?u a rdfs:Resource }} +``` + +## Avoid substring matching + +Matching for regexp/glob/substrings defeats any index text fields +could have. For example: + +```SPARQL +SELECT ?u { + ?u nie:title ?title . + FILTER (CONTAINS (?title, "sideshow")) +} +``` + +Will traverse all title strings looking for the substring. It is +encouraged to use fulltext search for finding matches within strings +where possible, for example: + +```SPARQL +SELECT ?u { ?u fts:match "sideshow" } +``` + +## Use TrackerSparqlStatement + +Using [class@Tracker.SparqlStatement] allows to parse and compile +a query once, and reuse it many times. Its usage +is recommended wherever possible. diff --git a/docs/reference/libtracker-sparql/performance.xml b/docs/reference/libtracker-sparql/performance.xml deleted file mode 100644 index 0fd93455e..000000000 --- a/docs/reference/libtracker-sparql/performance.xml +++ /dev/null @@ -1,142 +0,0 @@ -<?xml version='1.0' encoding="ISO-8859-1"?> -<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" - "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ -<!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2001/XInclude'"> -]> - -<part id="tracker-performance"> - <title>Performance dos and donts</title> - <partintro> - <para> - SPARQL is a very powerful query language. As it should be - suspected, this means there are areas where performance is - sacrificed for versatility. - </para> - <para> - These are some tips to get the best of SPARQL as implemented - by Tracker. - </para> - </partintro> - - <chapter id="tracker-perf-unrestricted-predicates"> - <title>Avoid queries with unrestricted predicates</title> - <para> - Queries with unrestricted predicates are those like: - <informalexample> - <programlisting language="SPARQL"> - SELECT ?p { <a> ?p 42 } - </programlisting> - </informalexample> - </para> - <para> - They involve lookups across all possible triples of - an object, which roughly translates to a traversal - through all tables and columns. - </para> - <para> - The most pathological case is: - <informalexample> - <programlisting language="SPARQL"> - SELECT ?s ?p ?o { ?s ?p ?o } - </programlisting> - </informalexample> - </para> - <para> - Which does retrieve every triple existing in the store. - </para> - <para> - Queries with unrestricted predicates are most useful to - introspect resources, or the triple store in its entirety. - Production code should do this in rare occasions. - </para> - </chapter> - - <chapter id="tracker-perf-negated-property-path"> - <title>Avoid the negated property path</title> - <para> - The <systemitem>!</systemitem> negation operator in property - paths negate the match. For example: - <informalexample> - <programlisting language="SPARQL"> - SELECT ?s ?o { ?s !nie:url ?o } - </programlisting> - </informalexample> - </para> - <para> - This query looks for every other property that is not - <systemitem>nie:url</systemitem>. The same reasoning than - unrestricted predicates apply, since that specific query is - equivalent to: - <informalexample> - <programlisting language="SPARQL"> - SELECT ?s ?o { ?s ?p ?o . - FILTER (?p != nie:url) } - </programlisting> - </informalexample> - </para> - </chapter> - - <chapter id="tracker-perf-graphs"> - <title>Specify graphs wherever possible</title> - <para> - Queries on the union graph, or with unrestricted graphs, for - example: - <informalexample> - <programlisting language="SPARQL"> - SELECT ?u { ?u a rdfs:Resource } - SELECT ?g ?u { GRAPH ?g { ?u a rdfs:Resource }} - </programlisting> - </informalexample> - - Will traverse across all graphs. Query complexity will increment - linearly with the amount of graphs. Production code should rarely - need to introspect graphs, and should strive to being aware of - the graph(s) involved. The fastest case is accessing one graph. - </para> - <para> - The graph(s) may be specified through - <systemitem>WITH/FROM/FROM NAMED/GRAPH</systemitem> and other - SPARQL syntax for graphs. For example: - <informalexample> - <programlisting language="SPARQL"> - WITH <G> SELECT ?u { ?u a rdfs:Resource } - WITH <G> SELECT ?g ?u { GRAPH ?g { ?u a rdfs:Resource }} - </programlisting> - </informalexample> - </para> - </chapter> - - <chapter id="tracker-perf-avoid-contains"> - <title>Avoid substring matching</title> - <para> - Matching for regexp/glob/substrings defeats any index text fields - could have. For example: - - <informalexample> - <programlisting language="SPARQL"> - SELECT ?u { ?u nie:title ?title . - FILTER (CONTAINS (?title, "sideshow")) } - </programlisting> - </informalexample> - - Will traverse all title strings looking for the substring. It is - encouraged to use fulltext search for finding matches within strings - where possible, for example: - - <informalexample> - <programlisting language="SPARQL"> - SELECT ?u { ?u fts:match "sideshow" } - </programlisting> - </informalexample> - </para> - </chapter> - - <chapter id="tracker-perf-use-statements"> - <title>Use TrackerSparqlStatement</title> - <para> - Using <type><link linkend="TrackerSparqlStatement">TrackerSparqlStatement</link></type> - allows to parse and compile a query once, and reuse it many times. Its usage - is recommended wherever possible. - </para> - </chapter> -</part> |