summaryrefslogtreecommitdiff
path: root/doc/developers/performance-use-case-analysis.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/developers/performance-use-case-analysis.txt')
-rw-r--r--doc/developers/performance-use-case-analysis.txt113
1 files changed, 113 insertions, 0 deletions
diff --git a/doc/developers/performance-use-case-analysis.txt b/doc/developers/performance-use-case-analysis.txt
new file mode 100644
index 0000000..9e6d8c7
--- /dev/null
+++ b/doc/developers/performance-use-case-analysis.txt
@@ -0,0 +1,113 @@
+.. This document describes _how_ to do use case analyses and what we want
+.. to get out of them; for the specific cases see the files referenced by
+.. performance-roadmap.txt
+
+Analysing a specific use case
+=============================
+
+The analysis of a use case needs to provide as outputs:
+ * The functional requirements that the use case has to satisfy.
+ * The file level operations and access patterns that will give the best
+ performance.
+ * A low friction API which will allow the use case to be implemented.
+ * The release of bzr (and thus the supported features) for which the analysis
+ was performed. The feature set of bzr defines the access patterns and data
+ required to implement any use case. So when we add features, their design
+ changes the requirements for the parts of the system they alter, so we need
+ to re-analyse use cases when bzr's feature set changes. If future plans are
+ considered in the analysis with the intention of avoiding rework, these
+ should also be mentioned.
+
+Performing the analysis
+=======================
+
+The analysis needs to be able to define the characteristics of the
+involved disk storage and APIs. That means we need to examine the data
+required for the operation, in what order it is required, on both the
+read and write sides, and how that needs to be presented to be
+consistent with our layering.
+
+As a quick example: 'annotation of a file requires the file id looked up
+from the tree, the basis revision id from the tree, and then the text of
+that fileid-revisionid pair along with the creating revision id
+allocated to each line, and the dotted revision number of each of those
+revision ids.' All three of our key domain objects are involved here,
+but we haven't defined any characteristics of the api or disk facilities
+yet. We could then do that by saying something like 'the file-id lookup
+should degrade gracefully as trees become huge. The tree basis id should
+be constant time. Retrieval of the annotated text should be roughly
+constant for any text of the same size regardless of the number of
+revisions contributing to its content. Mapping of the revision ids to
+dotted revnos could be done as the text is retrieved, but it's completely
+fine to post-process the annotated text to obtain dotted-revnos.'
+
+What factors should be considered?
+==================================
+
+Obviously, those that will make for an extremely fast system :). There
+are many possible factors, but the ones I think are most interesting to
+design with are:
+
+- baseline overhead:
+
+ - The time to get bzr ready to begin the use case.
+
+- scaling: how does performance change when any of the follow aspects
+ of the system are ratcheted massively up or down:
+
+ - number of files/dirs/symlinks/subtrees in a tree (both working and
+ revision trees)
+ - size of any particular file
+ - number of elements within a single directory
+ - length of symlinks
+ - number of changes to any file over time
+ (subordinately also the number of merges of the file)
+ - number of commits in the ancestry of a branch
+ (subordinately also the number of merges)
+ - number of revisions in a repository
+ - number of fileids in a repository
+ - number of ghosts in a given graph (revision or per-file)
+ - number of branches in a repository
+ - number of concurrent readers for a tree/branch/repository
+ - number of concurrent writers for objects that support that.
+ - latency to perform file operations (e.g. slow disks, network file systems,
+ our VFS layer and FTP/SFTP/etc)
+ - bandwidth to the disk storage
+ - latency to perform semantic operations (hpss specific)
+ - bandwidth when performing semantic operations.
+
+- locality of reference: If an operation requires data that is located
+ within a small region at any point, we often get better performance
+ than with an implementation of the same operation that requires the
+ same amount of data but with a lower locality of reference. It's
+ fairly tricky to add locality of reference after the fact, so I think
+ its worth considering up front.
+
+Using these factors, to the annotate example we can add that its
+reasonable to do two 'semantic' round trips to the local tree, one to
+the branch object, and two to the repository. In file-operation level
+measurements, in an ideal world there would be no more than one round
+trip for each semantic operation. What there must not be is one round
+trip per revision involved in the revisionid->dotted number mapping, nor
+per each revision id attributed to a line in the text.
+
+Not all the items mentioned above are created equal. The analysis should
+include the parameters considered and the common case values for each - the
+optimisation should be around the common cases not around the exceptions.
+
+For instance, we have a smart server now; file level operations are relatively
+low latency and we should use that as the common case. At this point we intend
+to preserve the performance of the dumb protocol networking, but focus on
+improving network performance via the smart server and thus escape the
+file-level operation latency considerations.
+
+Many performance problems only become visible when changing the scaling knobs
+upwards to large trees. On small trees it's our baseline performance that drives
+incremental improvements; on large trees it's the amount of processing per item
+that drives performance. A significant goal therefore is to keep the amount of
+data to be processed under control. Ideally we can scale in a sublinear fashion
+for all operations, but we MUST NOT scale even linearly for operations that
+invoke a latency multiplier. For example, reading a file on disk requires
+finding the inode for the file, then the block with the data and returning the
+contents. Due to directory grouping logic we pay a massive price to read files
+if we do not group the reads of files within the same directory.