Update scoop.c documentation to reflect the new scoop buffering behavour.

Update the docstring so it correctly describes how data is processed directly from the input stream if there is sufficient data there.
author: Donovan Baarda <abo@minkirri.apana.org.au> 2021-09-14 21:58:10 +1000
committer: Donovan Baarda <abo@minkirri.apana.org.au> 2021-09-14 21:58:10 +1000
commit: 4929960f7cb1c4387f90e6c251267bc835e43998 (patch)
tree: 6dcdeb4672b1f128e438079202d90f51cdc2f024
parent: 5c2f3a4ca805ab2687468559da27f473999013b1 (diff)
download: librsync-4929960f7cb1c4387f90e6c251267bc835e43998.tar.gz
1 files changed, 12 insertions, 13 deletions
diff --git a/src/scoop.c b/src/scoop.c
index cb3be98..6ac3999 100644
--- a/src/scoop.c
+++ b/src/scoop.c
@@ -28,23 +28,22 @@
 /** \file scoop.c
  * This file deals with readahead from caller-supplied buffers.
  *
- * Many functions require a certain minimum amount of input to do their
- * processing. For example, to calculate a strong checksum of a block we need
- * at least a block of input.
+ * Many functions require a certain minimum amount of contiguous input data to
+ * do their processing. For example, to calculate a strong checksum of a block
+ * we need at least a block of input.
  *
  * Since we put the buffers completely under the control of the caller, we
  * can't count on ever getting this much data all in one go. We can't simply
  * wait, because the caller might have a smaller buffer than we require and so
- * we'll never get it. For the same reason we must always accept all the data
- * we're given.
- *
- * So, stream input data that's required for readahead is put into a special
- * buffer, from which the caller can then read. It's essentially like an
- * internal pipe, which on any given read request may or may not be able to
- * actually supply the data.
- *
- * As a future optimization, we might try to take data directly from the input
- * buffer if there's already enough there.
+ * we'll never get it.
+ *
+ * Stream input data is used directly if there is sufficient data to satisfy
+ * the readhead requests, otherwise it is copied and accumulated into an
+ * internal buffer until there is enough. This means for large input buffers we
+ * can leave a "tail" of unprocessed data in the input buffer, and only consume
+ * all the data if it was too small and start accumulating into the internal
+ * buffer. Provided the input buffers always have enough data we avoid copying
+ * into the internal buffer at all.
  *
  * \todo We probably know a maximum amount of data that can be scooped up, so
  * we could just avoid dynamic allocation. However that can't be fixed at
author	Donovan Baarda <abo@minkirri.apana.org.au>	2021-09-14 21:58:10 +1000
committer	Donovan Baarda <abo@minkirri.apana.org.au>	2021-09-14 21:58:10 +1000
commit	4929960f7cb1c4387f90e6c251267bc835e43998 (patch)
tree	6dcdeb4672b1f128e438079202d90f51cdc2f024
parent	5c2f3a4ca805ab2687468559da27f473999013b1 (diff)
download	librsync-4929960f7cb1c4387f90e6c251267bc835e43998.tar.gz