summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* PFDEBUG DECODE added.antirez2014-04-161-0/+35
| | | | | | Provides a human readable description of the opcodes composing a run-length encoded HLL (sparse encoding). The command is only useful for debugging / development tasks.
* PFDEBUG added, PFGETREG removed.antirez2014-04-163-9/+25
| | | | | PFDEBUG will be the interface to do debugging tasks with a key containing an HLL object.
* hllSparseToDense API changed to take ref to object.antirez2014-04-161-6/+10
| | | | | | The new API takes directly the object doing everything needed to turn it into a dense representation, including setting the new representation as object->ptr.
* hllSparseAdd() sanity check for span != 0 added.antirez2014-04-161-0/+3
|
* Fix hllSparseAdd() new sequence replacement when next is NULL.antirez2014-04-161-4/+2
| | | | | sdsIncrLen() must be called anyway even if we are replacing the last oppcode of the sparse representation.
* Fix seqlen computation in hllSparseAdd().antirez2014-04-161-1/+1
|
* Abstract hllSparseAdd() / hllDenseAdd() via hllAdd().antirez2014-04-161-4/+19
|
* hllSparseSum(): multiply 1 * runlen for zero entries.antirez2014-04-161-2/+2
|
* Macro HLL_SPARSE_XZERO_LEN fixed.antirez2014-04-161-1/+1
|
* Fix HLL sparse object creation #2.antirez2014-04-161-2/+2
| | | | Two vars initialized to wrong values in createHLLObject().
* Increment pointer while iterating sparse HLL object.antirez2014-04-161-0/+6
|
* Fix HLL sparse object creation.antirez2014-04-161-2/+2
| | | | | The function didn't considered the fact that each XZERO opcode is two bytes.
* Create HyperLogLog objects with sparse encoding.antirez2014-04-161-10/+28
|
* HyperLogLog sparse to dense conversion function.antirez2014-04-161-3/+44
|
* HyperLogLog sparse representation initial implementation.antirez2014-04-161-9/+269
| | | | | | | | | Code never tested, but the basic layout is shaped in this commit. Also missing: 1) Sparse -> Dense conversion function. 2) New HLL object creation using the sparse representation. 3) Implementation of PFMERGE for the sparse representation.
* hllCount() refactored to support multiple representations.antirez2014-04-161-34/+62
|
* hllAdd() refactored into two functions.antirez2014-04-161-24/+35
| | | | Also dense representation access macro renamed accordingly.
* HyperLogLog refactoring to support different encodings.antirez2014-04-161-99/+135
| | | | | | | Metadata are now placed at the start of the representation as an header. There is a proper structure to access the representation. Still work to do in order to truly abstract the implementation from the representation, commands still work assuming dense representation.
* HyperLogLog sparse representation slightly modified.antirez2014-04-161-39/+43
| | | | | | | After running a few simulations with different alternative encodings, it was found that the VAL opcode performs better using 5 bits for the value and 2 bits for the run length, at least for cardinalities in the range of interest.
* HyperLogLog sparse representation description and macros.antirez2014-04-161-3/+104
|
* Add casting to match printf format.antirez2014-04-161-5/+9
| | | | | | | adjustOpenFilesLimit() and clusterUpdateSlotsWithConfig() that were assuming uint64_t is the same as unsigned long long, which is true probably for all the systems out there that we target, but still GCC emitted a warning since technically they are two different types.
* ZRANGEBYLEX and ZREVRANGEBYLEX implementation.antirez2014-04-163-2/+456
|
* PFCOUNT: always unshare/decode the object.antirez2014-04-161-3/+1
| | | | | | This will be a non-op most of the times since the object will be unshared / decoded, however it is more technically correct to start this way since the object may be decoded even in the read-only code path.
* Changed HyperLogLog hash seed to a non-zero value.antirez2014-04-161-1/+1
| | | | | | | | | | | | | | | Using a seed of zero has the side effect of having the empty string hashing to what is a very special case in the context of HyperLogLog: a very long run of zeroes. This did not influenced the correctness of the result with 16k registers because of the harmonic mean, but still it is inconvenient that a so obvious value maps to a so special hash. The seed 0xadc83b19 is used instead, which is the first 64 bits of the SHA1 of the empty string. Reference: issue #1657.
* Initial HyperLogLog tests.antirez2014-04-162-0/+69
|
* Return "WRONGTYPE" error on PF* type mismatch.antirez2014-04-161-1/+3
|
* Fix PFADD infinite loop.antirez2014-04-161-6/+3
| | | | | | | | We need to guarantee that the last bit is 1, otherwise an element may hash to just zeroes with probability 1/(2^64) and trigger an infinite loop. See issue #1657.
* Make hll-gnuplot-graph.rb callable from cli.antirez2014-04-161-4/+5
|
* Remove HyperLogLog type checking duplicated code.antirez2014-04-161-45/+21
|
* PFGETREG added for testing purposes.antirez2014-04-163-2/+42
| | | | | The new command allows to get a dump of the registers stored into an HyperLogLog data structure for testing / debugging purposes.
* PFCOUNT: unshare the object when cached cardinality is modified.antirez2014-04-161-0/+3
|
* PFSELFTEST improved to test the approximation error.antirez2014-04-161-6/+35
|
* hll-gnuplot-graph.rb improved with new filter.antirez2014-04-161-13/+22
| | | | | | The function to generate graphs is also more flexible as now includes step and max value. The step of the samples generation function is no longer limited to min step of 1000.
* HyperLogLog: added magic / version.antirez2014-04-161-25/+45
| | | | | | | This will allow future changes like compressed representations. Currently the magic is not checked for performance reasons but this may change in the future, for example if we add new types encoded in strings that may have the same size of HyperLogLogs.
* Fixed pfadd/pfcount commands emitting hll* events instead of pf* eventsRaymond Myers2014-04-161-2/+2
|
* Change HLL* to PF* in error messagesRaymond Myers2014-04-161-3/+3
|
* Include redis.h before other stuff in hyperloglog.c.antirez2014-04-161-1/+2
| | | | | Otherwise fmacros.h is included later and this may break compilation on different systems.
* HyperLogLog API prefix modified from "P" to "PF".antirez2014-04-165-19/+19
| | | | Using both the initials of Philippe Flajolet instead of just "P".
* Makefile.dep updated with hyperloglog.o deps.antirez2014-04-161-0/+4
|
* HyperLogLog: make API use the P prefix in honor of Philippe Flajolet.antirez2014-04-165-19/+19
|
* HLLMERGE fixed by adding a... missing loop!antirez2014-04-161-5/+7
|
* hll-gnuplot-graph.rb: added new filter "all".antirez2014-04-161-5/+14
|
* HyperLogLog apply bias correction using a polynomial.antirez2014-04-161-11/+18
| | | | | | | | | | | | Better results can be achieved by compensating for the bias of the raw approximation just after 2.5m (when LINEARCOUNTING is no longer used) by using a polynomial that approximates the bias at a given cardinality. The curve used was found using this web page: http://www.xuru.org/rt/PR.asp That performs polynomial regression given a set of values.
* HLLMERGE implemented.antirez2014-04-163-1/+67
| | | | | Merge N HLL data structures by selecting the max value for every M[i] register among the set of HLLs.
* HLLCOUNT is technically a write commandantirez2014-04-161-0/+6
| | | | | When we update the cached value, we need to propagate the command and signal the key as modified for WATCH.
* HLLADD: propagate write when only variable name is given.antirez2014-04-161-0/+1
| | | | | | | | | | | | | The following form is given: HLLADD myhll No element is provided in the above case so if 'myhll' var does not exist the result is to just create an empty HLL structure, and no update will be performed on the registers. In this case, the DB should still be set dirty and the command propagated.
* hll-gnuplot-graph.rb: Use |error| when filter is :maxantirez2014-04-161-0/+1
|
* Ignore txt files inside utils/hyperloglog.antirez2014-04-161-0/+1
| | | | Those are generated to trace graphs using gnuplot.
* HyperLogLog: use LINEARCOUNTING up to 3m.antirez2014-04-161-3/+11
| | | | | | The HyperLogLog original paper suggests using LINEARCOUNTING for cardinalities < 2.5m, however for P=14 the median / max error curves show that a value of '3' is the best pick for m = 16384.
* hll-gnuplot-graph.rb added to plot HyperLogLog error graphs.antirez2014-04-161-0/+68
|