<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/git.git/hashmap.h, branch ps/docs-diffcore</title>
<subtitle>github.com: git/git.git
</subtitle>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/git.git/'/>
<entry>
<title>hashmap: add string interning API</title>
<updated>2014-07-07T20:56:38+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2014-07-02T22:22:54+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/git.git/commit/?id=7b64d42d22206d9995a8f0cb3b515e623cac4702'/>
<id>7b64d42d22206d9995a8f0cb3b515e623cac4702</id>
<content type='text'>
Interning short strings with high probability of duplicates can reduce the
memory footprint and speed up comparisons.

Add strintern() and memintern() APIs that use a hashmap to manage the pool
of unique, interned strings.

Note: strintern(getenv()) could be used to sanitize git's use of getenv(),
in case we ever encounter a platform where a call to getenv() invalidates
previous getenv() results (which is allowed by POSIX).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Interning short strings with high probability of duplicates can reduce the
memory footprint and speed up comparisons.

Add strintern() and memintern() APIs that use a hashmap to manage the pool
of unique, interned strings.

Note: strintern(getenv()) could be used to sanitize git's use of getenv(),
in case we ever encounter a platform where a call to getenv() invalidates
previous getenv() results (which is allowed by POSIX).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>hashmap: add simplified hashmap_get_from_hash() API</title>
<updated>2014-07-07T20:56:35+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2014-07-02T22:22:11+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/git.git/commit/?id=ab73a9d119240b0b908ccb9edd19b8e536ce29b9'/>
<id>ab73a9d119240b0b908ccb9edd19b8e536ce29b9</id>
<content type='text'>
Hashmap entries are typically looked up by just a key. The hashmap_get()
API expects an initialized entry structure instead, to support compound
keys. This flexibility is currently only needed by find_dir_entry() in
name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
(currently five) call sites of hashmap_get() have to set up a near emtpy
entry structure, resulting in duplicate code like this:

  struct hashmap_entry keyentry;
  hashmap_entry_init(&amp;keyentry, hash(key));
  return hashmap_get(map, &amp;keyentry, key);

Add a hashmap_get_from_hash() API that allows hashmap lookups by just
specifying the key and its hash code, i.e.:

  return hashmap_get_from_hash(map, hash(key), key);

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hashmap entries are typically looked up by just a key. The hashmap_get()
API expects an initialized entry structure instead, to support compound
keys. This flexibility is currently only needed by find_dir_entry() in
name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
(currently five) call sites of hashmap_get() have to set up a near emtpy
entry structure, resulting in duplicate code like this:

  struct hashmap_entry keyentry;
  hashmap_entry_init(&amp;keyentry, hash(key));
  return hashmap_get(map, &amp;keyentry, key);

Add a hashmap_get_from_hash() API that allows hashmap lookups by just
specifying the key and its hash code, i.e.:

  return hashmap_get_from_hash(map, hash(key), key);

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>hashmap: factor out getting a hash code from a SHA1</title>
<updated>2014-07-07T20:56:24+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2014-07-02T22:20:20+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/git.git/commit/?id=039dc71a7cb824300e242f8abc0fcb19dac93641'/>
<id>039dc71a7cb824300e242f8abc0fcb19dac93641</id>
<content type='text'>
Copying the first bytes of a SHA1 is duplicated in six places,
however, the implications (the actual value would depend on the
endianness of the platform) is documented only once.

Add a properly documented API for this.

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Copying the first bytes of a SHA1 is duplicated in six places,
however, the implications (the actual value would depend on the
endianness of the platform) is documented only once.

Add a properly documented API for this.

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>hashmap.h: use 'unsigned int' for hash-codes everywhere</title>
<updated>2014-02-24T23:26:30+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-12-18T13:41:27+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/git.git/commit/?id=b6aad994737458177ddf68939719f90e7909f656'/>
<id>b6aad994737458177ddf68939719f90e7909f656</id>
<content type='text'>
Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>add a hashtable implementation that supports O(1) removal</title>
<updated>2013-11-18T21:03:51+00:00</updated>
<author>
<name>Karsten Blees</name>
<email>karsten.blees@gmail.com</email>
</author>
<published>2013-11-14T19:17:54+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/git.git/commit/?id=6a364ced497e407ab3ffb2554d4ef2c78f801832'/>
<id>6a364ced497e407ab3ffb2554d4ef2c78f801832</id>
<content type='text'>
The existing hashtable implementation (in hash.[ch]) uses open addressing
(i.e. resolve hash collisions by distributing entries across the table).
Thus, removal is difficult to implement with less than O(n) complexity.
Resolving collisions of entries with identical hashes (e.g. via chaining)
is left to the client code.

Add a hashtable implementation that supports O(1) removal and is slightly
easier to use due to builtin entry chaining.

Supports all basic operations init, free, get, add, remove and iteration.

Also includes ready-to-use hash functions based on the public domain FNV-1
algorithm (http://www.isthe.com/chongo/tech/comp/fnv).

The per-entry data structure (hashmap_entry) is piggybacked in front of
the client's data structure to save memory. See test-hashmap.c for usage
examples.

The hashtable is resized by a factor of four when 80% full. With these
settings, average memory consumption is about 2/3 of hash.[ch], and
insertion is about twice as fast due to less frequent resizing.

Lookups are also slightly faster, because entries are strictly confined to
their bucket (i.e. no data of other buckets needs to be traversed).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The existing hashtable implementation (in hash.[ch]) uses open addressing
(i.e. resolve hash collisions by distributing entries across the table).
Thus, removal is difficult to implement with less than O(n) complexity.
Resolving collisions of entries with identical hashes (e.g. via chaining)
is left to the client code.

Add a hashtable implementation that supports O(1) removal and is slightly
easier to use due to builtin entry chaining.

Supports all basic operations init, free, get, add, remove and iteration.

Also includes ready-to-use hash functions based on the public domain FNV-1
algorithm (http://www.isthe.com/chongo/tech/comp/fnv).

The per-entry data structure (hashmap_entry) is piggybacked in front of
the client's data structure to save memory. See test-hashmap.c for usage
examples.

The hashtable is resized by a factor of four when 80% full. With these
settings, average memory consumption is about 2/3 of hash.[ch], and
insertion is about twice as fast due to less frequent resizing.

Lookups are also slightly faster, because entries are strictly confined to
their bucket (i.e. no data of other buckets needs to be traversed).

Signed-off-by: Karsten Blees &lt;blees@dcon.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
