<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/diff-delta.c, branch v1.6.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://www.git.shady.money/git/atom?h=v1.6.2</id>
<link rel='self' href='https://www.git.shady.money/git/atom?h=v1.6.2'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/'/>
<updated>2007-12-18T23:22:28Z</updated>
<entry>
<title>fix style of a few comments in diff-delta.c</title>
<updated>2007-12-18T23:22:28Z</updated>
<author>
<name>Nicolas Pitre</name>
<email>nico@cam.org</email>
</author>
<published>2007-12-18T15:15:39Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=ce85b053d827e2f7c2ee2683cc09393e4768cc22'/>
<id>urn:sha1:ce85b053d827e2f7c2ee2683cc09393e4768cc22</id>
<content type='text'>
Signed-off-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Fix segfault in diff-delta.c when FLEX_ARRAY is 1</title>
<updated>2007-12-18T05:59:26Z</updated>
<author>
<name>Pierre Habouzit</name>
<email>madcoder@debian.org</email>
</author>
<published>2007-12-18T01:39:57Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=f9c5a80cdf2265f2df7712fad9f1fb7ef68b4768'/>
<id>urn:sha1:f9c5a80cdf2265f2df7712fad9f1fb7ef68b4768</id>
<content type='text'>
aka don't do pointer arithmetics on structs that have a FLEX_ARRAY member,
or you'll end up believing your array is 1 cell off its real address.

Signed-off-by: Pierre Habouzit &lt;madcoder@debian.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff-delta.c: Rationalize culling of hash buckets</title>
<updated>2007-09-10T00:16:49Z</updated>
<author>
<name>David Kastrup</name>
<email>dak@gnu.org</email>
</author>
<published>2007-09-08T21:25:55Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=02e665ce491296245f474dafdc02d47a6c8afa86'/>
<id>urn:sha1:02e665ce491296245f474dafdc02d47a6c8afa86</id>
<content type='text'>
The previous hash bucket culling resulted in a somewhat unpredictable
number of hash bucket entries in the order of magnitude of HASH_LIMIT.

Replace this with a Bresenham-like algorithm leaving us with exactly
HASH_LIMIT entries by uniform culling.

Signed-off-by: David Kastrup &lt;dak@gnu.org&gt;
Acked-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff-delta.c: pack the index structure</title>
<updated>2007-09-10T00:16:49Z</updated>
<author>
<name>David Kastrup</name>
<email>dak@gnu.org</email>
</author>
<published>2007-09-08T21:17:44Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=d2100860fd67dec6474157697888caaa0a0f51d0'/>
<id>urn:sha1:d2100860fd67dec6474157697888caaa0a0f51d0</id>
<content type='text'>
In normal use cases, the performance wins are not overly impressive:
we get something like 5-10% due to the slightly better locality of
memory accesses using the packed structure.

However, since the data structure for index entries saves 33% of
memory on 32-bit platforms and 40% on 64-bit platforms, the behavior
when memory gets limited should be nicer.

This is a rather well-contained change.  One obvious improvement would
be sorting the elements in one bucket according to their hash, then
using binary probing to find the elements with the right hash value.

As it stands, the output should be strictly the same as previously
unless one uses the option for limiting the amount of used memory, in
which case the created packs might be better.

Signed-off-by: David Kastrup &lt;dak@gnu.org&gt;
Acked-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff-delta.c: Fix broken skip calculation.</title>
<updated>2007-08-23T07:04:10Z</updated>
<author>
<name>David Kastrup</name>
<email>dak@gnu.org</email>
</author>
<published>2007-08-23T05:51:45Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=b1d884a9e3968db1fff91c2d066d871a3b8b013c'/>
<id>urn:sha1:b1d884a9e3968db1fff91c2d066d871a3b8b013c</id>
<content type='text'>
A particularly bad case was HASH_LIMIT &lt;= hash_count[i] &lt; 2*HASH_LIMIT:
in that case, only a single hash survived.  For larger cases,
2*HASH_LIMIT was the actual limiting value after pruning.

Signed-off-by: David Kastrup &lt;dak@gnu.org&gt;
Acked-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Support fetching the memory usage of a delta index</title>
<updated>2007-07-12T21:32:35Z</updated>
<author>
<name>Brian Downing</name>
<email>bdowning@lavos.net</email>
</author>
<published>2007-07-12T12:55:48Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=11779e79078c9da604753e570d02134c8d4bae6a'/>
<id>urn:sha1:11779e79078c9da604753e570d02134c8d4bae6a</id>
<content type='text'>
Delta indices, at least on 64-bit platforms, tend to be larger than
the actual uncompressed data.  As such, keeping track of this storage
is important if you want to successfully limit the memory size of your
pack window.

Squirrel away the total allocation size inside the delta_index struct,
and add an accessor "sizeof_delta_index" to access it.

Signed-off-by: Brian Downing &lt;bdowning@lavos.net&gt;
Acked-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff-delta: use realloc instead of xrealloc</title>
<updated>2007-05-31T07:15:18Z</updated>
<author>
<name>Martin Koegler</name>
<email>mkoegler@auto.tuwien.ac.at</email>
</author>
<published>2007-05-29T19:08:35Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=b75c6c6de1e8f801edb142b59e7809a166a63adc'/>
<id>urn:sha1:b75c6c6de1e8f801edb142b59e7809a166a63adc</id>
<content type='text'>
Commit 83572c1a914d3f7a8dd66d954c11bbc665b7b923 changed many
realloc to xrealloc. This change was made in diff-delta.c too,
although the code can handle an out of memory failure.

This patch reverts this change in diff-delta.c.

Signed-off-by: Martin Koegler &lt;mkoegler@auto.tuwien.ac.at&gt;
Signed-off-by: Junio C Hamano &lt;junkio@cox.net&gt;
</content>
</entry>
<entry>
<title>update diff-delta.c copyright</title>
<updated>2007-05-27T03:28:13Z</updated>
<author>
<name>Nicolas Pitre</name>
<email>nico@cam.org</email>
</author>
<published>2007-05-26T02:16:27Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=366b53c170ed4ac6a497757da1cbd0e316e48767'/>
<id>urn:sha1:366b53c170ed4ac6a497757da1cbd0e316e48767</id>
<content type='text'>
There is actually nothing left from the original LibXDiff code I used
over 2 years ago, and even the GIT implementation has diverged quite a
bit from LibXDiff's at this point.  Let's update the copyright notice
to better reflect that fact.

Signed-off-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;junkio@cox.net&gt;
</content>
</entry>
<entry>
<title>improve delta long block matching with big files</title>
<updated>2007-05-27T03:28:13Z</updated>
<author>
<name>Nicolas Pitre</name>
<email>nico@cam.org</email>
</author>
<published>2007-05-26T01:38:58Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=843366961cf14aad6490fbeb30f7b98f37f8833a'/>
<id>urn:sha1:843366961cf14aad6490fbeb30f7b98f37f8833a</id>
<content type='text'>
Martin Koegler noted that create_delta() performs a new hash lookup
after every block copy encoding which are currently limited to 64KB.

In case of larger identical blocks, the next hash lookup would normally
point to the next 64KB block in the reference buffer and multiple block
copy operations will be consecutively encoded.

It is however possible that the reference buffer be sparsely indexed if
hash buckets have been trimmed down in create_delta_index() when hashing
of the reference buffer isn't well balanced.  In that case the hash
lookup following a block copy might fail to match anything and the fact
that the reference buffer still matches beyond the previous 64KB block
will be missed.

Let's rework the code so that buffer comparison isn't bounded to 64KB
anymore.  The match size should be as large as possible up front and
only then should multiple block copy be encoded to cover it all.
Also, fewer hash lookups will be performed in the end.

According to Martin, this patch should reduce his 92MB pack down to 75MB
with the dataset he has.

Tests performed on the Linux kernel repo show a slightly smaller pack and
a slightly faster repack.

Signed-off-by: Nicolas Pitre &lt;nico@cam.org&gt;
Signed-off-by: Junio C Hamano &lt;junkio@cox.net&gt;
</content>
</entry>
<entry>
<title>simplify inclusion of system header files.</title>
<updated>2006-12-20T17:51:35Z</updated>
<author>
<name>Junio C Hamano</name>
<email>junkio@cox.net</email>
</author>
<published>2006-12-19T22:34:12Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=85023577a8f4b540aa64aa37f6f44578c0c305a3'/>
<id>urn:sha1:85023577a8f4b540aa64aa37f6f44578c0c305a3</id>
<content type='text'>
This is a mechanical clean-up of the way *.c files include
system header files.

 (1) sources under compat/, platform sha-1 implementations, and
     xdelta code are exempt from the following rules;

 (2) the first #include must be "git-compat-util.h" or one of
     our own header file that includes it first (e.g. config.h,
     builtin.h, pkt-line.h);

 (3) system headers that are included in "git-compat-util.h"
     need not be included in individual C source files.

 (4) "git-compat-util.h" does not have to include subsystem
     specific header files (e.g. expat.h).

Signed-off-by: Junio C Hamano &lt;junkio@cox.net&gt;
</content>
</entry>
</feed>
