<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/pack-objects.h, branch v2.24.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://www.git.shady.money/git/atom?h=v2.24.0</id>
<link rel='self' href='https://www.git.shady.money/git/atom?h=v2.24.0'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/'/>
<updated>2019-09-06T18:03:42Z</updated>
<entry>
<title>pack-objects: drop packlist index_pos optimization</title>
<updated>2019-09-06T18:03:42Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-09-06T01:36:05Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=3a37876b5dca4c18bda67bcdead9c1d79a59933d'/>
<id>urn:sha1:3a37876b5dca4c18bda67bcdead9c1d79a59933d</id>
<content type='text'>
Once upon a time, the code to add an object to our packing list in
pack-objects all lived in a single function. It computed the position
within the hash table once, then used it to check if the object was
already present, and if not, to add it.

Later, in 2834bc27c1 (pack-objects: refactor the packing list,
2013-10-24), this was split into two functions: packlist_find() and
packlist_alloc(). We ended up with an "index_pos" variable that gets
passed through several functions to make it from one to the other.

The resulting code is rather confusing to follow. The "index_pos"
variable is sometimes undefined, if we don't yet have a hash table. This
works out in practice because in that case packlist_alloc() won't use it
at all, since it will have to create/grow the hash table. But it's hard
to verify that, and it does cause gcc 9.2.1's -Wmaybe-uninitialized to
complain when compiled with "-flto -O3" (rightfully, since we do pass
the uninitialized value as a function parameter, even if nobody ends up
using it).

All of this is to save computing the hash index again when we're
inserting into the hash table, which I found doesn't make a measurable
difference in the program runtime (which is not surprising, since we're
doing all kinds of other heavyweight things for each object).

Let's just drop this index_pos variable entirely, simplifying the code
(and pleasing the compiler).

We might be better still refactoring this custom hash table to use one
of our existing implementations (an oidmap, or a kh_oid_map). I stopped
short of that here, but this would be the likely first step towards that
anyway.

Reported-by: Stephan Beyer &lt;s-beyer@gmx.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: use object_id in packlist_alloc()</title>
<updated>2019-09-06T18:03:39Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-09-05T22:52:25Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=f1cbd033e201a18c7175bc6509b48d6243e79739'/>
<id>urn:sha1:f1cbd033e201a18c7175bc6509b48d6243e79739</id>
<content type='text'>
The only caller of packlist_alloc() already has a "struct object_id",
and we immediately copy the hash they pass us into our own object_id.
Let's avoid the unnecessary round-trip to a raw sha1 pointer.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: convert packlist_find() to use object_id</title>
<updated>2019-06-20T16:54:58Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-06-20T07:41:03Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=3df28caefb2193fb7bbc87a427a620d96d508c8d'/>
<id>urn:sha1:3df28caefb2193fb7bbc87a427a620d96d508c8d</id>
<content type='text'>
We take a raw hash pointer, but most of our callers have a "struct
object_id" already. Let's switch to taking the full struct, which will
let us continue removing uses of raw sha1 buffers.

There are two callers that do need special attention:

  - in rebuild_existing_bitmaps(), we need to switch to
    nth_packed_object_oid(). This incurs an extra hash copy over
    pointing straight to the mmap'd sha1, but it shouldn't be measurable
    compared to the rest of the operation.

  - in can_reuse_delta() we already spent the effort to copy the sha1
    into a "struct object_id", but now we just have to do so a little
    earlier in the function (we can't easily convert that function's
    callers because they may be pointing at mmap'd REF_DELTA blocks).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: drop unused parameter from oe_map_new_pack()</title>
<updated>2019-02-14T23:26:15Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-02-14T05:50:32Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=c409d108b857799ae699654d2fc33b063c9aef9d'/>
<id>urn:sha1:c409d108b857799ae699654d2fc33b063c9aef9d</id>
<content type='text'>
Since 43fa44fa3b (pack-objects: move in_pack out of struct object_entry,
2018-04-14), we store the source pack for each object as a small index
rather than as a pointer. When we see a new pack that has no allocated
index, we fall back to generating an array of pointers by calling
oe_map_new_pack().

Perhaps counter-intuitively, that function does not need to actually see
our new index-less pack. It only allocates and populates the array with
the existing packs, after which oe_set_in_pack() actually adds the new
pack to the array.

Let's drop the unused "struct packed_git" argument to oe_map_new_pack()
to avoid confusion.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ph/pack-objects-mutex-fix'</title>
<updated>2019-02-05T22:26:16Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-02-05T22:26:16Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=d243a323a545da68b87149e885f2e440f0b13725'/>
<id>urn:sha1:d243a323a545da68b87149e885f2e440f0b13725</id>
<content type='text'>
"git pack-objects" incorrectly used uninitialized mutex, which has
been corrected.

* ph/pack-objects-mutex-fix:
  pack-objects: merge read_lock and lock in packing_data struct
  pack-objects: move read mutex to packing_data struct
</content>
</entry>
<entry>
<title>pack-objects: merge read_lock and lock in packing_data struct</title>
<updated>2019-01-28T19:22:12Z</updated>
<author>
<name>Patrick Hogg</name>
<email>phogg@novamoon.net</email>
</author>
<published>2019-01-25T00:22:05Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=edb673cf1001eeff140370c41139aaa06e67cea0'/>
<id>urn:sha1:edb673cf1001eeff140370c41139aaa06e67cea0</id>
<content type='text'>
Rename the packing_data lock to obd_lock and upgrade it to a recursive
mutex to make it suitable for current read_lock usages. Additionally
remove the superfluous #ifndef NO_PTHREADS guard around mutex
initialization in prepare_packing_data as the mutex functions
themselves are already protected.

Signed-off-by: Patrick Hogg &lt;phogg@novamoon.net&gt;
Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: move read mutex to packing_data struct</title>
<updated>2019-01-28T19:22:06Z</updated>
<author>
<name>Patrick Hogg</name>
<email>phogg@novamoon.net</email>
</author>
<published>2019-01-25T00:22:03Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=459307b139c9a859ca0b6ca5276cf9be3d2b8e3e'/>
<id>urn:sha1:459307b139c9a859ca0b6ca5276cf9be3d2b8e3e</id>
<content type='text'>
ac77d0c37 ("pack-objects: shrink size field in struct object_entry",
2018-04-14) added an extra usage of read_lock/read_unlock in the newly
introduced oe_get_size_slow for thread safety in parallel calls to
try_delta(). Unfortunately oe_get_size_slow is also used in serial
code, some of which is called before the first invocation of
ll_find_deltas. As such the read mutex is not guaranteed to be
initialized.

Resolve this by moving the read mutex to packing_data and initializing
it in prepare_packing_data which is initialized in cmd_pack_objects.

Signed-off-by: Patrick Hogg &lt;phogg@novamoon.net&gt;
Reviewed-by: Duy Nguyen &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'nd/the-index'</title>
<updated>2019-01-04T21:33:33Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-01-04T21:33:33Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=cde555480b95c4311819dc1f7a38cc856a9aed23'/>
<id>urn:sha1:cde555480b95c4311819dc1f7a38cc856a9aed23</id>
<content type='text'>
More codepaths become aware of working with in-core repository
instance other than the default "the_repository".

* nd/the-index: (22 commits)
  rebase-interactive.c: remove the_repository references
  rerere.c: remove the_repository references
  pack-*.c: remove the_repository references
  pack-check.c: remove the_repository references
  notes-cache.c: remove the_repository references
  line-log.c: remove the_repository reference
  diff-lib.c: remove the_repository references
  delta-islands.c: remove the_repository references
  cache-tree.c: remove the_repository references
  bundle.c: remove the_repository references
  branch.c: remove the_repository reference
  bisect.c: remove the_repository reference
  blame.c: remove implicit dependency the_repository
  sequencer.c: remove implicit dependency on the_repository
  sequencer.c: remove implicit dependency on the_index
  transport.c: remove implicit dependency on the_index
  notes-merge.c: remove implicit dependency the_repository
  notes-merge.c: remove implicit dependency on the_index
  list-objects.c: reduce the_repository references
  list-objects-filter.c: remove implicit dependency on the_index
  ...
</content>
</entry>
<entry>
<title>Merge branch 'cc/delta-islands'</title>
<updated>2018-11-21T11:39:02Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-11-21T11:39:02Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=7fab474656cdb5517d5b627602a54776e485ddbc'/>
<id>urn:sha1:7fab474656cdb5517d5b627602a54776e485ddbc</id>
<content type='text'>
A few issues in the implementation of "delta-islands" feature has
been corrected.

* cc/delta-islands:
  pack-objects: fix off-by-one in delta-island tree-depth computation
  pack-objects: zero-initialize tree_depth/layer arrays
  pack-objects: fix tree_depth and layer invariants
</content>
</entry>
<entry>
<title>pack-objects: zero-initialize tree_depth/layer arrays</title>
<updated>2018-11-21T04:50:27Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-11-20T09:48:57Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=e159b8107190aa53b27f9f106e5874597106eb88'/>
<id>urn:sha1:e159b8107190aa53b27f9f106e5874597106eb88</id>
<content type='text'>
Commit 108f530385 (pack-objects: move tree_depth into 'struct
packing_data', 2018-08-16) started maintaining a tree_depth array that
matches the "objects" array. We extend the array when:

  1. The objects array is extended, in which case we use realloc to
     extend the tree_depth array.

  2. A caller asks to store a tree_depth for object N, and this is the
     first such request; we create the array from scratch and store the
     value for N.

In the latter case, though, we use regular xmalloc(), and the depth
values for any objects besides N is undefined. This happens to not
trigger a bug with the current code, but the reasons are quite subtle:

 - we never ask about the depth for any object with index i &lt; N. This is
   because we store the depth immediately for all trees and blobs. So
   any such "i" must be a non-tree, and therefore we will never need to
   care about its depth (in fact, we really only care about the depth of
   trees).

 - there are no objects at this point with index i &gt; N, because we
   always fill in the depth for a tree immediately after its object
   entry is created (we may still allocate uninitialized depth entries,
   but they'll be initialized by packlist_alloc() when it initializes
   the entry in the "objects" array).

So it works, but only by chance. To be defensive, let's zero the array,
which matches the "unset" values which would be handed out by
oe_tree_depth() already.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
