<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit-graph.c, branch v2.19.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://www.git.shady.money/git/atom?h=v2.19.0</id>
<link rel='self' href='https://www.git.shady.money/git/atom?h=v2.19.0'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/'/>
<updated>2018-08-20T18:33:52Z</updated>
<entry>
<title>Merge branch 'jk/for-each-object-iteration'</title>
<updated>2018-08-20T18:33:52Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-20T18:33:52Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=0c54cdaf6580f121919048633e85772d60b8fb17'/>
<id>urn:sha1:0c54cdaf6580f121919048633e85772d60b8fb17</id>
<content type='text'>
The API to iterate over all objects learned to optionally list
objects in the order they appear in packfiles, which helps locality
of access if the caller accesses these objects while as objects are
enumerated.

* jk/for-each-object-iteration:
  for_each_*_object: move declarations to object-store.h
  cat-file: use a single strbuf for all output
  cat-file: split batch "buf" into two variables
  cat-file: use oidset check-and-insert
  cat-file: support "unordered" output for --batch-all-objects
  cat-file: rename batch_{loose,packed}_object callbacks
  t1006: test cat-file --batch-all-objects with duplicates
  for_each_packed_object: support iterating in pack-order
  for_each_*_object: give more comprehensive docstrings
  for_each_*_object: take flag arguments as enum
  for_each_*_object: store flag definitions in a single location
</content>
</entry>
<entry>
<title>Merge branch 'nd/i18n'</title>
<updated>2018-08-15T22:08:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-15T22:08:23Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=4bea8485e395769951c6b9eddfba1081ea7ef47f'/>
<id>urn:sha1:4bea8485e395769951c6b9eddfba1081ea7ef47f</id>
<content type='text'>
Many more strings are prepared for l10n.

* nd/i18n: (23 commits)
  transport-helper.c: mark more strings for translation
  transport.c: mark more strings for translation
  sha1-file.c: mark more strings for translation
  sequencer.c: mark more strings for translation
  replace-object.c: mark more strings for translation
  refspec.c: mark more strings for translation
  refs.c: mark more strings for translation
  pkt-line.c: mark more strings for translation
  object.c: mark more strings for translation
  exec-cmd.c: mark more strings for translation
  environment.c: mark more strings for translation
  dir.c: mark more strings for translation
  convert.c: mark more strings for translation
  connect.c: mark more strings for translation
  config.c: mark more strings for translation
  commit-graph.c: mark more strings for translation
  builtin/replace.c: mark more strings for translation
  builtin/pack-objects.c: mark more strings for translation
  builtin/grep.c: mark strings for translation
  builtin/config.c: mark more strings for translation
  ...
</content>
</entry>
<entry>
<title>for_each_packed_object: support iterating in pack-order</title>
<updated>2018-08-13T20:48:28Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-08-10T23:15:49Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=736eb88fdc8a2dea4302114d2f74b580d0f83cfe'/>
<id>urn:sha1:736eb88fdc8a2dea4302114d2f74b580d0f83cfe</id>
<content type='text'>
We currently iterate over objects within a pack in .idx
order, which uses the object hashes. That means that it
is effectively random with respect to the location of the
object within the pack. If you're going to access the actual
object data, there are two reasons to move linearly through
the pack itself:

  1. It improves the locality of access in the packfile. In
     the cold-cache case, this may mean fewer disk seeks, or
     better usage of disk cache.

  2. We store related deltas together in the packfile. Which
     means that the delta base cache can operate much more
     efficiently if we visit all of those related deltas in
     sequence, as the earlier items are likely to still be
     in the cache.  Whereas if we visit the objects in
     random order, our cache entries are much more likely to
     have been evicted by unrelated deltas in the meantime.

So in general, if you're going to access the object contents
pack order is generally going to end up more efficient.

But if you're simply generating a list of object names, or
if you're going to end up sorting the result anyway, you're
better off just using the .idx order, as finding the pack
order means generating the in-memory pack-revindex.
According to the numbers in 8b8dfd5132 (pack-revindex:
radix-sort the revindex, 2013-07-11), that takes about 200ms
for linux.git, and 20ms for git.git (those numbers are a few
years old but are still a good ballpark).

That makes it a good optimization for some cases (we can
save tens of seconds in git.git by having good locality of
delta access, for a 20ms cost), but a bad one for others
(e.g., right now "cat-file --batch-all-objects
--batch-check="%(objectname)" is 170ms in git.git, so adding
20ms to that is noticeable).

Hence this patch makes it an optional flag. You can't
actually do any interesting timings yet, as it's not plumbed
through to any user-facing tools like cat-file. That will
come in a later patch.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph.c: mark more strings for translation</title>
<updated>2018-07-23T18:19:09Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2018-07-21T07:49:26Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=4f5b532d18ad61d8b96d65b4f8d2321cf154d066'/>
<id>urn:sha1:4f5b532d18ad61d8b96d65b4f8d2321cf154d066</id>
<content type='text'>
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: add repo arg to graph readers</title>
<updated>2018-07-17T22:47:48Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2018-07-11T22:42:42Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=dade47c06cf849b0ca180a8e6383b55ea6f75812'/>
<id>urn:sha1:dade47c06cf849b0ca180a8e6383b55ea6f75812</id>
<content type='text'>
Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)

Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.

This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: store graph in struct object_store</title>
<updated>2018-07-17T22:47:48Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2018-07-11T22:42:41Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=8527750626f8a1b0fe641a5163760be054cc1d64'/>
<id>urn:sha1:8527750626f8a1b0fe641a5163760be054cc1d64</id>
<content type='text'>
Instead of storing commit graphs in static variables, store them in
struct object_store. There are no changes to the signatures of existing
functions - they all still only support the_repository, and support for
other instances of struct repository will be added in a subsequent
commit.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: add free_commit_graph</title>
<updated>2018-07-17T22:47:48Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2018-07-11T22:42:40Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=c3756d5b7fc6e163032296aa6c10fad2589273dc'/>
<id>urn:sha1:c3756d5b7fc6e163032296aa6c10fad2589273dc</id>
<content type='text'>
Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: refactor preparing commit graph</title>
<updated>2018-07-17T22:47:48Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2018-07-11T22:42:37Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=5faf357b4314fcc7976f75c7f3ba205d9eba8e77'/>
<id>urn:sha1:5faf357b4314fcc7976f75c7f3ba205d9eba8e77</id>
<content type='text'>
Two functions in the code (1) check if the repository is configured for
commit graphs, (2) call prepare_commit_graph(), and (3) check if the
graph exists. Move (1) and (3) into prepare_commit_graph(), reducing
duplication of code.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/commit-graph-fsck' into jt/commit-graph-per-object-store</title>
<updated>2018-07-17T22:46:19Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-07-17T22:46:19Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=8295296458bfa5e371dccae0a0e0a4b9a56f9b40'/>
<id>urn:sha1:8295296458bfa5e371dccae0a0e0a4b9a56f9b40</id>
<content type='text'>
* ds/commit-graph-fsck: (23 commits)
  coccinelle: update commit.cocci
  commit-graph: update design document
  gc: automatically write commit-graph files
  commit-graph: add '--reachable' option
  commit-graph: use string-list API for input
  fsck: verify commit-graph
  commit-graph: verify contents match checksum
  commit-graph: test for corrupted octopus edge
  commit-graph: verify commit date
  commit-graph: verify generation number
  commit-graph: verify parent list
  commit-graph: verify root tree OIDs
  commit-graph: verify objects exist
  commit-graph: verify corrupt OID fanout and lookup
  commit-graph: verify required chunks are present
  commit-graph: verify catches corrupt signature
  commit-graph: add 'verify' subcommand
  commit-graph: load a root tree from specific graph
  commit: force commit to parse from object database
  commit-graph: parse commit from chosen graph
  ...
</content>
</entry>
<entry>
<title>commit: add repository argument to lookup_commit</title>
<updated>2018-06-29T17:43:39Z</updated>
<author>
<name>Stefan Beller</name>
<email>sbeller@google.com</email>
</author>
<published>2018-06-29T01:21:59Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=c1f5eb49620d4f287af28509621a364e3888cfe7'/>
<id>urn:sha1:c1f5eb49620d4f287af28509621a364e3888cfe7</id>
<content type='text'>
Add a repository argument to allow callers of lookup_commit to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller &lt;sbeller@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
