<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit.c, branch v2.19.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://www.git.shady.money/git/atom?h=v2.19.0</id>
<link rel='self' href='https://www.git.shady.money/git/atom?h=v2.19.0'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/'/>
<updated>2018-09-04T21:31:39Z</updated>
<entry>
<title>Merge branch 'ds/commit-graph-lockfile-fix'</title>
<updated>2018-09-04T21:31:39Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-09-04T21:31:39Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=0a866db570e520ce7b08d1eceefdeaa9d63b6704'/>
<id>urn:sha1:0a866db570e520ce7b08d1eceefdeaa9d63b6704</id>
<content type='text'>
"git merge-base" in 2.19-rc1 has performance regression when the
(experimental) commit-graph feature is in use, which has been
mitigated.

* ds/commit-graph-lockfile-fix:
  commit: don't use generation numbers if not needed
</content>
</entry>
<entry>
<title>commit: don't use generation numbers if not needed</title>
<updated>2018-08-30T18:17:57Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-30T12:58:09Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=091f4cf3586957c3fd99d4c4c59c569d009137ad'/>
<id>urn:sha1:091f4cf3586957c3fd99d4c4c59c569d009137ad</id>
<content type='text'>
In 3afc679b "commit: use generations in paint_down_to_common()",
the queue in paint_down_to_common() was changed to use a priority
order based on generation number before commit date. This served
two purposes:

 1. When generation numbers are present, the walk guarantees
    correct topological relationships, regardless of clock skew in
    commit dates.

 2. It enables short-circuiting the walk when the min_generation
    parameter is added in d7c1ec3e "commit: add short-circuit to
    paint_down_to_common()". This short-circuit helps commands
    like 'git branch --contains' from needing to walk to a merge
    base when we know the result is false.

The commit message for 3afc679b includes the following sentence:

    This change does not affect the number of commits that are
    walked during the execution of paint_down_to_common(), only
    the order that those commits are inspected.

This statement is incorrect. Because it changes the order in which
the commits are inspected, it changes the order they are added to
the queue, and hence can change the number of loops before the
queue_has_nonstale() method returns true.

This change makes a concrete difference depending on the topology
of the commit graph. For instance, computing the merge-base between
consecutive versions of the Linux kernel has no effect for versions
after v4.9, but 'git merge-base v4.8 v4.9' presents a performance
regression:

    v2.18.0: 0.122s
v2.19.0-rc1: 0.547s
       HEAD: 0.127s

To determine that this was simply an ordering issue, I inserted
a counter within the while loop of paint_down_to_common() and
found that the loop runs 167,468 times in v2.18.0 and 635,579
times in v2.19.0-rc1.

The topology of this case can be described in a simplified way
here:

  v4.9
   |  \
   |   \
  v4.8  \
   | \   \
   |  \   |
  ...  A  B
   |  /  /
   | /  /
   |/__/
   C

Here, the "..." means "a very long line of commits". By generation
number, A and B have generation one more than C. However, A and B
have commit date higher than most of the commits reachable from
v4.8. When the walk reaches v4.8, we realize that it has PARENT1
and PARENT2 flags, so everything it can reach is marked as STALE,
including A. B has only the PARENT1 flag, so is not STALE.

When paint_down_to_common() is run using
compare_commits_by_commit_date, A and B are removed from the queue
early and C is inserted into the queue. At this point, C and the
rest of the queue entries are marked as STALE. The loop then
terminates.

When paint_down_to_common() is run using
compare_commits_by_gen_then_commit_date, B is removed from the
queue only after the many commits reachable from v4.8 are explored.
This causes the loop to run longer. The reason for this regression
is simple: the queue order is intended to not explore a commit
until everything that _could_ reach that commit is explored. From
the information gathered by the original ordering, we have no
guarantee that there is not a commit D reachable from v4.8 that
can also reach B. We gained absolute correctness in exchange for
a performance regression.

The performance regression is probably the worse option, since
these incorrect results in paint_down_to_common() are rare. The
topology required for the performance regression are less rare,
but still require multiple merge commits where the parents differ
greatly in generation number. In our example above, the commit A
is as important as the commit B to demonstrate the problem, since
otherwise the commit C will sit in the queue as non-stale just as
long in both orders.

The solution provided uses the min_generation parameter to decide
if we should use generation numbers in our ordering. When
min_generation is equal to zero, it means that the caller has no
known cutoff for the walk, so we should rely on our commit-date
heuristic as before; this is the case with merge_bases_many().
When min_generation is non-zero, then the caller knows a valuable
cutoff for the short-circuit mechanism; this is the case with
remove_redundant() and in_merge_bases_many().

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'js/larger-timestamps'</title>
<updated>2018-08-27T21:33:44Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-27T21:33:44Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=1392c5d28935a3aa25367df52adf4ca1e3c5724e'/>
<id>urn:sha1:1392c5d28935a3aa25367df52adf4ca1e3c5724e</id>
<content type='text'>
Portability fix.

* js/larger-timestamps:
  commit: use timestamp_t for author_date_slab
</content>
</entry>
<entry>
<title>commit: use timestamp_t for author_date_slab</title>
<updated>2018-08-21T21:08:18Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-21T20:54:12Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=1820703045f8974bc5320d08a3611f4e29c83bf9'/>
<id>urn:sha1:1820703045f8974bc5320d08a3611f4e29c83bf9</id>
<content type='text'>
The author_date_slab is used to store the author date of a commit
when walking with the --author-date flag in rev-list or log. This
was added as an 'unsigned long' in

	81c6b38b "log: --author-date-order"

Since 'unsigned long' is ambiguous in its bit-ness across platforms
(64-bit in Linux, 32-bit in Windows, for example), most references
to the author dates in commit.c were converted to timestamp_t in

	dddbad72 "timestamp_t: a new data type for timestamps"

However, the slab definition was missed, leading to a mismatch in
the data types in Windows. This would not reveal itself as a bug
unless someone authors a commit after February 2106, but commits
can store anything as their author date.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jt/commit-graph-per-object-store'</title>
<updated>2018-08-02T22:30:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:47Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=78a72ad4f8fa91adc876b2fc4b18fd370e43136d'/>
<id>urn:sha1:78a72ad4f8fa91adc876b2fc4b18fd370e43136d</id>
<content type='text'>
The singleton commit-graph in-core instance is made per in-core
repository instance.

* jt/commit-graph-per-object-store:
  commit-graph: add repo arg to graph readers
  commit-graph: store graph in struct object_store
  commit-graph: add free_commit_graph
  commit-graph: add missing forward declaration
  object-store: add missing include
  commit-graph: refactor preparing commit graph
</content>
</entry>
<entry>
<title>Merge branch 'sb/object-store-lookup'</title>
<updated>2018-08-02T22:30:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:42Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=3a2a1dc17077a27ad1a89db27cb1b4b374f3b0ff'/>
<id>urn:sha1:3a2a1dc17077a27ad1a89db27cb1b4b374f3b0ff</id>
<content type='text'>
lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.

* sb/object-store-lookup: (32 commits)
  commit.c: allow lookup_commit_reference to handle arbitrary repositories
  commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
  tag.c: allow deref_tag to handle arbitrary repositories
  object.c: allow parse_object to handle arbitrary repositories
  object.c: allow parse_object_buffer to handle arbitrary repositories
  commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
  commit.c: allow set_commit_buffer to handle arbitrary repositories
  commit.c: migrate the commit buffer to the parsed object store
  commit-slabs: remove realloc counter outside of slab struct
  commit.c: allow parse_commit_buffer to handle arbitrary repositories
  tag: allow parse_tag_buffer to handle arbitrary repositories
  tag: allow lookup_tag to handle arbitrary repositories
  commit: allow lookup_commit to handle arbitrary repositories
  tree: allow lookup_tree to handle arbitrary repositories
  blob: allow lookup_blob to handle arbitrary repositories
  object: allow lookup_object to handle arbitrary repositories
  object: allow object_as_type to handle arbitrary repositories
  tag: add repository argument to deref_tag
  tag: add repository argument to parse_tag_buffer
  tag: add repository argument to lookup_tag
  ...
</content>
</entry>
<entry>
<title>Merge branch 'ds/commit-graph-fsck'</title>
<updated>2018-08-02T22:30:40Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:39Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=b006f01ab5b6aa912f2c577c4af441564c6c78a4'/>
<id>urn:sha1:b006f01ab5b6aa912f2c577c4af441564c6c78a4</id>
<content type='text'>
"git fsck" learns to make sure the optional commit-graph file is in
a sane state.

* ds/commit-graph-fsck: (23 commits)
  coccinelle: update commit.cocci
  commit-graph: update design document
  gc: automatically write commit-graph files
  commit-graph: add '--reachable' option
  commit-graph: use string-list API for input
  fsck: verify commit-graph
  commit-graph: verify contents match checksum
  commit-graph: test for corrupted octopus edge
  commit-graph: verify commit date
  commit-graph: verify generation number
  commit-graph: verify parent list
  commit-graph: verify root tree OIDs
  commit-graph: verify objects exist
  commit-graph: verify corrupt OID fanout and lookup
  commit-graph: verify required chunks are present
  commit-graph: verify catches corrupt signature
  commit-graph: add 'verify' subcommand
  commit-graph: load a root tree from specific graph
  commit: force commit to parse from object database
  commit-graph: parse commit from chosen graph
  ...
</content>
</entry>
<entry>
<title>Merge branch 'bc/object-id'</title>
<updated>2018-08-02T22:30:39Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:39Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=37aac3e408fa2348983e964f8bda2de581f2c44e'/>
<id>urn:sha1:37aac3e408fa2348983e964f8bda2de581f2c44e</id>
<content type='text'>
Conversion from uchar[40] to struct object_id continues.

* bc/object-id:
  pretty: switch hard-coded constants to the_hash_algo
  sha1-file: convert constants to uses of the_hash_algo
  log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo-&gt;hexsz
  diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
  builtin/merge-recursive: make hash independent
  builtin/merge: switch to use the_hash_algo
  builtin/fmt-merge-msg: make hash independent
  builtin/update-index: simplify parsing of cacheinfo
  builtin/update-index: convert to using the_hash_algo
  refs/files-backend: use the_hash_algo for writing refs
  sha1-name: use the_hash_algo when parsing object names
  strbuf: allocate space with GIT_MAX_HEXSZ
  commit: express tree entry constants in terms of the_hash_algo
  hex: switch to using the_hash_algo
  tree-walk: replace hard-coded constants with the_hash_algo
  cache: update object ID functions for the_hash_algo
</content>
</entry>
<entry>
<title>Merge branch 'sb/object-store-grafts'</title>
<updated>2018-07-18T19:20:28Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-07-18T19:20:27Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=00624d608cc69bd62801c93e74d1ea7a7ddd6598'/>
<id>urn:sha1:00624d608cc69bd62801c93e74d1ea7a7ddd6598</id>
<content type='text'>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.

* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
</content>
</entry>
<entry>
<title>commit-graph: add repo arg to graph readers</title>
<updated>2018-07-17T22:47:48Z</updated>
<author>
<name>Jonathan Tan</name>
<email>jonathantanmy@google.com</email>
</author>
<published>2018-07-11T22:42:42Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=dade47c06cf849b0ca180a8e6383b55ea6f75812'/>
<id>urn:sha1:dade47c06cf849b0ca180a8e6383b55ea6f75812</id>
<content type='text'>
Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)

Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.

This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.

Signed-off-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
