<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit-graph.h, branch v2.21.2</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://www.git.shady.money/git/atom?h=v2.21.2</id>
<link rel='self' href='https://www.git.shady.money/git/atom?h=v2.21.2'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/'/>
<updated>2019-02-05T22:26:14Z</updated>
<entry>
<title>Merge branch 'ab/commit-graph-write-progress'</title>
<updated>2019-02-05T22:26:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2019-02-05T22:26:14Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=e5eac5735609722a07401a279a1508ace22fc9bc'/>
<id>urn:sha1:e5eac5735609722a07401a279a1508ace22fc9bc</id>
<content type='text'>
The codepath to show progress meter while writing out commit-graph
file has been improved.

* ab/commit-graph-write-progress:
  commit-graph write: emit a percentage for all progress
  commit-graph write: add itermediate progress
  commit-graph write: remove empty line for readability
  commit-graph write: add more descriptive progress output
  commit-graph write: show progress for object search
  commit-graph write: more descriptive "writing out" output
  commit-graph write: add "Writing out" progress output
  commit-graph: don't call write_graph_chunk_extra_edges() unnecessarily
  commit-graph: rename "large edges" to "extra edges"
</content>
</entry>
<entry>
<title>commit-graph: rename "large edges" to "extra edges"</title>
<updated>2019-01-22T19:33:46Z</updated>
<author>
<name>SZEDER Gábor</name>
<email>szeder.dev@gmail.com</email>
</author>
<published>2019-01-19T20:21:13Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=5af7417bd869d935715a56b2ccdee04d3a79a328'/>
<id>urn:sha1:5af7417bd869d935715a56b2ccdee04d3a79a328</id>
<content type='text'>
The optional 'Large Edge List' chunk of the commit graph file stores
parent information for commits with more than two parents, and the
names of most of the macros, variables, struct fields, and functions
related to this chunk contain the term "large edges", e.g.
write_graph_chunk_large_edges().  However, it's not a really great
term, as the edges to the second and subsequent parents stored in this
chunk are not any larger than the edges to the first and second
parents stored in the "main" 'Commit Data' chunk.  It's the number of
edges, IOW number of parents, that is larger compared to non-merge and
"regular" two-parent merge commits.  And indeed, two functions in
'commit-graph.c' have a local variable called 'num_extra_edges' that
refer to the same thing, and this "extra edges" term is much better at
describing these edges.

So let's rename all these references to "large edges" in macro,
variable, function, etc. names to "extra edges".  There is a
GRAPH_OCTOPUS_EDGES_NEEDED macro as well; for the sake of consistency
rename it to GRAPH_EXTRA_EDGES_NEEDED.

We can do so safely without causing any incompatibility issues,
because the term "large edges" doesn't come up in the file format
itself in any form (the chunk's magic is {'E', 'D', 'G', 'E'}, there
is no 'L' in there), but only in the specification text.  The string
"large edges", however, does come up in the output of 'git
commit-graph read' and in tests looking at its input, but that command
is explicitly documented as debugging aid, so we can change its output
and the affected tests safely.

Signed-off-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph, fuzz: add fuzzer for commit-graph</title>
<updated>2019-01-16T04:31:49Z</updated>
<author>
<name>Josh Steadmon</name>
<email>steadmon@google.com</email>
</author>
<published>2019-01-15T22:25:50Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=aa658574bfcbe03f5703458ac10be1ef3f5f5472'/>
<id>urn:sha1:aa658574bfcbe03f5703458ac10be1ef3f5f5472</id>
<content type='text'>
Break load_commit_graph_one() into a new function, parse_commit_graph().
The latter function operates on arbitrary buffers, which makes it
suitable as a fuzzing target. Since parse_commit_graph() is only called
by load_commit_graph_one() (and the fuzzer described below), we omit
error messages that would be duplicated by the caller.

Adds fuzz-commit-graph.c, which provides a fuzzing entry point
compatible with libFuzzer (and possibly other fuzzing engines).

Signed-off-by: Josh Steadmon &lt;steadmon@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ds/commit-graph-with-grafts'</title>
<updated>2018-10-16T07:15:59Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-10-16T07:15:59Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=6d8f8ebb74d21b51cfbf427a436094134af36ee2'/>
<id>urn:sha1:6d8f8ebb74d21b51cfbf427a436094134af36ee2</id>
<content type='text'>
The recently introduced commit-graph auxiliary data is incompatible
with mechanisms such as replace &amp; grafts that "breaks" immutable
nature of the object reference relationship.  Disable optimizations
based on its use (and updating existing commit-graph) when these
incompatible features are in use in the repository.

* ds/commit-graph-with-grafts:
  commit-graph: close_commit_graph before shallow walk
  commit-graph: not compatible with uninitialized repo
  commit-graph: not compatible with grafts
  commit-graph: not compatible with replace objects
  test-repository: properly init repo
  commit-graph: update design document
  refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
  refs.c: migrate internal ref iteration to pass thru repository argument
</content>
</entry>
<entry>
<title>Merge branch 'ab/commit-graph-progress'</title>
<updated>2018-10-16T07:15:58Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-10-16T07:15:58Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=36d767d02e12c1ff991511b7bd8c1d023340f2ca'/>
<id>urn:sha1:36d767d02e12c1ff991511b7bd8c1d023340f2ca</id>
<content type='text'>
Generation of (experimental) commit-graph files have so far been
fairly silent, even though it takes noticeable amount of time in a
meaningfully large repository.  The users will now see progress
output.

* ab/commit-graph-progress:
  gc: fix regression in 7b0f229222 impacting --quiet
  commit-graph verify: add progress output
  commit-graph write: add progress output
</content>
</entry>
<entry>
<title>Merge branch 'ds/commit-graph-tests'</title>
<updated>2018-09-17T20:53:58Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-09-17T20:53:58Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=06880cff3892185ae71839636e2d2c688413eccb'/>
<id>urn:sha1:06880cff3892185ae71839636e2d2c688413eccb</id>
<content type='text'>
We can now optionally run tests with commit-graph enabled.

* ds/commit-graph-tests:
  commit-graph: define GIT_TEST_COMMIT_GRAPH
</content>
</entry>
<entry>
<title>Merge branch 'ds/reachable'</title>
<updated>2018-09-17T20:53:52Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-09-17T20:53:52Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=1b7a91da71d42759dfb83fa3a17be54ad01f0132'/>
<id>urn:sha1:1b7a91da71d42759dfb83fa3a17be54ad01f0132</id>
<content type='text'>
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.

* ds/reachable:
  commit-reach: correct accidental #include of C file
  commit-reach: use can_all_from_reach
  commit-reach: make can_all_from_reach... linear
  commit-reach: replace ref_newer logic
  test-reach: test commit_contains
  test-reach: test can_all_from_reach_with_flags
  test-reach: test reduce_heads
  test-reach: test get_merge_bases_many
  test-reach: test is_descendant_of
  test-reach: test in_merge_bases
  test-reach: create new test tool for ref_newer
  commit-reach: move can_all_from_reach_with_flags
  upload-pack: generalize commit date cutoff
  upload-pack: refactor ok_to_give_up()
  upload-pack: make reachable() more generic
  commit-reach: move commit_contains from ref-filter
  commit-reach: move ref_newer from remote.c
  commit.h: remove method declarations
  commit-reach: move walk methods from commit.c
</content>
</entry>
<entry>
<title>commit-graph write: add progress output</title>
<updated>2018-09-17T17:12:30Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2018-09-17T15:33:35Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=7b0f2292224cec841de78acc911e4c378ea79faa'/>
<id>urn:sha1:7b0f2292224cec841de78acc911e4c378ea79faa</id>
<content type='text'>
Before this change the "commit-graph write" command didn't report any
progress. On my machine this command takes more than 10 seconds to
write the graph for linux.git, and around 1m30s on the
2015-04-03-1M-git.git[1] test repository (a test case for a large
monorepository).

Furthermore, since the gc.writeCommitGraph setting was added in
d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27),
there was no indication at all from a "git gc" run that anything was
different. This why one of the progress bars being added here uses
start_progress() instead of start_delayed_progress(), so that it's
guaranteed to be seen. E.g. on my tiny 867 commit dotfiles.git
repository:

    $ git -c gc.writeCommitGraph=true gc
    Enumerating objects: 2821, done.
    [...]
    Computing commit graph generation numbers: 100% (867/867), done.

On larger repositories, such as linux.git the delayed progress bar(s)
will kick in, and we'll show what's going on instead of, as was
previously happening, printing nothing while we write the graph:

    $ git -c gc.writeCommitGraph=true gc
    [...]
    Annotating commits in commit graph: 1565573, done.
    Computing commit graph generation numbers: 100% (782484/782484), done.

Note that here we don't show "Finding commits for commit graph", this
is because under "git gc" we seed the search with the commit
references in the repository, and that set is too small to show any
progress, but would e.g. on a smaller repo such as git.git with
--stdin-commits:

    $ git rev-list --all | git -c gc.writeCommitGraph=true write --stdin-commits
    Finding commits for commit graph: 100% (162576/162576), done.
    Computing commit graph generation numbers: 100% (162576/162576), done.

With --stdin-packs we don't show any estimation of how much is left to
do. This is because we might be processing more than one pack. We
could be less lazy here and show progress, either by detecting that
we're only processing one pack, or by first looping over the packs to
discover how many commits they have. I don't see the point in doing
that work. So instead we get (on 2015-04-03-1M-git.git):

    $ echo pack-&lt;HASH&gt;.idx | git -c gc.writeCommitGraph=true --exec-path=$PWD commit-graph write --stdin-packs
    Finding commits for commit graph: 13064614, done.
    Annotating commits in commit graph: 3001341, done.
    Computing commit graph generation numbers: 100% (1000447/1000447), done.

No GC mode uses --stdin-packs. It's what they use at Microsoft to
manually compute the generation numbers for their collection of large
packs which are never coalesced.

The reason we need a "report_progress" variable passed down from "git
gc" is so that we don't report this output when we're running in the
process "git gc --auto" detaches from the terminal.

Since we write the commit graph from the "git gc" process itself (as
opposed to what we do with say the "git repack" phase), we'd end up
writing the output to .git/gc.log and reporting it to the user next
time as part of the "The last gc run reported the following[...]"
error, see 329e6e8794 ("gc: save log from daemonized gc --auto and
print it next time", 2015-09-19).

So we must keep track of whether or not we're running in that
demonized mode, and if so print no progress.

See [2] and subsequent replies for a discussion of an approach not
taken in compute_generation_numbers(). I.e. we're saying "Computing
commit graph generation numbers", even though on an established
history we're mostly skipping over all the work we did in the
past. This is similar to the white lie we tell in the "Writing
objects" phase (not all are objects being written).

Always showing progress is considered more important than
accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang
for 6 seconds with no output on the second "git gc" if no changes were
made to any objects in the interim if we'd take the approach in [2].

1. https://github.com/avar/2015-04-03-1M-git

2. &lt;c6960252-c095-fb2b-e0bc-b1e6bb261614@gmail.com&gt;
   (https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261614@gmail.com/)

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: define GIT_TEST_COMMIT_GRAPH</title>
<updated>2018-08-29T17:44:31Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-29T12:49:04Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=859fdc0c3cf9ad8cdd5eacaa24aee481bc1e7bc1'/>
<id>urn:sha1:859fdc0c3cf9ad8cdd5eacaa24aee481bc1e7bc1</id>
<content type='text'>
The commit-graph feature is tested in isolation by
t5318-commit-graph.sh and t6600-test-reach.sh, but there are many
more interesting scenarios involving commit walks. Many of these
scenarios are covered by the existing test suite, but we need to
maintain coverage when the optional commit-graph structure is not
present.

To allow running the full test suite with the commit-graph present,
add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar
to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try
to load the commit-graph when parsing commits, and writes the
commit-graph file after every 'git commit' command.

There are a few tests that rely on commits not existing in
pack-files to trigger important events, so manually set
GIT_TEST_COMMIT_GRAPH to false for the necessary commands.

There is one test in t6024-recursive-merge.sh that relies on the
merge-base algorithm picking one of two ambiguous merge-bases, and
the commit-graph feature changes which merge-base is picked.

Helped-by: Eric Sunshine &lt;sunshine@sunshineco.com&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: close_commit_graph before shallow walk</title>
<updated>2018-08-21T17:22:51Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-20T18:24:34Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=829a321569d8e8f2c582aef9f0c990df976ab842'/>
<id>urn:sha1:829a321569d8e8f2c582aef9f0c990df976ab842</id>
<content type='text'>
Call close_commit_graph() when about to start a rev-list walk that
includes shallow commits. This is necessary in code paths that "fake"
shallow commits for the sake of fetch. Specifically, test 351 in
t5500-fetch-pack.sh runs

	git fetch --shallow-exclude one origin

with a file-based transfer. When the "remote" has a commit-graph, we do
not prevent the commit-graph from being loaded, but then the commits are
intended to be dynamically transferred into shallow commits during
get_shallow_commits_by_rev_list(). By closing the commit-graph before
this call, we prevent this interaction.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
