<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit-graph.h, branch v2.27.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://www.git.shady.money/git/atom?h=v2.27.0</id>
<link rel='self' href='https://www.git.shady.money/git/atom?h=v2.27.0'/>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/'/>
<updated>2020-05-01T20:39:54Z</updated>
<entry>
<title>Merge branch 'ds/blame-on-bloom'</title>
<updated>2020-05-01T20:39:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-05-01T20:39:54Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=6d56d4c7dcd667d28aec28498591723c6febea1c'/>
<id>urn:sha1:6d56d4c7dcd667d28aec28498591723c6febea1c</id>
<content type='text'>
"git blame" learns to take advantage of the "changed-paths" Bloom
filter stored in the commit-graph file.

* ds/blame-on-bloom:
  test-bloom: check that we have expected arguments
  test-bloom: fix some whitespace issues
  blame: drop unused parameter from maybe_changed_path
  blame: use changed-path Bloom filters
  tests: write commit-graph with Bloom filters
  revision: complicated pathspecs disable filters
</content>
</entry>
<entry>
<title>Merge branch 'gs/commit-graph-path-filter'</title>
<updated>2020-05-01T20:39:53Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-05-01T20:39:53Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=9b6606f43d55bbf33b9924d16e02e60e1c09660a'/>
<id>urn:sha1:9b6606f43d55bbf33b9924d16e02e60e1c09660a</id>
<content type='text'>
Introduce an extension to the commit-graph to make it efficient to
check for the paths that were modified at each commit using Bloom
filters.

* gs/commit-graph-path-filter:
  bloom: ignore renames when computing changed paths
  commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag
  t4216: add end to end tests for git log with Bloom filters
  revision.c: add trace2 stats around Bloom filter usage
  revision.c: use Bloom filters to speed up path based revision walks
  commit-graph: add --changed-paths option to write subcommand
  commit-graph: reuse existing Bloom filters during write
  commit-graph: write Bloom filters to commit graph file
  commit-graph: examine commits by generation number
  commit-graph: examine changed-path objects in pack order
  commit-graph: compute Bloom filters for changed paths
  diff: halt tree-diff early after max_changes
  bloom.c: core Bloom filter implementation for changed paths.
  bloom.c: introduce core Bloom filter constructs
  bloom.c: add the murmur3 hash implementation
  commit-graph: define and use MAX_NUM_CHUNKS
</content>
</entry>
<entry>
<title>commit-graph: close descriptors after mmap</title>
<updated>2020-04-25T05:25:50Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-04-23T21:41:13Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=c8828530b7797f5ab584c84dc2b86d3c14b39c8d'/>
<id>urn:sha1:c8828530b7797f5ab584c84dc2b86d3c14b39c8d</id>
<content type='text'>
We don't ever refer to the descriptor after mmap-ing it. And keeping it
open means we can run out of descriptors in degenerate cases (e.g.,
thousands of split chain files). Let's close it as soon as possible.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tests: write commit-graph with Bloom filters</title>
<updated>2020-04-16T22:38:04Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-04-16T20:14:03Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=b23ea9790d30e80c7c79a77aab2e9d150a189463'/>
<id>urn:sha1:b23ea9790d30e80c7c79a77aab2e9d150a189463</id>
<content type='text'>
The GIT_TEST_COMMIT_GRAPH environment variable updates the commit-
graph file whenever "git commit" is run, ensuring that we always
have an updated commit-graph throughout the test suite. The
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS environment variable was
introduced to write the changed-path Bloom filters whenever "git
commit-graph write" is run. However, the GIT_TEST_COMMIT_GRAPH
trick doesn't launch a separate process and instead writes it
directly.

To expand the number of tests that have commits in the commit-graph
file, add a helper method that computes the commit-graph and place
that helper inside "git commit" and "git merge".

In the helper method, check GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS
to ensure we are writing changed-path Bloom filters whenever
possible.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph.h: replace 'commit_hex' with 'commits'</title>
<updated>2020-04-15T16:20:30Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-04-14T04:04:25Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=6830c360777468434184f60023e2562348c9dacc'/>
<id>urn:sha1:6830c360777468434184f60023e2562348c9dacc</id>
<content type='text'>
The 'write_commit_graph()' function takes in either a string list of
pack indices, or a string list of hexadecimal commit OIDs. These
correspond to the '--stdin-packs' and '--stdin-commits' mode(s) from
'git commit-graph write'.

Using a string_list of hexadecimal commit IDs is not the most efficient
use of memory, since we can instead use the 'struct oidset', which is
more well-suited for this case.

This has another benefit which will become apparent in the following
commit. This is that we are about to disambiguate the kinds of errors we
produce with '--stdin-commits' into "non-hex input" and "hex-input, but
referring to a non-commit object". By having 'write_commit_graph' take
in a 'struct oidset *' of commits, we place the burden on the caller (in
this case, the builtin) to handle the first case, and the commit-graph
machinery can handle the second case.

Suggested-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/commit-graph.c: introduce split strategy 'replace'</title>
<updated>2020-04-15T16:20:28Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-04-14T04:04:17Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=8a6ac287b2ba5f75bb2d9409dd97e9b501daf253'/>
<id>urn:sha1:8a6ac287b2ba5f75bb2d9409dd97e9b501daf253</id>
<content type='text'>
When using split commit-graphs, it is sometimes useful to completely
replace the commit-graph chain with a new base.

For example, consider a scenario in which a repository builds a new
commit-graph incremental for each push. Occasionally (say, after some
fixed number of pushes), they may wish to rebuild the commit-graph chain
with all reachable commits.

They can do so with

  $ git commit-graph write --reachable

but this removes the chain entirely and replaces it with a single
commit-graph in 'objects/info/commit-graph'. Unfortunately, this means
that the next push will have to move this commit-graph into the first
layer of a new chain, and then write its new commits on top.

Avoid such copying entirely by allowing the caller to specify that they
wish to replace the entirety of their commit-graph chain, while also
specifying that the new commit-graph should become the basis of a fresh,
length-one chain.

This addresses the above situation by making it possible for the caller
to instead write:

  $ git commit-graph write --reachable --split=replace

which writes a new length-one chain to 'objects/info/commit-graphs',
making the commit-graph incremental generated by the subsequent push
relatively cheap by avoiding the aforementioned copy.

In order to do this, remove an assumption in 'write_commit_graph_file'
that chains are always at least two incrementals long.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/commit-graph.c: introduce split strategy 'no-merge'</title>
<updated>2020-04-15T16:20:27Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-04-14T04:04:12Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=fdbde82fe523374b3c5d1f0f01f3c43dcaca9465'/>
<id>urn:sha1:fdbde82fe523374b3c5d1f0f01f3c43dcaca9465</id>
<content type='text'>
In the previous commit, we laid the groundwork for supporting different
splitting strategies. In this commit, we introduce the first splitting
strategy: 'no-merge'.

Passing '--split=no-merge' is useful for callers which wish to write a
new incremental commit-graph, but do not want to spend effort condensing
the incremental chain [1]. Previously, this was possible by passing
'--size-multiple=0', but this no longer the case following 63020f175f
(commit-graph: prefer default size_mult when given zero, 2020-01-02).

When '--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain, and it will always write a new incremental.

[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>builtin/commit-graph.c: support for '--split[=&lt;strategy&gt;]'</title>
<updated>2020-04-15T16:20:26Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-04-14T04:04:08Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=4f027355f6b6b5b2ba69e23ff50cb7313d13dd70'/>
<id>urn:sha1:4f027355f6b6b5b2ba69e23ff50cb7313d13dd70</id>
<content type='text'>
With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the asymptotic behavior of looking up a
commit in an incremental chain is not dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.

In the next two commits, we will introduce additional splitting
strategies that can exert additional control over:

  - when a split commit-graph is and isn't written, and

  - when the existing commit-graph chain is discarded completely and
    replaced with another graph

To prepare for this, make '--split' take an optional strategy (as in
'--split[=&lt;strategy&gt;]'), and add a new enum to describe which strategy
is being used. For now, no strategies are given, and the only enumerated
value is 'COMMIT_GRAPH_SPLIT_UNSPECIFIED', indicating the absence of a
strategy.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag</title>
<updated>2020-04-06T18:08:37Z</updated>
<author>
<name>Garima Singh</name>
<email>garima.singh@microsoft.com</email>
</author>
<published>2020-04-06T16:59:55Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=d5b873c832d832e44523d1d2a9d29afe2b84c84f'/>
<id>urn:sha1:d5b873c832d832e44523d1d2a9d29afe2b84c84f</id>
<content type='text'>
Add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag to the test setup suite
in order to toggle writing Bloom filters when running any of the git tests.
If set to true, we will compute and write Bloom filters every time a test
calls `git commit-graph write`, as if the `--changed-paths` option was
passed in.

The test suite passes when GIT_TEST_COMMIT_GRAPH and
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS are enabled.

Helped-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Garima Singh &lt;garima.singh@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: write Bloom filters to commit graph file</title>
<updated>2020-04-06T18:08:37Z</updated>
<author>
<name>Garima Singh</name>
<email>garima.singh@microsoft.com</email>
</author>
<published>2020-04-06T16:59:49Z</published>
<link rel='alternate' type='text/html' href='https://www.git.shady.money/git/commit/?id=76ffbca71a9c89d1e530f734e16a70b3924f4bea'/>
<id>urn:sha1:76ffbca71a9c89d1e530f734e16a70b3924f4bea</id>
<content type='text'>
Update the technical documentation for commit-graph-format with
the formats for the Bloom filter index (BIDX) and Bloom filter
data (BDAT) chunks. Write the computed Bloom filters information
to the commit graph file using this format.

Helped-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Garima Singh &lt;garima.singh@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
