aboutsummaryrefslogtreecommitdiffstats
path: root/builtin (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-04-17global: use designated initializers for optionsPatrick Steinhardt20-132/+368
While we expose macros for most of our different option types understood by the "parse-options" subsystem, not every combination of fields that has one as that would otherwise quickly lead to an explosion of macros. Instead, we just initialize structures manually for those variants of fields that don't have a macro. Callsites that open-code these structure initialization don't use designated initializers though and instead just provide values for each of the fields that they want to initialize. This has three significant downsides: - Callsites need to specify all values up to the last field that they care about. This often includes fields that should simply be left at their default zero-initialized state, which adds distraction. - Any reader not deeply familiar with the layout of the structure has a hard time figuring out what the respective initializers mean. - Reordering or introducing new fields in the middle of the structure is impossible without adapting all callsites. Convert all sites to instead use designated initializers, which we have started using in our codebase quite a while ago. This allows us to skip any default-initialized fields, gives the reader context by specifying the field names and allows us to reorder or introduce new fields where we want to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-29Merge branch 'en/random-cleanups'Junio C Hamano1-1/+0
Miscellaneous code clean-ups. * en/random-cleanups: merge-ort: remove extraneous word in comment merge-ort: fix accidental strset<->strintmap t7615: be more explicit about diff algorithm used t6423: fix a comment that accidentally reversed two commits stash: remove merge-recursive.h include
2025-03-29Merge branch 'tb/multi-cruft-pack-refresh-fix'Junio C Hamano1-17/+101
Certain "cruft" objects would have never been refreshed when there are multiple cruft packs in the repository, which has been corrected. * tb/multi-cruft-pack-refresh-fix: builtin/pack-objects.c: freshen objects from existing cruft packs
2025-03-29Merge branch 'jk/fetch-ref-prefix-cleanup'Junio C Hamano1-27/+19
In protocol v2 where the refs advertisement is constrained, we try to tell the server side not to limit the advertisement when there is no specific need to, which has been the source of confusion and recent bugs. Revamp the logic to simplify. * jk/fetch-ref-prefix-cleanup: fetch: use ref prefix list to skip ls-refs fetch: avoid ls-refs only to ask for HEAD symref update fetch: stop protecting additions to ref-prefix list fetch: ask server to advertise HEAD for config-less fetch refspec_ref_prefixes(): clean up refspec_item logic t5516: beef up exact-oid ref prefixes test t5516: drop NEEDSWORK about v2 reachability behavior t5516: prefer "oid" to "sha1" in some test titles t5702: fix typo in test name
2025-03-29Merge branch 'en/merge-ort-prepare-to-remove-recursive'Junio C Hamano1-2/+3
First step of deprecating and removing merge-recursive. * en/merge-ort-prepare-to-remove-recursive: am: switch from merge_recursive_generic() to merge_ort_generic() merge-ort: fix merge.directoryRenames=false t3650: document bug when directory renames are turned off merge-ort: support having merge verbosity be set to 0 merge-ort: allow rename detection to be disabled merge-ort: add new merge_ort_generic() function
2025-03-29Merge branch 'ps/refname-avail-check-optim'Junio C Hamano2-5/+12
The code paths to check whether a refname X is available (by seeing if another ref X/Y exists, etc.) have been optimized. * ps/refname-avail-check-optim: refs: reuse iterators when determining refname availability refs/iterator: implement seeking for files iterators refs/iterator: implement seeking for packed-ref iterators refs/iterator: implement seeking for ref-cache iterators refs/iterator: implement seeking for reftable iterators refs/iterator: implement seeking for merged iterators refs/iterator: provide infrastructure to re-seek iterators refs/iterator: separate lifecycle from iteration refs: stop re-verifying common prefixes for availability refs/files: batch refname availability checks for initial transactions refs/files: batch refname availability checks for normal transactions refs/reftable: batch refname availability checks refs: introduce function to batch refname availability checks builtin/update-ref: skip ambiguity checks when parsing object IDs object-name: allow skipping ambiguity checks in `get_oid()` family object-name: introduce `repo_get_oid_with_flags()`
2025-03-29Merge branch 'cc/signed-fast-export-import'Junio C Hamano2-50/+161
"git fast-export | git fast-import" learns to deal with commit and tag objects with embedded signatures a bit better. * cc/signed-fast-export-import: fast-export, fast-import: add support for signed-commits fast-export: do not modify memory from get_commit_buffer git-fast-export.adoc: clarify why 'verbatim' may not be a good idea fast-export: rename --signed-tags='warn' to 'warn-verbatim' fast-export: fix missing whitespace after switch git-fast-import.adoc: add missing LF in the BNF
2025-03-26Merge branch 'ua/some-builtins-wo-the-repository'Junio C Hamano7-61/+54
A handful of built-in command implementations have been rewritten to use the repository instance supplied by git.c:run_builtin(), its caller. * ua/some-builtins-wo-the-repository: builtin/checkout-index: stop using `the_repository` builtin/for-each-ref: stop using `the_repository` builtin/ls-files: stop using `the_repository` builtin/pack-refs: stop using `the_repository` builtin/send-pack: stop using `the_repository` builtin/verify-commit: stop using `the_repository` builtin/verify-tag: stop using `the_repository` config: teach repo_config to allow `repo` to be NULL
2025-03-26Merge branch 'sj/ref-consistency-checks-more'Junio C Hamano2-2/+33
"git fsck" becomes more careful when checking the refs. * sj/ref-consistency-checks-more: builtin/fsck: add `git refs verify` child process packed-backend: check whether the "packed-refs" is sorted packed-backend: add "packed-refs" entry consistency check packed-backend: check whether the refname contains NUL characters packed-backend: add "packed-refs" header consistency check packed-backend: check if header starts with "# pack-refs with: " packed-backend: check whether the "packed-refs" is regular file builtin/refs: get worktrees without reading head information t0602: use subshell to ensure working directory unchanged
2025-03-26Merge branch 'jt/diff-pairs'Junio C Hamano1-0/+207
A post-processing filter for "diff --raw" output has been introduced. * jt/diff-pairs: builtin/diff-pairs: allow explicit diff queue flush builtin: introduce diff-pairs command diff: add option to skip resolving diff statuses diff: return diff_filepair from diff queue helpers
2025-03-18am: switch from merge_recursive_generic() to merge_ort_generic()Elijah Newren1-2/+3
Switch from merge-recursive to merge-ort. Adjust the following testcases due to the switch: * t4151: This test left an untracked file in the way of the merge. merge-recursive could only sometimes tell when untracked files were in the way, and by the time it discovers others, it has already made too many changes to back out of the merge. So, instead of writing the results to e.g. 'file1' it would instead write them to 'file1~branch1'. This is confusing for users, because they might not notice 'file1~branch1' and accidentally add and commit 'file1'. In contrast, merge-ort correctly notices the file in the way before making any changes and aborts. Since this test didn't care about the file in the way, just remove it before calling git-am. * t4255: Usage of merge-ort allows us to change two known failures into successes. * t6427: As noted a few commits ago, the choice of conflict label for diff3 markers for the ancestor commit was previously handled by merge-recursive.c rather than by callers. Since that has now changed, `git am` needs to specify that label. Although the previous conflict label ("constructed merge base") was already fairly somewhat slanted towards `git am`, let's use wording more along the lines of the related command-line flag from `git apply` and function involved to tie it more closely to `git am`. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17stash: remove merge-recursive.h includeElijah Newren1-1/+0
stash was modified to use merge_ort_nonrecursive() instead of merge_recursive_generic() back in commit 874cf2a60444 (stash: apply stash using 'merge_ort_nonrecursive()', 2022-05-10). That makes the inclusion of merge-recursive.h unnecessary. In preparation for the removal of merge-recursive.h, remove the unnecessary include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-13builtin/pack-objects.c: freshen objects from existing cruft packsTaylor Blau1-17/+101
Once an object is written into a cruft pack, we can only freshen it by writing a new loose or packed copy of that object with a more recent mtime. Prior to 61568efa95 (builtin/pack-objects.c: support `--max-pack-size` with `--cruft`, 2023-08-28), we typically had at most one cruft pack in a repository at any given time. So freshening unreachable objects was straightforward when already rewriting the cruft pack (and its *.mtimes file). But 61568efa95 changes things: 'pack-objects' now supports writing multiple cruft packs when invoked with `--cruft` and the `--max-pack-size` flag. Cruft packs are rewritten until they reach some size threshold, at which point they are considered "frozen", and will only be modified in a pruning GC, or if the threshold itself is adjusted. Prior to this patch, however, this process breaks down when we attempt to freshen an object packed in an earlier cruft pack, and that cruft pack is larger than the threshold and thus will survive the repack. When this is the case, it is impossible to freshen objects in cruft pack(s) when those cruft packs are larger than the threshold. This is because we would avoid writing them in the new cruft pack entirely, for a couple of reasons. 1. When enumerating packed objects via 'add_objects_in_unpacked_packs()' we pass the SKIP_IN_CORE_KEPT_PACKS, which is used to avoid looping over the packs we're going to retain (which are marked as kept in-core by 'read_cruft_objects()'). This means that we will avoid enumerating additional packed copies of objects found in any cruft packs which are larger than the given size threshold. Thus there is no opportunity to call 'create_object_entry()' whatsoever. 2. We likewise will discard the loose copy (if one exists) of any unreachable object packed in a cruft pack that is larger than the threshold. Here our call path is 'add_unreachable_loose_objects()', which uses the 'add_loose_object()' callback. That function will eventually land us in 'want_object_in_pack()' (via 'add_cruft_object_entry()'), and we'll discard the object as it appears in one of the packs which we marked as kept in-core. This means in effect that it is impossible to freshen an unreachable object once it appears in a cruft pack larger than the given threshold. Instead, we should pack an additional copy of an unreachable object we want to freshen even if it appears in a cruft pack, provided that the cruft copy has an mtime which is before the mtime of the copy we are trying to pack/freshen. This is sub-optimal in the sense that it requires keeping an additional copy of unreachable objects upon freshening, but we don't have a better alternative without the ability to make in-place modifications to existing *.mtimes files. In order to implement this, we have to adjust the behavior of 'want_found_object()'. When 'pack-objects' is told that we're *not* going to retain any cruft packs (i.e. the set of packs marked as kept in-core does not contain a cruft pack), the behavior is unchanged. But when there *is* at least one cruft pack that we're holding onto, it is no longer sufficient to reject a copy of an object found in that cruft pack for that reason alone. In this case, we only want to reject a candidate object when copies of that object either: - exists in a non-cruft pack that we are retaining, regardless of that pack's mtime, or - exists in a cruft pack with an mtime at least as recent as the copy we are debating whether or not to pack, in which case freshening would be redundant. To do this, keep track of whether or not we have any cruft packs in our in-core kept list with a new 'ignore_packed_keep_in_core_has_cruft' flag. When we end up in this new special case, we replace a call to 'has_object_kept_pack()' to 'want_cruft_object_mtime()', and only reject objects when we have a copy in an existing cruft pack with at least as recent an mtime as our candidate (in which case "freshening" would be redundant). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12refs/iterator: separate lifecycle from iterationPatrick Steinhardt1-0/+2
The ref and reflog iterators have their lifecycle attached to iteration: once the iterator reaches its end, it is automatically released and the caller doesn't have to care about that anymore. When the iterator should be released before it has been exhausted, callers must explicitly abort the iterator via `ref_iterator_abort()`. This lifecycle is somewhat unusual in the Git codebase and creates two problems: - Callsites need to be very careful about when exactly they call `ref_iterator_abort()`, as calling the function is only valid when the iterator itself still is. This leads to somewhat awkward calling patterns in some situations. - It is impossible to reuse iterators and re-seek them to a different prefix. This feature isn't supported by any iterator implementation except for the reftable iterators anyway, but if it was implemented it would allow us to optimize cases where we need to search for specific references repeatedly by reusing internal state. Detangle the lifecycle from iteration so that we don't deallocate the iterator anymore once it is exhausted. Instead, callers are now expected to always call a newly introduce `ref_iterator_free()` function that deallocates the iterator and its internal state. Note that the `dir_iterator` is somewhat special because it does not implement the `ref_iterator` interface, but is only used to implement other iterators. Consequently, we have to provide `dir_iterator_free()` instead of `dir_iterator_release()` as the allocated structure itself is managed by the `dir_iterator` interfaces, as well, and not freed by `ref_iterator_free()` like in all the other cases. While at it, drop the return value of `ref_iterator_abort()`, which wasn't really required by any of the iterator implementations anyway. Furthermore, stop calling `base_ref_iterator_free()` in any of the backends, but instead call it in `ref_iterator_free()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12builtin/update-ref: skip ambiguity checks when parsing object IDsPatrick Steinhardt1-5/+10
Most of the commands in git-update-ref(1) accept an old and/or new object ID to update a specific reference to. These object IDs get parsed via `repo_get_oid()`, which not only handles plain object IDs, but also those that have a suffix like "~" or "^2". More surprisingly though, it even knows to resolve arbitrary revisions, despite the fact that its manpage does not mention this fact even once. One consequence of this is that we also check for ambiguous references: when parsing a full object ID where the DWIM mechanism would also cause us to resolve it as a branch, we'd end up printing a warning. While this check makes sense to have in general, it is arguably less useful in the context of git-update-ref(1). This is due to multiple reasons: - The manpage is explicitly structured around object IDs. So if we see a fully blown object ID, the intent should be quite clear in general. - The command is part of our plumbing layer and not a tool that users would generally use in interactive workflows. As such, the warning will likely not be visible to anybody in the first place. - Users can and should use the fully-qualified refname in case there is any potential for ambiguity. And given that this command is part of our plumbing layer, one should always try to be as defensive as possible and use fully-qualified refnames. Furthermore, this check can be quite expensive when updating lots of references via `--stdin`, because we try to read multiple references per object ID that we parse according to the DWIM rules. This effect can be seen both with the "files" and "reftable" backend. The issue is not unique to git-update-ref(1), but was also an issue in git-cat-file(1), where it was addressed by disabling the ambiguity check in 25fba78d36b (cat-file: disable object/refname ambiguity check for batch mode, 2013-07-12). Disable the warning in git-update-ref(1), which provides a significant speedup with both backends. The user-visible outcome is unchanged even when ambiguity exists, except that we don't show the warning anymore. The following benchmark creates 10000 new references with a 100000 preexisting refs with the "files" backend: Benchmark 1: update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD~) Time (mean ± σ): 467.3 ms ± 5.1 ms [User: 100.0 ms, System: 365.1 ms] Range (min … max): 461.9 ms … 479.3 ms 10 runs Benchmark 2: update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD) Time (mean ± σ): 394.1 ms ± 5.8 ms [User: 63.3 ms, System: 327.6 ms] Range (min … max): 384.9 ms … 405.7 ms 10 runs Summary update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD) ran 1.19 ± 0.02 times faster than update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD~) And with the "reftable" backend: Benchmark 1: update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD~) Time (mean ± σ): 146.9 ms ± 2.2 ms [User: 90.4 ms, System: 56.0 ms] Range (min … max): 142.7 ms … 150.8 ms 19 runs Benchmark 2: update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD) Time (mean ± σ): 63.2 ms ± 1.1 ms [User: 41.0 ms, System: 21.8 ms] Range (min … max): 61.1 ms … 66.6 ms 41 runs Summary update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD) ran 2.32 ± 0.05 times faster than update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD~) Note that the absolute improvement with both backends is roughly in the same ballpark, but the relative improvement for the "reftable" backend is more significant because writing the new table to disk is faster in the first place. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export, fast-import: add support for signed-commitsLuke Shumaker2-20/+126
fast-export has a --signed-tags= option that controls how to handle tag signatures. However, there is no equivalent for commit signatures; it just silently strips the signature out of the commit (analogously to --signed-tags=strip). While signatures are generally problematic for fast-export/fast-import (because hashes are likely to change), if they're going to support tag signatures, there's no reason to not also support commit signatures. So, implement a --signed-commits= option that mirrors the --signed-tags= option. On the fast-export side, try to be as much like signed-tags as possible, in both implementation and in user-interface. This will change the default behavior to '--signed-commits=abort' from what is now '--signed-commits=strip'. In order to provide an escape hatch for users of third-party tools that call fast-export and do not yet know of the --signed-commits= option, add an environment variable 'FAST_EXPORT_SIGNED_COMMITS_NOABORT=1' that changes the default to '--signed-commits=warn-strip'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export: do not modify memory from get_commit_bufferLuke Shumaker1-28/+33
fast-export's helper function find_encoding() takes a `const char *`, but modifies that memory despite the `const`. Ultimately, this memory came from get_commit_buffer(), and you're not supposed to modify the memory that you get from get_commit_buffer(). So, get rid of find_encoding() in favor of commit.h:find_commit_header(), which gives back a string length, rather than mutating the memory to insert a '\0' terminator. Because find_commit_header() detects the "\n\n" string that separates the headers and the commit message, move the call to be above the `message = strstr(..., "\n\n")` call. This helps readability, and allows for the value of `encoding` to be used for a better value of "..." so that the same memory doesn't need to be checked twice. Introduce a `commit_buffer_cursor` variable to avoid writing an awkward `encoding ? encoding + encoding_len : committer_end` expression. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export: rename --signed-tags='warn' to 'warn-verbatim'Luke Shumaker1-4/+4
The --signed-tags= option takes one of five arguments specifying how to handle signed tags during export. Among these arguments, 'strip' is to 'warn-strip' as 'verbatim' is to 'warn' (the unmentioned argument is 'abort', which stops the fast-export process entirely). That is, signatures are either stripped or copied verbatim while exporting, with or without a warning. Match the pattern and rename 'warn' to 'warn-verbatim' to make it clear that it instructs fast-export to copy signatures verbatim. To maintain backwards compatibility, 'warn' is still recognized as deprecated synonym of 'warn-verbatim'. Signed-off-by: Luke Shumaker <lukeshu@datawire.io> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fast-export: fix missing whitespace after switchChristian Couder1-4/+4
"Documentation/CodingGuidelines" says that there should be whitespaces around operators like 'if', 'switch', 'for', etc. Let's fix this in "builtin/fast-export.c". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: use ref prefix list to skip ls-refsJeff King1-20/+7
In git-fetch we have an optimization to avoid issuing an ls-refs command to the server if we don't care about the value of any refs (e.g., because we are fetching exact object ids), saving a round-trip to the server. This comes from e70a3030e7 (fetch: do not list refs if fetching only hashes, 2018-09-27). It uses an explicit flag "must_list_refs" to decide when we need to do so. That was needed back then, because the list of ref-prefixes was not always complete. If it was empty, it did not necessarily mean that we were not interested in any refs). But that is no longer the case; an empty list of prefixes means that we truly do not care about any refs. And so rather than an explicit flag, we can just check whether we are interested in any ref prefixes. This simplifies the code slightly, as there is now a single source of truth for the decision. It also fixes a bug in / optimizes a very unlikely case, which is: git fetch $remote ^foo $oid I.e., a negative refspec combined with an exact oid fetch. This is somewhat nonsense, in that there are no positive refspecs mentioning refs to countermand with the negative one. But we should be able to do this without issuing an ls-refs command (excluding "foo" from the empty set will obviously still be the empty set). However, the current code does not do so. The negative refspec is not counted as a noop in un-setting the must_list_refs flag (hardly the fault of e70a3030e7, as negative refspecs did not appear until much later). But by using the prefix list as a source of truth, this naturally just works; the negative refspec does not add a prefix to ask about, and hence does not trigger the ls-refs call. This is esoteric enough that I didn't bother adding a test. The real value here is in the code simplification. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: avoid ls-refs only to ask for HEAD symref updateJeff King1-3/+2
When we fetch from a configured remote, we may try to update the local refs/remotes/<origin>/HEAD, and so we ask the server to advertise its HEAD to us. But if we aren't otherwise asking about any refs at all, then we know this HEAD update can never happen! To consider a new value for HEAD, the set_head() function uses guess_remote_head(). And even if it sees an explicit symref value for HEAD, it will only report that as a match if we also saw that remote ref advertised, and it mapped to a local tracking ref via get_fetch_map(). In other words, a fetch like this: git fetch origin $exact_oid:refs/heads/foo can never update HEAD, because we will never have fetched (nor even see the advertisement for) the ref that HEAD points to. Currently the command above will still call ls-refs to ask about the HEAD, even though it is pointless. This patch teaches it to skip the ls-refs call entirely in this case, which avoids a round-trip to the server. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: stop protecting additions to ref-prefix listJeff King1-6/+4
When using the ref-prefix feature of protocol v2, a client which sends no prefixes at all will get the full advertisement. And so the code in git-fetch was historically loose about setting up that list based on our refspecs. There were cases where we needed to know about some refs, so we just didn't add anything to the ref-prefix list. And hence further code, like that for tag-following and updating origin/HEAD, had to be careful about adding to an empty list. E.g., see the bug fixed by bd52d9a058 (fetch: fix following tags when fetching specific OID, 2025-03-07). But the previous commit removed the last such case, and now we know an empty ref-prefix list (at least inside git-fetch's do_fetch() function) means that we really don't need to see any refs. So we can drop those extra conditionals. This simplifies the code a little. But it also means that some cases can now use ref prefixes when they would not otherwise. As the test shows, fetching an exact oid into a local ref can now avoid enumerating all of the refs. The refspec itself doesn't need to know about any remote refs, and the tag auto-following can just ask about refs/tags/. The same is true for asking about HEAD to update the local origin/HEAD. I didn't add a test for that yet, though, as we can optimize it even further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: ask server to advertise HEAD for config-less fetchJeff King1-0/+8
If we're not given any refspecs (either on the command line or via config) and we have no branch merge config, then we fetch the remote HEAD into our local FETCH_HEAD. In that case we do not send any ref-prefix option to the server at all, and we see the full advertisement. But this is sub-optimal. We only care about HEAD, so we can just ask for that, and ignore all of the other refs. The new test demonstrates a case where we see fewer refs (in this case only one less, but in theory we could be ignoring millions of them). This also removes the only case where we care about seeing some refs from the other side, but don't add anything to the ref_prefixes list. Cleaning this up means one less maintenance burden. Before this patch, any code which wanted to add to the list had to make sure the list was not empty, since an empty list meant "ask for everything". Now it really means "we are not interested in any refs". This should let us optimize a few more cases in subsequent patches. Note that we'll add "HEAD" to the list of prefixes, and later code for updating "refs/remotes/<remote>/HEAD" may likewise do so. In theory this could cause duplicates in the list, but in practice these can't both trigger. We hit our new case only if there are no refspecs, and the "<remote>/HEAD" feature is enabled only when we are fetching from a remote with configured refspecs. We could be defensive with a flag, but it didn't seem worth it to me (the absolute worse case is a useless redundant ref-prefix line sent to the server). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10Merge branch 'tb/fetch-follow-tags-fix'Junio C Hamano1-1/+3
* tb/fetch-follow-tags-fix: fetch: fix following tags when fetching specific OID
2025-03-07builtin/checkout-index: stop using `the_repository`Usman Akinyemi1-22/+21
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/checkout-index.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_checkout_index()` function with `repo` set to NULL and then early in the function, `show_usage_with_options_if_asked()` call will give the options help and exit. Pass an instance of "struct index_state" available in the calling context to both `checkout_all()` and `checkout_file()` to remove their dependency on the global `the_repository` variable. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/for-each-ref: stop using `the_repository`Usman Akinyemi1-3/+2
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/for-each-ref.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_for_each_ref()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/ls-files: stop using `the_repository`Usman Akinyemi1-16/+16
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/ls-files.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_ls_files()` function with `repo` set to NULL and then early in the function, `show_usage_with_options_if_asked()` call will give the options help and exit. Pass the repository available in the calling context to both `expand_objectsize()` and `show_ru_info()` to remove their dependency on the global `the_repository` variable. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/pack-refs: stop using `the_repository`Usman Akinyemi1-5/+3
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/pack-refs.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_pack_refs()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/send-pack: stop using `the_repository`Usman Akinyemi1-4/+3
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/send-pack.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_send_pack()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/verify-commit: stop using `the_repository`Usman Akinyemi1-7/+6
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/verify-commit.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_verify_commit()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Pass the repository available in the calling context to `verify_commit()` to remove it's dependency on the global `the_repository` variable. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07builtin/verify-tag: stop using `the_repository`Usman Akinyemi1-4/+3
Remove the_repository global variable in favor of the repository argument that gets passed in "builtin/verify-tag.c". When `-h` is passed to the command outside a Git repository, the `run_builtin()` will call the `cmd_verify_tag()` function with `repo` set to NULL and then early in the function, `parse_options()` call will give the options help and exit. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07fetch: fix following tags when fetching specific OIDTaylor Blau1-1/+3
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22), unconditionally adds "HEAD" to the list of ref prefixes we send to the server. This breaks a core assumption that the list of prefixes we send to the server is complete. We must either send all prefixes we care about, or none at all (in the latter case the server then advertises everything). The tag following code is careful to only add "refs/tags/" to the list of prefixes if there are already entries in the prefix list. But because the new code from 3f763ddf28 runs after the tag code, and because it unconditionally adds to the prefix list, we may end up with a prefix list that _should_ have "refs/tags/" in it, but doesn't. When that is the case, the server does not advertise any tags, and our auto-following breaks because we never learned about any tags in the first place. Fix this by only adding "HEAD" to the ref prefixes when we know that we are already limiting the advertisement. In either case we'll learn about HEAD (either through the limited advertisement, or implicitly through a full advertisement). Reported-by: Igor Todorovski <itodorov@ca.ibm.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05Merge branch 'kn/ref-migrate-skip-reflog'Junio C Hamano1-1/+1
Usage string of "git refs" has been corrected. * kn/ref-migrate-skip-reflog: refs: show --no-reflog in the help text
2025-03-05Merge branch 'ps/path-sans-the-repository'Junio C Hamano19-58/+106
The path.[ch] API takes an explicit repository parameter passed throughout the callchain, instead of relying on the_repository singleton instance. * ps/path-sans-the-repository: path: adjust last remaining users of `the_repository` environment: move access to "core.sharedRepository" into repo settings environment: move access to "core.hooksPath" into repo settings repo-settings: introduce function to clear struct path: drop `git_path()` in favor of `repo_git_path()` rerere: let `rerere_path()` write paths into a caller-provided buffer path: drop `git_common_path()` in favor of `repo_common_path()` worktree: return allocated string from `get_worktree_git_dir()` path: drop `git_path_buf()` in favor of `repo_git_path_replace()` path: drop `git_pathdup()` in favor of `repo_git_path()` path: drop unused `strbuf_git_path()` function path: refactor `repo_submodule_path()` family of functions submodule: refactor `submodule_to_gitdir()` to accept a repo path: refactor `repo_worktree_path()` family of functions path: refactor `repo_git_path()` family of functions path: refactor `repo_common_path()` family of functions
2025-03-03refs: show --no-reflog in the help textJunio C Hamano1-1/+1
We forgot that we must keep the documentation and help text in sync. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03builtin/diff-pairs: allow explicit diff queue flushJustin Tobler1-0/+14
The diffs queued from git-diff-pairs(1) are flushed when stdin is closed. To enable greater flexibility, allow control over when the diff queue is flushed by writing a single NUL byte on stdin between input file pairs. Diff output between flushes is separated by a single NUL byte. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03builtin: introduce diff-pairs commandJustin Tobler1-0/+193
Through git-diff(1), a single diff can be generated from a pair of blob revisions directly. Unfortunately, there is not a mechanism to compute batches of specific file pair diffs in a single process. Such a feature is particularly useful on the server-side where diffing between a large set of changes is not feasible all at once due to timeout concerns. To facilitate this, introduce git-diff-pairs(1) which acts as a backend passing its NUL-terminated raw diff format input from stdin through diff machinery to produce various forms of output such as patch or raw. The raw format was originally designed as an interchange format and represents the contents of the diff_queued_diff list making it possible to break the diff pipeline into separate stages. For example, git-diff-tree(1) can be used as a frontend to compute file pairs to queue and feed its raw output to git-diff-pairs(1) to compute patches. With this, batches of diffs can be progressively generated without having to recompute renames or retrieve object context. Something like the following: git diff-tree -r -z -M $old $new | git diff-pairs -p -z should generate the same output as `git diff-tree -p -M`. Furthermore, each line of raw diff formatted input can also be individually fed to a separate git-diff-pairs(1) process and still produce the same output. Based-on-patch-by: Jeff King <peff@peff.net> Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28path: adjust last remaining users of `the_repository`Patrick Steinhardt1-1/+1
With the preceding refactorings we now only have a couple of implicit users of `the_repository` left in the "path" subsystem, all of which depend on global state via `calc_shared_perm()`. Make the dependency on `the_repository` explicit by passing the repo as a parameter instead and adjust callers accordingly. Note that this change bubbles up into a couple of subsystems that were previously declared as free from `the_repository`. Instead of marking all of them as `the_repository`-dependent again, we instead use the repository that is available in the calling context. There are three exceptions though with "copy.c", "pack-write.c" and "tempfile.c". Adjusting these would require us to adapt callsites all over the place, so this is left for a future iteration. Mark "path.c" as free from `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28environment: move access to "core.sharedRepository" into repo settingsPatrick Steinhardt2-7/+7
Similar as with the preceding commit, we track "core.sharedRepository" via a pair of global variables. Move them into `struct repo_settings` so that we can instead track them per-repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28path: drop `git_path()` in favor of `repo_git_path()`Patrick Steinhardt7-18/+43
Remove `git_path()` in favor of the `repo_git_path()` family of functions, which makes the implicit dependency on `the_repository` go away. Note that `git_path()` returned a string allocated via `get_pathname()`, which uses a rotating set of statically allocated buffers. Consequently, callers didn't have to free the returned string. The same isn't true for `repo_common_path()`, so we also have to add logic to free the returned strings. This refactoring also allows us to remove `repo_common_pathv()` as well as `get_pathname()` from the public interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28rerere: let `rerere_path()` write paths into a caller-provided bufferPatrick Steinhardt1-4/+7
Same as with `get_worktree_git_dir()` a couple of commits ago, the `rerere_path()` function returns paths that need not be free'd by the caller because `git_path()` internally uses `get_pathname()`. Refactor the function to instead accept a caller-provided buffer that the path will be written into, passing on ownership to the caller. This refactoring prepares us for the removal of `git_path()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-27Merge branch 'kn/ref-migrate-skip-reflog'Junio C Hamano1-0/+3
"git refs migrate" can optionally be told not to migrate the reflog. * kn/ref-migrate-skip-reflog: builtin/refs: add '--no-reflog' flag to drop reflogs
2025-02-27Merge branch 'ua/os-version-capability'Junio C Hamano1-11/+2
The value of "uname -s" is by default sent over the wire as a part of the "version" capability. * ua/os-version-capability: agent: advertise OS name via agent capability t5701: add setup test to remove side-effect dependency version: extend get_uname_info() to hide system details version: refactor get_uname_info() version: refactor redact_non_printables() version: replace manual ASCII checks with isprint() for clarity
2025-02-27builtin/fsck: add `git refs verify` child processshejialuo1-1/+32
At now, we have already implemented the ref consistency checks for both "files-backend" and "packed-backend". Although we would check some redundant things, it won't cause trouble. So, let's integrate it into the "git-fsck(1)" command to get feedback from the users. And also by calling "git refs verify" in "git-fsck(1)", we make sure that the new added checks don't break. Introduce a new function "fsck_refs" that initializes and runs a child process to execute the "git refs verify" command. In order to provide the user interface create a progress which makes the total task be 1. It's hard to know how many loose refs we will check now. We might improve this later. Then, introduce the option to allow the user to disable checking ref database consistency. Put this function in the very first execution sequence of "git-fsck(1)" due to that we don't want the existing code of "git-fsck(1)" which would implicitly check the consistency of refs to die the program. Last, update the test to exercise the code. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-27builtin/refs: get worktrees without reading head informationshejialuo1-1/+1
In "packed-backend.c", there are some functions such as "create_snapshot" and "next_record" which would check the correctness of the content of the "packed-ref" file. When anything is bad, the program will die. It may seem that we have nothing relevant to above feature, because we are going to read and parse the raw "packed-ref" file without creating the snapshot and using the ref iterator to check the consistency. However, when using "get_worktrees" in "builtin/refs", we would parse the "HEAD" information. If the referent of the "HEAD" is inside the "packed-ref", we will call "create_snapshot" function to parse the "packed-ref" to get the information. No matter whether the entry of "HEAD" in "packed-ref" is correct, "create_snapshot" would call "verify_buffer_safe" to check whether there is a newline in the last line of the file. If not, the program will die. Although this behavior has no harm for the program, it will short-circuit the program. When the users execute "git refs verify" or "git fsck", we should avoid reading the head information, which may execute the read operation in packed backend with stricter checks to die the program. Instead, we should continue to check other parts of the "packed-refs" file completely. Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing worktrees, 2023-12-29), we have introduced a function "get_worktrees_internal" which allows us to get worktrees without reading head information. Create a new exposed function "get_worktrees_without_reading_head", then replace the "get_worktrees" in "builtin/refs" with the new created function. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-26Merge branch 'jk/check-mailmap-wo-name-fix'Junio C Hamano1-1/+1
"git check-mailmap" segfault fix. * jk/check-mailmap-wo-name-fix: mailmap: fix check-mailmap with full mailmap line
2025-02-25Merge branch 'pw/merge-tree-stdin-deadlock-fix'Junio C Hamano1-6/+5
"git merge-tree --stdin" has been improved (including a workaround for a deadlock). * pw/merge-tree-stdin-deadlock-fix: merge-tree: fix link formatting in html docs merge-tree: improve docs for --stdin merge-tree: only use basic merge config merge-tree: remove redundant code merge-tree --stdin: flush stdout to avoid deadlock
2025-02-21mailmap: fix check-mailmap with full mailmap lineJacob Keller1-1/+1
I recently had reported to me a crash from a coworker using the recently added sendemail mailmap support: 3724814 Segmentation fault (core dumped) git check-mailmap "bugs@company.xx" This appears to happen because of the NULL pointer name passed into map_user(). Fix this by passing "" instead of NULL so that we have a valid pointer. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21Merge branch 'ua/update-server-info-sans-the-repository'Junio C Hamano1-4/+4
Code clean-up. * ua/update-server-info-sans-the-repository: builtin/update-server-info: remove the_repository global variable
2025-02-21builtin/refs: add '--no-reflog' flag to drop reflogsKarthik Nayak1-0/+3
The "git refs migrate" subcommand converts the backend used for ref storage. It always migrates reflog data as well as refs. Introduce an option to exclude reflogs from migration, allowing them to be discarded when they are unnecessary. This is particularly useful in server-side repositories, where reflogs are typically not expected. However, some repositories may still have them due to historical reasons, such as bugs, misconfigurations, or administrative decisions to enable reflogs for debugging. In such repositories, it would be optimal to drop reflogs during the migration. To address this, introduce the '--no-reflog' flag, which prevents reflog migration. When this flag is used, reflogs from the original reference backend are migrated. Since only the new reference backend remains in the repository, all previous reflogs are permanently discarded. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>