aboutsummaryrefslogtreecommitdiffstats
path: root/builtin (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-10-09repack: free existing_cruft array after useJeff King1-0/+1
We allocate an array of packed_git pointers so that we can sort the list of cruft packs, but we never free the array, causing a small leak. Note that we don't need to free the packed_git structs themselves; they're owned by the repository object. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05builtin/repack.c: avoid making cruft packs preferredTaylor Blau1-1/+46
When doing a `--geometric` repack, we make sure that the preferred pack (if writing a MIDX) is the largest pack that we *didn't* repack. That has the effect of keeping the preferred pack in sync with the pack containing a majority of the repository's reachable objects. But if the repository happens to double in size, we'll repack everything. Here we don't specify any `--preferred-pack`, and instead let the MIDX code choose. In the past, that worked fine, since there would only be one pack to choose from: the one we just wrote. But it's no longer necessarily the case that there is one pack to choose from. It's possible that the repository also has a cruft pack, too. If the cruft pack happens to come earlier in lexical order (and has an earlier mtime than any non-cruft pack), we'll pick that pack as preferred. This makes it impossible to reuse chunks of the reachable pack verbatim from pack-objects, so is sub-optimal. Luckily, this is a somewhat rare circumstance to be in, since we would have to repack the entire repository during a `--geometric` repack, and the cruft pack would have to sort ahead of the pack we just created. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05builtin/repack.c: implement support for `--max-cruft-size`Taylor Blau2-11/+130
Cruft packs are an alternative mechanism for storing a collection of unreachable objects whose mtimes are recent enough to avoid being pruned out of the repository. When cruft packs were first introduced back in b757353676 (builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and a7d493833f (builtin/pack-objects.c: --cruft with expiration, 2022-05-20), the recommended workflow consisted of: - Repacking periodically, either by packing anything loose in the repository (via `git repack -d`) or producing a geometric sequence of packs (via `git repack --geometric=<d> -d`). - Every so often, splitting the repository into two packs, one cruft to store the unreachable objects, and another non-cruft pack to store the reachable objects. Repositories may (out of band with the above) choose periodically to prune out some unreachable objects which have aged out of the grace period by generating a pack with `--cruft-expiration=<approxidate>`. This allowed repositories to maintain relatively few packs on average, and quarantine unreachable objects together in a cruft pack, avoiding the pitfalls of holding unreachable objects as loose while they age out (for more, see some of the details in 3d89a8c118 (Documentation/technical: add cruft-packs.txt, 2022-05-20)). This all works, but can be costly from an I/O-perspective when frequently repacking a repository that has many unreachable objects. This problem is exacerbated when those unreachable objects are rarely (if every) pruned. Since there is at most one cruft pack in the above scheme, each time we update the cruft pack it must be rewritten from scratch. Because much of the pack is reused, this is a relatively inexpensive operation from a CPU-perspective, but is very costly in terms of I/O since we end up rewriting basically the same pack (plus any new unreachable objects that have entered the repository since the last time a cruft pack was generated). At the time, we decided against implementing more robust support for multiple cruft packs. This patch implements that support which we were lacking. Introduce a new option `--max-cruft-size` which allows repositories to accumulate cruft packs up to a given size, after which point a new generation of cruft packs can accumulate until it reaches the maximum size, and so on. To generate a new cruft pack, the process works like so: - Sort a list of any existing cruft packs in ascending order of pack size. - Starting from the beginning of the list, group cruft packs together while the accumulated size is smaller than the maximum specified pack size. - Combine the objects in these cruft packs together into a new cruft pack, along with any other unreachable objects which have since entered the repository. Once a cruft pack grows beyond the size specified via `--max-cruft-size` the pack is effectively frozen. This limits the I/O churn up to a quadratic function of the value specified by the `--max-cruft-size` option, instead of behaving quadratically in the number of total unreachable objects. When pruning unreachable objects, we bypass the new code paths which combine small cruft packs together, and instead start from scratch, passing in the appropriate `--max-pack-size` down to `pack-objects`, putting it in charge of keeping the resulting set of cruft packs sized correctly. This may seem like further I/O churn, but in practice it isn't so bad. We could prune old cruft packs for whom all or most objects are removed, and then generate a new cruft pack with just the remaining set of objects. But this additional complexity buys us relatively little, because most objects end up being pruned anyway, so the I/O churn is well contained. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-05builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDETaylor Blau1-3/+3
The repack builtin takes a `--max-pack-size` command-line argument which it uses to feed into any of the pack-objects children that it may spawn when generating a new pack. This option is parsed with OPT_STRING, meaning that we'll accept anything as input, punting on more fine-grained validation until we get down into pack-objects. This is fine, but it's wasteful to spend an entire sub-process just to figure out that one of its option is bogus. Instead, parse the value of `--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the known-good result down to pack-objects. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-04Merge branch 'jk/commit-graph-verify-fix'Junio C Hamano1-7/+24
Various fixes to "git commit-graph verify". * jk/commit-graph-verify-fix: commit-graph: report incomplete chains during verification commit-graph: tighten chain size check commit-graph: detect read errors when verifying graph chain t5324: harmonize sha1/sha256 graph chain corruption commit-graph: check mixed generation validation when loading chain file commit-graph: factor out chain opening function
2023-10-03submodule--helper: return error from set-url when modifying failedJan Alexander Steffens (heftig)1-5/+7
set-branch will return an error when setting the config fails so I don't see why set-url shouldn't. Also skip the sync in this case. Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03submodule--helper: use submodule_from_path in set-{url,branch}Jan Alexander Steffens (heftig)1-4/+18
The commands need a path to a submodule but treated it as the name when modifying the .gitmodules file, leading to confusion when a submodule's name does not match its path. Because calling submodule_from_path initializes the submodule cache, we need to manually trigger a reread before syncing, as the cache is missing the config change we just made. Signed-off-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03commit-graph: clear oidset after finishing writeJeff King1-0/+1
In graph_write() we store commits in an oidset, but never clean it up, leaking the contents. We should clear it in the cleanup section. The oidset comes from 6830c36077 (commit-graph.h: replace 'commit_hex' with 'commits', 2020-04-13), but it was just replacing a string_list that was also leaked. Curiously, we fixed the leak of some adjacent variables in commit fa8953cb40 (builtin/commit-graph.c: extract 'read_one_commit()', 2020-05-18), but the oidset wasn't included for some reason. In combination with the preceding commits, this lets us mark t5324 as leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-03merge: free result of repo_get_merge_bases()Jeff King1-1/+4
We call repo_get_merge_bases(), which allocates a commit_list, but never free the result, causing a leak. The obvious solution is to free it, but we need to look at the contents of the first item to decide whether to leave the loop. One option is to free it in both code paths. But since the commit that the list points to is longer-lived than the list itself, we can just dereference it immediately, free the list, and then continue with the existing logic. This is about the same amount of code, but keeps the list management all in one place. This lets us mark a number of merge-related test scripts as leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02builtin/ls-tree: let the oid determine the output algorithmEric W. Biederman1-1/+4
Update cmd_ls_tree to call get_oid_with_context and pass GET_OID_HASH_ANY instead of calling the simpler repo_get_oid. This implments in ls-tree the behavior that asking to display a sha1 hash displays the corrresponding sha1 encoded object and asking to display a sha256 hash displayes the corresponding sha256 encoded object. This is useful for testing the conversion of an object to an equivlanet object encoded with a different hash function. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02tree-walk: init_tree_desc take an oid to get the hash algorithmEric W. Biederman9-18/+24
To make it possible for git ls-tree to display the tree encoded in the hash algorithm of the oid specified to git ls-tree, update init_tree_desc to take as a parameter the oid of the tree object. Update all callers of init_tree_desc and init_tree_desc_gently to pass the oid of the tree object. Use the oid of the tree object to discover the hash algorithm of the oid and store that hash algorithm in struct tree_desc. Use the hash algorithm in decode_tree_entry and update_tree_entry_internal to handle reading a tree object encoded in a hash algorithm that differs from the repositories hash algorithm. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02builtin/cat-file: let the oid determine the output algorithmEric W. Biederman1-3/+9
Use GET_OID_HASH_ANY when calling get_oid_with_context. This implements the semi-obvious behaviour that specifying a sha1 oid shows the output for a sha1 encoded object, and specifying a sha256 oid shows the output for a sha256 encoded object. This is useful for testing the the conversion of an object to an equivalent object encoded with a different hash function. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02rev-parse: add an --output-object-format parameterEric W. Biederman1-0/+23
The new --output-object-format parameter returns the oid in the specified format. This is a generally useful plumbing facility. It is useful for writing test cases and for directly querying the translation maps. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02object: factor out parse_mode out of fast-import and tree-walk into in object.hEric W. Biederman1-16/+2
builtin/fast-import.c and tree-walk.c have almost identical version of get_mode. The two functions started out the same but have diverged slightly. The version in fast-import changed mode to a uint16_t to save memory. The version in tree-walk started erroring if no mode was present. As far as I can tell both of these changes are valid for both of the callers, so add the both changes and place the common parsing helper in object.h Rename the helper from get_mode to parse_mode so it does not conflict with another helper named get_mode in diff-no-index.c This will be used shortly in a new helper decode_tree_entry_raw which is used to compute cmpatibility objects as part of the sha256 transition. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02tag: sign both hashesEric W. Biederman1-4/+41
When we write a tag the object oid is specific to the hash algorithm. This matters when a tag is signed. The hash transition plan calls for signatures on both the sha1 form and the sha256 form of the object, and for both of those signatures to live in the tag object. To generate tag object with multiple signatures, first compute the unsigned form of the tag, and then if the tag is being signed compute the unsigned form of the tag with the compatibilityr hash. Then compute compute the signatures of both buffers. Once the signatures are computed add them to both buffers. This allows computing the compatibility hash in do_sign, saving write_object_file the expense of recomputing the compatibility tag just to compute it's hash. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02object-names: support input of oids in any supported hashEric W. Biederman1-1/+1
Support short oids encoded in any algorithm, while ensuring enough of the oid is specified to disambiguate between all of the oids in the repository encoded in any algorithm. By default have the code continue to only accept oids specified in the storage hash algorithm of the repository, but when something is ambiguous display all of the possible oids from any accepted oid encoding. A new flag is added GET_OID_HASH_ANY that when supplied causes the code to accept oids specified in any hash algorithm, and to return the oids that were resolved. This implements the functionality that allows both SHA-1 and SHA-256 object names, from the "Object names on the command line" section of the hash function transition document. Care is taken in get_short_oid so that when the result is ambiguous the output remains the same if GIT_OID_HASH_ANY was not supplied. If GET_OID_HASH_ANY was supplied objects of any hash algorithm that match the prefix are displayed. This required updating repo_for_each_abbrev to give it a parameter so that it knows to look at all hash algorithms. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02gc: add `gc.repackFilterTo` config optionChristian Couder1-0/+4
A previous commit implemented the `gc.repackFilter` config option to specify a filter that should be used by `git gc` when performing repacks. Another previous commit has implemented `git repack --filter-to=<dir>` to specify the location of the packfile containing filtered out objects when using a filter. Let's implement the `gc.repackFilterTo` config option to specify that location in the config when `gc.repackFilter` is used. Now when `git gc` will perform a repack with a <dir> configured through this option and not empty, the repack process will be passed a corresponding `--filter-to=<dir>` argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02repack: implement `--filter-to` for storing filtered out objectsChristian Couder1-1/+9
A previous commit has implemented `git repack --filter=<filter-spec>` to allow users to filter out some objects from the main pack and move them into a new different pack. It would be nice if this new different pack could be created in a different directory than the regular pack. This would make it possible to move large blobs into a pack on a different kind of storage, for example cheaper storage. Even in a different directory, this pack can be accessible if, for example, the Git alternates mechanism is used to point to it. In fact not using the Git alternates mechanism can corrupt a repo as the generated pack containing the filtered objects might not be accessible from the repo any more. So setting up the Git alternates mechanism should be done before using this feature if the user wants the repo to be fully usable while this feature is used. In some cases, like when a repo has just been cloned or when there is no other activity in the repo, it's Ok to setup the Git alternates mechanism afterwards though. It's also Ok to just inspect the generated packfile containing the filtered objects and then just move it into the '.git/objects/pack/' directory manually. That's why it's not necessary for this command to check that the Git alternates mechanism has been already setup. While at it, as an example to show that `--filter` and `--filter-to` work well with other options, let's also add a test to check that these options work well with `--max-pack-size`. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02gc: add `gc.repackFilter` config optionChristian Couder1-0/+6
A previous commit has implemented `git repack --filter=<filter-spec>` to allow users to filter out some objects from the main pack and move them into a new different pack. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a <filter-spec> for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a <filter-spec> configured through this option and not empty, the repack process will be passed a corresponding `--filter=<filter-spec>` argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02repack: add `--filter=<filter-spec>` optionChristian Couder1-0/+70
This new option puts the objects specified by `<filter-spec>` into a separate packfile. This could be useful if, for example, some blobs take up a lot of precious space on fast storage while they are rarely accessed. It could make sense to move them into a separate cheaper, though slower, storage. It's possible to find which new packfile contains the filtered out objects using one of the following: - `git verify-pack -v ...`, - `test-tool find-pack ...`, which a previous commit added, - `--filter-to=<dir>`, which a following commit will add to specify where the pack containing the filtered out objects will be. This feature is implemented by running `git pack-objects` twice in a row. The first command is run with `--filter=<filter-spec>`, using the specified filter. It packs objects while omitting the objects specified by the filter. Then another `git pack-objects` command is launched using `--stdin-packs`. We pass it all the previously existing packs into its stdin, so that it will pack all the objects in the previously existing packs. But we also pass into its stdin, the pack created by the previous `git pack-objects --filter=<filter-spec>` command as well as the kept packs, all prefixed with '^', so that the objects in these packs will be omitted from the resulting pack. The result is that only the objects filtered out by the first `git pack-objects` command are in the pack resulting from the second `git pack-objects` command. As the interactions with kept packs are a bit tricky, a few related tests are added. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02repack: refactor finding pack prefixChristian Couder1-6/+12
Create a new find_pack_prefix() to refactor code that handles finding the pack prefix from the packtmp and packdir global variables, as we are going to need this feature again in following commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02repack: refactor finishing pack-objects commandChristian Couder1-37/+33
Create a new finish_pack_objects_cmd() to refactor duplicated code that handles reading the packfile names from the output of a `git pack-objects` command and putting it into a string_list, as well as calling finish_command(). While at it, beautify a code comment a bit in the new function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02pack-objects: allow `--filter` without `--stdout`Christian Couder1-6/+2
9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught `git pack-objects` to use `--filter`, but required the use of `--stdout` since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Even without a promisor remote, filtering out objects can also be useful if we can put the filtered out objects in a separate pack, and in this case it also makes sense for pack-objects to write the packfile directly to an actual file rather than on stdout. Remove the `--stdout` requirement when using `--filter`, so that in a follow-up commit, repack can pass `--filter` to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02Merge branch 'jc/unresolve-removal'Junio C Hamano2-90/+23
"checkout --merge -- path" and "update-index --unresolve path" did not resurrect conflicted state that was resolved to remove path, but now they do. * jc/unresolve-removal: checkout: allow "checkout -m path" to unmerge removed paths checkout/restore: add basic tests for --merge checkout/restore: refuse unmerging paths unless checking out of the index update-index: remove stale fallback code for "--unresolve" update-index: use unmerge_index_entry() to support removal resolve-undo: allow resurrecting conflicted state that resolved to deletion update-index: do not read HEAD and MERGE_HEAD unconditionally
2023-09-29diff --stat: set the width defaults in a helper functionDragan Simic4-13/+5
Extract the commonly used initialization of the --stat-width=<width>, --stat-name-width=<width> and --stat-graph-with=<width> parameters to their internal default values into a helper function, to avoid repeating the same initialization code in a few places. Add a couple of tests to additionally cover existing configuration options diff.statNameWidth=<width> and diff.statGraphWidth=<width> when used by git-merge to generate --stat outputs. This closes the gap that existed previously in the --stat tests, and reduces the chances for having any regressions introduced by this commit. While there, perform a small bunch of minor wording tweaks in the improved unit test, to improve its test-level consistency a bit. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-29Merge branch 'ob/am-msgfix'Junio C Hamano1-1/+2
The parameters to generate an error message have been corrected. * ob/am-msgfix: am: fix error message in parse_opt_show_current_patch()
2023-09-29Merge branch 'ds/stat-name-width-configuration'Junio C Hamano4-0/+4
"git diff" learned diff.statNameWidth configuration variable, to give the default width for the name part in the "--stat" output. * ds/stat-name-width-configuration: diff --stat: add config option to limit filename width
2023-09-29Merge branch 'jk/fsmonitor-unused-parameter'Junio C Hamano1-4/+6
Unused parameters in fsmonitor related code paths have been marked as such. * jk/fsmonitor-unused-parameter: run-command: mark unused parameters in start_bg_wait callbacks fsmonitor: mark unused hashmap callback parameters fsmonitor/darwin: mark unused parameters in system callback fsmonitor: mark unused parameters in stub functions fsmonitor/win32: mark unused parameter in fsm_os__incompatible() fsmonitor: mark some maybe-unused parameters fsmonitor/win32: drop unused parameters fsmonitor: prefer repo_git_path() to git_pathdup()
2023-09-28commit-graph: report incomplete chains during verificationJeff King1-1/+9
The load_commit_graph_chain_fd_st() function will stop loading chains when it sees an error. But if it has loaded any graph slice at all, it will return it. This is a good thing for normal use (we use what data we can, and this is just an optimization). But it's a bad thing for "commit-graph verify", which should be careful about finding any irregularities. We do complain to stderr with a warning(), but the verify command still exits with a successful return code. The new tests here cover corruption of both the base and tip slices of the chain. The corruption of the base file already works (it is the first file we look at, so when we see the error we return NULL). The "tip" case is what is fixed by this patch (it complains to stderr but still returns the base slice). Likewise the existing tests for corruption of the commit-graph-chain file itself need to be updated. We already exited non-zero correctly for the "base" case, but the "tip" case can now do so, too. Note that this also causes us to adjust a test later in the file that similarly corrupts a tip (though confusingly the test script calls this "base"). It checks stderr but erroneously expects the whole "verify" command to exit with a successful code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-28commit-graph: detect read errors when verifying graph chainJeff King1-7/+16
Because it's OK to not have a graph file at all, the graph_verify() function needs to tell the difference between a missing file and a real error. So when loading a traditional graph file, we call open_commit_graph() separately from load_commit_graph_chain_fd_st(), and don't complain if the first one fails with ENOENT. When the function learned about chain files in 3da4b609bb (commit-graph: verify chains with --shallow mode, 2019-06-18), we couldn't be as careful, since the only way to load a chain was with read_commit_graph_one(), which did both the open/load as a single unit. So we'll miss errors in chain files we load, thinking instead that there was just no chain file at all. Note that we do still report some of these problems to stderr, as the loading function calls error() and warning(). But we'd exit with a successful exit code, which is wrong. We can fix that by using the recently split open/load functions for chains. That lets us treat the chain file just like a single file with respect to error handling here. An existing test (from 3da4b609bb) shows off the problem; we were expecting "commit-graph verify" to report success, but that makes no sense. We did not even verify the contents of the graph data, because we couldn't load it! I don't think this was an intentional exception, but rather just the test covering what happened to occur. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-25merge-tree: add -X strategy optionTang Yuyi1-3/+15
Add merge strategy option to produce more customizable merge result such as automatically resolving conflicts. Signed-off-by: Tang Yuyi <winglovet@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-22Merge branch 'tb/repack-existing-packs-cleanup'Junio C Hamano1-113/+180
The code to keep track of existing packs in the repository while repacking has been refactored. * tb/repack-existing-packs-cleanup: builtin/repack.c: extract common cruft pack loop builtin/repack.c: avoid directly inspecting "util" builtin/repack.c: store existing cruft packs separately builtin/repack.c: extract `has_existing_non_kept_packs()` builtin/repack.c: extract redundant pack cleanup for existing packs builtin/repack.c: extract redundant pack cleanup for --geometric builtin/repack.c: extract marking packs for deletion builtin/repack.c: extract structure to store existing packs
2023-09-21am: fix error message in parse_opt_show_current_patch()Oswald Buddenhagen1-1/+2
The argument order was incorrect. This was introduced by 246cac8505 (i18n: turn even more messages into "cannot be used together" ones, 2022-01-05). Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-20Merge branch 'jc/update-index-show-index-version'Junio C Hamano1-6/+13
"git update-index" learns "--show-index-version" to inspect the index format version used by the on-disk index file. * jc/update-index-show-index-version: test-tool: retire "index-version" update-index: add --show-index-version update-index doc: v4 is OK with JGit and libgit2
2023-09-20Merge branch 'js/systemd-timers-wsl-fix'Junio C Hamano1-1/+1
Update "git maintainance" timers' implementation based on systemd timers to work with WSL. * js/systemd-timers-wsl-fix: maintenance(systemd): support the Windows Subsystem for Linux
2023-09-18run-command: mark unused parameters in start_bg_wait callbacksJeff King1-1/+2
The start_bg_command() function takes a callback to tell when the background-ed process is "ready". The callback receives the child_process struct as well as an extra void pointer. But curiously, neither of the two users of this interface look at either parameter! This makes some sense. The only non-test user of the API is fsmonitor, which uses fsmonitor_ipc__get_state() to connect to a single global fsmonitor daemon (i.e., the one we just started!). So we could just drop these parameters entirely. But it seems like a pretty reasonable interface for the "wait" callback to have access to the details of the spawned process, and to have room for passing extra data through a void pointer. So let's leave these in place but mark the unused ones so that -Wunused-parameter does not complain. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18fsmonitor: mark unused hashmap callback parametersJeff King1-2/+3
Like many hashmap comparison functions, our cookies_cmp() does not look at its extra void data parameter. This should have been annotated in 02c3c59e62 (hashmap: mark unused callback parameters, 2022-08-19), but this new case was added around the same time (plus fsmonitor is not built at all on Linux, so it is easy to miss there). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18fsmonitor: mark some maybe-unused parametersJeff King1-1/+1
There's a bit of conditionally-compiled code in fsmonitor, so some function parameters may be unused depending on the build options: - in fsmonitor--daemon.c's try_to_run_foreground_daemon(), we take a detach_console argument, but it's only used on Windows. This seems intentional (and not mistakenly missing other platforms) based on the discussion in c284e27ba7 (fsmonitor--daemon: implement 'start' command, 2022-03-25), which introduced it. - in fsmonitor-setting.c's check_for_incompatible(), we pass the "ipc" flag down to the system-specific fsm_os__incompatible() helper. But we can only do so if our platform has such a helper. In both cases we can mark the argument as MAYBE_UNUSED. That annotates it enough to suppress the compiler's -Wunused-parameter warning, but without making it impossible to use the variable, as a regular UNUSED annotation would. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18Merge branch 'rs/grep-no-no-or'Junio C Hamano1-1/+1
"git grep -e A --no-or -e B" is accepted, even though the negation of "or" did not mean anything, which has been tightened. * rs/grep-no-no-or: grep: reject --no-or
2023-09-18diff --stat: add config option to limit filename widthDragan Simic4-0/+4
Add new configuration option diff.statNameWidth=<width> that is equivalent to the command-line option --stat-name-width=<width>, but it is ignored by format-patch. This follows the logic established by the already existing configuration option diff.statGraphWidth=<width>. Limiting the widths of names and graphs in the --stat output makes sense for interactive work on wide terminals with many columns, hence the support for these configuration options. They don't affect format-patch because it already adheres to the traditional 80-column standard. Update the documentation and add more tests to cover new configuration option diff.statNameWidth=<width>. While there, perform a few minor code and whitespace cleanups here and there, as spotted. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-14Merge branch 'rs/name-rev-use-opt-hidden-bool'Junio C Hamano1-6/+2
Simplify use of parse-options API a bit. * rs/name-rev-use-opt-hidden-bool: name-rev: use OPT_HIDDEN_BOOL for --peel-tag
2023-09-14Merge branch 'rs/grep-parseopt-simplify'Junio C Hamano1-3/+2
Simplify use of parse-options API a bit. * rs/grep-parseopt-simplify: grep: use OPT_INTEGER_F for --max-depth
2023-09-13builtin/repack.c: extract common cruft pack loopTaylor Blau1-13/+18
When generating the list of packs to store in a MIDX (when given the `--write-midx` option), we include any cruft packs both during --geometric and non-geometric repacks. But the rules for when we do and don't have to check whether any of those cruft packs were queued for deletion differ slightly between the two cases. But the two can be unified, provided there is a little bit of extra detail added in the comment to clarify when it is safe to avoid checking for any pending deletions (and why it is OK to do so even when not required). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: avoid directly inspecting "util"Taylor Blau1-4/+14
The `->util` field corresponding to each string_list_item is used to track the existence of some pack at the beginning of a repack operation was originally intended to be used as a bitfield. This bitfield tracked: - (1 << 0): whether or not the pack should be deleted - (1 << 1): whether or not the pack is cruft The previous commit removed the use of the second bit, but a future patch (from a different series than this one) will introduce a new use of it. So we could stop treating the util pointer as a bitfield and instead start treating it as if it were a boolean. But this would require some backtracking when that later patch is applied. Instead, let's avoid touching the ->util field directly, and instead introduce convenience functions like: - pack_mark_for_deletion() - pack_is_marked_for_deletion() Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: store existing cruft packs separatelyTaylor Blau1-16/+23
When repacking with the `--write-midx` option, we invoke the function `midx_included_packs()` in order to produce the list of packs we want to include in the resulting MIDX. This list is comprised of: - existing .keep packs - any pack(s) which were written earlier in the same process - any unchanged packs when doing a `--geometric` repack - any cruft packs Prior to this patch, we stored pre-existing cruft and non-cruft packs together (provided those packs are non-kept). This meant we needed an additional bit to indicate which non-kept pack(s) were cruft versus those that aren't. But alternatively we can store cruft packs in a separate list, avoiding the need for this extra bit, and simplifying the code below. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: extract `has_existing_non_kept_packs()`Taylor Blau1-1/+7
When there is: - at least one pre-existing packfile (which is not marked as kept), - repacking with the `-d` flag, and - not doing a cruft repack , then we pass a handful of additional options to the inner `pack-objects` process, like `--unpack-unreachable`, `--keep-unreachable`, and `--pack-loose-unreachable`, in addition to marking any packs we just wrote for promisor remotes as kept in-core (with `--keep-pack`, as opposed to the presence of a ".keep" file on disk). Because we store both cruft and non-cruft packs together in the same `existing.non_kept_packs` list, it suffices to check its `nr` member to see if it is zero or not. But a following change will store cruft- and non-cruft packs separately, meaning this check would break as a result. Prepare for this by extracting this part of the check into a new helper function called `has_existing_non_kept_packs()`. This patch does not introduce any functional changes, but prepares us to make a more isolated change in a subsequent patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: extract redundant pack cleanup for existing packsTaylor Blau1-17/+28
To remove redundant packs at the end of a repacking operation, Git uses its `remove_redundant_pack()` function in a loop over the set of pre-existing, non-kept packs. In a later commit, we will split this list into two, one for pre-existing cruft pack(s), and another for non-cruft pack(s). Prepare for this by factoring out the routine to loop over and delete redundant packs into its own function. Instead of calling `remove_redundant_pack()` directly, we now will call `remove_redundant_existing_packs()`, which itself dispatches a call to `remove_redundant_packs_1()`. Note that the geometric repacking code will still call `remove_redundant_pack()` directly, but see the previous commit for more details. Having `remove_redundant_packs_1()` exist as a separate function may seem like overkill in this patch. However, a later patch will call `remove_redundant_packs_1()` once over two separate lists, so this refactoring sets us up for that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: extract redundant pack cleanup for --geometricTaylor Blau1-23/+29
To reduce the complexity of the already quite-long `cmd_repack()` implementation, extract out the parts responsible for deleting redundant packs from a geometric repack out into its own sub-routine. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: extract marking packs for deletionTaylor Blau1-18/+32
At the end of a repack (when given `-d`), Git attempts to remove any packs which have been made "redundant" as a result of the repacking operation. For example, an all-into-one (`-A` or `-a`) repack makes every pre-existing pack which is not marked as kept redundant. Geometric repacks (with `--geometric=<n>`) make any packs which were rolled up redundant, and so on. But before deleting the set of packs we think are redundant, we first check to see whether or not we just wrote a pack which is identical to any one of the packs we were going to delete. When this is the case, Git must avoid deleting that pack, since it matches a pack we just wrote (so deleting it may cause the repository to become corrupt). Right now we only process the list of non-kept packs in a single pass. But a future change will split the existing non-kept packs further into two lists: one for cruft packs, and another for non-cruft packs. Factor out this routine to prepare for calling it twice on two separate lists in a future patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13builtin/repack.c: extract structure to store existing packsTaylor Blau1-41/+49
The repack machinery needs to keep track of which packfiles were present in the repository at the beginning of a repack, segmented by whether or not each pack is marked as kept. The names of these packs are stored in two `string_list`s, corresponding to kept- and non-kept packs, respectively. As a consequence, many functions within the repack code need to take both `string_list`s as arguments, leading to code like this: ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix, cruft_expiration, &names, &existing_nonkept_packs, /* <- */ &existing_kept_packs); /* <- */ Wrap up this pair of `string_list`s into a single structure that stores both. This saves us from having to pass both string lists separately, and prepares for adding additional fields to this structure. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>