aboutsummaryrefslogtreecommitdiffstats
path: root/builtin (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-05-16pack-objects: update usage to match docsDerrick Stolee1-2/+8
The t0450 test script verifies that builtin usage matches the synopsis in the documentation. Adjust the builtin to match and then remove 'git pack-objects' from the exception list. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: add --path-walk optionDerrick Stolee1-8/+140
In order to more easily compute delta bases among objects that appear at the exact same path, add a --path-walk option to 'git pack-objects'. This option will use the path-walk API instead of the object walk given by the revision machinery. Since objects will be provided in batches representing a common path, those objects can be tested for delta bases immediately instead of waiting for a sort of the full object list by name-hash. This has multiple benefits, including avoiding collisions by name-hash. The objects marked as UNINTERESTING are included in these batches, so we are guaranteeing some locality to find good delta bases. After the individual passes are done on a per-path basis, the default name-hash is used to find other opportunistic delta bases that did not match exactly by the full path name. The current implementation performs delta calculations while walking objects, which is not ideal for a few reasons. First, this will cause the "Enumerating objects" phase to be much longer than usual. Second, it does not take advantage of threading during the path-scoped delta calculations. Even with this lack of threading, the path-walk option is sometimes faster than the usual approach. Future changes will refactor this code to allow for threading, but that complexity is deferred until later to keep this patch as simple as possible. This new walk is incompatible with some features and is ignored by others: * Object filters are not currently integrated with the path-walk API, such as sparse-checkout or tree depth. A blobless packfile could be integrated easily, but that is deferred for later. * Server-focused features such as delta islands, shallow packs, and using a bitmap index are incompatible with the path-walk API. * The path walk API is only compatible with the --revs option, not taking object lists or pack lists over stdin. These alternative ways to specify the objects currently ignores the --path-walk option without even a warning. Future changes will create performance tests that demonstrate the power of this approach. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: extract should_attempt_deltas()Derrick Stolee1-24/+32
This will be helpful in a future change, which will reuse this logic. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16reset: integrate sparse index with --patchDerrick Stolee1-3/+3
Similar to the previous change for 'git add -p', the reset builtin checked for integration with the sparse index after possibly redirecting its logic toward the interactive logic. This means that the builtin would expand the sparse index to a full one upon read. Move this check earlier within cmd_reset() to improve performance here. Add tests to guarantee that we are not universally expanding the index. Add behavior tests to check that we are doing the same operations as a full index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16git add: make -p/-i aware of sparse indexDerrick Stolee1-3/+4
It is slow to expand a sparse index in-memory due to parsing of trees. We aim to minimize that performance cost when possible. 'git add -p' uses 'git apply' child processes to modify the index, but still there are some expansions that occur. It turns out that control flows out of cmd_add() in the interactive cases before the lines that confirm that the builtin is integrated with the sparse index. Moving that integration point earlier in cmd_add() allows 'git add -i' and 'git add -p' to operate without expanding a sparse index to a full one. Add test cases that confirm that these interactive add options work with the sparse index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16apply: integrate with the sparse indexDerrick Stolee1-1/+6
The sparse index allows storing directory entries in the index, marked with the skip-wortkree bit and pointing to a tree object. This may be an unexpected data shape for some implementation areas, so we are rolling it out incrementally on a builtin-per-builtin basis. This change enables the sparse index for 'git apply'. The main motivation for this change is that 'git apply' is used as a child process of 'git add -p' and expanding the sparse index for each of those child processes can lead to significant performance issues. The good news is that the actual index manipulation code used by 'git apply' is already integrated with the sparse index, so the only product change is to mark the builtin as allowing the sparse index so it isn't inflated on read. The more involved part of this change is around adding tests that verify how 'git apply' behaves in a sparse-checkout environment and whether or not the index expands in certain operations. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16hash-object: handle --literally with OPT_NEGBITJeff King1-16/+11
Since we recently removed the hash_literally() function, the hash-object --literally option has been simplified to just removing the INDEX_FORMAT_CHECK flag. Rather than pass it around as a separate bool, we can just have the option parser remove the bit from the set of flags directly. This simplifies the helper functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16hash-object: merge HASH_* and INDEX_* flagsJeff King1-17/+6
The hash-object command has its own custom flag bits that it sets based on command-line options. But since we dropped hash_literally() in the previous commit, the only thing we do with those flag bits is convert them directly into "index_flags" to pass to index_fd(). This extra layer of indirection makes the code harder to read and reason about. Let's just use the INDEX_* flags directly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16hash-object: stop allowing unknown typesJeff King1-24/+5
When passed the "--literally" option, hash-object will allow any arbitrary string for its "-t" type option. Such objects are only useful for testing or debugging, as they cannot be used in the normal way (e.g., you cannot fetch their contents!). Let's drop this feature, which will eventually let us simplify the object-writing code. This is technically backwards incompatible, but since such objects were never really functional, it seems unlikely that anybody will notice. We will retain the --literally flag, as it also instructs hash-object not to worry about other format issues (e.g., type-specific things that fsck would complain about). The documentation does not need to be updated, as it was always vague about which checks we're loosening (it uses only the phrase "any garbage"). The code change is a bit hard to verify from just the patch text. We can drop our local hash_literally() helper, but it was really just wrapping write_object_file_literally(). We now replace that with calling index_fd(), as we do for the non-literal code path, but dropping the INDEX_FORMAT_CHECK flag. This ends up being the same semantically as what the _literally() code path was doing (modulo handling unknown types, which is our goal). We'll be able to clean up these code paths a bit more in subsequent patches. The existing test is flipped to show that we now reject the unknown type. The additional "extra-long type" test is now redundant, as we bail early upon seeing a bogus type. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16fsck: stop using object_info->type_name strbufJeff King1-11/+2
When fsck-ing a loose object, we use object_info's type_name strbuf to record the parsed object type as a string. For most objects this is redundant with the object_type enum, but it does let us report the string when we encounter an object with an unknown type (for which there is no matching enum value). There are a few downsides, though: 1. The code to report these cases is not actually robust. Since we did not pass a strbuf to unpack_loose_header(), we only retrieved types from headers up to 32 bytes. In longer cases, we'd simply say "object corrupt or missing". 2. This is the last caller that uses object_info's type_name strbuf support. It would be nice to refactor it so that we can simplify that code. 3. Likewise, we'll check the hash of the object using its unknown type (again, as long as that type is short enough). That depends on the hash_object_file_literally() code, which we'd eventually like to get rid of. So we can simplify things by bailing immediately in read_loose_object() when we encounter an unknown type. This has a few user-visible effects: a. Instead of producing a single line of error output like this: error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object is of unknown type 'bogus': .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 we'll now issue two lines (the first from read_loose_object() when we see the unparsable header, and the second from the fsck code, since we couldn't read the object): error: unable to parse type from header 'bogus 4' of .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object corrupt or missing: .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 This is a little more verbose, but this sort of error should be rare (such objects are almost impossible to work with, and cannot be transferred between repositories as they are not representable in packfiles). And as a bonus, reporting the broken header in full could help with debugging other cases (e.g., a header like "blob xyzzy\0" would fail in parsing the size, but previously we'd not have showed the offending bytes). b. An object with an unknown type will be reported as corrupt, without actually doing a hash check. Again, I think this is unlikely to matter in practice since such objects are totally unusable. We'll update one fsck test to match the new error strings. And we can remove another test that covered the case of an object with an unknown type _and_ a hash corruption. Since we'll skip the hash check now in this case, the test is no longer interesting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16cat-file: use type enum instead of buffer for -t optionJeff King1-9/+4
Now that we no longer support OBJECT_INFO_ALLOW_UNKNOWN_TYPE, there is no need to pass a strbuf into oid_object_info_extended() to record the type. The regular object_type enum is sufficient to capture all of the types we will allow. This simplifies the code a bit, and will eventually let us drop object_info's type_name strbuf support. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16cat-file: make --allow-unknown-type a noopJeff King1-13/+5
The cat-file command has some minor support for handling objects with "unknown" types. I.e., strings that are not "blob", "commit", "tree", or "tag". In theory this could be used for debugging or experimenting with extensions to Git. But in practice this support is not very useful: 1. You can get the type and size of such objects, but nothing else. Not even the contents! 2. Only loose objects are supported, since packfiles use numeric ids for the types, rather than strings. 3. Likewise you cannot ever transfer objects between repositories, because they cannot be represented in the packfiles used for the on-the-wire protocol. The support for these unknown types complicates the object-parsing code, and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix infinite loop on broken zlib input, 2025-02-25). So let's drop it. The first step is to remove the user-facing parts, which are accessible only via cat-file. This is technically backwards-incompatible, but given the limitations listed above, these objects couldn't possibly be useful in any workflow. However, we can't just rip out the option entirely. That would hurt a caller who ran: git cat-file -t --allow-unknown-object <oid> and fed it normal, well-formed objects. There --allow-unknown-type was doing nothing, but we wouldn't want to start bailing with an error. So to protect any such callers, we'll retain --allow-unknown-type as a noop. The code change is fairly small (but we'll able to clean up more code in follow-on patches). The test updates drop any use of the option. We still retain tests that feed the broken objects to cat-file without --allow-unknown-type, as we should continue to confirm that those objects are rejected. Note that in one spot we can drop a layer of loop, re-indenting the body; viewing the diff with "-w" helps there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15Merge branch 'ps/maintenance-missing-tasks'Junio C Hamano1-31/+117
Make repository clean-up tasks "gc" can do available to "git maintenance" front-end. * ps/maintenance-missing-tasks: builtin/maintenance: introduce "rerere-gc" task builtin/gc: move rerere garbage collection into separate function builtin/maintenance: introduce "worktree-prune" task builtin/gc: move pruning of worktrees into a separate function builtin/gc: remove global variables where it is trivial to do builtin/gc: fix indentation of `cmd_gc()` parameters
2025-05-15fetch: avoid unnecessary work when there is no current branchJohannes Schindelin1-1/+1
As pointed out by CodeQL, `branch_get()` may return `NULL`, in which case `branch_has_merge_config()` would return early, but we can even avoid enumerating the refs prefixes in that case, saving even more CPU cycles. Technically, we should enclose these two statements in an `if (branch) {...}` block, but the indentation is already quite deep, therefore I refrained from doing that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: carefully clear local variable's address after useJohannes Schindelin1-0/+1
As pointed out by CodeQL, it is a potentially dangerous practice to store local variables' addresses in non-local structs. Yet this is exactly what happens with the `acked_commits` attribute that is used in `cmd_fetch()`: The pointer to a local variable is assigned to it. Now, it is Git's convention that `cmd_*()` functions are essentially only returning just before exiting the process, therefore there is little danger that this attribute is used after the code flow returns from that function. However, code in `cmd_*()` function is often so useful that it gets lifted into a library function, at which point this issue could become a real problem. Let's make sure to clear the `acked_commits` attribute out after it was used, and before the function returns (at which point the address would go stale). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15commit: simplify codeJohannes Schindelin1-1/+1
The difference of two unsigned integers is defined to be unsigned, and therefore it is misleading to check whether it is greater than zero (instead, the more natural way would be to check whether the difference is zero or not). Let's instead avoid the subtraction altogether, and compare the two operands directly, which makes the code more obvious as a side effect. Pointed out by CodeQL's rule with the ID `cpp/unsigned-difference-expression-compared-zero`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-14replay: replace the_repository with repo parameter passed to cmd_replay ()Elijah Newren1-30/+35
Replace the_repository everywhere with repo, feed repo from cmd_replay() to all the other functions in the file that need it, and remove the UNUSED annotation on repo. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12whatchanged: remove when built with WITH_BREAKING_CHANGESJunio C Hamano1-0/+6
As we made "git whatchanged" require "--i-still-use-this" and asked the users to report if they still want to use it, the logical next step is to allow us build Git without "whatchanged" to prepare for its eventual removal. If we were to follow the pattern established in 8ccc75c2 (remote: announce removal of "branches/" and "remotes/", 2025-01-22), we can do this together with the documentation update to officially list that the command will be removed in the BreakingChanges document, but let's just keep the changes separate just in case we want to proceed a bit slower. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12whatchanged: require --i-still-use-thisJunio C Hamano1-0/+13
The documentation of "git whatchanged" is pretty explicit that the command was retained for historical reasons to help those whose fingers cannot be retrained. Let's see if they still are finding it hard to type "git log --raw" instead of "git whatchanged" by marking the command as "nominated for removal", and require "--i-still-use-this" on the command line. Adjust the tests so that the option is passed when we invoke the command. In addition, we test that the command fails when "--i-still-use-this" is not given. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12Merge branch 'ds/fix-thin-fix'Junio C Hamano1-26/+32
"git index-pack --fix-thin" used to abort to prevent a cycle in delta chains from forming in a corner case even when there is no such cycle. * ds/fix-thin-fix: index-pack: allow revisiting REF_DELTA chains t5309: create failing test for 'git index-pack' test-tool: add pack-deltas helper
2025-05-12Merge branch 'ps/object-store-cleanup'Junio C Hamano11-23/+26
Further code clean-up in the object-store layer. * ps/object-store-cleanup: object-store: drop `repo_has_object_file()` treewide: convert users of `repo_has_object_file()` to `has_object()` object-store: allow fetching objects via `has_object()` object-store: move function declarations to their respective subsystems object-store: move and rename `odb_pack_keep()` object-store: drop `loose_object_path()` object-store: move `struct packed_git` into "packfile.h"
2025-05-12you-still-use-that??: help deprecating commands for removalJunio C Hamano1-8/+2
Commands slated for removal like "git pack-redundant" now require an explicit "--i-still-use-this" option to run. This is to discourage casual use and surface their pending deprecation to users. The warning message is long, so factor it into a helper function you_still_use_that() to simplify reuse by other commands. Also add a missing test to ensure this enforcement works for "pack-redundant". Helped-by: Elijah Newren <newren@gmail.com> [en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12oidmap: rename oidmap_free() to oidmap_clear()Jeff King1-1/+1
This function does not free the oidmap struct itself; it just drops all items from the map (using hashmap_clear_() internally). It should be called oidmap_clear(), per CodingGuidelines. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12builtin/am: fix memory leak in `split_mail_stgit_series`Lidong Yan1-1/+3
In builtin/am.c:split_mail_stgit_series, if `fopen` failed, `series_dir_buf` allocated by `xstrdup` will leak. Add `free` in `!fp` if branch will prevent the leak. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-08Merge branch 'ps/mv-contradiction-fix'Junio C Hamano1-3/+61
"git mv a a/b dst" would ask to move the directory 'a' itself, as well as its contents, in a single destination directory, which is a contradicting request that is impossible to satisfy. This case is now detected and the command errors out. * ps/mv-contradiction-fix: builtin/mv: convert assert(3p) into `BUG()` builtin/mv: bail out when trying to move child and its parent
2025-05-07builtin/maintenance: introduce "rerere-gc" taskPatrick Steinhardt1-0/+37
While git-gc(1) knows to garbage collect the rerere cache, git-maintenance(1) does not yet have a task for this cleanup. Introduce a new "rerere-gc" task to plug this gap. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07builtin/gc: move rerere garbage collection into separate functionPatrick Steinhardt1-5/+11
In a subsequent commit we are going to introduce a new "rerere-gc" task for git-maintenance(1). To prepare for this, refactor the code that spawns `git rerere gc` into a separate function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07builtin/maintenance: introduce "worktree-prune" taskPatrick Steinhardt1-0/+45
While git-gc(1) knows to prune stale worktrees, git-maintenance(1) does not yet have a task for this cleanup. Introduce a new "worktree-prune" task to plug this gap. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07builtin/gc: move pruning of worktrees into a separate functionPatrick Steinhardt1-10/+15
In a subsequent commit we will introduce a new "worktree-prune" task for git-maintenance(1). To prepare for this, refactor the code that spawns `git worktree prune` into a separate function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07builtin/gc: remove global variables where it is trivial to doPatrick Steinhardt1-19/+12
We use a couple of global variables to assemble command line arguments for subprocesses we execute in git-gc(1). All of these variables except the one for git-repack(1) are only used in a single place though, so they don't really add anything but confusion. Remove those variables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07builtin/gc: fix indentation of `cmd_gc()` parametersPatrick Steinhardt1-3/+3
The parameters of `cmd_gc()` aren't indented properly. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-30builtin/mv: convert assert(3p) into `BUG()`Patrick Steinhardt1-1/+2
The use of asserts is discouraged in our codebase because they lead to different behaviour depending on how Git is built. When being unsure enough whether a condition always holds so that one adds the assert, then the assert should probably trigger regardless of how Git is being built. Drop the call to assert(3p) in git-mv(1) and instead use `BUG()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-30builtin/mv: bail out when trying to move child and its parentPatrick Steinhardt1-2/+59
We have a known issue in git-mv(1) where moving both a child and any of its parents causes an assert to trigger because the child cannot be found anymore in the index. We have added a test for this in commit 0fcd473fdd3 (t7001: add failure test which triggers assertion, 2024-10-22) without addressing the issue, which is why the test itself is marked as `test_expect_failure`. The behaviour of that test relies on a call to assert(3p) though, which may or may not be compiled into the resulting binary depending on whether or not we pass `-DNDEBUG`. When these asserts are compiled into Git this may cause our CI to hang on Windows though, because asserts may cause a modal window to be shown. While we could work around the issue by converting this into a call to `BUG()`, let's rather address the root cause of the issue by bailing out in case we see that both a child and any of its parents are being moved in the same command. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29Merge branch 'jh/gc-launchctl-schedule-fix'Junio C Hamano1-2/+2
Fix for scheduled maintenance tasks on platforms using launchctl. * jh/gc-launchctl-schedule-fix: maintenance: fix launchctl calendar intervals
2025-04-29Merge branch 'az/tighten-string-array-constness'Junio C Hamano6-11/+11
Code clean-up. * az/tighten-string-array-constness: global: mark usage strings and string tables const
2025-04-29Merge branch 'ua/call-repo-config-with-possibly-null-repository'Junio C Hamano2-4/+2
Since a call to repo_config() can be called with repo set to NULL these days, a command that is marked as RUN_SETUP in the builtin command table does not have to check repo with NULL before making the call. * ua/call-repo-config-with-possibly-null-repository: builtin/difftool: remove unnecessary if statement builtin/add: remove unnecessary if statement
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt8-19/+21
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29object-store: move function declarations to their respective subsystemsPatrick Steinhardt2-2/+2
We carry declarations for a couple of functions in "object-store.h" that are not defined in "object-store.c", but in a different subsystem. Move these declarations to the respective headers whose matching code files carry the corresponding definition. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29object-store: move and rename `odb_pack_keep()`Patrick Steinhardt2-2/+3
The function `odb_pack_keep()` creates a file at the passed-in path. If this fails, then the function re-tries by first creating any potentially missing leading directories and then trying to create the file once again. As such, this function doesn't host any kind of logic that is specific to the object store, but is rather a generic helper function. Rename the function to `safe_create_file_with_leading_directories()` and move it into "path.c". While at it, refactor it so that it loses its dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-28index-pack: allow revisiting REF_DELTA chainsDerrick Stolee1-26/+32
As detailed in the previous changes to t5309-pack-delta-cycles.sh, the logic within 'git index-pack' to analyze an incoming thin packfile with REF_DELTAs is suspect. The algorithm is overly cautious around delta cycles, and that leads in fact to failing even when there is no cycle. This change adjusts the algorithm to no longer fail in these cases. In fact, these cycle cases will no longer fail but more importantly the valid cases will no longer fail, either. The resulting packfile from the --fix-thin operation will not have cycles either since REF_DELTAs are forbidden from the on-disk format and OFS_DELTAs are impossible to write as a cycle. The crux of the matter is how the algorithm works when the REF_DELTAs point to base objects that exist in the local repository. When reading the thin packfile, the object IDs for the delta objects are unknown so we do not have the delta chain structure automatically. Instead, we need to start somewhere by selecting a delta whose base is inside our current object database. Consider the case where the packfile has two REF_DELTA objects, A and B, and the delta chain looks like "A depends on B" and "B depends on C" for some third object C, where C is already in the current repository. The algorithm _should_ start with all objects that depend on C, finding B, and then moving on to all objects depending on B, finding A. However, if the repository also already has object B, then the delta chain can be analyzed in a different order. The deltas with base B can be analyzed first, finding A, and then the deltas with base C are analyzed, finding B. The algorithm currently continues to look for objects that depend on B, finding A again. This fails due to A's 'real_type' member already being overwritten from OBJ_REF_DELTA to the correct object type. This scenario is possible in a typical 'git fetch' where the client does not advertise B as a 'have' but requests A as a 'want' (and C is noticed as a common object based on other 'have's). The reason this isn't typically seen is that most Git servers use OFS_DELTAs to represent deltas within a packfile. However, if a server uses only REF_DELTAs, then this kind of issue can occur. There is nothing in the explicit packfile format that states this use of inter-pack REF_DELTA is incorrect, only that REF_DELTAs should not be used in the on-disk representation to avoid cycles. This die() was introduced in ab791dd138 (index-pack: fix race condition with duplicate bases, 2014-08-29). Several refactors have adjusted the error message and the surrounding logic, but this issue has existed for a longer time as that was only a conversion from an assert(). The tests in t5309 originated in 3b910d0c5e (add tests for indexing packs with delta cycles, 2013-08-23) and b2ef3d9ebb (test index-pack on packs with recoverable delta cycles, 2013-08-23). These changes make note that the current behavior of handling "resolvable" cycles is mostly a documentation-only test, not that this behavior is the best way for Git to handle the situation. The fix here is somewhat complicated due to the amount of state being adjusted by the loop within threaded_second_pass(). Instead of trying to resume the start of the loop while adjusting the necessary context, I chose to scan the REF_DELTAs depending on the current 'parent' and skip any that have already been processed. This necessarily leaves us in a state where 'child' and 'child_obj' could be left as NULL and that must be handled later. There is also some careful handling around skipping REF_DELTAs when there are also OFS_DELTAs depending on that parent. There may be value in extending 'test-tool pack-deltas' to allow writing OFS_DELTAs in order to exercise this logic across the delta types. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'rj/build-tweaks'Junio C Hamano1-2/+7
Various build tweaks, including CSPRNG selection on some platforms. * rj/build-tweaks: config.mak.uname: set CSPRNG_METHOD to getrandom on Linux config.mak.uname: add arc4random to the cygwin build config.mak.uname: add sysinfo() configuration for cygwin builtin/gc.c: correct RAM calculation when using sysinfo config.mak.uname: add clock_gettime() to the cygwin build config.mak.uname: add HAVE_GETDELIM to the cygwin section config.mak.uname: only set NO_REGEX on cygwin for v1.7 config.mak.uname: add a note about NO_STRLCPY for Linux Makefile: remove NEEDS_LIBRT build variable meson.build: set default help format to html on windows meson.build: only set build variables for non-default values Makefile: only set some BASIC_CFLAGS when RUNTIME_PREFIX is set meson.build: remove -DCURL_DISABLE_TYPECHECK
2025-04-24Merge branch 'ps/parse-options-integers'Junio C Hamano25-151/+392
Update parse-options API to catch mistakes to pass address of an integral variable of a wrong type/size. * ps/parse-options-integers: parse-options: detect mismatches in integer signedness parse-options: introduce precision handling for `OPTION_UNSIGNED` parse-options: introduce precision handling for `OPTION_INTEGER` parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()` parse-options: support unit factors in `OPT_INTEGER()` global: use designated initializers for options parse: fix off-by-one for minimum signed values
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano48-87/+110
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-24Merge branch 'ps/object-file-cleanup' into ps/object-store-cleanupJunio C Hamano48-87/+110
* ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-23Merge branch 'ja/doc-reset-mv-rm-markup-updates'Junio C Hamano1-1/+2
Doc mark-up updates. * ja/doc-reset-mv-rm-markup-updates: doc: add markup for characters in Guidelines doc: fix asciidoctor synopsis processing of triple-dots doc: convert git-mv to new documentation format doc: move synopsis git-mv commands in the synopsis section doc: convert git-rm to new documentation format doc: fix synopsis analysis logic doc: convert git-reset to new documentation format
2025-04-23maintenance: fix launchctl calendar intervalsJosh Heinrichs1-2/+2
When using the launchctl scheduler, the weekly job runs daily, and the daily job runs on the first six days of each month. This appears to be due to specifying "Day" in the calendar intervals, which according to launchd.plist(5) is for specifying days of the month rather than days of the week. The behaviour of running a job on the 0th day is undocumented, but in my testing appears to be the same as not specifying "Day" in the calendar interval, in which case the job will run daily. Use "Weekday" in the calendar intervals, which is the correct way to schedule jobs to run on specific days of the week. Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-21global: mark usage strings and string tables constAhelenia Ziemiańska6-11/+11
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-20builtin/difftool: remove unnecessary if statementUsman Akinyemi1-2/+1
Since we already teach the `repo_config()` in "f29f1990b5 (config: teach repo_config to allow `repo` to be NULL, 2025-03-08)" to allow `repo` to be NULL, no need to check if `repo` is NULL before calling `repo_config()`. Suggested-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-20builtin/add: remove unnecessary if statementUsman Akinyemi1-2/+1
Since we already teach the `repo_config()` in "f29f1990b5 (config: teach repo_config to allow `repo` to be NULL, 2025-03-08)" to allow `repo` to be NULL, no need to check if `repo` is NULL before calling `repo_config()`. Suggested-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17Merge branch 'ua/update-update-server-info'Junio C Hamano1-2/+2
Code simplification. * ua/update-update-server-info: builtin/update-server-info: remove unnecessary if statement