summaryrefslogtreecommitdiffstats
path: root/refs
AgeCommit message (Collapse)AuthorLines
5 daysMerge branch 'ps/refs-for-each'Junio C Hamano-18/+21
Code refactoring around refs-for-each-* API functions. * ps/refs-for-each: refs: replace `refs_for_each_fullref_in()` refs: replace `refs_for_each_namespaced_ref()` refs: replace `refs_for_each_glob_ref()` refs: replace `refs_for_each_glob_ref_in()` refs: replace `refs_for_each_rawref_in()` refs: replace `refs_for_each_rawref()` refs: replace `refs_for_each_ref_in()` refs: improve verification for-each-ref options refs: generalize `refs_for_each_fullref_in_prefixes()` refs: generalize `refs_for_each_namespaced_ref()` refs: speed up `refs_for_each_glob_ref_in()` refs: introduce `refs_for_each_ref_ext` refs: rename `each_ref_fn` refs: rename `do_for_each_ref_flags` refs: move `do_for_each_ref_flags` further up refs: move `refs_head_ref_namespaced()` refs: remove unused `refs_for_each_include_root_ref()`
10 daysMerge branch 'kn/ref-location'Junio C Hamano-53/+51
Allow the directory in which reference backends store their data to be specified. * kn/ref-location: refs: add GIT_REFERENCE_BACKEND to specify reference backend refs: allow reference location in refstorage config refs: receive and use the reference storage payload refs: move out stub modification to generic layer refs: extract out `refs_create_refdir_stubs()` setup: don't modify repo in `create_reference_database()`
2026-02-25refs: receive and use the reference storage payloadKarthik Nayak-15/+46
An upcoming commit will add support for providing an URI via the 'extensions.refStorage' config. The URI will contain the reference backend and a corresponding payload. The payload can be then used for providing an alternate locations for the reference backend. To prepare for this, modify the existing backends to accept such an argument when initializing via the 'init()' function. Both the files and reftable backends will parse the information to be filesystem paths to store references. Given that no callers pass any payload yet this is essentially a no-op change for now. To enable this, provide a 'refs_compute_filesystem_location()' function which will parse the current 'gitdir' and the 'payload' to provide the final reference directory and common reference directory (if working in a linked worktree). The documentation and tests will be added alongside the extension of the config variable. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25refs: move out stub modification to generic layerKarthik Nayak-28/+5
When creating the reftable reference backend on disk, we create stubs to ensure that the directory can be recognized as a Git repository. This is done by calling `refs_create_refdir_stubs()`. Move this to the generic layer as this is needed for all backends excluding from the files backends. In an upcoming commit where we introduce alternate reference backend locations, we'll have to also create stubs in the $GIT_DIR irrespective of the backend being used. This commit builds the base to add that logic. Similarly, move the logic for deletion of stubs to the generic layer. The files backend recursively calls the remove function of the 'packed-backend', here skip calling the generic function since that would try to delete stubs. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25refs: extract out `refs_create_refdir_stubs()`Karthik Nayak-12/+2
For Git to recognize a directory as a Git directory, it requires the directory to contain: 1. 'HEAD' file 2. 'objects/' directory 3. 'refs/' directory Here, #1 and #3 are part of the reference storage mechanism, specifically the files backend. Since then, newer backends such as the reftable backend have moved to using their own path ('reftable/') for storing references. But to ensure Git still recognizes the directory as a Git directory, we create stubs. There are two locations where we create stubs: - In 'refs/reftable-backend.c' when creating the reftable backend. - In 'clone.c' before spawning transport helpers. In a following commit, we'll add another instance. So instead of repeating the code, let's extract out this code to `refs_create_refdir_stubs()` and use it. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_rawref()`Patrick Steinhardt-2/+5
Replace calls to `refs_for_each_rawref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: rename `each_ref_fn`Patrick Steinhardt-1/+1
Similar to the preceding commit, rename `each_ref_fn` to better match our current best practices around how we name things. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: rename `do_for_each_ref_flags`Patrick Steinhardt-15/+15
The enum `do_for_each_ref_flags` and its individual values don't match to our current best practices when it comes to naming things. Rename it to `refs_for_each_flag`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09Merge branch 'kn/ref-batch-output-error-reporting-fix'Junio C Hamano-13/+13
A handful of code paths that started using batched ref update API (after Git 2.51 or so) lost detailed error output, which have been corrected. * kn/ref-batch-output-error-reporting-fix: fetch: delay user information post committing of transaction receive-pack: utilize rejected ref error details fetch: utilize rejected ref error details update-ref: utilize rejected error details if available refs: add rejection detail to the callback function refs: skip to next ref when current ref is rejected
2026-01-25refs: skip to next ref when current ref is rejectedKarthik Nayak-13/+13
In `refs_verify_refnames_available()` we have two nested loops: the outer loop iterates over all references to check, while the inner loop checks for filesystem conflicts for a given ref by breaking down its path. With batched updates, when we detect a filesystem conflict, we mark the update as rejected and execute 'continue'. However, this only skips to the next iteration of the inner loop, not the outer loop as intended. This causes the same reference to be repeatedly rejected. Fix this by using a goto statement to skip to the next reference in the outer loop. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/reftable: introduce generic checks for refsPatrick Steinhardt-6/+76
In a preceding commit we have extracted generic checks for both direct and symbolic refs that apply for all backends. Wire up those checks for the "reftable" backend. Note that this is done by iterating through all refs manually with the low-level reftable ref iterator. We explicitly don't want to use the higher-level iterator that is exposed to users of the reftable backend as that iterator may swallow for example broken refs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/reftable: fix consistency checks with worktreesPatrick Steinhardt-15/+14
The ref consistency checks are driven via `cmd_refs_verify()`. That function loops through all worktrees (including the main worktree) and then checks the ref store for each of them individually. It follows that the backend is expected to only verify refs that belong to the specified worktree. While the "files" backend handles this correctly, the "reftable" backend doesn't. In fact, it completely ignores the passed worktree and instead verifies refs of _all_ worktrees. The consequence is that we'll end up every ref store N times, where N is the number of worktrees. Or rather, that would be the case if we actually iterated through the worktree reftable stacks correctly. But we use `strmap_for_each_entry()` to iterate through the stacks, but the map is in fact not even properly populated. So instead of checking stacks N^2 times, we actually only end up checking the reftable stack of the main worktree. Fix this bug by only verifying the stack of the passed-in worktree and constructing the backends via `backend_for_worktree()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/reftable: extract function to retrieve backend for worktreePatrick Steinhardt-27/+43
Pull out the logic to retrieve a backend for a given worktree. This function will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/reftable: adapt includes to become consistentPatrick Steinhardt-2/+2
Adapt the includes to be sorted and to use include paths that are relative to the "refs/" directory. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: introduce function to perform normal ref checksPatrick Steinhardt-0/+2
In a subsequent commit we'll introduce new generic checks for direct refs. These checks will be independent of the actual backend. Introduce a new function `refs_fsck_ref()` that will be used for this purpose. At the current point in time it's still empty, but it will get populated in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: extract generic symref target checksPatrick Steinhardt-33/+21
The consistency checks for the "files" backend contain a couple of verifications for symrefs that verify generic properties of the target reference. These properties need to hold for every backend, no matter whether it's using the "files" or "reftable" backend. Reimplementing these checks for every single backend doesn't really make sense. Extract it into a generic `refs_fsck_symref()` function that can be used by other backends, as well. The "reftable" backend will be wired up in a subsequent commit. While at it, improve the consistency checks so that we don't complain about refs pointing to a non-ref target in case the target refname format does not verify. Otherwise it's very likely that we'll generate both error messages, which feels somewhat redundant in this case. Note that the function has a couple of `UNUSED` parameters. These will become referenced in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: perform consistency checks for root refsPatrick Steinhardt-3/+47
While the "files" backend already knows to perform consistency checks for the "refs/" hierarchy, it doesn't verify any of its root refs. Plug this omission. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: improve error handling when verifying symrefsPatrick Steinhardt-15/+13
The error handling when verifying symbolic refs is a bit on the wild side: - `fsck_report_ref()` can be told to ignore specific errors. If an error has been ignored and a previous check raised an unignored error, then assigning `ret = fsck_report_ref()` will cause us to swallow the previous error. - When the target reference is not valid we bail out early without checking for other errors. Fix both of these issues by consistently or'ing the return value and not bailing out early. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: extract function to check single refPatrick Steinhardt-29/+51
When checking the consistency of references we create a directory iterator and then verify each single reference in a loop. The logic to perform the actual checks is embedded into that loop, which makes it hard to reuse. But In a subsequent commit we're about to introduce a second path that wants to verify references. Prepare for this by extracting the logic to check a single reference into a standalone function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: remove useless indirectionPatrick Steinhardt-9/+7
The function `files_fsck_refs()` only has a single callsite and forwards all of its arguments as-is, so it's basically a useless indirection. Inline the function call. While at it, also remove the bitwise or that we have for return values. We don't really want to or them at all, but rather just want to return an error in case either of the functions has failed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: remove `refs_check_dir` parameterPatrick Steinhardt-5/+3
The parameter `refs_check_dir` determines which directory we want to check references for. But as we always want to check the complete refs hierarchy, this parameter is always set to "refs". Drop the parameter and hardcode it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: move fsck functions into global scopePatrick Steinhardt-9/+8
When performing consistency checks we pass the functions that perform the verification down the calling stack. This is somewhat unnecessary though, as the set of functions doesn't ever change. Simplify the code by moving the array into global scope and remove the parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12refs/files: simplify iterating through root refsPatrick Steinhardt-8/+3
When iterating through root refs we first need to determine the directory in which the refs live. This is done by retrieving the root of the loose refs via `refs->loose->root->name`, and putting it through `files_ref_path()` to derive the final path. This is somewhat redundant though: the root name of the loose files cache is always going to be the empty string. As such, we always end up passing that empty string to `files_ref_path()` as the ref hierarchy we want to start. And this actually makes sense: `files_ref_path()` already computes the location of the root directory, so of course we need to pass the empty string for the ref hierarchy itself. So going via the loose ref cache to figure out that the root of a ref hierarchy is empty is only causing confusion. But next to the added confusion, it can also lead to a segfault. The loose ref cache is populated lazily, so it may not always be set. It seems to be sheer luck that this is a condition we do not currently hit. The right thing to do would be to call `get_loose_ref_cache()`, which knows to populate the cache if required. Simplify the code and fix the potential segfault by simply removing the indirection via the loose ref cache completely. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30Merge branch 'gf/maintenance-is-needed-fix'Junio C Hamano-1/+1
Brown-paper-bag fix to a recently graduated 'kn/maintenance-is-needed' topic. * gf/maintenance-is-needed-fix: refs: dereference the value of the required pointer
2025-12-19refs: dereference the value of the required pointerGreg Funni-1/+1
Currently, this always prints yes because required is non-null. This is the wrong behavior. The boolean must be dereferenced. Signed-off-by: Greg Funni <gfunni234@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-21Merge branch 'kn/maintenance-is-needed'Junio C Hamano-0/+68
"git maintenance" command learned "is-needed" subcommand to tell if it is necessary to perform various maintenance tasks. * kn/maintenance-is-needed: maintenance: add 'is-needed' subcommand maintenance: add checking logic in `pack_refs_condition()` refs: add a `optimize_required` field to `struct ref_storage_be` reftable/stack: add function to check if optimization is required reftable/stack: return stack segments directly
2025-11-19Merge branch 'ps/ref-peeled-tags-fixes'Junio C Hamano-2/+2
Another fix-up to "peeled-tags" topic. * ps/ref-peeled-tags-fixes: object: fix performance regression when peeling tags
2025-11-19Merge branch 'kn/refs-optim-cleanup'Junio C Hamano-38/+18
Code clean-up. * kn/refs-optim-cleanup: t/pack-refs-tests: move the 'test_done' to callees refs: rename 'pack_refs_opts' to 'refs_optimize_opts' refs: move to using the '.optimize' functions
2025-11-19Merge branch 'ps/ref-peeled-tags'Junio C Hamano-239/+83
Some ref backend storage can hold not just the object name of an annotated tag, but the object name of the object the tag points at. The code to handle this information has been streamlined. * ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`
2025-11-10refs: add a `optimize_required` field to `struct ref_storage_be`Karthik Nayak-0/+68
To allow users of the refs namespace to check if the reference backend requires optimization, add a new field `optimize_required` field to `struct ref_storage_be`. This field is of type `optimize_required_fn` which is also introduced in this commit. Modify the debug, files, packed and reftable backend to implement this field. A following commit will expose this via 'git pack-refs' and 'git refs optimize'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-06Merge branch 'pk/reflog-migrate-message-fix'Junio C Hamano-2/+2
Message fix. * pk/reflog-migrate-message-fix: refs: add missing space in messages
2025-11-06object: fix performance regression when peeling tagsPatrick Steinhardt-2/+2
Our Bencher dashboards [1] have recently alerted us about a bunch of performance regressions when writing references, specifically with the reftable backend. There is a 3x regression when writing many refs with preexisting refs in the reftable format, and a 10x regression when migrating refs between backends in either of the formats. Bisecting the issue lands us at 6ec4c0b45b (refs: don't store peeled object IDs for invalid tags, 2025-10-23). The gist of the commit is that we may end up storing peeled objects in both reftables and packed-refs for corrupted tags, where the claimed tagged object type is different than the actual tagged object type. This will then cause us to create the `struct object *` with a wrong type, as well, and obviously nothing good comes out of that. The fix for this issue was to introduce a new flag to `peel_object()` that causes us to verify the tagged object's type before writing it into the refdb -- if the tag is corrupt, we skip writing the peeled value. To verify whether the peeled value is correct we have to look up the object type via the ODB and compare the actual type with the claimed type, and that additional object lookup is costly. This also explains why we see the regression only when writing refs with the reftable backend, but we see the regression with both backends when migrating refs: - The reftable backend knows to store peeled values in the new table immediately, so it has to try and peel each ref it's about to write to the transaction. So the performance regression is visible for all writes. - The files backend only stores peeled values when writing the packed-refs file, so it wouldn't hit the performance regression for normal writes. But on ref migrations we know to write all new values into the packed-refs file immediately, and that's why we see the regression for both backends there. Taking a step back though reveals an oddity in the new verification logic: we not only verify the _tagged_ object's type, but we also verify the type of the tag itself. But this isn't really needed, as we wouldn't hit the bug in such a case anyway, as we only hit the issue with corrupt tags claiming an invalid type for the tagged object. The consequence of this is that we now started to look up the target object of every single reference we're about to write, regardless of whether it even is a tag or not. And that is of course quite costly. Fix the issue by only verifying the type of the tagged objects. This means that we of course still have a performance hit for actual tags. But this only happens for writes anyway, and I'd claim it's preferable to not store corrupted data in the refdb than to be fast here. Rename the flag accordingly to clarify that we only verify the tagged object's type. This fix brings performance back to previous levels: Benchmark 1: baseline Time (mean ± σ): 46.0 ms ± 0.4 ms [User: 40.0 ms, System: 5.7 ms] Range (min … max): 45.0 ms … 47.1 ms 54 runs Benchmark 2: regression Time (mean ± σ): 140.2 ms ± 1.3 ms [User: 77.5 ms, System: 60.5 ms] Range (min … max): 138.0 ms … 142.7 ms 20 runs Benchmark 3: fix Time (mean ± σ): 46.2 ms ± 0.4 ms [User: 40.2 ms, System: 5.7 ms] Range (min … max): 45.0 ms … 47.3 ms 55 runs Summary update-ref: baseline 1.00 ± 0.01 times faster than fix 3.05 ± 0.04 times faster than regression [1]: https://bencher.dev/perf/git/plots Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-06Merge branch 'ps/ref-peeled-tags' into ps/ref-peeled-tags-fixesJunio C Hamano-239/+83
* ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`
2025-11-05refs: add missing space in messagesPeter Krefting-2/+2
Signed-off-by: Peter Krefting <peter@softwolves.pp.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04Merge branch 'xr/ref-debug-remove-on-disk'Junio C Hamano-0/+9
The "debug" ref-backend was missing a method implementation, which has been corrected. * xr/ref-debug-remove-on-disk: refs: add missing remove_on_disk implementation for debug backend
2025-11-04Merge branch 'kn/refs-optim-cleanup' into kn/maintenance-is-neededJunio C Hamano-38/+18
* kn/refs-optim-cleanup: t/pack-refs-tests: move the 'test_done' to callees refs: rename 'pack_refs_opts' to 'refs_optimize_opts' refs: move to using the '.optimize' functions
2025-11-04Merge branch 'ps/ref-peeled-tags' into kn/maintenance-is-neededJunio C Hamano-239/+83
* ps/ref-peeled-tags: (23 commits) t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn` builtin/repo: add progress meter for structure stats builtin/repo: add keyvalue and nul format for structure stats builtin/repo: add object counts in structure output builtin/repo: introduce structure subcommand ...
2025-11-04refs: rename 'pack_refs_opts' to 'refs_optimize_opts'Karthik Nayak-10/+10
The previous commit removed all references to 'pack_refs()' within the refs subsystem. Continue this cleanup by also renaming 'pack_refs_opts' to 'refs_optimize_opts' and the respective flags accordingly. Keeping the naming consistent will make the code easier to maintain. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: move to using the '.optimize' functionsKarthik Nayak-32/+12
The `struct ref_store` variable exposes two ways to optimize a reftable backend: 1. pack_refs 2. optimize The former was specific to the 'files' + 'packed' refs backend. The latter is more generic and covers all backends. While the naming is different, both of these functions perform the same functionality. Consolidate this code to only maintain the 'optimize' functions. Do this by modifying the backends so that they exclusively implement the `optimize` callback, only. All users of the refs subsystem already use the 'optimize' function so there is no changes needed on the callee side. Finally, cleanup all references to the 'pack_refs' field of the structure and code around it. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: don't store peeled object IDs for invalid tagsPatrick Steinhardt-2/+3
Both the "files" and "reftable" backend store peeled object IDs for references that point to tags: - The "files" backend stores the value when packing refs, where each peeled object ID is prefixed with "^". - The "reftable" backend stores the value whenever writing a new reference that points to a tag via a special ref record type. Both of these backends use `peel_object()` to find the peeled object ID. But as explained in the preceding commit, that function does not detect the case where the tag's tagged object and its claimed type mismatch. The consequence of storing these bogus peeled object IDs is that we're less likely to detect such corruption in other parts of Git. git-for-each-ref(1) for example does not notice anymore that the tag is broken when using "--format=%(*objectname)" to dereference tags. One could claim that this is good, because it still allows us to mostly use the tag as intended. But the biggest problem here is that we now have different behaviour for such a broken tag depending on whether or not we have its peeled value in the refdb. Fix the issue by verifying the object type when peeling the object. If that verification fails we simply skip storing the peeled value in either of the reference formats. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04object: add flag to `peel_object()` to verify object typePatrick Steinhardt-5/+4
When peeling a tag to a non-tag object we repeatedly call `parse_object()` on the tagged object until we find the first object that isn't a tag. While this feels sensible at first, there is a big catch here: `parse_object()` doesn't actually verify the type of the tagged object. The relevant code path here eventually ends up in `parse_tag_buffer()`. Here, we parse the various fields of the tag, including the "type". Once we've figured out the type and the tagged object ID, we call one of the `lookup_${type}()` functions for whatever type we have found. There is two possible outcomes in the successful case: 1. The object is already part of our cached objects. In that case we double-check whether the type we're trying to look up matches the type that was cached. 2. The object is _not_ part of our cached objects. In that case, we simply create a new object with the expected type, but we don't parse that object. In the first case we might notice type mismatches, but only in the case where our cache has the object with the correct type. In the second case, we'll blindly assume that the type is correct and then go with it. We'll only notice that the type might be wrong when we try to parse the object at a later point. Now arguably, we could change `parse_tag_buffer()` to verify the tagged object's type for us. But that would have the effect that such a tag cannot be parsed at all anymore, and we have a small bunch of tests for exactly this case that assert we still can open such tags. So this change does not feel like something we can retroactively tighten, even though one shouldn't ever hit such corrupted tags. Instead, add a new `flags` field to `peel_object()` that allows the caller to opt in to strict object verification. This will be wired up at a subset of callsites over the next few commits. Note that this change also inlines `deref_tag_noverify()`. There's only been two callsites of that function, the one we're changing and one in our test helpers. The latter callsite can trivially use `deref_tag()` instead, so by inlining the function we avoid having to pass down the flag. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: drop infrastructure to peel via iteratorsPatrick Steinhardt-127/+1
Now that the peeled object ID gets propagated via the `struct reference` there is no need anymore to call into the reference iterator itself to dereference an object. Remove this infrastructure. Most of the changes are straight-forward deletions of code. There is one exception though in `refs/packed-backend.c::write_with_updates()`. Here we stop peeling the iterator and instead just pass the peeled object ID of that iterator directly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: drop `current_ref_iter` hackPatrick Steinhardt-18/+0
In preceding commits we have refactored all callers of `peel_iterated_oid()` to instead use `reference_get_peeled_oid()`. This allows us to thus get rid of the former function. Getting rid of that function is nice, but even nicer is that this also allows us to get rid of the `current_ref_iter` hack. This global variable tracked the currently-active ref iterator so that we can use it to peel an object ID. Now that the peeled object ID is propagated via `struct reference` though we don't have to depend on this hack anymore, which makes for a more robust and easier-to-understand infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: expose peeled object ID via the iteratorPatrick Steinhardt-0/+6
Both the "files" and "reftable" backend are able to store peeled values for tags in the respective formats. This allows for a more efficient lookup of the target object of such a tag without having to manually peel via the object database. The infrastructure to access these peeled object IDs is somewhat funky though. When iterating through objects, we store a pointer reference to the current iterator in a global variable. The callbacks invoked by that iterator are then expected to call `peel_iterated_oid()`, which checks whether the globally-stored iterator's current reference refers to the one handed into that function. If so, we ask the iterator to peel the object, otherwise we manually peel the object via the object database. Depending on global state like this is somewhat weird and also quite fragile. Introduce a new `struct reference::peeled_oid` field that can be populated by the reference backends. This field can be accessed via a new function `reference_get_peeled_oid()` that either uses that value, if set, or alternatively peels via the ODB. With this change we don't have to rely on global state anymore, but make the peeled object ID available to the callback functions directly. Adjust trivial callers that already have a `struct reference` available. Remaining callers will be adjusted in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: fully reset `struct ref_iterator::ref` on iterationPatrick Steinhardt-1/+4
With the introduction of the `struct ref_iterator::ref` field it now is a whole lot easier to introduce new fields that become accessible to the caller without having to adapt every single callsite. But there's a downside: when a new field is introduced we always have to adapt all backends to set that field. This isn't something we can avoid in the general case: when the new field is expected to be populated by all backends we of course cannot avoid doing so. But new fields may be entirely optional, in which case we'd still have such churn. And furthermore, it is very easy right now to leak state from a previous iteration into the next iteration. Address this issue by ensuring that the reference backends all fully reset the field on every single iteration. This ensures that no state from previous iterations can leak into the next one. And it ensures that any newly introduced fields will be zeroed out by default. Note that we don't have to explicitly adapt the "files" backend, as it uses the `cache_ref_iterator` internally. Furthermore, other "wrapping" iterators like for example the `prefix_ref_iterator` copy around the whole reference, so these don't need to be adapted either. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce `.ref` field for the base iteratorPatrick Steinhardt-96/+71
The base iterator has a couple of fields that tracks the name, target, object ID and flags for the current reference. Due to this design we have to create a new `struct reference` whenever we want to hand over that reference to the callback function, which is tedious and not very efficient. Convert the structure to instead contain a `struct reference` as member. This member is expected to be populated by the implementations of the iterator and is handed over to the callback directly. While at it, simplify `should_pack_ref()` to take a `struct reference` directly instead of passing its respective fields. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt-6/+10
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30Merge branch 'ps/symlink-symref-deprecation'Junio C Hamano-2/+17
"Symlink symref" has been added to the list of things that will disappear at Git 3.0 boundary. * ps/symlink-symref-deprecation: refs/files: deprecate writing symrefs as symbolic links
2025-10-27refs: add missing remove_on_disk implementation for debug backendXinyu Ruan-0/+9
The debug ref backend (refs_be_debug) was missing the remove_on_disk function pointer, which caused a segmentation fault when running 'GIT_TRACE_REFS=1 git refs migrate --ref-format=reftable' commands. Signed-off-by: Xinyu Ruan <r200981113@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-26Merge branch 'js/unreachable-workaround-for-no-symlink-head' into maint-2.51Junio C Hamano-1/+7
Code clean-up. * js/unreachable-workaround-for-no-symlink-head: refs: forbid clang to complain about unreachable code