aboutsummaryrefslogtreecommitdiffstats
path: root/refs/files-backend.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-11-06Merge branch 'pk/reflog-migrate-message-fix'Junio C Hamano1-1/+1
Message fix. * pk/reflog-migrate-message-fix: refs: add missing space in messages
2025-11-05refs: add missing space in messagesPeter Krefting1-1/+1
Signed-off-by: Peter Krefting <peter@softwolves.pp.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30Merge branch 'ps/symlink-symref-deprecation'Junio C Hamano1-2/+17
"Symlink symref" has been added to the list of things that will disappear at Git 3.0 boundary. * ps/symlink-symref-deprecation: refs/files: deprecate writing symrefs as symbolic links
2025-10-26Merge branch 'js/unreachable-workaround-for-no-symlink-head' into maint-2.51Junio C Hamano1-1/+7
Code clean-up. * js/unreachable-workaround-for-no-symlink-head: refs: forbid clang to complain about unreachable code
2025-10-20Merge branch 'js/unreachable-workaround-for-no-symlink-head'Junio C Hamano1-1/+7
Code clean-up. * js/unreachable-workaround-for-no-symlink-head: refs: forbid clang to complain about unreachable code
2025-10-15Merge branch 'kn/refs-files-case-insensitive' into maint-2.51Junio C Hamano1-12/+66
Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems
2025-10-15Merge branch 'ps/reflog-migrate-fixes' into maint-2.51Junio C Hamano1-7/+58
"git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-10-15refs/files: deprecate writing symrefs as symbolic linksPatrick Steinhardt1-2/+17
The "files" backend has the ability to store symbolic refs as symbolic links, which can be configured via "core.preferSymlinkRefs". This feature stems back from the early days: the initial implementation of symbolic refs used symlinks exclusively. The symref format was only introduced in 9b143c6e15 (Teach update-ref about a symbolic ref stored in a textfile., 2005-09-25) and made the default in 9f0bb90d16 (core.prefersymlinkrefs: use symlinks for .git/HEAD, 2006-05-02). This is all about 20 years ago, and there are no known reasons nowadays why one would want to use symlinks instead of symrefs. Mark the feature for deprecation in Git 3.0. Note that this only deprecates _writing_ symrefs as symbolic links. Reading such symrefs is still supported for now. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-13Merge branch 'kn/reftable-consistency-checks'Junio C Hamano1-3/+0
The reftable backend learned to sanity check its on-disk data more carefully. * kn/reftable-consistency-checks: refs/reftable: add fsck check for checking the table name reftable: add code to facilitate consistency checks fsck: order 'fsck_msg_type' alphabetically Documentation/fsck-msgids: remove duplicate msg id reftable: check for trailing newline in 'tables.list' refs: move consistency check msg to generic layer refs: remove unused headers
2025-10-09refs: forbid clang to complain about unreachable codeJohannes Schindelin1-1/+7
When `NO_SYMLINK_HEAD` is defined, `create_ref_symlink()` is hard-coded as `(-1)`, and as a consequence the condition `!create_ref_symlink()` always evaluates to false, rendering any code guarded by that condition unreachable. Therefore, clang is _technically_ correct when it complains about unreachable code. It does completely miss the fact that this is okay because on _other_ platforms, where `NO_SYMLINK_HEAD` is not defined, the code isn't unreachable at all. Let's use the same trick as in 82e79c63642c (git-compat-util: add NOT_CONSTANT macro and use it in atfork_prepare(), 2025-03-17) to appease clang while at the same time keeping the `-Wunreachable` flag to potentially find _actually_ unreachable code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-07refs: move consistency check msg to generic layerKarthik Nayak1-2/+0
The files-backend prints a message before the consistency checks run. Move this to the generic layer so both the files and reftable backend can benefit from this message. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-07refs: remove unused headersKarthik Nayak1-1/+0
In the 'refs/' namespace, some of the included header files are not needed, let's remove them. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-02Merge branch 'ms/refs-optimize'Junio C Hamano1-0/+10
"git refs optimize" is added for not very well explained reason despite it does the same thing as "git pack-refs"... * ms/refs-optimize: t: add test for git refs optimize subcommand t0601: refactor tests to be shareable builtin/refs: add optimize subcommand doc: pack-refs: factor out common options builtin/pack-refs: factor out core logic into a shared library builtin/pack-refs: convert to use the generic refs_optimize() API reftable-backend: implement 'optimize' action files-backend: implement 'optimize' action refs: add a generic 'optimize' API
2025-09-29Merge branch 'kn/refs-files-case-insensitive'Junio C Hamano1-12/+66
Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems
2025-09-19files-backend: implement 'optimize' actionMeet Soni1-0/+10
With the generic `refs_optimize()` API now in place, provide the first implementation for the 'files' reference backend. This makes the new API functional for existing repositories and serves as the foundation for migrating user-facing commands to the new architecture. The implementation simply calls the existing `files_pack_refs()` function, as 'packing' is the method used to optimize the files-based reference store. Wire up the new `files_optimize()` function to the `optimize` slot in the files backend's virtual table. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: handle D/F conflicts during lockingKarthik Nayak1-5/+6
The previous commit added the necessary validation and checks for F/D conflicts in the files backend when working on case insensitive systems. There is still a possibility for D/F conflicts. This is a different from the F/D since for F/D conflicts, there would not be a conflict during the lock creation phase: refs/heads/foo.lock refs/heads/foo/bar.lock However there would be a conflict when the locks are committed, since we cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of conflicts are checked and resolved in `refs_verify_refnames_available()`, so the previous commit ensured that for case-insensitive filesystems, we would lowercase the inputs to that function. For D/F conflicts, there is a conflict during the lock creation phase itself: refs/heads/foo/bar.lock refs/heads/foo.lock As in `lock_raw_ref()` after creating the lock, we also check for D/F conflicts. This can occur in case-insensitive filesystems when trying to fetch case-conflicted references like: refs/heads/Foo/new refs/heads/foo D/F conflicts can also occur in case-sensitive filesystems, when the repository already contains a directory with a lock file 'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This doesn't concern directories containing garbage files as those are handled on a higher level. To fix this, simply categorize the error as a name conflict. Also remove this reference from the list of valid refnames for availability checks. By categorizing the error and removing it from the list of valid references, batched updates now knows to reject such reference updates and apply the other reference updates. Fix a small typo in `ref_transaction_maybe_set_rejected()` while here. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: handle F/D conflicts in case-insensitive FSKarthik Nayak1-2/+17
When using the files-backend on case-insensitive filesystems, there is possibility of hitting F/D conflicts when creating references within a single transaction, such as: - 'refs/heads/foo' - 'refs/heads/Foo/bar' Ideally such conflicts are caught in `refs_verify_refnames_available()` which is responsible for checking F/D conflicts within a given transaction. This utility function is shared across the reference backends. As such, it doesn't consider the issues of using a case-insensitive file system, which only affects the files-backend. While one solution would be to make the function aware of such issues, this feels like leaking implementation details of file-backend specific issues into the utility function. So opt for the more simpler option, of lowercasing all references sent to this function when on a case-insensitive filesystem and operating on the files-backend. To do this, simply use a `struct strbuf` to convert the refname to lowercase and append it to the list of refnames to be checked. Since we use a `struct strbuf` and the memory is cleared right after, make sure that the string list duplicates all provided string. Without this change, the user would simply be left with a repository with '.lock' files which were created in the 'prepare' phase of the transaction, as the 'commit' phase would simply abort and not do the necessary cleanup. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: use correct error type when lock existsKarthik Nayak1-3/+18
When fetching references into a repository, if a lock for a particular reference exists, then `lock_raw_ref()` throws: - REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict because the transaction contains conflicting references while being on a case-insensitive filesystem. - REF_TRANSACTION_ERROR_GENERIC: for all other errors. The latter causes the entire set of batched updates to fail, even in case sensitive filessystems. Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This allows batched updates to reject the individual update which conflicts with the existing file, while updating the rest of the references. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: catch conflicts on case-insensitive file-systemsKarthik Nayak1-5/+28
During the 'prepare' phase of a reference transaction in the files backend, we create the lock files for references to be created. When using batched updates on case-insensitive filesystems, the entire batched updates would be aborted if there are conflicting names such as: refs/heads/Foo refs/heads/foo This affects all commands which were migrated to use batched updates in Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before that, reference updates would be applied serially with one transaction used per update. When users fetched multiple references on case-insensitive systems, subsequent references would simply overwrite any earlier references. So when fetching: refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6 refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 The user would simply end up with: refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 This is buggy behavior since the user is never informed about the overrides performed and missing references. Nevertheless, the user is left with a working repository with a subset of the references. Since Git 2.51, in such situations fetches would simply fail without updating any references. Which is also buggy behavior and worse off since the user is left without any references. The error is triggered in `lock_raw_ref()` where the files backend attempts to create a lock file. When a lock file already exists the function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens, the entire batched updates, not individual operation, is aborted as if it were in a transaction. Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to aid the batched update mechanism to simply reject such errors. The change only affects batched updates since batched updates will reject individual updates with non-generic errors. So specifically this would only affect: 1. git fetch 2. git receive-pack 3. git update-ref --batch-updates This bubbles the error type up to `files_transaction_prepare()` which tries to lock each reference update. So if the locking fails, we check if the rejection type can be ignored, which is done by calling `ref_transaction_maybe_set_rejected()`. As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT', the specific reference update would simply be rejected, while other updates in the transaction would continue to be applied. This allows partial application of references in case-insensitive filesystems when fetching colliding references. While the earlier implementation allowed the last reference to be applied overriding the initial references, this change would allow the first reference to be applied while rejecting consequent collisions. This should be an okay compromise since with the files backend, there is no scenario possible where we would retain all colliding references. Let's also be more proactive and notify users on case-insensitive filesystems about such problems by providing a brief about the issue while also recommending using the reftable backend, which doesn't have the same issue. Reported-by: Joe Drew <joe.drew@indexexchange.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-29Merge branch 'jk/no-clobber-dangling-symref-with-fetch'Junio C Hamano1-4/+30
"git fetch" can clobber a symref that is dangling when the remote-tracking HEAD is set to auto update, which has been corrected. * jk/no-clobber-dangling-symref-with-fetch: refs: do not clobber dangling symrefs t5510: prefer "git -C" to subshell for followRemoteHEAD tests t5510: stop changing top-level working directory t5510: make confusing config cleanup more explicit
2025-08-21Merge branch 'ps/remote-rename-fix'Junio C Hamano1-6/+9
"git remote rename origin upstream" failed to move origin/HEAD to upstream/HEAD when origin/HEAD is unborn and performed other renames extremely inefficiently, which has been corrected. * ps/remote-rename-fix: builtin/remote: only iterate through refs that are to be renamed builtin/remote: rework how remote refs get renamed builtin/remote: determine whether refs need renaming early on builtin/remote: fix sign comparison warnings refs: simplify logic when migrating reflog entries refs: pass refname when invoking reflog entry callback
2025-08-21Merge branch 'ps/reflog-migrate-fixes'Junio C Hamano1-7/+58
"git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-08-19refs: do not clobber dangling symrefsJeff King1-4/+30
When given an expected "before" state, the ref-writing code will avoid overwriting any ref that does not match that expected state. We use the null oid as a sentinel value for "nothing should exist", and likewise that is the sentinel value we get when trying to read a ref that does not exist. But there's one corner case where this is ambiguous: dangling symrefs. Trying to read them will yield the null oid, but there is potentially something of value there: the dangling symref itself. For a normal recursive write, this is OK. Imagine we have a symref "FOO_HEAD" that points to a ref "refs/heads/bar" that does not exist, and we try to write to it with a create operation like: oid=$(git rev-parse HEAD) ;# or whatever git symbolic-ref FOO_HEAD refs/heads/bar echo "create FOO_HEAD $oid" | git update-ref --stdin The attempt to resolve FOO_HEAD will actually resolve "bar", yielding the null oid. That matches our expectation, and the write proceeds. This is correct, because we are not writing FOO_HEAD at all, but writing its destination "bar", which in fact does not exist. But what if the operation asked not to dereference symrefs? Like this: echo "create FOO_HEAD $oid" | git update-ref --no-deref --stdin Resolving FOO_HEAD would still result in a null oid, and the write will proceed. But it will overwrite FOO_HEAD itself, removing the fact that it ever pointed to "bar". This case is a little esoteric; we are clobbering a symref with a no-deref write of a regular ref value. But the same problem occurs when writing symrefs. For example: echo "symref-create FOO_HEAD refs/heads/other" | git update-ref --no-deref --stdin The "create" operation asked us to create FOO_HEAD only if it did not exist. But we silently overwrite the existing value. You can trigger this without using update-ref via the fetch followRemoteHEAD code. In "create" mode, it should not overwrite an existing value. But if you manually create a symref pointing to a value that does not yet exist (either via symbolic-ref or with "remote add -m"), create mode will happily overwrite it. Instead, we should detect this case and refuse to write. The correct specification to overwrite FOO_HEAD in this case is to provide an expected target ref value, like: echo "symref-update FOO_HEAD refs/heads/other ref refs/heads/bar" | git update-ref --no-deref --stdin Note that the non-symref "update" directive does not allow you to do this (you can only specify an oid). This is a weakness in the update-ref interface, and you'd have to overwrite unconditionally, like: echo "update FOO_HEAD $oid" | git update-ref --no-deref --stdin Likewise other symref operations like symref-delete do not accept the "ref" keyword. You should be able to do: echo "symref-delete FOO_HEAD ref refs/heads/bar" but cannot (and can only delete unconditionally). This patch doesn't address those gaps. We may want to do so in a future patch for completeness, but it's not clear if anybody actually wants to perform those operations. The symref update case (specifically, via followRemoteHEAD) is what I ran into in the wild. The code for the fix is relatively straight-forward given the discussion above. But note that we have to implement it independently for the files and reftable backends. The "old oid" checks happen as part of the locking process, which is implemented separately for each system. We may want to factor this out somehow, but it's beyond the scope of this patch. (Another curiosity is that the messages in the reftable code are marked for translation, but the ones in the files backend are not. I followed local convention in each case, but we may want to harmonize this at some point). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs: pass refname when invoking reflog entry callbackPatrick Steinhardt1-6/+9
With `refs_for_each_reflog_ent()` callers can iterate through all the reflog entries for a given reference. The callback that is being invoked for each such entry does not receive the name of the reference that we are currently iterating through. This isn't really a limiting factor, as callers can simply pass the name via the callback data. But this layout sometimes does make for a bit of an awkward calling pattern. One example: when iterating through all reflogs, and for each reflog we iterate through all refnames, we have to do some extra book keeping to track which reference name we are currently yielding reflog entries for. Change the signature of the callback function so that the reference name of the reflog gets passed through to it. Adapt callers accordingly and start using the new parameter in trivial cases. The next commit will refactor the reference migration logic to make use of this parameter so that we can simplify its logic a bit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06Merge branch 'ps/reflog-migrate-fixes' into ps/remote-rename-fixJunio C Hamano1-7/+58
* ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-08-06refs: fix invalid old object IDs when migrating reflogsPatrick Steinhardt1-1/+15
When migrating reflog entries between different storage formats we end up with invalid old object IDs for the migrated entries: instead of writing the old object ID of the to-be-migrated entry, we end up with the all-zeroes object ID. The root cause of this issue is that we don't know to use the old object ID provided by the caller. Instead, we manually resolve the old object ID by resolving the current value of its matching reference. But as that reference does not yet exist in the target ref storage we always end up resolving it to all-zeroes. This issue got unnoticed as there is no user-facing command that would even show the old object ID. While `git log -g` knows to show the new object ID, we don't have any formatting directive to show the old object ID. Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If set, backends are instructed to use the old and new object IDs provided by the caller, without doing any manual resolving. Set this flag in `ref_transaction_update_reflog()`. Amend our tests in t1460-refs-migrate to use our test tool to read reflog entries. This test tool prints out both old and new object ID of each reflog entry, which fixes the test gap. Furthermore it also prints the full identity used to write the reflog, which provides test coverage for the previous commit in this patch series that fixed the identity for migrated reflogs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs: stop unsetting REF_HAVE_OLD for log-only updatesPatrick Steinhardt1-4/+5
The `REF_HAVE_OLD` flag indicates whether a given ref update has its old object ID set. If so, the value of that field is used to verify whether the current state of the reference matches this expected state. It is thus an important part of mitigating races with a concurrent process that updates the same set of references. When writing reflogs though we explicitly unset that flag. This is a sensible thing to do: the old state of reflog entry updates may not necessarily match the current on-disk state of its accompanying ref, but it's only intended to signal what old object ID we want to write into the new reflog entry. For example when migrating refs we end up writing many reflog entries for a single reference, and most likely those reflog entries will have many different old object IDs. But unsetting this flag also removes a useful signal, namely that the caller _did_ provide an old object ID for a given reflog entry. This signal will become useful in a subsequent commit, where we add a new flag that tells the transaction to use the provided old and new object IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a signal to verify that the caller really did provide an old object ID. Stop unsetting the flag so that we can use it as this described signal in a subsequent commit. Skip checking the old object ID for log-only updates so that we don't expect it to match the current on-disk state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs/files: detect race when generating reflog entry for HEADPatrick Steinhardt1-2/+38
When updating a reference that is being pointed to HEAD we don't only write a reflog message for that particular reference, but also generate one for HEAD. This logic is handled by `split_head_update()`, where we: 1. Verify that the condition actually triggered. This is done by reading HEAD at the start of the transaction so that we can then check whether a given reference update refers to its target. 2. Queue a new log-only update for HEAD in case it did. But the logic is unfortunately not free of races, as we do not lock the HEAD reference after we have read its target. This can lead to the following two scenarios: - HEAD gets concurrently updated to point to one of the references we have already processed. This causes us not writing a reflog message even though we should have done so. - HEAD gets concurrently updated to no longer point to a reference anymore that we have already processed. This causes us to write a reflog message even though we should _not_ have done so. Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that is specific to the "files" backend. If set, we will double check that the HEAD reference still points to the reference that we are creating the reflog entry for after we have locked HEAD. Furthermore, instead of manually resolving the old object ID of that entry, we now use the same old state as for the parent update. If we detect such a racy update we abort the transaction. This is a bit heavy-handed: the user didn't even ask us to write a reflog update for "HEAD", so it might be surprising if we abort the transaction. That being said: - Normal users wouldn't typically hit this case as we only hit the relevant code when committing to a branch that is being pointed to by "HEAD" directly. Commands like git-commit(1) typically commit to "HEAD" itself though. - Scripted users that use git-update-ref(1) and related plumbing commands are unlikely to hit this case either, as they would have to update the pointed-to-branch at the same as "HEAD" is being updated, which is an exceedingly rare event. The alternative would be to instead drop the log-only update completely, but that would require more logic that is hard to verify without adding infrastructure specific for such a test. So we rather do the pragmatic thing and don't worry too much about an edge case that is very unlikely to happen. Unfortunately, this change only helps with the second race. We cannot reliably plug the first race without locking the HEAD reference at the start of the transaction. Locking HEAD unconditionally would effectively serialize all writes though, and that doesn't seem like an option. Also, double checking its value at the end of the transaction is not an option either, as its target may have flip-flopped during the transaction. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-03Merge branch 'kn/for-each-ref-skip'Junio C Hamano1-3/+4
"git for-each-ref" learns "--start-after" option to help applications that want to page its output. * kn/for-each-ref-skip: ref-cache: set prefix_state when seeking for-each-ref: introduce a '--start-after' option ref-filter: remove unnecessary else clause refs: selectively set prefix in the seek functions ref-cache: remove unused function 'find_ref_entry()' refs: expose `ref_iterator` via 'refs.h'
2025-07-16Merge branch 'ps/refs-files-remove-empty-parent'Junio C Hamano1-0/+2
When a ref creation at refs/heads/foo/bar fails, the files backend now removes refs/heads/foo/ if the directory is otherwise not used. * ps/refs-files-remove-empty-parent: refs/files: remove empty parent dirs when ref creation fails
2025-07-15refs: selectively set prefix in the seek functionsKarthik Nayak1-3/+4
The ref iterator exposes a `ref_iterator_seek()` function. The name suggests that this would seek the iterator to a specific reference in some ways similar to how `fseek()` works for the filesystem. However, the function actually sets the prefix for refs iteration. So further iteration would only yield references which match the particular prefix. This is a bit confusing. Let's add a 'flags' field to the function, which when set with the 'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the iteration in-line with the existing behavior. Otherwise, the reference backends will simply seek to the specified reference and clears any previously set prefix. This allows users to start iteration from a specific reference. In the packed and reftable backend, since references are available in a sorted list, the changes are simply setting the prefix if needed. The changes on the files-backend are a little more involved, since the files backend uses the 'ref-cache' mechanism. We move out the existing logic within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()` which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We then parse the provided seek string and set the required levels and their indexes to ensure that seeking is possible. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-08Merge branch 'kn/fetch-push-bulk-ref-update'Junio C Hamano1-0/+7
"git push" and "git fetch" are taught to update refs in batches to gain performance. * kn/fetch-push-bulk-ref-update: receive-pack: handle reference deletions separately refs/files: skip updates with errors in batched updates receive-pack: use batched reference updates send-pack: fix memory leak around duplicate refs fetch: use batched reference updates refs: add function to translate errors to strings
2025-07-08refs/files: remove empty parent dirs when ref creation failsPatrick Steinhardt1-0/+2
When creating a new reference in the "files" backend we first create the directory hierarchy for that reference, then create the lockfile for that reference, and finally rename the lockfile into place. When the transaction gets aborted we prune the lockfile, but we don't clean up the directory hierarchy that we may have created for the lockfile. In some egde cases this can lead to lots of empty directories being cluttered in the ".git/refs" directory that really serve no purpose at all. We know to prune such empty directories when packing refs, but that only patches over the issue. Improve this by removing empty parents when cleaning up still-locked references in `files_transaction_cleanup()`. This function is also called when preparing or committing the transaction, so this change also helps when not explicitly aborting the transaction. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25refs/files: skip updates with errors in batched updatesKarthik Nayak1-0/+7
The commit 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08) introduced support for batched reference updates. This allows users to batch updates together, while allowing some of the updates to fail. Under the hood, batched updates use the reference transaction mechanism. Each update which fails is marked as such. Any failed updates must be skipped over in the rest of the code, as they wouldn't apply any more. In two of the loops within 'files_transaction_finish()' of the files backend, the failed updates aren't skipped over. This can cause a SEGFAULT otherwise. Add the missing skips and a test to validate the same. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-03Merge branch 'sj/ref-contents-check-fix'Junio C Hamano1-0/+3
"git verify-refs" (and hence "git fsck --reference") started erroring out in a repository in which secondary worktrees were prepared with Git 2.43 or lower. * sj/ref-contents-check-fix: fsck: ignore missing "refs" directory for linked worktrees
2025-06-02fsck: ignore missing "refs" directory for linked worktreesshejialuo1-0/+3
"git refs verify" doesn't work if there are worktrees created on Git v2.43.0 or older versions. These versions don't automatically create the "refs" directory, causing the error: error: cannot open directory .git/worktrees/<worktree name>/refs: No such file or directory Since 8f4c00de95 (builtin/worktree: create refdb via ref backend, 2024-01-08), we automatically create the "refs" directory for new worktrees. And in 7c78d819e6 (ref: support multiple worktrees check for refs, 2024-11-20), we assume that all linked worktrees have this directory and would wrongly report an error to the user, thus introducing compatibility issue. Check for ENOENT errno before reporting directory access errors for linked worktrees to maintain backward compatibility. Reported-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano1-2/+2
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-16Merge branch 'kn/non-transactional-batch-updates'Junio C Hamano1-176/+138
Updating multiple references have only been possible in all-or-none fashion with transactions, but it can be more efficient to batch multiple updates even when some of them are allowed to fail in a best-effort manner. A new "best effort batches of updates" mode has been introduced. * kn/non-transactional-batch-updates: update-ref: add --batch-updates flag for stdin mode refs: support rejection in batch updates during F/D checks refs: implement batch reference update support refs: introduce enum-based transaction error types refs/reftable: extract code from the transaction preparation refs/files: remove duplicate duplicates check refs: move duplicate refname update check to generic layer refs/files: remove redundant check in split_symref_update()
2025-04-15Merge branch 'ps/object-wo-the-repository'Junio C Hamano1-1/+1
The object layer has been updated to take an explicit repository instance as a parameter in more code paths. * ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-15object-file: move `safe_create_leading_directories()` into "path.c"Patrick Steinhardt1-2/+2
The `safe_create_leading_directories()` function and its relatives are located in "object-file.c", which is not a good fit as they provide generic functionality not related to objects at all. Move them into "path.c", which already hosts `safe_create_dir()` and its relative `safe_create_dir_in_gitdir()`. "path.c" is free of `the_repository`, but the moved functions depend on `the_repository` to read the "core.sharedRepository" config. Adapt the function signature to accept a repository as argument to fix the issue and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'ps/object-wo-the-repository' into ps/object-file-cleanupJunio C Hamano1-1/+1
* ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-08refs: support rejection in batch updates during F/D checksKarthik Nayak1-9/+18
The `refs_verify_refnames_available()` is used to batch check refnames for F/D conflicts. While this is the more performant alternative than its individual version, it does not provide rejection capabilities on a single update level. For batched updates, this would mean a rejection of the entire transaction whenever one reference has a F/D conflict. Modify the function to call `ref_transaction_maybe_set_rejected()` to check if a single update can be rejected. Since this function is only internally used within 'refs/' and we want to pass in a `struct ref_transaction *` as a variable. We also move and mark `refs_verify_refnames_available()` to 'refs-internal.h' to be an internal function. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs: implement batch reference update supportKarthik Nayak1-1/+11
Git supports making reference updates with or without transactions. Updates with transactions are generally better optimized. But transactions are all or nothing. This means, if a user wants to batch updates to take advantage of the optimizations without the hard requirement that all updates must succeed, there is no way currently to do so. Particularly with the reftable backend where batching multiple reference updates is more efficient than performing them sequentially. Introduce batched update support with a new flag, 'REF_TRANSACTION_ALLOW_FAILURE'. Batched updates while different from transactions, use the transaction infrastructure under the hood. When enabled, this flag allows individual reference updates that would typically cause the entire transaction to fail due to non-system-related errors to be marked as rejected while permitting other updates to proceed. System errors referred by 'REF_TRANSACTION_ERROR_GENERIC' continue to result in the entire transaction failing. This approach enhances flexibility while preserving transactional integrity where necessary. The implementation introduces several key components: - Add 'rejection_err' field to struct `ref_update` to track failed updates with failure reason. - Add a new struct `ref_transaction_rejections` and a field within `ref_transaction` to this struct to allow quick iteration over rejected updates. - Modify reference backends (files, packed, reftable) to handle partial transactions by using `ref_transaction_set_rejected()` instead of failing the entire transaction when `REF_TRANSACTION_ALLOW_FAILURE` is set. - Add `ref_transaction_for_each_rejected_update()` to let callers examine which updates were rejected and why. This foundational change enables batched update support throughout the reference subsystem. A following commit will expose this capability to users by adding a `--batch-updates` flag to 'git-update-ref(1)', providing both a user-facing feature and a testable implementation. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs: introduce enum-based transaction error typesKarthik Nayak1-100/+102
Replace preprocessor-defined transaction errors with a strongly-typed enum `ref_transaction_error`. This change: - Improves type safety and function signature clarity. - Makes error handling more explicit and discoverable. - Maintains existing error cases, while adding new error cases for common scenarios. This refactoring paves the way for more comprehensive error handling which we will utilize in the upcoming commits to add batch reference update support. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs/files: remove duplicate duplicates checkKarthik Nayak1-6/+0
Within the files reference backend's transaction's 'finish' phase, a verification step is currently performed wherein the refnames list is sorted and examined for multiple updates targeting the same refname. It has been observed that this verification is redundant, as an identical check is already executed during the transaction's 'prepare' stage. Since the refnames list remains unmodified following the 'prepare' stage, this secondary verification can be safely eliminated. The duplicate check has been removed accordingly, and the `ref_update_reject_duplicates()` function has been marked as static, as its usage is now confined to 'refs.c'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs: move duplicate refname update check to generic layerKarthik Nayak1-53/+14
Move the tracking of refnames in `affected_refnames` from individual backends into the generic layer in 'refs.c'. This centralizes the duplicate refname detection that was previously handled separately by each backend. Make some changes to accommodate this move: - Add a `string_list` field `refnames` to `ref_transaction` to contain all the references in a transaction. This field is updated whenever a new update is added via `ref_transaction_add_update`, so manual additions in reference backends are dropped. - Modify the backends to use this field internally as needed. The backends need to check if an update for refname already exists when splitting symrefs or adding an update for 'HEAD'. - In the reftable backend, within `reftable_be_transaction_prepare()`, move the `string_list_has_string()` check above `ref_transaction_add_update()`. Since `ref_transaction_add_update()` automatically adds the refname to `transaction->refnames`, performing the check after will always return true, so we perform the check before adding the update. This helps reduce duplication of functionality between the backends and makes it easier to make changes in a more centralized manner. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs/files: remove redundant check in split_symref_update()Karthik Nayak1-17/+3
In `split_symref_update()`, there were two checks for duplicate refnames: - At the start, `string_list_has_string()` ensures the refname is not already in `affected_refnames`, preventing duplicates from being added. - After adding the refname, another check verifies whether the newly inserted item has a `util` value. The second check is unnecessary because the first one guarantees that `string_list_insert()` will never encounter a preexisting entry. The `item->util` field is assigned to validate that a rename doesn't already exist in the list. The validation is done after the first check. As this check is removed, clean up the validation and the assignment of this field in `split_head_update()` and `files_transaction_prepare()`. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-29Merge branch 'ps/refname-avail-check-optim'Junio C Hamano1-44/+73
The code paths to check whether a refname X is available (by seeing if another ref X/Y exists, etc.) have been optimized. * ps/refname-avail-check-optim: refs: reuse iterators when determining refname availability refs/iterator: implement seeking for files iterators refs/iterator: implement seeking for packed-ref iterators refs/iterator: implement seeking for ref-cache iterators refs/iterator: implement seeking for reftable iterators refs/iterator: implement seeking for merged iterators refs/iterator: provide infrastructure to re-seek iterators refs/iterator: separate lifecycle from iteration refs: stop re-verifying common prefixes for availability refs/files: batch refname availability checks for initial transactions refs/files: batch refname availability checks for normal transactions refs/reftable: batch refname availability checks refs: introduce function to batch refname availability checks builtin/update-ref: skip ambiguity checks when parsing object IDs object-name: allow skipping ambiguity checks in `get_oid()` family object-name: introduce `repo_get_oid_with_flags()`
2025-03-12refs/iterator: implement seeking for files iteratorsPatrick Steinhardt1-0/+16
Implement seeking for "files" iterators. As we simply use a ref-cache iterator under the hood the implementation is straight-forward. Note that we do not implement seeking on reflog iterators, same as with the "reftable" backend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12refs/iterator: separate lifecycle from iterationPatrick Steinhardt1-26/+10
The ref and reflog iterators have their lifecycle attached to iteration: once the iterator reaches its end, it is automatically released and the caller doesn't have to care about that anymore. When the iterator should be released before it has been exhausted, callers must explicitly abort the iterator via `ref_iterator_abort()`. This lifecycle is somewhat unusual in the Git codebase and creates two problems: - Callsites need to be very careful about when exactly they call `ref_iterator_abort()`, as calling the function is only valid when the iterator itself still is. This leads to somewhat awkward calling patterns in some situations. - It is impossible to reuse iterators and re-seek them to a different prefix. This feature isn't supported by any iterator implementation except for the reftable iterators anyway, but if it was implemented it would allow us to optimize cases where we need to search for specific references repeatedly by reusing internal state. Detangle the lifecycle from iteration so that we don't deallocate the iterator anymore once it is exhausted. Instead, callers are now expected to always call a newly introduce `ref_iterator_free()` function that deallocates the iterator and its internal state. Note that the `dir_iterator` is somewhat special because it does not implement the `ref_iterator` interface, but is only used to implement other iterators. Consequently, we have to provide `dir_iterator_free()` instead of `dir_iterator_release()` as the allocated structure itself is managed by the `dir_iterator` interfaces, as well, and not freed by `ref_iterator_free()` like in all the other cases. While at it, drop the return value of `ref_iterator_abort()`, which wasn't really required by any of the iterator implementations anyway. Furthermore, stop calling `base_ref_iterator_free()` in any of the backends, but instead call it in `ref_iterator_free()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>