| Age | Commit message (Collapse) | Author | Files | Lines |
|
Message fix.
* pk/reflog-migrate-message-fix:
refs: add missing space in messages
|
|
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "debug" ref-backend was missing a method implementation, which
has been corrected.
* xr/ref-debug-remove-on-disk:
refs: add missing remove_on_disk implementation for debug backend
|
|
"Symlink symref" has been added to the list of things that will
disappear at Git 3.0 boundary.
* ps/symlink-symref-deprecation:
refs/files: deprecate writing symrefs as symbolic links
|
|
The debug ref backend (refs_be_debug) was missing the remove_on_disk
function pointer, which caused a segmentation fault when running
'GIT_TRACE_REFS=1 git refs migrate --ref-format=reftable' commands.
Signed-off-by: Xinyu Ruan <r200981113@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* js/unreachable-workaround-for-no-symlink-head:
refs: forbid clang to complain about unreachable code
|
|
Code clean-up.
* js/unreachable-workaround-for-no-symlink-head:
refs: forbid clang to complain about unreachable code
|
|
Deal more gracefully with directory / file conflicts when the files
backend is used for ref storage, by failing only the ones that are
involved in the conflict while allowing others.
* kn/refs-files-case-insensitive:
refs/files: handle D/F conflicts during locking
refs/files: handle F/D conflicts in case-insensitive FS
refs/files: use correct error type when lock exists
refs/files: catch conflicts on case-insensitive file-systems
|
|
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.
* ps/reflog-migrate-fixes:
refs: fix invalid old object IDs when migrating reflogs
refs: stop unsetting REF_HAVE_OLD for log-only updates
refs/files: detect race when generating reflog entry for HEAD
refs: fix identity for migrated reflogs
ident: fix type of string length parameter
builtin/reflog: implement subcommand to write new entries
refs: export `ref_transaction_update_reflog()`
builtin/reflog: improve grouping of subcommands
Documentation/git-reflog: convert to use synopsis type
|
|
The "files" backend has the ability to store symbolic refs as symbolic
links, which can be configured via "core.preferSymlinkRefs". This
feature stems back from the early days: the initial implementation of
symbolic refs used symlinks exclusively. The symref format was only
introduced in 9b143c6e15 (Teach update-ref about a symbolic ref stored
in a textfile., 2005-09-25) and made the default in 9f0bb90d16
(core.prefersymlinkrefs: use symlinks for .git/HEAD, 2006-05-02).
This is all about 20 years ago, and there are no known reasons nowadays
why one would want to use symlinks instead of symrefs. Mark the feature
for deprecation in Git 3.0.
Note that this only deprecates _writing_ symrefs as symbolic links.
Reading such symrefs is still supported for now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The reftable backend learned to sanity check its on-disk data more
carefully.
* kn/reftable-consistency-checks:
refs/reftable: add fsck check for checking the table name
reftable: add code to facilitate consistency checks
fsck: order 'fsck_msg_type' alphabetically
Documentation/fsck-msgids: remove duplicate msg id
reftable: check for trailing newline in 'tables.list'
refs: move consistency check msg to generic layer
refs: remove unused headers
|
|
When `NO_SYMLINK_HEAD` is defined, `create_ref_symlink()` is hard-coded
as `(-1)`, and as a consequence the condition `!create_ref_symlink()`
always evaluates to false, rendering any code guarded by that condition
unreachable.
Therefore, clang is _technically_ correct when it complains about
unreachable code. It does completely miss the fact that this is okay
because on _other_ platforms, where `NO_SYMLINK_HEAD` is not defined,
the code isn't unreachable at all.
Let's use the same trick as in 82e79c63642c (git-compat-util: add
NOT_CONSTANT macro and use it in atfork_prepare(), 2025-03-17) to
appease clang while at the same time keeping the `-Wunreachable` flag
to potentially find _actually_ unreachable code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Handling of an empty subdirectory of .git/refs/ in the ref-files
backend has been corrected.
* kn/ref-cache-seek-fix:
refs/ref-cache: fix SEGFAULT when seeking in empty directories
|
|
Add glue code in 'refs/reftable-backend.c' which calls the reftable
library to perform the fsck checks. Here we also map the reftable errors
to Git' fsck errors.
Introduce a check to validate table names for a given reftable stack.
Also add 'badReftableTableName' as a corresponding error within Git. The
reftable specification mentions:
It suggested to use
${min_update_index}-${max_update_index}-${random}.ref as a naming
convention.
So treat non-conformant file names as warnings.
While adding the fsck header to 'refs/reftable-backend.c', modify the
list to maintain lexicographical ordering.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The files-backend prints a message before the consistency checks run.
Move this to the generic layer so both the files and reftable backend
can benefit from this message.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the 'refs/' namespace, some of the included header files are not
needed, let's remove them.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git refs optimize" is added for not very well explained reason
despite it does the same thing as "git pack-refs"...
* ms/refs-optimize:
t: add test for git refs optimize subcommand
t0601: refactor tests to be shareable
builtin/refs: add optimize subcommand
doc: pack-refs: factor out common options
builtin/pack-refs: factor out core logic into a shared library
builtin/pack-refs: convert to use the generic refs_optimize() API
reftable-backend: implement 'optimize' action
files-backend: implement 'optimize' action
refs: add a generic 'optimize' API
|
|
The 'cache_ref_iterator_seek()' function is used to seek the
`ref_iterator` to the desired reference in the ref-cache mechanism. We
use the seeking functionality to implement the '--start-after' flag in
'git-for-each-ref(1)'.
When using the files-backend with packed-refs, it is possible that some
of the refs directories are empty. For e.g. just after repacking, the
'refs/heads' directory would be empty. The ref-cache seek mechanism,
doesn't take this into consideration when descending into a
subdirectory, and makes an out of bounds access, causing SEGFAULT as we
try to access entries within the directory. Fix this by breaking out of
the loop when we enter an empty directory.
Since we start with the base directory of 'refs/' which is never empty,
it is okay to perform this check after the first iteration in the
`do..while` clause.
Add tests which simulate this behavior and also provide coverage over
using the feature over packed-refs.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Deal more gracefully with directory / file conflicts when the files
backend is used for ref storage, by failing only the ones that are
involved in the conflict while allowing others.
* kn/refs-files-case-insensitive:
refs/files: handle D/F conflicts during locking
refs/files: handle F/D conflicts in case-insensitive FS
refs/files: use correct error type when lock exists
refs/files: catch conflicts on case-insensitive file-systems
|
|
To make the new generic `optimize` API fully functional, provide an
implementation for the 'reftable' reference backend.
For the reftable backend, the 'optimize' action is to compact its
tables. The existing `reftable_be_pack_refs()` function already provides
this logic, so the new `reftable_be_optimize()` function simply calls
it.
Wire up the new function to the `optimize` slot in the reftable
backend's virtual table.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With the generic `refs_optimize()` API now in place, provide the first
implementation for the 'files' reference backend. This makes the new API
functional for existing repositories and serves as the foundation for
migrating user-facing commands to the new architecture.
The implementation simply calls the existing `files_pack_refs()`
function, as 'packing' is the method used to optimize the files-based
reference store.
Wire up the new `files_optimize()` function to the `optimize` slot in
the files backend's virtual table.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The existing `pack-refs` API is conceptually tied to the 'files'
backend, but its behavior is generic (e.g., it triggers compaction for
reftable). This naming is confusing.
Introduce a new generic refs_optimize() API that dispatches to a
backend-specific implementation via a new 'optimize' vtable method.
This lays the architectural groundwork for different reference backends
(like 'files' and 'reftable') to provide their own storage optimization
logic, which will be called from a single, generic entry point.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous commit added the necessary validation and checks for F/D
conflicts in the files backend when working on case insensitive systems.
There is still a possibility for D/F conflicts. This is a different from
the F/D since for F/D conflicts, there would not be a conflict during
the lock creation phase:
refs/heads/foo.lock
refs/heads/foo/bar.lock
However there would be a conflict when the locks are committed, since we
cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of
conflicts are checked and resolved in
`refs_verify_refnames_available()`, so the previous commit ensured that
for case-insensitive filesystems, we would lowercase the inputs to that
function.
For D/F conflicts, there is a conflict during the lock creation phase
itself:
refs/heads/foo/bar.lock
refs/heads/foo.lock
As in `lock_raw_ref()` after creating the lock, we also check for D/F
conflicts. This can occur in case-insensitive filesystems when trying to
fetch case-conflicted references like:
refs/heads/Foo/new
refs/heads/foo
D/F conflicts can also occur in case-sensitive filesystems, when the
repository already contains a directory with a lock file
'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This
doesn't concern directories containing garbage files as those are
handled on a higher level.
To fix this, simply categorize the error as a name conflict. Also remove
this reference from the list of valid refnames for availability checks.
By categorizing the error and removing it from the list of valid
references, batched updates now knows to reject such reference updates
and apply the other reference updates.
Fix a small typo in `ref_transaction_maybe_set_rejected()` while here.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When using the files-backend on case-insensitive filesystems, there is
possibility of hitting F/D conflicts when creating references within a
single transaction, such as:
- 'refs/heads/foo'
- 'refs/heads/Foo/bar'
Ideally such conflicts are caught in `refs_verify_refnames_available()`
which is responsible for checking F/D conflicts within a given
transaction. This utility function is shared across the reference
backends. As such, it doesn't consider the issues of using a
case-insensitive file system, which only affects the files-backend.
While one solution would be to make the function aware of such issues,
this feels like leaking implementation details of file-backend specific
issues into the utility function. So opt for the more simpler option, of
lowercasing all references sent to this function when on a
case-insensitive filesystem and operating on the files-backend.
To do this, simply use a `struct strbuf` to convert the refname to
lowercase and append it to the list of refnames to be checked. Since we
use a `struct strbuf` and the memory is cleared right after, make sure
that the string list duplicates all provided string.
Without this change, the user would simply be left with a repository
with '.lock' files which were created in the 'prepare' phase of the
transaction, as the 'commit' phase would simply abort and not do the
necessary cleanup.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When fetching references into a repository, if a lock for a particular
reference exists, then `lock_raw_ref()` throws:
- REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict
because the transaction contains conflicting references while being
on a case-insensitive filesystem.
- REF_TRANSACTION_ERROR_GENERIC: for all other errors.
The latter causes the entire set of batched updates to fail, even in
case sensitive filessystems.
Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This
allows batched updates to reject the individual update which conflicts
with the existing file, while updating the rest of the references.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
During the 'prepare' phase of a reference transaction in the files
backend, we create the lock files for references to be created. When
using batched updates on case-insensitive filesystems, the entire
batched updates would be aborted if there are conflicting names such as:
refs/heads/Foo
refs/heads/foo
This affects all commands which were migrated to use batched updates in
Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before
that, reference updates would be applied serially with one transaction
used per update. When users fetched multiple references on
case-insensitive systems, subsequent references would simply overwrite
any earlier references. So when fetching:
refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6
refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3
refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56
The user would simply end up with:
refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3
refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56
This is buggy behavior since the user is never informed about the
overrides performed and missing references. Nevertheless, the user is
left with a working repository with a subset of the references. Since
Git 2.51, in such situations fetches would simply fail without updating
any references. Which is also buggy behavior and worse off since the
user is left without any references.
The error is triggered in `lock_raw_ref()` where the files backend
attempts to create a lock file. When a lock file already exists the
function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens,
the entire batched updates, not individual operation, is aborted as if
it were in a transaction.
Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to
aid the batched update mechanism to simply reject such errors. The
change only affects batched updates since batched updates will reject
individual updates with non-generic errors. So specifically this would
only affect:
1. git fetch
2. git receive-pack
3. git update-ref --batch-updates
This bubbles the error type up to `files_transaction_prepare()` which
tries to lock each reference update. So if the locking fails, we check
if the rejection type can be ignored, which is done by calling
`ref_transaction_maybe_set_rejected()`.
As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT',
the specific reference update would simply be rejected, while other
updates in the transaction would continue to be applied. This allows
partial application of references in case-insensitive filesystems when
fetching colliding references.
While the earlier implementation allowed the last reference to be
applied overriding the initial references, this change would allow the
first reference to be applied while rejecting consequent collisions.
This should be an okay compromise since with the files backend, there is
no scenario possible where we would retain all colliding references.
Let's also be more proactive and notify users on case-insensitive
filesystems about such problems by providing a brief about the issue
while also recommending using the reftable backend, which doesn't have
the same issue.
Reported-by: Joe Drew <joe.drew@indexexchange.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git fetch" can clobber a symref that is dangling when the
remote-tracking HEAD is set to auto update, which has been
corrected.
* jk/no-clobber-dangling-symref-with-fetch:
refs: do not clobber dangling symrefs
t5510: prefer "git -C" to subshell for followRemoteHEAD tests
t5510: stop changing top-level working directory
t5510: make confusing config cleanup more explicit
|
|
Code clean-ups.
* ps/reftable-libgit2-cleanup:
refs/reftable: always reload stacks when creating lock
reftable: don't second-guess errors from flock interface
reftable/stack: handle outdated stacks when compacting
reftable/stack: allow passing flags to `reftable_stack_add()`
reftable/stack: fix compiler warning due to missing braces
reftable/stack: reorder code to avoid forward declarations
reftable/writer: drop Git-specific `QSORT()` macro
reftable/writer: fix type used for number of records
|
|
"git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
renames extremely inefficiently, which has been corrected.
* ps/remote-rename-fix:
builtin/remote: only iterate through refs that are to be renamed
builtin/remote: rework how remote refs get renamed
builtin/remote: determine whether refs need renaming early on
builtin/remote: fix sign comparison warnings
refs: simplify logic when migrating reflog entries
refs: pass refname when invoking reflog entry callback
|
|
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.
* ps/reflog-migrate-fixes:
refs: fix invalid old object IDs when migrating reflogs
refs: stop unsetting REF_HAVE_OLD for log-only updates
refs/files: detect race when generating reflog entry for HEAD
refs: fix identity for migrated reflogs
ident: fix type of string length parameter
builtin/reflog: implement subcommand to write new entries
refs: export `ref_transaction_update_reflog()`
builtin/reflog: improve grouping of subcommands
Documentation/git-reflog: convert to use synopsis type
|
|
When given an expected "before" state, the ref-writing code will avoid
overwriting any ref that does not match that expected state. We use the
null oid as a sentinel value for "nothing should exist", and likewise
that is the sentinel value we get when trying to read a ref that does
not exist.
But there's one corner case where this is ambiguous: dangling symrefs.
Trying to read them will yield the null oid, but there is potentially
something of value there: the dangling symref itself.
For a normal recursive write, this is OK. Imagine we have a symref
"FOO_HEAD" that points to a ref "refs/heads/bar" that does not exist,
and we try to write to it with a create operation like:
oid=$(git rev-parse HEAD) ;# or whatever
git symbolic-ref FOO_HEAD refs/heads/bar
echo "create FOO_HEAD $oid" | git update-ref --stdin
The attempt to resolve FOO_HEAD will actually resolve "bar", yielding
the null oid. That matches our expectation, and the write proceeds. This
is correct, because we are not writing FOO_HEAD at all, but writing its
destination "bar", which in fact does not exist.
But what if the operation asked not to dereference symrefs? Like this:
echo "create FOO_HEAD $oid" | git update-ref --no-deref --stdin
Resolving FOO_HEAD would still result in a null oid, and the write will
proceed. But it will overwrite FOO_HEAD itself, removing the fact that
it ever pointed to "bar".
This case is a little esoteric; we are clobbering a symref with a
no-deref write of a regular ref value. But the same problem occurs when
writing symrefs. For example:
echo "symref-create FOO_HEAD refs/heads/other" |
git update-ref --no-deref --stdin
The "create" operation asked us to create FOO_HEAD only if it did not
exist. But we silently overwrite the existing value.
You can trigger this without using update-ref via the fetch
followRemoteHEAD code. In "create" mode, it should not overwrite an
existing value. But if you manually create a symref pointing to a value
that does not yet exist (either via symbolic-ref or with "remote add
-m"), create mode will happily overwrite it.
Instead, we should detect this case and refuse to write. The correct
specification to overwrite FOO_HEAD in this case is to provide an
expected target ref value, like:
echo "symref-update FOO_HEAD refs/heads/other ref refs/heads/bar" |
git update-ref --no-deref --stdin
Note that the non-symref "update" directive does not allow you to do
this (you can only specify an oid). This is a weakness in the update-ref
interface, and you'd have to overwrite unconditionally, like:
echo "update FOO_HEAD $oid" | git update-ref --no-deref --stdin
Likewise other symref operations like symref-delete do not accept the
"ref" keyword. You should be able to do:
echo "symref-delete FOO_HEAD ref refs/heads/bar"
but cannot (and can only delete unconditionally). This patch doesn't
address those gaps. We may want to do so in a future patch for
completeness, but it's not clear if anybody actually wants to perform
those operations. The symref update case (specifically, via
followRemoteHEAD) is what I ran into in the wild.
The code for the fix is relatively straight-forward given the discussion
above. But note that we have to implement it independently for the files
and reftable backends. The "old oid" checks happen as part of the
locking process, which is implemented separately for each system. We may
want to factor this out somehow, but it's beyond the scope of this
patch. (Another curiosity is that the messages in the reftable code are
marked for translation, but the ones in the files backend are not. I
followed local convention in each case, but we may want to harmonize
this at some point).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When creating a new addition via either `reftable_stack_new_addition()`
or its convenince wrapper `reftable_stack_add()` we:
1. Create the "tables.list.lock" file.
2. Verify that the current version of the "tables.list" file is
up-to-date.
3. Write the new table records if so.
By default, the second step would cause us to bail out if we see that
there has been a concurrent write to the stack that made our in-memory
copy of the stack out-of-date. This is a safety mechanism to not write
records to the stack based on outdated information.
The downside though is that concurrent writes may now cause us to bail
out, which is not a good user experience. In addition, this isn't even
necessary for us, as Git knows to perform all checks for the old state
of references under the lock. (Well, in all except one case: when we
expire the reflog we first create the log iterator before we create the
lock, but this ordering is fixed as part of this commit.)
Consequently, most writers pass the `REFTABLE_STACK_NEW_ADDITION_RELOAD`
flag. The effect of this flag is that we reload the stack after having
acquired the lock in case the stack is out-of-date. This plugs the race
with concurrent writers, but we continue performing the verifications of
the expected old state to catch actual conflicts in the references we
are about to write.
Adapt the remaining callsites that don't yet pass this flag to do so.
While at it, drop a needless manual reload.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `reftable_stack_add()` function is a simple wrapper to lock the
stack, add records to it via a callback and then commit the
result. One problem with it though is that it doesn't accept any flags
for creating the addition. This makes it impossible to automatically
reload the stack in case it was modified before we managed to lock the
stack.
Add a `flags` field to plug this gap and pass it through accordingly.
For now this new flag won't be used by us, but it will be used by
libgit2.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With `refs_for_each_reflog_ent()` callers can iterate through all the
reflog entries for a given reference. The callback that is being invoked
for each such entry does not receive the name of the reference that we
are currently iterating through. This isn't really a limiting factor, as
callers can simply pass the name via the callback data.
But this layout sometimes does make for a bit of an awkward calling
pattern. One example: when iterating through all reflogs, and for each
reflog we iterate through all refnames, we have to do some extra book
keeping to track which reference name we are currently yielding reflog
entries for.
Change the signature of the callback function so that the reference name
of the reflog gets passed through to it. Adapt callers accordingly and
start using the new parameter in trivial cases. The next commit will
refactor the reference migration logic to make use of this parameter so
that we can simplify its logic a bit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ps/reflog-migrate-fixes:
refs: fix invalid old object IDs when migrating reflogs
refs: stop unsetting REF_HAVE_OLD for log-only updates
refs/files: detect race when generating reflog entry for HEAD
refs: fix identity for migrated reflogs
ident: fix type of string length parameter
builtin/reflog: implement subcommand to write new entries
refs: export `ref_transaction_update_reflog()`
builtin/reflog: improve grouping of subcommands
Documentation/git-reflog: convert to use synopsis type
|
|
When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.
The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.
This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.
Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.
Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whether
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.
When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.
But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal will become useful in a subsequent commit, where we add a new
flag that tells the transaction to use the provided old and new object
IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
signal to verify that the caller really did provide an old object ID.
Stop unsetting the flag so that we can use it as this described signal
in a subsequent commit. Skip checking the old object ID for log-only
updates so that we don't expect it to match the current on-disk state.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When updating a reference that is being pointed to HEAD we don't only
write a reflog message for that particular reference, but also generate
one for HEAD. This logic is handled by `split_head_update()`, where we:
1. Verify that the condition actually triggered. This is done by
reading HEAD at the start of the transaction so that we can then
check whether a given reference update refers to its target.
2. Queue a new log-only update for HEAD in case it did.
But the logic is unfortunately not free of races, as we do not lock the
HEAD reference after we have read its target. This can lead to the
following two scenarios:
- HEAD gets concurrently updated to point to one of the references we
have already processed. This causes us not writing a reflog message
even though we should have done so.
- HEAD gets concurrently updated to no longer point to a reference
anymore that we have already processed. This causes us to write a
reflog message even though we should _not_ have done so.
Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that
is specific to the "files" backend. If set, we will double check that
the HEAD reference still points to the reference that we are creating
the reflog entry for after we have locked HEAD. Furthermore, instead of
manually resolving the old object ID of that entry, we now use the same
old state as for the parent update.
If we detect such a racy update we abort the transaction. This is a bit
heavy-handed: the user didn't even ask us to write a reflog update for
"HEAD", so it might be surprising if we abort the transaction. That
being said:
- Normal users wouldn't typically hit this case as we only hit the
relevant code when committing to a branch that is being pointed to
by "HEAD" directly. Commands like git-commit(1) typically commit to
"HEAD" itself though.
- Scripted users that use git-update-ref(1) and related plumbing
commands are unlikely to hit this case either, as they would have to
update the pointed-to-branch at the same as "HEAD" is being updated,
which is an exceedingly rare event.
The alternative would be to instead drop the log-only update completely,
but that would require more logic that is hard to verify without adding
infrastructure specific for such a test. So we rather do the pragmatic
thing and don't worry too much about an edge case that is very unlikely
to happen.
Unfortunately, this change only helps with the second race. We cannot
reliably plug the first race without locking the HEAD reference at the
start of the transaction. Locking HEAD unconditionally would effectively
serialize all writes though, and that doesn't seem like an option. Also,
double checking its value at the end of the transaction is not an option
either, as its target may have flip-flopped during the transaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The config API had a set of convenience wrapper functions that
implicitly use the_repository instance; they have been removed and
inlined at the calling sites.
* ps/config-wo-the-repository: (21 commits)
config: fix sign comparison warnings
config: move Git config parsing into "environment.c"
config: remove unused `the_repository` wrappers
config: drop `git_config_set_multivar()` wrapper
config: drop `git_config_get_multivar_gently()` wrapper
config: drop `git_config_set_multivar_in_file_gently()` wrapper
config: drop `git_config_set_in_file_gently()` wrapper
config: drop `git_config_set()` wrapper
config: drop `git_config_set_gently()` wrapper
config: drop `git_config_set_in_file()` wrapper
config: drop `git_config_get_bool()` wrapper
config: drop `git_config_get_ulong()` wrapper
config: drop `git_config_get_int()` wrapper
config: drop `git_config_get_string()` wrapper
config: drop `git_config_get_string()` wrapper
config: drop `git_config_get_string_multi()` wrapper
config: drop `git_config_get_value()` wrapper
config: drop `git_config_get_value()` wrapper
config: drop `git_config_get()` wrapper
config: drop `git_config_clear()` wrapper
...
|
|
Code clean-up.
* kn/for-each-ref-skip-updates:
ref-filter: use REF_ITERATOR_SEEK_SET_PREFIX instead of '1'
t6302: add test combining '--start-after' with '--exclude'
for-each-ref: reword the documentation for '--start-after'
for-each-ref: fix documentation argument ordering
ref-cache: use 'size_t' instead of int for length
|
|
"git for-each-ref" learns "--start-after" option to help
applications that want to page its output.
* kn/for-each-ref-skip:
ref-cache: set prefix_state when seeking
for-each-ref: introduce a '--start-after' option
ref-filter: remove unnecessary else clause
refs: selectively set prefix in the seek functions
ref-cache: remove unused function 'find_ref_entry()'
refs: expose `ref_iterator` via 'refs.h'
|
|
The commit 090eb5336c (refs: selectively set prefix in the seek
functions, 2025-07-15) modified the ref-cache iterator to support
seeking to a specified marker without setting the prefix.
The commit adds and uses an integer 'len' to capture the length of the
seek marker to compare with the entries of a given directory. Since the
type of the variable is 'int', this is met with a typecast of converting
a `strlen` to 'int' so it can be assigned to the 'len' variable.
This is whole operation is a bit wrong:
1. Since the 'len' variable is eventually used in a 'strncmp', it should
have been of type 'size_t'.
2. This also truncates the value provided from 'strlen' to an int, which
could cause a large refname to produce a negative number.
Let's do the correct thing here and simply use 'size_t' for `len`.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 090eb5336c (refs: selectively set prefix in the seek functions,
2025-07-15) we separated the seeking functionality of reference
iterators from the functionality to set prefix to an iterator. This
allows users of ref iterators to seek to a particular reference to
provide pagination support.
The files-backend, uses the ref-cache iterator to iterate over loose
refs. The iterator tracks directories and entries already processed via
a stack of levels. Each level corresponds to a directory under the files
backend. New levels are added to the stack, and when all entries from a
level is yielded, the corresponding level is popped from the stack.
To accommodate seeking, we need to populate and traverse the levels to
stop the requested seek marker at the appropriate level and its entry
index. Each level also contains a 'prefix_state' which is used for
prefix matching, this allows the iterator to skip levels/entries which
don't match a prefix. The default value of 'prefix_state' is
PREFIX_CONTAINS_DIR, which yields all entries within a level. When
purely seeking without prefix matching, we want to yield all entries.
The commit however, skips setting the value explicitly. This causes the
MemorySanitizer to issue a 'use-of-uninitialized-value' error when
running 't/t6302-for-each-ref-filter'.
Set the value explicitly to avoid to fix the issue.
Reported-by: Kyle Lippincott <spectral@google.com>
Helped-by: Kyle Lippincott <spectral@google.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_int()`. All
callsites are adjusted so that they use
`repo_config_get_int(the_repository, ...)` instead. While some callsites
might already have a repository available, this mechanical conversion is
the exact same as the current situation and thus cannot cause any
regression. Those sites should eventually be cleaned up in a later patch
series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config()`. All callsites
are adjusted so that they use `repo_config(the_repository, ...)`
instead. While some callsites might already have a repository available,
this mechanical conversion is the exact same as the current situation
and thus cannot cause any regression. Those sites should eventually be
cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When a ref creation at refs/heads/foo/bar fails, the files backend
now removes refs/heads/foo/ if the directory is otherwise not used.
* ps/refs-files-remove-empty-parent:
refs/files: remove empty parent dirs when ref creation fails
|
|
The ref iterator exposes a `ref_iterator_seek()` function. The name
suggests that this would seek the iterator to a specific reference in
some ways similar to how `fseek()` works for the filesystem.
However, the function actually sets the prefix for refs iteration. So
further iteration would only yield references which match the particular
prefix. This is a bit confusing.
Let's add a 'flags' field to the function, which when set with the
'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the
iteration in-line with the existing behavior. Otherwise, the reference
backends will simply seek to the specified reference and clears any
previously set prefix. This allows users to start iteration from a
specific reference.
In the packed and reftable backend, since references are available in a
sorted list, the changes are simply setting the prefix if needed. The
changes on the files-backend are a little more involved, since the files
backend uses the 'ref-cache' mechanism. We move out the existing logic
within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()`
which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We
then parse the provided seek string and set the required levels and
their indexes to ensure that seeking is possible.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'find_ref_entry' function is no longer used, so remove it.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `ref_iterator` is an internal structure to the 'refs/'
sub-directory, which allows iteration over refs. All reference iteration
is built on top of these iterators.
External clients of the 'refs' subsystem use the various
'refs_for_each...()' functions to iterate over refs. However since these
are wrapper functions, each combination of functionality requires a new
wrapper function. This is not feasible as the functions pile up with the
increase in requirements. Expose the internal reference iterator, so
advanced users can mix and match options as needed.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git push" and "git fetch" are taught to update refs in batches to
gain performance.
* kn/fetch-push-bulk-ref-update:
receive-pack: handle reference deletions separately
refs/files: skip updates with errors in batched updates
receive-pack: use batched reference updates
send-pack: fix memory leak around duplicate refs
fetch: use batched reference updates
refs: add function to translate errors to strings
|