aboutsummaryrefslogtreecommitdiffstats
path: root/t/t6120-describe.sh (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-08-20describe: handle blob traversal with no commitsJeff King1-0/+16
When describing a blob, we traverse from HEAD, remembering each commit we saw, and then checking each blob to report the containing commit. But if we haven't seen any commits at all, we'll segfault (we store the "current" commit as an oid initialized to the null oid, causing lookup_commit_reference() to return NULL). This shouldn't be able to happen normally. We always start our traversal at HEAD, which must be a commit (a property which is enforced by the refs code). But you can trigger the segfault like this: blob=$(echo foo | git hash-object -w --stdin) echo $blob >.git/HEAD git describe $blob We can instead catch this case and return an empty result, which hits the usual "we didn't find $blob while traversing HEAD" error. This is a minor lie in that we did "find" the blob. And this even hints at a bigger problem in this code: what if the traversal pointed to the blob as _not_ part of a commit at all, but we had previously filled in the recorded "current commit"? One could imagine this happening due to a tag pointing directly to the blob in question. But that can't happen, because we only traverse from HEAD, never from any other refs. And the intent of the blob-describing code is to find blobs within commits. So I think this matches the original intent as closely as we can (and again, this segfault cannot be triggered without corrupting your repository!). The test here does not use the formula above, which works only for the files backend (and not reftables). Instead we use another loophole to create the bogus state using only Git commands. See the comment in the test for details. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18describe: catch unborn branch in describe_blob()Jeff King1-0/+8
When describing a blob, we search for it by traversing from HEAD. We do this by feeding the name HEAD to setup_revisions(). But if we are on an unborn branch, this will fail with a confusing message: $ git describe $blob fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' It is OK for this to be an error (we cannot find $blob in an empty traversal, so we'd eventually complain about that). But the error message could be more helpful. Let's resolve HEAD ourselves and pass the resolved object id to setup_revisions(). If resolving fails, then we can print a more useful message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18describe: error if blob not foundJeff King1-0/+6
If describe_blob() does not find the blob in question, it returns an empty strbuf, and we print an empty line. This differs from describe_commit(), which always either returns an answer or calls die() itself. As the blob function was bolted onto the command afterwards, I think its behavior is not intentional, and it is just a bug that it does not report an error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12name-rev: remove "--stdin" supportJunio C Hamano1-2/+8
As part of Git 3.0, remove the hidden synonym for "--annotate-stdin" for real. As this does not change the fact that it used to be called "--stdin" in older version of Git, keep that passage in the documentation for "--annotate-stdin". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12t6120: further modernizeJunio C Hamano1-3/+3
There is absolutely no reason why a pattern given to grep to find 'warning: --stdin is deprecated' must be quoted within a pair of single quotes, or the pattern to look for the literal string as ERE. Quote the test body with a pair of single quotes like everybody else, and quote the needle string in a pair of double quotes. Also use test_grep instead of "grep -E". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12t6120: avoid hiding "git" exit statusJunio C Hamano1-2/+4
A handful of tests invoke "git" on the upstream side of a pipe, hiding its exit status. Correct them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13object-name: be more strict in parsing describe-like outputElijah Newren1-0/+24
From Documentation/revisions.txt: '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb':: Output from `git describe`; i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name. which means that output of the format ${REFNAME}-${INTEGER}-g${HASH} should parse to fully expanded ${HASH}. This is fine. However, we currently don't validate any of ${REFNAME}-${INTEGER}, we only parse -g${HASH} and assume the rest is valid. That is problematic, since it breaks things like git cat-file -p branchname:path/to/file/named/i-gaffed which, when commit (or tree or blob) affed exists, will not return us information about the file we are looking for but will instead erroneously tell us about object affed. A few additional notes: - This is a slight backward incompatibility break, because we used to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However, a backward incompatible break is necessary, because there is no other way for someone to be more specific and disambiguate that they want the blob master:path/to/who-gabbed instead of the object abbed. - There is a possibility that check_refname_format() rules change in the future. However, we can only realistically loosen the rules for what that function accepts rather than tighten. If we were to tighten the rules, some real world repositories may already have refnames that suddenly become unacceptable and we break those repositories. As such, any describe-like syntax of the form ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the changes in this commit will remain valid in the future. - The fact that check_refname_format() rules could loosen in the future is probably also an important reason to make this change. If the rules loosen, there might be additional cases within ${GARBAGE}-g${HASH} that become ambiguous in the future. While abbreviated hashes can be disambiguated by abbreviating less, it may well be that these alternative object names have no way of being disambiguated (much like pathnames cannot be). Accepting all random ${GARBAGE} thus makes it difficult for us to allow future extensions to object naming. So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are present in the string, and would be considered a valid ref and non-negative integer. Also, add a few tests for git describe using object names of the form ${REVISION_NAME}${MODIFIERS} since an early version of this patch failed on constructs like git describe v2.48.0-rc2-161-g6c2274cdbc^0 Reported-by: Gabriel Amaral <gabriel-amaral@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-15Merge branch 'jk/describe-perf'Junio C Hamano1-3/+21
"git describe" optimization. * jk/describe-perf: describe: split "found all tags" and max_candidates logic describe: stop traversing when we run out of names describe: stop digging for max_candidates+1 t/perf: add tests for git-describe t6120: demonstrate weakness in disjoint-root handling
2024-12-06describe: split "found all tags" and max_candidates logicJeff King1-0/+10
Commit a30154187a (describe: stop traversing when we run out of names, 2024-10-31) taught git-describe to automatically reduce the max_candidates setting to match the total number of possible names. This lets us break out of the traversal rather than fruitlessly searching for more candidates when there are no more to be found. However, setting max_candidates to 0 (e.g., if the repo has no tags) overlaps with the --exact-match option, which explicitly uses the same value. And this causes a regression with --always, which is ignored in exact-match mode. We used to get this in a repo with no tags: $ git describe --always HEAD b2f0a7f and now we get: $ git describe --always HEAD fatal: no tag exactly matches 'b2f0a7f47f5f2aebe1e7fceff19a57de20a78c06' The reason is that we bail early in describe_commit() when max_candidates is set to 0. This logic goes all the way back to 2c33f75754 (Teach git-describe --exact-match to avoid expensive tag searches, 2008-02-24). We should obviously fix this regression, but there are two paths, depending on what you think: $ git describe --always --exact-match and $ git describe --always --candidates=0 should do. Since the "--always" option was added, it has always been ignored in --exact-match (or --candidates=0) mode. I.e., we treat --exact-match as a true exact match of a tag, and never fall back to using --always, even if it was requested. If we think that's a bug (or at least a misfeature), then the right solution is to fix it by removing the early bail-out from 2c33f75754, letting the noop algorithm run and then hitting the --always fallback output. And then our regression naturally goes away, because it follows the same path. If we think that the current "--exact-match --always" behavior is the right thing, then we have to differentiate the case where we automatically reduced max_candidates to 0 from the case where the user asked for it specifically. That's possible to do with a flag, but we can also just reimplement the logic from a30154187a to explicitly break out of the traversal when we run out of candidates (rather than relying on the existing max_candidates check). My gut feeling is along the lines of option 1 (it's a bug, and people would be happy for "--exact-match --always" to give the fallback rather than ignoring "--always"). But the documentation can be interpreted in the other direction, and we've certainly lived with the existing behavior for many years. So it's possible that changing it now is the wrong thing. So this patch fixes the regression by taking the second option, retaining the "--exact-match" behavior as-is. There are two new tests. The first shows that the regression is fixed (we don't even need a new repo without tags; a restrictive --match is enough to create the situation that there are no candidate names). The second test confirms that the "--exact-match --always" behavior remains unchanged and continues to die when there is no tag pointing at the specified commit. It's possible we may reconsider this in the future, but this shows that the approach described above is implemented faithfully. We can also run the perf tests in p6100 to see that we've retained the speedup that a30154187a was going for: Test HEAD^ HEAD -------------------------------------------------------------------------------------- 6100.2: describe HEAD 0.72(0.64+0.07) 0.72(0.66+0.06) +0.0% 6100.3: describe HEAD with one max candidate 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 6100.4: describe HEAD with one tag 0.01(0.01+0.00) 0.01(0.01+0.00) +0.0% Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t: remove TEST_PASSES_SANITIZE_LEAK annotationsPatrick Steinhardt1-1/+0
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there is no longer a need to have that variable declared in all of our tests. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07t6120: demonstrate weakness in disjoint-root handlingJeff King1-3/+11
Commit 30b1c7ad9d (describe: don't abort too early when searching tags, 2020-02-26) tried to fix a problem that happens when there are disjoint histories: to accurately compare the counts for different tags, we need to keep walking the history longer in order to find a common base. But its fix misses a case: we may still bail early if we hit the max_candidates limit, producing suboptimal output. You can see this in action by adding "--candidates=2" to the tests; we'll stop traversing as soon as we see the second tag and will produce the wrong answer. I hit this in practice while trying to teach git-describe not to keep looking for candidates after we've seen all tags in the repo (effectively adding --candidates=2, since these toy repos have only two tags each). This is probably fixable by continuing to walk after hitting the max-candidates limit, all the way down to a common ancestor of all candidates. But it's not clear in practice what the preformance implications would be (it would depend on how long the branches that hold the candidates are). So I'm punting on that for now, but I'd like to adjust the tests to be more resilient, and to document the findings. So this patch: 1. Adds an extra tag at the bottom of history. This shouldn't change the output, but does mean we are more resilient to low values of --candidates (e.g., if we start reducing it to the total number of tags). This is arguably closer to the real world anyway, where you're not going to have just 2 tags, but an arbitrarily long history going back in time, possibly with multiple irrelevant tags in it (I called the new tag "H" here for "history"). 2. Run the same tests with --candidates=2, which shows that even with the current code they can fail if we end the traversal early. That leaves a trail for anybody interested in trying to improve the behavior. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01builtin/name-rev: fix various trivial memory leaksPatrick Steinhardt1-0/+1
There are several structures that we don't release after `cmd_name_rev()` is done. Plug those leaks. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-26describe: refresh the index when 'broken' flag is usedAbhijeet Sonar1-0/+36
When describe is run with 'dirty' flag, we refresh the index to make sure it is in sync with the filesystem before determining if the working tree is dirty. However, this is not done for the codepath where the 'broken' flag is used. This causes `git describe --broken --dirty` to false positively report the worktree being dirty if a file has different stat info than what is recorded in the index. Running `git update-index -q --refresh` to refresh the index before running diff-index fixes the problem. Also add tests to deliberately update stat info of a file before running describe to verify it behaves correctly. Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02tests: teach callers of test_i18ngrep to use test_grepJunio C Hamano1-1/+1
They are equivalents and the former still exists, so as long as the only change this commit makes are to rewrite test_i18ngrep to test_grep, there won't be any new bug, even if there still are callers of test_i18ngrep remaining in the tree, or when merged to other topics that add new uses of test_i18ngrep. This patch was produced more or less with git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' | xargs perl -p -i -e 's/test_i18ngrep /test_grep /' and a good way to sanity check the result yourself is to run the above in a checkout of c4603c1c (test framework: further deprecate test_i18ngrep, 2023-10-31) and compare the resulting working tree contents with the result of applying this patch to the same commit. You'll see that test_i18ngrep in a few t/lib-*.sh files corrected, in addition to the manual reproduction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-21describe: fix --no-exact-matchRené Scharfe1-0/+8
Since 2c33f75754 (Teach git-describe --exact-match to avoid expensive tag searches, 2008-02-24) git describe accepts --no-exact-match, but it does the same as --exact-match, an alias for --candidates=0. That's because it's defined using OPT_SET_INT with a value of 0, which sets 0 when negated as well. Let --no-exact-match set the number of candidates to the default value instead. Users that need a more specific lack of exactitude can specify their preferred value using --candidates, as before. The "--no-exact-match" option was not covered in the tests, so let's add a few. Also add a case where --exact-match option is used on a commit that cannot be described without distance from tags and make sure the command fails. Signed-off-by: René Scharfe <l.s.r@web.de> [jc: added trivial tests] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-09name-rev: fix names by dropping taggerdate workaroundElijah Newren1-0/+6
Commit 7550424804 ("name-rev: include taggerdate in considering the best name", 2016-04-22) introduced the idea of using taggerdate in the criteria for selecting the best name. At the time, a certain commit in linux.git -- namely, aed06b9cfcab -- was being named by name-rev as v4.6-rc1~9^2~792 which, while correct, was very suboptimal. Some investigation found that tweaking the MERGE_TRAVERSAL_WEIGHT to lower it could give alternate answers such as v3.13-rc7~9^2~14^2~42 or v3.13~5^2~4^2~2^2~1^2~42 A manual solution involving looking at tagger dates came up with v3.13-rc1~65^2^2~42 which is much nicer. That workaround was then implemented in name-rev. Unfortunately, the taggerdate heuristic is causing bugs. I was pointed to a case in a private repository where name-rev reports a name of the form v2022.10.02~86 when users expected to see one of the form v2022.10.01~2 (I've modified the names and numbers a bit from the real testcase.) As you can probably guess, v2022.10.01 was created after v2022.10.02 (by a few hours), even though it pointed to an older commit. While the condition is unusual even in the repository in question, it is not the only problematic set of tags in that repository. The taggerdate logic is causing problems. Further, it turns out that this taggerdate heuristic isn't even helping anymore. Due to the fix to naming logic in 3656f84278 ("name-rev: prefer shorter names over following merges", 2021-12-04), we get improved names without the taggerdate heuristic. For the original commit of interest in linux.git, a modern git without the taggerdate heuristic still provides the same optimal answer of interest, namely: v3.13-rc1~65^2^2~42 So, the taggerdate is no longer providing benefit, and it is causing problems. Simply get rid of it. However, note that "taggerdate" as a variable is used to store things besides a taggerdate these days. Ever since commit ef1e74065c ("name-rev: favor describing with tags and use committer date to tiebreak", 2017-03-29), this has been used to store committer dates and there it is used as a fallback tiebreaker (as opposed to a primary criteria overriding effective distance calculations). We do not want to remove that fallback tiebreaker, so not all instances of "taggerdate" are removed in this change. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-13name-rev: use generation numbers if availableJacob Keller1-0/+118
If a commit in a sequence of linear history has a non-monotonically increasing commit timestamp, git name-rev might not properly name the commit. This occurs because name-rev uses a heuristic of the commit date to avoid searching down tags which lead to commits that are older than the named commit. This is intended to avoid work on larger repositories. This heuristic impacts git name-rev, and by extension git describe --contains which is built on top of name-rev. Further more, if --all or --annotate-stdin is used, the heuristic is not enabled because the full history has to be analyzed anyways. This results in some confusion if a user sees that --annotate-stdin works but a normal name-rev does not. If the repository has a commit graph, we can use the generation numbers instead of using the commit dates. This is essentially the same check except that generation numbers make it exact, where the commit date heuristic could be incorrect due to clock errors. Since we're extending the notion of cutoff to more than one variable, create a series of functions for setting and checking the cutoff. This avoids duplication and moves access of the global cutoff and generation_cutoff to as few functions as possible. Add several test cases including a test that covers the new commitGraph behavior, as well as tests for --all and --annotate-stdin with and without commitGraphs. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-09Merge branch 'jc/name-rev-stdin'Junio C Hamano1-2/+7
"git name-rev --stdin" does not behave like usual "--stdin" at all. Start the process of renaming it to "--annotate-stdin". * jc/name-rev-stdin: name-rev.c: use strbuf_getline instead of limited size buffer name-rev: deprecate --stdin in favor of --annotate-stdin
2022-01-10name-rev: deprecate --stdin in favor of --annotate-stdinJohn Cai1-2/+7
Introduce a --annotate-stdin that is functionally equivalent of --stdin. --stdin does not behave as --stdin in other subcommands, such as pack-objects whereby it takes one argument per line. Since --stdin can be a confusing and misleading name, rename it to --annotate-stdin. This change adds a warning to --stdin warning that it will be removed in the future. Signed-off-by: "John Cai" <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13t6000-t9999: detect and signal failure within loopEric Sunshine1-3/+3
Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `|| return 1` (or `|| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13tests: fix broken &&-chains in compound statementsEric Sunshine1-2/+5
The top-level &&-chain checker built into t/test-lib.sh causes tests to magically exit with code 117 if the &&-chain is broken. However, it has the shortcoming that the magic does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed` partly fills in the gap by catching broken &&-chains in `(...)` subshells, but bugs can still lurk behind broken &&-chains in the other cases. Fix broken &&-chains in compound statements in order to reduce the number of possible lurking bugs. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02t6120: use git-update-ref rather than filesystem accessHan-Wen Nienhuys1-2/+4
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11describe tests: support -C in "check_describe"Ævar Arnfjörð Bjarmason1-9/+18
Change a subshell added in a preceding commit to instead use a new "-C" option to "check_describe". The idiom for this is copied as-is from the "test_commit" function in test-lib-functions.sh Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11describe tests: fix nested "test_expect_success" callÆvar Arnfjörð Bjarmason1-8/+14
Fix a nested invocation of "test_expect_success", the "check_describe()" function is a wrapper for calling test_expect_success, and therefore needs to be called outside the body of another "test_expect_success". The two tests added in 30b1c7ad9d6 (describe: don't abort too early when searching tags, 2020-02-26) were not testing for anything due to this logic error. Without this fix reverting the C code changes in that commit still has all tests passing, with this fix we're actually testing the "describe" output. This is because "test_expect_success" calls "test_finish_", whose last statement happens to be true. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11describe tests: don't rely on err.actual from "check_describe"Ævar Arnfjörð Bjarmason1-14/+11
Convert the one test that relied on the "err.actual" file produced by check_describe() to instead do its own check of "git describe" output. This means that the two tests won't have an inter-dependency (e.g. if the earlier test is skipped). An earlier version of this patch instead asserted that no other test had any output on stderr. We're not doing that here out of fear that "gc --auto" or another future change to "git describe" will cause it to legitimately emit output on stderr unexpectedly[1]. I'd think that inverting the test added in 3291fe4072e (Add git-describe test for "verify annotated tag names on output", 2008-03-03) to make checking that we don't have warnings the rule rather than the exception would be the sort of thing the describe tests should be catching, but for now let's leave it as it is. 1. http://lore.kernel.org/git/xmqqwnuqo8ze.fsf@gitster.c.googlers.com Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11describe tests: refactor away from glob matchingÆvar Arnfjörð Bjarmason1-40/+38
Change the glob matching via a "case" statement to a "test_cmp" after we've stripped out the hash-specific g<hash-abbrev> suffix. 5312ab11fbf (Add describe test., 2007-01-13). This means that we can use test_cmp to compare the output. I could omit the "-8" change of e.g. "A-*" to "A-8-gHASH", but I think it makes sense to test that here explicitly. It means you need to add new tests to the bottom of the file, but that's not a burden in this case. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11describe tests: improve test for --work-tree & --dirtyÆvar Arnfjörð Bjarmason1-6/+6
Improve tests added in 9f67d2e8279 (Teach "git describe" --dirty option, 2009-10-21) and 2ed5c8e174d (describe: setup working tree for --dirty, 2019-02-03) so that they make sense in combination with each other. The "check_describe" being removed here was the earlier test, we then later added these --work-tree tests which really just wanted to check if we got the exact same output from "describe", but the test wasn't structured to test for that. Let's change it to do that, which both improves test coverage and makes it more obvious what's going on here. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11describe tests: convert setup to use test_commitÆvar Arnfjörð Bjarmason1-45/+13
Convert the setup of the describe tests to use test_commit when possible. This makes use of the new --annotate option to test_commit. Some of the setup here could simply be removed since the data being created wasn't important to any of the subsequent tests, so I've done so. E.g. assigning to the "one" variable was always useless, and just checking that we can describe HEAD after the first commit wasn't useful. In the case of the "two" variable we could instead use the tag we just created. See 5312ab11fbf (Add describe test., 2007-01-13) for the initial version of this code. There's other cases here like redundant "test_tick" invocations, or the simplification of not echoing "X" to a file we're about to tag as "x", now we just use "x" in both cases. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10tests: remove most uses of test_i18ncmpÆvar Arnfjörð Bjarmason1-1/+1
As a follow-up to d162b25f956 (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20) remove most uses of test_i18ncmp via a simple s/test_i18ncmp/test_cmp/g search-replacement. I'm leaving t6300-for-each-ref.sh out due to a conflict with in-flight changes between "master" and "seen", as well as the prerequisite itself due to other changes between "master" and "next/seen" which add new test_i18ncmp uses. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19t6[0-3]*: adjust the references to the default branch name "main"Johannes Schindelin1-11/+11
Carefully excluding t6300, which sees independent development elsewhere at the time of writing, we use `main` as the default branch name in t6[0-3]*. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -- t6[0-3]*.sh && git checkout HEAD -- t6300\*) This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19tests: mark tests relying on the current default for `init.defaultBranch`Johannes Schindelin1-0/+3
In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9]*.sh) \ t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167]*.sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-26Merge branch 'jc/describe-misnamed-annotated-tag'Junio C Hamano1-1/+19
When "git describe C" finds an annotated tag with tagname A to be the best name to explain commit C, and the tag is stored in a "wrong" place in the refs/tags hierarchy, e.g. refs/tags/B, the command gave a warning message but used A (not B) to describe C. If C is exactly at the tag, the describe output would be "A", but "git rev-parse A^0" would not be equal as "git rev-parse C^0". The behavior of the command has been changed to use the "long" form i.e. A-0-gOBJECTNAME, which is correctly interpreted by rev-parse. * jc/describe-misnamed-annotated-tag: describe: force long format for a name based on a mislocated tag
2020-02-26describe: don't abort too early when searching tagsBenno Evers1-0/+51
When searching the commit graph for tag candidates, `git-describe` will stop as soon as there is only one active branch left and it already found an annotated tag as a candidate. This works well as long as all branches eventually connect back to a common root, but if the tags are found across branches with no common ancestor B o----. \ o-----o---o----x A it can happen that the search on one branch terminates prematurely because a tag was found on another, independent branch. This scenario isn't quite as obscure as it sounds, since cloning with a limited depth often introduces many independent "dead ends" into the commit graph. The help text of `git-describe` states pretty clearly that when describing a commit D, the number appended to the emitted tag X should correspond to the number of commits found by `git log X..D`. Thus, this commit modifies the stopping condition to only abort the search when only one branch is left to search *and* all current best candidates are descendants from that branch. For repositories with a single root, this condition is always true: When the search is reduced to a single active branch, the current commit must be an ancestor of *all* tag candidates. This means that in the common case, this change will have no negative performance impact since the same number of commits as before will be traversed. Signed-off-by: Benno Evers <benno@bmevers.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-20describe: force long format for a name based on a mislocated tagJunio C Hamano1-1/+19
An annotated tag has two names: where it sits in the refs/tags hierarchy and the tagname recorded in the "tag" field in the object itself. They usually should match. Since 212945d4 ("Teach git-describe to verify annotated tag names before output", 2008-02-28), a commit described using an annotated tag bases its name on the tagname from the object. While this was a deliberate design decision to make it easier to converse about tags with others, even if the tags happen to be fetched to a different name than it was given by its creator, it had one downside. The output from "git describe", at least in the modern Git, should be usable as an object name to name the exact commit given to the "git describe" command. Using the tagname, when two names differ, breaks this property, when describing a commit that is directly pointed at by such a tag. An annotated tag Bob made as "v1.0" may sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from "git describe v1.0-bob^0" would say "v1.0", but there may not be any tag at "refs/tags/v1.0" locally or there may be another tag that points at a different object. Note that this won't be a problem if a commit being described is not directly pointed at by such a mislocated tag. In the example in the previous paragraph, describing a commit whose parent is v1.0-bob would result in "v1.0" (i.e. the tagname taken from the tag object) followed by "-1-gXXXXX" where XXXXX is the abbreviated object name, and a string that ends with "-g" followed by a hexadecimal string is an object name for the object whose name begins with hexadecimal string (as long as it is unique), so it does not matter if the leading part is "v1.0" or "v1.0-bob". Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when the name we give is based on a mislocated annotated tag to ensure that the output can be used as the object name for the object originally given to the command to fix the issue. While at it, remove an overly cautious dead code to protect against an annotated tag object without the tagname. Such a tag is filtered out much earlier in the codeflow, and will not reach this part of the code. Helped-by: Matheus Tavares <matheus.bernardino@usp.br> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-09name-rev: eliminate recursion in name_rev()SZEDER Gábor1-1/+1
The name_rev() function calls itself recursively for each interesting parent of the commit it got as parameter, and, consequently, it can segfault when processing a deep history if it exhausts the available stack space. E.g. running 'git name-rev --all' and 'git name-rev HEAD~100000' in the gcc, gecko-dev, llvm, and WebKit repositories results in segfaults on my machine ('ulimit -s' reports 8192kB of stack size limit), and nowadays the former segfaults in the Linux repo as well (it reached the necessasry depth sometime between v5.3-rc4 and -rc5). Eliminate the recursion by inserting the interesting parents into a LIFO 'prio_queue' [1] and iterating until the queue becomes empty. Note that the parent commits must be added in reverse order to the LIFO 'prio_queue', so their relative order is preserved during processing, i.e. the first parent should come out first from the queue, because otherwise performance greatly suffers on mergy histories [2]. The stacksize-limited test 'name-rev works in a deep repo' in 't6120-describe.sh' demonstrated this issue and expected failure. Now the recursion is gone, so flip it to expect success. Also gone are the dmesg entries logging the segfault of that segfaulting 'git name-rev' process on every execution of the test suite. Note that this slightly changes the order of lines in the output of 'git name-rev --all', usually swapping two lines every 35 lines in git.git or every 150 lines in linux.git. This shouldn't matter in practice, because the output has always been unordered anyway. This patch is best viewed with '--ignore-all-space'. [1] Early versions of this patch used a 'commit_list', resulting in ~15% performance penalty for 'git name-rev --all' in 'linux.git', presumably because of the memory allocation and release for each insertion and removal. Using a LIFO 'prio_queue' has basically no effect on performance. [2] We prefer shorter names, i.e. 'v0.1~234' is preferred over 'v0.1^2~5', meaning that usually following the first parent of a merge results in the best name for its ancestors. So when later we follow the remaining parent(s) of a merge, and reach an already named commit, then we usually find that we can't give that commit a better name, and thus we don't have to visit any of its ancestors again. OTOH, if we were to follow the Nth parent of the merge first, then the name of all its ancestors would include a corresponding '^N'. Those are not the best names for those commits, so when later we reach an already named commit following the first parent of that merge, then we would have to update the name of that commit and the names of all of its ancestors as well. Consequently, we would have to visit many commits several times, resulting in a significant slowdown. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-06t6120: add a test to cover inner conditions in 'git name-rev's name_rev()SZEDER Gábor1-0/+41
In 'builtin/name-rev.c' in the name_rev() function there is a loop iterating over all parents of the given commit, and the loop body looks like this: if (parent_number > 1) { if (generation > 0) // branch #1 new_name = ... else // branch #2 new_name = ... name_rev(parent, new_name, ...); } else { // branch #3 name_rev(...); } These conditions are not covered properly in the test suite. As far as purely test coverage goes, they are all executed several times over in 't6120-describe.sh'. However, they don't directly influence the command's output, because the repository used in that test script contains several branches and tags pointing somewhere into the middle of the commit DAG, and thus result in a better name for the to-be-named commit. This can hide bugs: e.g. by replacing the 'new_name' parameter of the first recursive name_rev() call with 'tip_name' (effectively making both branch #1 and #2 a noop) 'git name-rev --all' shows thousands of bogus names in the Git repository, but the whole test suite still passes successfully. In an early version of a later patch in this series I managed to mess up all three branches (at once!), but the test suite still passed. So add a new test case that operates on the following history: A--------------master \ / \----------M2 \ / \---M1-C \ / B and names the commit 'B' to make sure that all three branches are crucial to determine 'B's name: - There is only a single ref, so all names are based on 'master', without any undesired interference from other refs. - Each time name_rev() follows the second parent of a merge commit, it appends "^2" to the name. Following 'master's second parent right at the start ensures that all commits on the ancestry path from 'master' to 'B' have a different base name from the original 'tip_name' of the very first name_rev() invocation. Currently, while name_rev() is recursive, it doesn't matter, but it will be necessary to properly cover all three branches after the recursion is eliminated later in this series. - Following 'M2's second parent makes sure that branch #2 (i.e. when 'generation = 0') affects 'B's name. - Following the only parent of the non-merge commit 'C' ensures that branch #3 affects 'B's name, and that it increments 'generation'. - Coming from 'C' 'generation' is 1, thus following 'M1's second parent makes sure that branch #1 affects 'B's name. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-06t6120-describe: modernize the 'check_describe' helperSZEDER Gábor1-6/+4
The 'check_describe' helper function runs 'git describe' outside of 'test_expect_success' blocks, with extra hand-rolled code to record and examine its exit code. Update this helper and move the 'git describe' invocation inside the 'test_expect_success' block. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-13t6120-describe: correct test repo history graph in commentSZEDER Gábor1-9/+10
At the top of 't6120-describe.sh' an ASCII graph illustrates the repository's history used in this test script. This graph is a bit misleading, because it swapped the second merge commit's first and second parents. When describing/naming a commit it does make a difference which parent is the first and which is the second/Nth, so update this graph to accurately represent that second merge. While at it, move this history graph from the 'test_description' variable to a regular comment. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-28name-rev: avoid cutoff timestamp underflowSZEDER Gábor1-0/+15
When 'git name-rev' is invoked with commit-ish parameters, it tries to save some work, and doesn't visit commits older than the committer date of the oldest given commit minus a one day worth of slop. Since our 'timestamp_t' is an unsigned type, this leads to a timestamp underflow when the committer date of the oldest given commit is within a day of the UNIX epoch. As a result the cutoff timestamp ends up far-far in the future, and 'git name-rev' doesn't visit any commits, and names each given commit as 'undefined'. Check whether subtracting the slop from the oldest committer date would lead to an underflow, and use no cutoff in that case. We don't have a TIME_MIN constant, dddbad728c (timestamp_t: a new data type for timestamps, 2017-04-26) didn't add one, so do it now. Note that the type of the cutoff timestamp variable used to be signed before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t, 2017-05-20). The behavior was still the same even back then, but the underflow didn't happen when substracting the slop from the oldest committer date, but when comparing the signed cutoff timestamp with unsigned committer dates in name_rev(). IOW, this underflow bug is as old as 'git name-rev' itself. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-21tests: fix unportable "\?" and "\+" regex syntaxÆvar Arnfjörð Bjarmason1-4/+4
Fix widely supported but non-POSIX basic regex syntax introduced in [1] and [2]. On GNU, NetBSD and FreeBSD the following works: $ echo xy >f $ grep 'xy\?' f; echo $? xy 0 The same goes for "\+". The "?" and "+" syntax is not in the BRE syntax, just in ERE, but on some implementations it can be invoked by prefixing the meta-operator with "\", but not on OpenBSD: $ uname -a OpenBSD obsd.my.domain 6.2 GENERIC#132 amd64 $ grep --version grep version 0.9 $ grep 'xy\?' f; echo $? 1 Let's fix this by moving to ERE syntax instead, where "?" and "+" are universally supported: $ grep -E 'xy?' f; echo $? xy 0 1. 2ed5c8e174 ("describe: setup working tree for --dirty", 2019-02-03) 2. c801170b0c ("t6120: test for describe with a bare repository", 2019-02-03) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-06Merge branch 'ss/describe-dirty-in-the-right-directory'Junio C Hamano1-0/+39
"git --work-tree=$there --git-dir=$here describe --dirty" did not work correctly as it did not pay attention to the location of the worktree specified by the user by mistake, which has been corrected. * ss/describe-dirty-in-the-right-directory: t6120: test for describe with a bare repository describe: setup working tree for --dirty
2019-02-04t6120: test for describe with a bare repositorySebastian Staudt1-0/+6
This ensures that nothing breaks the basic functionality of describe for bare repositories. Please note that --broken and --dirty need a working tree. Signed-off-by: Sebastian Staudt <koraktor@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-04describe: setup working tree for --dirtySebastian Staudt1-0/+33
We don't use NEED_WORK_TREE when running the git-describe builtin, since you should be able to describe a commit even in a bare repository. However, the --dirty flag does need a working tree. Since we don't call setup_work_tree(), it uses whatever directory we happen to be in. That's unlikely to match our index, meaning we'd say "dirty" even when the real working tree is clean. We can fix that by calling setup_work_tree() once we know that the user has asked for --dirty. The --broken option also needs a working tree. But because its implementation calls git-diff-index we don‘t have to setup the working tree in the git-describe process. Signed-off-by: Sebastian Staudt <koraktor@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-21tests: use 'test_must_be_empty' instead of 'test_cmp <empty> <out>'SZEDER Gábor1-2/+1
Using 'test_must_be_empty' is shorter and more idiomatic than >empty && test_cmp empty out as it saves the creation of an empty file. Furthermore, sometimes the expected empty file doesn't have such a descriptive name like 'empty', and its creation is far away from the place where it's finally used for comparison (e.g. in 't7600-merge.sh', where two expected empty files are created in the 'setup' test, but are used only about 500 lines later). These cases were found by instrumenting 'test_cmp' to error out the test script when it's used to compare empty files, and then converted manually. Note that even after this patch there still remain a lot of cases where we use 'test_cmp' to check empty files: - Sometimes the expected output is not hard-coded in the test, but 'test_cmp' is used to ensure that two similar git commands produce the same output, and that output happens to be empty, e.g. the test 'submodule update --merge - ignores --merge for new submodules' in 't7406-submodule-update.sh'. - Repetitive common tasks, including preparing the expected results and running 'test_cmp', are often extracted into a helper function, and some of this helper's callsites expect no output. - For the same reason as above, the whole 'test_expect_success' block is within a helper function, e.g. in 't3070-wildmatch.sh'. - Or 'test_cmp' is invoked in a loop, e.g. the test 'cvs update (-p)' in 't9400-git-cvsserver-server.sh'. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-14t: switch $_z40 to $ZERO_OIDbrian m. carlson1-1/+1
Switch all uses of $_z40 to $ZERO_OID so that they work correctly with larger hashes. This commit was created by using the following sed command to modify all files in the t directory except t/test-lib.sh: sed -i 's/\$_z40/$ZERO_OID/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-27Merge branch 'sb/describe-blob'Junio C Hamano1-0/+8
"git describe $garbage" stopped giving any errors when the garbage happens to be a string with 40 hexadecimal letters. * sb/describe-blob: describe: confirm that blobs actually exist
2018-02-12describe: confirm that blobs actually existJeff King1-0/+8
Prior to 644eb60bd0 (builtin/describe.c: describe a blob, 2017-11-15), we noticed and complained about missing objects, since they were not valid commits: $ git describe 0000000000000000000000000000000000000000 fatal: 0000000000000000000000000000000000000000 is not a valid 'commit' object After that commit, we feed any non-commit to lookup_blob(), and complain only if it returns NULL. But the lookup_* functions do not actually look at the on-disk object database at all. They return an entry from the in-memory object hash if present (and if it matches the requested type), and otherwise auto-create a "struct object" of the requested type. A missing object would hit that latter case: we create a bogus blob struct, walk all of history looking for it, and then exit successfully having produced no output. One reason nobody may have noticed this is that some related cases do still work OK: 1. If we ask for a tree by sha1, then the call to lookup_commit_referecne_gently() would have parsed it, and we would have its true type in the in-memory object hash. 2. If we ask for a name that doesn't exist but isn't a 40-hex sha1, then get_oid() would complain before we even look at the objects at all. We can fix this by replacing the lookup_blob() call with a check of the true type via sha1_object_info(). This is not quite as efficient as we could possibly make this check. We know in most cases that the object was already parsed in the earlier commit lookup, so we could call lookup_object(), which does auto-create, and check the resulting struct's type (or NULL). However it's not worth the fragility nor code complexity to save a single object lookup. The new tests cover this case, as well as that of a tree-by-sha1 (which does work as described above, but was not explicitly tested). Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Acked-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-23Merge branch 'dk/describe-all-output-fix'Junio C Hamano1-1/+5
An old regression in "git describe --all $annotated_tag^0" has been fixed. * dk/describe-all-output-fix: describe: prepend "tags/" when describing tags with embedded name
2017-12-27describe: prepend "tags/" when describing tags with embedded nameDaniel Knittl-Frank1-1/+5
The man page of the "git describe" command explains the expected output when using the --all option, i.e. the full reference path is shown, including heads/ or tags/ prefix. When 212945d4a85dfa172ea55ec73b1d830ef2d8582f ("Teach git-describe to verify annotated tag names before output") made Git favor the embedded name of annotated tags, it accidentally changed the output format when the --all flag is given, only printing the tag's name without the prefix. Check if --all was specified and re-add the "tags/" prefix for this special case to fix the regresssion. Signed-off-by: Daniel Knittl-Frank <knittl89+git@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-19builtin/describe.c: describe a blobStefan Beller1-0/+34
Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>