git - Mirror of https://git.kernel.org/pub/scm/git/git.git/

Age	Commit message (Collapse)	Author	Files	Lines
2023-12-18	The second batch	Junio C Hamano	1	-0/+29
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15	tests: adjust whitespace in chainlint expectations	Patrick Steinhardt	27	-74/+90
	The "check-chainlint" target runs automatically when running tests and performs self-checks to verify that the chainlinter itself produces the expected output. Originally, the chainlinter was implemented via sed, but the infrastructure has been rewritten in fb41727b7e (t: retire unused chainlint.sed, 2022-09-01) to use a Perl script instead. The rewrite caused some slight whitespace changes in the output that are ultimately not of much importance. In order to be able to assert that the actual chainlinter errors match our expectations we thus have to ignore whitespace characters when diffing them. As the `-w` flag is not in POSIX we try to use `git diff -w --no-index` before we fall back to `diff -w -u`. To accomodate for cases where the host system has no Git installation we use the locally-compiled version of Git. This can result in problems though when the Git project's repository is using extensions that the locally-compiled version of Git doesn't understand. It will refuse to run and thus cause the checks to fail. Instead of improving the detection logic, fix our ".expect" files so that we do not need any post-processing at all anymore. This allows us to drop the `-w` flag when diffing so that we can always use diff(1) now. Note that we keep some of the post-processing of `chainlint.pl` output intact to strip leading line numbers generated by the script. Having these would cause a rippling effect whenever we add a new test that sorts into the middle of existing tests and would require us to renumerate all subsequent lines, which seems rather pointless. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14	mailinfo: avoid recursion when unquoting From headers	Jeff King	1	-2/+6
	Our unquote_comment() function is recursive; when it sees a comment within a comment, like: (this is an (embedded) comment) it recurses to handle the inner comment. This is fine for practical use, but it does mean that you can easily run out of stack space with a malicious header. For example: perl -e 'print "From: ", "(" x 2**18;' \| git mailinfo /dev/null /dev/null segfaults on my system. And since mailinfo is likely to be fed untrusted input from the Internet (if not by human users, who might recognize a garbage header, but certainly there are automated systems that apply patches from a list) it may be possible for an attacker to trigger the problem. That said, I don't think there's an interesting security vulnerability here. All an attacker can do is make it impossible to parse their email and apply their patch, and there are lots of ways to generate bogus emails. So it's more of an annoyance than anything. But it's pretty easy to fix it. The recursion is not helping us preserve any particular state from each level. The only flag in our parsing is take_next_literally, and we can never recurse when it is set (since the start of a new comment implies it was not backslash-escaped). So it is really only useful for finding the end of the matched pair of parentheses. We can do that easily with a simple depth counter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14	t5100: make rfc822 comment test more careful	Jeff King	2	-2/+2
	When processing "From" headers in an email, mailinfo "unquotes" quoted strings and rfc822 parenthesized comments. For quoted strings, we actually remove the double-quotes, so: From: "A U Thor" <someone@example.com> become: Author: A U Thor Email: someone@example.com But for comments, we leave the outer parentheses in place, so: From: A U (this is a comment) Thor <someone@example.com> becomes: Author: A U (this is a comment) Thor Email: someone@example.com So what is the comment "unquoting" actually doing? In our code, being in a comment section has exactly two effects: 1. We'll unquote backslash-escaped characters inside a comment section. 2. We _won't_ unquote double-quoted strings inside a comment section. Our test for comments in t5100 checks this: From: "A U Thor" <somebody@example.com> (this is $really$ a comment (honestly)) So it is covering (1), but not (2). Let's add in a quoted string to cover this. Moreover, because the comment appears at the end of the From header, there's nothing to confirm that we correctly found the end of the comment section (and not just the end-of-string). Let's instead move it to the beginning of the header, which means we can confirm that the existing quoted string is detected (which will only happen if we know we've left the comment block). As expected, the test continues to pass, but this will give us more confidence as we refactor the code in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-13	checkout: forbid "-B <branch>" from touching a branch used elsewhere	Junio C Hamano	5	-3/+41
	"git checkout -B <branch> [<start-point>]", being a "forced" version of "-b", switches to the <branch>, after optionally resetting its tip to the <start-point>, even if the <branch> is in use in another worktree, which is somewhat unexpected. Protect the <branch> using the same logic that forbids "git checkout <branch>" from touching a branch that is in use elsewhere. This is a breaking change that may deserve backward compatibliity warning in the Release Notes. The "--ignore-other-worktrees" option can be used as an escape hatch if the finger memory of existing users depend on the current behaviour of "-B". Reported-by: Willem Verstraeten <willem.verstraeten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12	mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()	Jeff King	2	-4/+26
	When processing a header like a "From" line, mailinfo uses unquote_quoted_pair() to handle double-quotes and rfc822 parenthesized comments. It takes a NUL-terminated string on input, and loops over the "in" pointer until it sees the NUL. When it finds the start of an interesting block, it delegates to helper functions which also increment "in", and return the updated pointer. But there's a bug here: the helpers find the NUL with a post-increment in the loop condition, like: while ((c = *in++) != 0) So when they do see a NUL (rather than the correct termination of the quote or comment section), they return "in" as one _past_ the NUL terminator. And thus the outer loop in unquote_quoted_pair() does not realize we hit the NUL, and keeps reading past the end of the buffer. We should instead make sure to return "in" positioned at the NUL, so that the caller knows to stop their loop, too. A hacky way to do this is to return "in - 1" after leaving the inner loop. But a slightly cleaner solution is to avoid incrementing "in" until we are sure it contained a non-NUL byte (i.e., doing it inside the loop body). The two tests here show off the problem. Since we check the output, they'll _usually_ report a failure in a normal build, but it depends on what garbage bytes are found after the heap buffer. Building with SANITIZE=address reliably notices the problem. The outcome (both the exit code and the exact bytes) are just what Git happens to produce for these cases today, and shouldn't be taken as an endorsement. It might be reasonable to abort on an unterminated string, for example. The priority for this patch is fixing the out-of-bounds memory access. Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/block: reuse buffer to compute record keys	Patrick Steinhardt	2	-11/+10
	When iterating over entries in the block iterator we compute the key of each of the entries and write it into a buffer. We do not reuse the buffer though and thus re-allocate it on every iteration, which is wasteful. Refactor the code to reuse the buffer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/block: introduce macro to initialize `struct block_iter`	Patrick Steinhardt	5	-13/+14
	There are a bunch of locations where we initialize members of `struct block_iter`, which makes it harder than necessary to expand this struct to have additional members. Unify the logic via a new `BLOCK_ITER_INIT` macro that initializes all members. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/merged: reuse buffer to compute record keys	Patrick Steinhardt	2	-15/+18
	When iterating over entries in the merged iterator's queue, we compute the key of each of the entries and write it into a buffer. We do not reuse the buffer though and thus re-allocate it on every iteration, which is wasteful given that we never transfer ownership of the allocated bytes outside of the loop. Refactor the code to reuse the buffer. This also fixes a potential memory leak when `merged_iter_advance_subiter()` returns an error. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/stack: fix use of unseeded randomness	Patrick Steinhardt	2	-4/+4
	When writing a new reftable stack, Git will first create the stack with a random suffix so that concurrent updates will not try to write to the same file. This random suffix is computed via a call to rand(3P). But we never seed the function via srand(3P), which means that the suffix is in fact always the same. Fix this bug by using `git_rand()` instead, which does not need to be initialized. While this function is likely going to be slower depending on the platform, this slowness should not matter in practice as we only use it when writing a new reftable stack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/stack: fix stale lock when dying	Patrick Steinhardt	1	-32/+15
	When starting a transaction via `reftable_stack_init_addition()`, we create a lockfile for the reftable stack itself which we'll write the new list of tables to. But if we terminate abnormally e.g. via a call to `die()`, then we do not remove the lockfile. Subsequent executions of Git which try to modify references will thus fail with an out-of-date error. Fix this bug by registering the lock as a `struct tempfile`, which ensures automatic cleanup for us. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/stack: reuse buffers when reloading stack	Patrick Steinhardt	1	-8/+4
	In `reftable_stack_reload_once()` we iterate over all the tables added to the stack in order to figure out whether any of the tables needs to be reloaded. We use a set of buffers in this context to compute the paths of these tables, but discard those buffers on every iteration. This is quite wasteful given that we do not need to transfer ownership of the allocated buffer outside of the loop. Refactor the code to instead reuse the buffers to reduce the number of allocations we need to do. Note that we do not have to manually reset the buffer because `stack_filename()` does this for us already. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/stack: perform auto-compaction with transactional interface	Patrick Steinhardt	2	-0/+62
	Whenever updating references or reflog entries in the reftable stack, we need to add a new table to the stack, thus growing the stack's length by one. The stack can grow to become quite long rather quickly, leading to performance issues when trying to read records. But besides performance issues, this can also lead to exhaustion of file descriptors very rapidly as every single table requires a separate descriptor when opening the stack. While git-pack-refs(1) fixes this issue for us by merging the tables, it runs too irregularly to keep the length of the stack within reasonable limits. This is why the reftable stack has an auto-compaction mechanism: `reftable_stack_add()` will call `reftable_stack_auto_compact()` after its added the new table, which will auto-compact the stack as required. But while this logic works alright for `reftable_stack_add()`, we do not do the same in `reftable_addition_commit()`, which is the transactional equivalent to the former function that allows us to write multiple updates to the stack atomically. Consequentially, we will easily run into file descriptor exhaustion in code paths that use many separate transactions like e.g. non-atomic fetches. Fix this issue by calling `reftable_stack_auto_compact()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable/stack: verify that `reftable_stack_add()` uses auto-compaction	Patrick Steinhardt	1	-0/+49
	While we have several tests that check whether we correctly perform auto-compaction when manually calling `reftable_stack_auto_compact()`, we don't have any tests that verify whether `reftable_stack_add()` does call it automatically. Add one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable: handle interrupted writes	Patrick Steinhardt	2	-4/+4
	There are calls to write(3P) where we don't properly handle interrupts. Convert them to use `write_in_full()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable: handle interrupted reads	Patrick Steinhardt	2	-2/+2
	There are calls to pread(3P) and read(3P) where we don't properly handle interrupts. Convert them to use `pread_in_full()` and `read_in_full()`, respectively. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	reftable: wrap EXPECT macros in do/while	Patrick Steinhardt	1	-26/+32
	The `EXPECT` macros used by the reftable test framework are all using a single `if` statement with the actual condition. This results in weird syntax when using them in if/else statements like the following: ``` if (foo) EXPECT(foo == 2) else EXPECT(bar == 2) ``` Note that there need not be a trailing semicolon. Furthermore, it is not immediately obvious whether the else now belongs to the `if (foo)` or whether it belongs to the expanded `if (foo == 2)` from the macro. Fix this by wrapping the macros in a do/while loop. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11	show-ref: use die_for_incompatible_opt3()	René Scharfe	2	-10/+12
	Use the standard message for reporting the use of multiple mutually exclusive options by calling die_for_incompatible_opt3() instead of rolling our own. This has the benefits of showing only the actually given options, reducing the number of strings to translate and making the UI slightly more consistent. Adjust the test to no longer insist on a specific order of the reported options, as this implementation detail does not affect the usefulness of the error message. Reported-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: René Scharfe <l.s.r@web.de> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	Start the 2.44 cycle	Junio C Hamano	3	-2/+45
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	revision: parse integer arguments to --max-count, --skip, etc., more carefully	Junio C Hamano	3	-13/+57
	The "rev-list" and other commands in the "log" family, being the oldest part of the system, use their own custom argument parsers, and integer values of some options are parsed with atoi(), which allows a non-digit after the number (e.g., "1q") to be silently ignored. As a natural consequence, an argument that does not begin with a digit (e.g., "q") silently becomes zero, too. Switch to use strtol_i() and parse_timestamp() appropriately to catch bogus input. Note that one may naïvely expect that --max-count, --skip, etc., to only take non-negative values, but we must allow them to also take negative values, as an escape hatch to countermand a limit set by an earlier option on the command line; the underlying variables are initialized to (-1) and "--max-count=-1", for example, is a legitimate way to reinitialize the limit. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	sequencer: simplify away extra git_config_string() call	Jeff King	1	-13/+8
	In our config callback, we call git_config_string() to copy the incoming value string into a local string. But we don't modify or store that string; we just look at it and then free it. We can make the code simpler by just looking at the value passed into the callback. Note that we do need to check for NULL, which is the one bit of logic git_config_string() did for us. And I could even see an argument that we are abstracting any error-checking of the value behind the git_config_string() layer. But in practice no other callbacks behave this way; it is standard to check for NULL and then just look at the string directly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	gpg-interface: drop pointless config_error_nonbool() checks	Jeff King	1	-12/+3
	Config callbacks which use git_config_string() or git_config_pathname() have no need to check for a NULL value. This is handled automatically by those helpers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	push: drop confusing configset/callback redundancy	Jeff King	2	-33/+25
	We parse push config by calling git_config() with our git_push_config() callback. But inside that callback, when we see "push.gpgsign", we ignore the value passed into the callback and instead make a new call to git_config_get_value(). This is unnecessary at best, and slightly wrong at worst (if there are multiple instances, get_value() only returns one; both methods end up with last-one-wins, but we'd fail to report errors if earlier incarnations were bogus). The call was added by 68c757f219 (push: add a config option push.gpgSign for default signed pushes, 2015-08-19). That commit doesn't give any reason to deviate from the usual strategy here; it was probably just somebody unfamiliar with our config API and its conventions. It also added identical code to builtin/send-pack.c, which also handles push.gpgsign. And then the same issue spread to its neighbor in b33a15b081 (push: add recurseSubmodules config option, 2015-11-17), presumably via cargo-culting. This patch fixes all three to just directly use the value provided to the callback. While I was adjusting the code to do so, I noticed that push.gpgsign is overly careful about a NULL value. After git_parse_maybe_bool() has returned anything besides 1, we know that the value cannot be NULL (if it were, it would be an implicit "true", and many callers of maybe_bool rely on that). Here that lets us shorten "if (v && !strcasecmp(v, ...))" to just "if (!strcasecmp(v, ...))". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	config: use git_config_string() for core.checkRoundTripEncoding	Jeff King	3	-8/+4
	Since this code path was recently converted to check for a NULL value, it now behaves exactly like git_config_string(). We can shorten the code a bit by using that helper. Note that git_config_string() takes a const pointer, but our storage variable is non-const. We're better off making this "const", though, since the default value points to a string literal (and thus it would be an error if anybody tried to write to it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	diff: give more detailed messages for bogus diff.* config	Jeff King	1	-2/+6
	The config callbacks for a few diff.* variables simply return -1 when we encounter an error. The message you get mentions the offending location, like: fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7 but is vague about "bad" (as it must be, since the message comes from the generic config code). Most callbacks add their own messages here, so let's do the same. E.g.: error: unknown value for config 'diff.algorithm': foo fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7 I've written the string in a way that should be reusable for translators, and matches another similar message in transport.c (there doesn't yet seem to be a popular generic message to reuse here, so hopefully this will get the ball rolling). Note that in the case of diff.algorithm, our parse_algorithm_value() helper does detect a NULL value string. But it's still worth detecting it ourselves here, since we can give a more specific error message (and which is the usual one for unexpected implicit-bool values). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	config: use config_error_nonbool() instead of custom messages	Jeff King	3	-3/+3
	A few config callbacks use their own custom messages to report an unexpected implicit bool like: [merge "foo"] driver These should just use config_error_nonbool(), so the user sees consistent messages. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	imap-send: don't use git_die_config() inside callback	Jeff King	1	-1/+1
	The point of git_die_config() is to let configset users mention the file/line info for invalid config, like: if (!git_config_get_int("foo.bar", &value)) { if (!is_ok(value)) git_die_config("foo.bar"); } Using it from within a config callback is unnecessary, because we can simply return an error, at which point the config machinery will mention the file/line of the offending variable. Worse, using git_die_config() can actually produce the wrong location when the key is found in multiple spots. For instance, with config like: [imap] host host = foo we'll report the line number of the "host = foo" line, but the problem is on the implicit-bool "host" line. We can fix it by just returning an error code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	git_xmerge_config(): prefer error() to die()	Jeff King	1	-3/+4
	When parsing merge config, a few code paths die on error. It's preferable for us to call error() here, because the resulting error message from the config parsing code contains much more detail. For example, before: fatal: unknown style 'bogus' given for 'merge.conflictstyle' and after: error: unknown style 'bogus' given for 'merge.conflictstyle' fatal: bad config variable 'merge.conflictstyle' in file '.git/config' at line 7 Since we're touching these lines, I also marked them for translation. There's no reason they shouldn't behave like most other config-parsing errors. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	config: reject bogus values for core.checkstat	Jeff King	1	-0/+3
	If you feed nonsense config like: git -c core.checkstat=foobar status we'll silently ignore the unknown value, rather than reporting an error. This goes all the way back to c08e4d5b5c (Enable minimal stat checking, 2013-01-22). Detecting and complaining now is technically a backwards-incompatible change, but I don't think anybody has any reason to use an invalid value here. There are no historical values we'd want to allow for backwards compatibility or anything like that. We are better off loudly telling the user that their config may not be doing what they expect. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	fsck: handle NULL value when parsing message config	Jeff King	3	-10/+21
	When parsing fsck., receive.fsck., or fetch.fsck.*, we don't check for an implicit bool. So any of: [fsck] badTree [receive "fsck"] badTree [fetch "fsck"] badTree will cause us to segfault. We can fix it with config_error_nonbool() in the usual way, but we have to make a few more changes to get good error messages. The problem is that all three spots do: if (skip_prefix(var, "fsck.", &var)) to match and parse the actual message id. But that means that "var" now just says "badTree" instead of "receive.fsck.badTree", making the resulting message confusing. We can fix that by storing the parsed message id in its own separate variable. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	trailer: handle NULL value when parsing trailer-specific config	Jeff King	1	-0/+6
	When parsing the "key", "command", and "cmd" trailer config, we just make a copy of the value string. If we see an implicit bool like: [trailer "foo"] key we'll segfault trying to copy a NULL pointer. We can fix this with the usual config_error_nonbool() check. I split this out from the other vanilla cases, because at first glance it looks like a better fix here would be to move the NULL check out of the switch statement. But it would change the behavior of other keys like trailer.*.ifExists, where an implicit bool is interpreted as EXISTS_DEFAULT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	submodule: handle NULL value when parsing submodule.*.branch	Jeff King	1	-1/+3
	We record the submodule branch config value as a string, so config that uses an implicit bool like: [submodule "foo"] branch will cause us to segfault. Note that unlike most other config-parsing bugs of this class, this can be triggered by parsing a bogus .gitmodules file (which we might do after cloning a malicious repository). I don't think the security implications are important, though. It's always a strict NULL dereference, not an out-of-bounds read or write. So we should reliably kill the process. That may be annoying, but the impact is limited to the attacker preventing the victim from successfully using "git clone --recurse-submodules", etc, on the malicious repo. The "branch" entry is the only one with this problem; other strings like "path" and "url" already check for NULL. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	help: handle NULL value for alias.* config	Jeff King	1	-1/+4
	When showing all config with "git help --all", we print the list of defined aliases. But our config callback to do so does not check for a NULL value, meaning a config block like: [alias] foo will cause us to segfault. We should detect and complain about this in the usual way. Since this command is purely informational (and we aren't trying to run the alias), we could perhaps just generate a warning and continue. But this sort of misconfiguration should be pretty rare, and the error message we will produce points directly to the line of config that needs to be fixed. So just generating the usual error should be OK. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	trace2: handle NULL values in tr2_sysenv config callback	Jeff King	1	-0/+2
	If you have config with an implicit bool like: [trace2] envvars we'll segfault, as we unconditionally try to xstrdup() the value. We should instead detect and complain, as a boolean value has no meaning here. The same is true for every variable in tr2_sysenv_settings (and this patch covers them all, as we check them in a loop). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	setup: handle NULL value when parsing extensions	Jeff King	1	-0/+2
	The "partialclone" extension config records a string, and hence it is an error to have an implicit bool like: [extensions] partialclone in your config. We should recognize and reject this, rather than segfaulting (which is the current behavior). Note that it's OK to use config_error_nonbool() here, even though the return value is an enum. We explicitly document EXTENSION_ERROR as -1 for compatibility with error(), etc. This is the only extension value that has this problem. Most of the others are bools that interpret this value naturally. The exception is extensions.objectformat, which does correctly check for NULL. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	config: handle NULL value when parsing non-bools	Jeff King	11	-5/+47
	When the config parser sees an "implicit" bool like: [core] someVariable it passes NULL to the config callback. Any callback code which expects a string must check for NULL. This usually happens via helpers like git_config_string(), etc, but some custom code forgets to do so and will segfault. These are all fairly vanilla cases where the solution is just the usual pattern of: if (!value) return config_error_nonbool(var); though note that in a few cases we have to split initializers like: int some_var = initializer(); into: int some_var; if (!value) return config_error_nonbool(var); some_var = initializer(); There are still some broken instances after this patch, which I'll address on their own in individual patches after this one. Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	bisect: always clean on reset	Jeff King	2	-5/+10
	Usually "bisect reset" cleans up any refs/bisect/ refs, along with meta-files like .git/BISECT_LOG. But it only does so after deciding that a bisection is active, which it does by reading BISECT_START. This is usually fine, but it's possible to get into a confusing state if the BISECT_START file is gone, but other cruft is left (this might be due to a bug, or a system crash, etc). And since "bisect reset" refuses to do anything in this state, the user has no easy way to clean up the leftover cruft. While another "bisect start" would clear the state, in the interim it can be annoying, as other tools (like our bash prompt code) think we are bisecting, and for-each-ref output may be polluted with refs/bisect/ entries. Further adding to the confusion is that running "bisect reset $some_ref" skips the BISECT_START check. So it never realizes that there's no bisection active and does the cleanup anyway! So let's just make sure we always do the cleanup, whether we looked at BISECT_START or not. If the user doesn't give us a commit to reset to, we'll still say "We are not bisecting" and skip the call to "git checkout". Reported-by: Janik Haag <janik@aq0.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	parse-options: decouple "--end-of-options" and "--"	Jeff King	3	-2/+25
	When we added generic end-of-options support in 51b4594b40 (parse-options: allow --end-of-options as a synonym for "--", 2019-08-06), we made them true synonyms. They both stop option parsing, and they are both returned in the resulting argv if the KEEP_DASHDASH flag is used. The hope was that this would work for all callers: - most generic callers would not pass KEEP_DASHDASH, and so would just do the right thing (stop parsing there) without needing to know anything more. - callers with KEEP_DASHDASH were generally going to rely on setup_revisions(), which knew to handle --end-of-options specially But that turned out miss quite a few cases that pass KEEP_DASHDASH but do their own manual parsing. For example, "git reset", "git checkout", and so on want pass KEEP_DASHDASH so they can support: git reset $revs -- $paths but of course aren't going to actually do a traversal, so they don't call setup_revisions(). And those cases currently get confused by --end-of-options being left in place, like: $ git reset --end-of-options HEAD fatal: option '--end-of-options' must come before non-option arguments We could teach each of these callers to handle the leftover option explicitly. But let's try to be a bit more clever and see if we can solve it centrally in parse-options.c. The bogus assumption here is that KEEP_DASHDASH tells us the caller wants to see --end-of-options in the result. But really, the callers which need to know that --end-of-options was reached are those that may potentially parse more options from argv. In other words, those that pass the KEEP_UNKNOWN_OPT flag. If such a caller is aware of --end-of-options (e.g., because they call setup_revisions() with the result), then this will continue to do the right thing, treating anything after --end-of-options as a non-option. And if the caller is not aware of --end-of-options, they are better off keeping it intact, because either: 1. They are just passing the options along to somebody else anyway, in which case that somebody would need to know about the --end-of-options marker. 2. They are going to parse the remainder themselves, at which point choking on --end-of-options is much better than having it silently removed. The point is to avoid option injection from untrusted command line arguments, and bailing is better than quietly treating the untrusted argument as an option. This fixes bugs with --end-of-options across several commands, but I've focused on two in particular here: - t7102 confirms that "git reset --end-of-options --foo" now works. This checks two things. One, that we no longer barf on "--end-of-options" itself (which previously we did, even if the rev was something vanilla like "HEAD" instead of "--foo"). And two, that we correctly treat "--foo" as a revision rather than an option. This fix applies to any other cases which pass KEEP_DASHDASH but not KEEP_UNKNOWN_OPT, like "git checkout", "git check-attr", "git grep", etc, which would previously choke on "--end-of-options". - t9350 shows the opposite case: fast-export passed KEEP_UNKNOWN_OPT but not KEEP_DASHDASH, but then passed the result on to setup_revisions(). So it never saw --end-of-options, and would erroneously parse "fast-export --end-of-options --foo" as having a "--foo" option. This is now fixed. Note that this does shut the door for callers which want to know if we hit end-of-options, but don't otherwise need to keep unknown opts. The obvious thing here is feeding it to the DWIM verify_filename() machinery. And indeed, this is a problem even for commands which do understand --end-of-options already. For example, without this patch, you get: $ git log --end-of-options --foo fatal: option '--foo' must come before non-option arguments because we refuse to accept "--foo" as a filename (because it starts with a dash) even though we could know that we saw end-of-options. The verify_filename() function simply doesn't accept this extra information. So that is the status quo, and this patch doubles down further on that. Commands like "git reset" have the same problem, but they won't even know that parse-options saw --end-of-options! So even if we fixed verify_filename(), they wouldn't have anything to pass to it. But in practice I don't think this is a big deal. If you are being careful enough to use --end-of-options, then you should also be using "--" to disambiguate and avoid the DWIM behavior in the first place. In other words, doing: git log --end-of-options --this-is-a-rev -- --this-is-a-path works correctly, and will continue to do so. And likewise, with this patch now: git reset --end-of-options --this-is-a-rev -- --this-is-a-path will work, as well. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	worktree: simplify incompatibility message for --orphan and commit-ish	René Scharfe	1	-2/+2
	Use a single translatable string to report that the worktree add option --orphan is incompatible with a commit-ish instead of having the commit-ish in a separate translatable string. This reduces the number of strings to translate and gives translators the full context. A similar message is used in builtin/describe.c, but with the plural of commit-ish, and here we need the singular form. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	worktree: standardize incompatibility messages	René Scharfe	2	-9/+10
	Use the standard parameterized message for reporting incompatible options for worktree add. This reduces the number of strings to translate and makes the UI slightly more consistent. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	clean: factorize incompatibility message	René Scharfe	1	-1/+1
	Use the standard parameterized message for reporting incompatible options to inform users that they can't use -x and -X together. This reduces the number of strings to translate and makes the UI slightly more consistent. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	revision, rev-parse: factorize incompatibility messages about - -exclude-hidden	René Scharfe	4	-15/+22
	Use the standard parameterized message for reporting incompatible options to report options that are not accepted in combination with --exclude-hidden. This reduces the number of strings to translate and makes the UI a bit more consistent. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs	René Scharfe	1	-6/+3
	The revision option --reverse is incompatible with --walk-reflogs and --graph is incompatible with both --reverse and --walk-reflogs. So they are all incompatible with each other. Use the function for checking three mutually incompatible options, die_for_incompatible_opt3(), to perform this check in one place and without repetition. This is shorter and clearer. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	repack: use die_for_incompatible_opt3() for -A/-k/--cruft	René Scharfe	1	-10/+4
	The repack option --keep-unreachable is incompatible with -A, --cruft is incompatible with -A and -k, and -k is short for --keep-unreachable. So they are all incompatible with each other. Use the function for checking three mutually incompatible options, die_for_incompatible_opt3(), to perform this check in one place and without repetition. This is shorter and clearer. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09	push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror	René Scharfe	1	-8/+4
	The push option --delete is incompatible with --all, --mirror, and --tags; --tags is incompatible with --all and --mirror; --all is incompatible with --mirror. This means they are all incompatible with each other. And --branches is an alias for --all. Use the function for checking four mutually incompatible options, die_for_incompatible_opt4(), to perform this check in one place and without repetition. This is shorter and clearer. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03	completion: avoid user confusion in non-cone mode	Elijah Newren	2	-2/+96
	It is tempting to think of "files and directories" of the current directory as valid inputs to the add and set subcommands of git sparse-checkout. However, in non-cone mode, they often aren't and using them as potential completions leads to many forms of confusion: Issue #1. It provides the wrong files and directories. For git sparse-checkout add we always want to add files and directories not currently in our sparse checkout, which means we want file and directories not currently present in the current working tree. Providing the files and directories currently present is thus always wrong. For git sparse-checkout set we have a similar problem except in the subset of cases where we are trying to narrow our checkout to a strict subset of what we already have. That is not a very common scenario, especially since it often does not even happen to be true for the first use of the command; for years we required users to create a sparse-checkout via git sparse-checkout init git sparse-checkout set <args...> (or use a clone option that did the init step for you at clone time). The init command creates a minimal sparse-checkout with just the top-level directory present, meaning the set command has to be used to expand the checkout. Thus, only in a special and perhaps unusual cases would any of the suggestions from normal file and directory completion be appropriate. Issue #2: Suggesting patterns that lead to warnings is unfriendly. If the user specifies any regular file and omits the leading '/', then the sparse-checkout command will warn the user that their command is problematic and suggest they use a leading slash instead. Issue #3: Completion gets confused by leading '/', and provides wrong paths. Users often want to anchor their patterns to the toplevel of the repository, especially when listing individual files. There are a number of reasons for this, but notably even sparse-checkout encourages them to do so (as noted above). However, if users do so (via adding a leading '/' to their pattern), then bash completion will interpret the leading slash not as a request for a path at the toplevel of the repository, but as a request for a path at the root of the filesytem. That means at best that completion cannot help with such paths, and if it does find any completions, they are almost guaranteed to be wrong. Issue #4: Suggesting invalid patterns from subdirectories is unfriendly. There is no per-directory equivalent to .gitignore with sparse-checkouts. There is only a single worktree-global $GIT_DIR/info/sparse-checkout file. As such, paths to files must be specified relative to the toplevel of a repository. Providing suggestions of paths that are relative to the current working directory, as bash completion defaults to, is wrong when the current working directory is not the worktree toplevel directory. Issue #5: Paths with special characters will be interpreted incorrectly The entries in the sparse-checkout file are patterns, not paths. While most paths also qualify as patterns (though even in such cases it would be better for users to not use them directly but prefix them with a leading '/'), there are a variety of special characters that would need special escaping beyond the normal shell escaping: '*', '?', '\', '[', ']', and any leading '#' or '!'. If completion suggests any such paths, users will likely expect them to be treated as an exact path rather than as a pattern that might match some number of files other than 1. However, despite the first four issues, we can note that _if_ users are using tab completion, then they are probably trying to specify a path in the index. As such, we transform their argument into a top-level-rooted pattern that matches such a file. For example, if they type: git sparse-checkout add Make<TAB> we could "complete" to git sparse-checkout add /Makefile or, if they ran from the Documentation/technical/ subdirectory: git sparse-checkout add m<TAB> we could "complete" it to: git sparse-checkout add /Documentation/technical/multi-pack-index.txt Note in both cases I use "complete" in quotes, because we actually add characters both before and after the argument in question, so we are kind of abusing "bash completions" to be "bash completions AND beginnings". The fifth issue is a bit stickier, especially when you consider that we not only need to deal with escaping issues because of special meanings of patterns in sparse-checkout & gitignore files, but also that we need to consider escaping issues due to ls-files needing to sometimes quote or escape characters, and because the shell needs to escape some characters. The multiple interacting forms of escaping could get ugly; this patch makes no attempt to do so and simply documents that we decided to not deal with those corner cases for now but at least get the common cases right. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03	completion: avoid misleading completions in cone mode	Elijah Newren	1	-0/+20
	The "set" and "add" subcommands of "sparse-checkout", when in cone mode, should only complete on directories. For bash_completion in general, when no completions are returned for any subcommands, it will often fall back to standard completion of files and directories as a substitute. That is not helpful here. Since we have already looked for all valid completions, if none are found then falling back to standard bash file and directory completion is at best actively misleading. In fact, there are three different ways it can be actively misleading. Add a long comment in the code about how that fallback behavior can deceive, and disable the fallback by returning a fake result as the sole completion. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03	completion: fix logic for determining whether cone mode is active	Elijah Newren	1	-2/+10
	_git_sparse_checkout() was checking whether we were in cone mode by checking whether either: A) core.sparseCheckoutCone was "true" B) "--cone" was specified on the command line This code has 2 bugs I didn't catch in my review at the time 1) core.sparseCheckout must be "true" for core.sparseCheckoutCone to be relevant (which matters since "git sparse-checkout disable" only unsets core.sparseCheckout, not core.sparseCheckoutCone) 2) The presence of "--no-cone" should override any config setting Further, I forgot to update this logic as part of 2d95707a02 ("sparse-checkout: make --cone the default", 2022-04-22) for the new default. Update the code for the new default and make it be more careful in determining whether to complete based on cone mode or non-cone mode. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03	completion: squelch stray errors in sparse-checkout completion	Elijah Newren	1	-1/+1
	If, in the root of a project, one types git sparse-checkout set --cone ../<TAB> then an error message of the form fatal: ../: '../' is outside repository at '/home/newren/floss/git' is written to stderr, which munges the users view of their own command. Squelch such messages by using the __git() wrapper, designed for this purpose; see commit e15098a314 (completion: consolidate silencing errors from git commands, 2017-02-03) for more on the wrapper. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-03	hooks--pre-commit: detect non-ASCII when renaming	Julian Prein	1	-1/+1
	When diff.renames is turned on, the diff-filter will not return renamed files (or copied ones with diff.renames=copy) and potential non-ASCII characters would not be caught by this hook. Use the plumbing command diff-index instead of the porcelain one to not be affected by diff.rename. Signed-off-by: Julian Prein <druckdev@protonmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>