summaryrefslogtreecommitdiffstats
path: root/submodule.c
AgeCommit message (Collapse)AuthorLines
2026-02-23refs: rename `each_ref_fn`Patrick Steinhardt-1/+1
Similar to the preceding commit, rename `each_ref_fn` to better match our current best practices around how we name things. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-13Merge branch 'ps/commit-list-functions-renamed'Junio C Hamano-2/+2
Rename three functions around the commit_list data structure. * ps/commit-list-functions-renamed: commit: rename `free_commit_list()` to conform to coding guidelines commit: rename `reverse_commit_list()` to conform to coding guidelines commit: rename `copy_commit_list()` to conform to coding guidelines
2026-02-05Merge branch 'ar/submodule-gitdir-tweak'Junio C Hamano-33/+190
Avoid local submodule repository directory paths overlapping with each other by encoding submodule names before using them as path components. * ar/submodule-gitdir-tweak: submodule: detect conflicts with existing gitdir configs submodule: hash the submodule name for the gitdir path submodule: fix case-folding gitdir filesystem collisions submodule--helper: fix filesystem collisions by encoding gitdir paths builtin/credential-store: move is_rfc3986_unreserved to url.[ch] submodule--helper: add gitdir migration command submodule: allow runtime enabling extensions.submodulePathConfig submodule: introduce extensions.submodulePathConfig builtin/submodule--helper: add gitdir command submodule: always validate gitdirs inside submodule_name_to_gitdir submodule--helper: use submodule_name_to_gitdir in add_submodule
2026-01-15commit: rename `free_commit_list()` to conform to coding guidelinesPatrick Steinhardt-2/+2
Our coding guidelines say that: Functions that operate on `struct S` are named `S_<verb>()` and should generally receive a pointer to `struct S` as first parameter. While most of the functions related to `struct commit_list` already follow that naming schema, `free_commit_list()` doesn't. Rename the function to address this and adjust all of its callers. Add a compatibility wrapper for the old function name to ease the transition and avoid any semantic conflicts with in-flight patch series. This wrapper will be removed once Git 2.53 has been released. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12submodule: detect conflicts with existing gitdir configsAdrian Ratiu-0/+61
Credit goes to Emily and Josh for testing and noticing a corner-case which caused conflicts with existing gitdir configs to silently pass validation, then fail later in add_submodule() with a cryptic error: fatal: A git directory for 'nested%2fsub' is found locally with remote(s): origin /.../trash directory.t7425-submodule-gitdir-path-extension/sub This change ensures the validation step checks existing gitdirs for conflicts. We only have to do this for submodules having gitdirs, because those without submodule.%s.gitdir need to be migrated and will throw an error earlier in the submodule codepath. Quoting Josh: My testing setup has been as follows: * Using our locally-built Git with our downstream patch of [1] included: * create a repo "sub" * create a repo "super" * In "super": * mkdir nested * git submodule add ../sub nested/sub * Verify that the submodule's gitdir is .git/modules/nested%2fsub * Using a build of git from upstream `next` plus this series: * git config set --global extensions.submodulepathconfig true * git clone --recurse-submodules super super2 * create a repo "nested%2fsub" * In "super2": * git submodule add ../nested%2fsub At this point I'd expect the collision detection / encoding to take effect, but instead I get the error listed above. End quote Suggested-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12submodule: fix case-folding gitdir filesystem collisionsAdrian Ratiu-1/+52
Add a new check when extension.submodulePathConfig is enabled, to detect and prevent case-folding filesystem colisions. When this new check is triggered, a stricter casefolding aware URI encoding is used to percent-encode uppercase characters. By using this check/retry mechanism the uppercase encoding is only applied when necessary, so case-sensitive filesystems are not affected. Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12submodule--helper: fix filesystem collisions by encoding gitdir pathsAdrian Ratiu-1/+41
Fix nested filesystem collisions by url-encoding gitdir paths stored in submodule.%s.gitdir, when extensions.submodulePathConfig is enabled. Credit goes to Junio and Patrick for coming up with this design: the encoding is only applied when necessary, to newly added submodules. Existing modules don't need the encoding because git already errors out when detecting nested gitdirs before this patch. This commit adds the basic url-encoding and some tests. Next commits extend the encode -> validate -> retry loop to fix more conflicts. Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12submodule: introduce extensions.submodulePathConfigAdrian Ratiu-26/+35
The idea of this extension is to abstract away the submodule gitdir path implementation: everyone is expected to use the config and not worry about how the path is computed internally, either in git or other implementations. With this extension enabled, the submodule.<name>.gitdir repo config becomes the single source of truth for all submodule gitdir paths. The submodule.<name>.gitdir config is added automatically for all new submodules when this extension is enabled. Git will throw an error if the extension is enabled and a config is missing, advising users how to migrate. Migration is manual for now. E.g. to add a missing config entry for an existing "foo" module: git config submodule.foo.gitdir .git/modules/foo Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12submodule: always validate gitdirs inside submodule_name_to_gitdirAdrian Ratiu-8/+4
Move the ad-hoc validation checks sprinkled across the source tree, after calling submodule_name_to_gitdir() into the function proper, which now always validates the gitdir before returning it. This simplifies the API and helps to: 1. Avoid redundant validation calls after submodule_name_to_gitdir(). 2. Avoid the risk of callers forgetting to validate. 3. Ensure gitdir paths provided by users via configs are always valid (config gitdir paths are added in a subsequent commit). The validation function can still be called as many times as needed outside submodule_name_to_gitdir(), for example we keep two calls which are still required, to avoid parallel clone races by re-running the validation in builtin/submodule-helper.c. Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt-9/+3
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22treewide: use setup_revisions_from_strvec() when we have a strvecJeff King-1/+1
The previous commit introduced a wrapper to make using setup_revisions() with a strvec easier and safer. It converted spots that were already doing most of what the wrapper did. Let's now convert spots where we were not setting up the free_removed_argv_elements flag. As discussed in the previous commit, this probably isn't fixing any bugs or leaks (since these sites wouldn't trigger the re-shuffling of argv that causes them). This is mostly future-proofing us against setup_revisions() becoming more aggressive about its re-shuffling. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set_in_file_gently()` wrapperPatrick Steinhardt-1/+1
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set_in_file_gently()`. All callsites are adjusted so that they use `repo_config_set_in_file_gently(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt-3/+2
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` when handling submodule sourcesPatrick Steinhardt-26/+0
The "--recursive" flag for git-grep(1) allows users to grep for a string across submodule boundaries. To make this work we add each submodule's object sources to our own object database so that the objects can be accessed directly. The infrastructure for this depends on a global string list of submodule paths. The caller is expected to call `add_submodule_odb_by_path()` for each source and the object database will then eventually register all submodule sources via `do_oid_object_info_extended()` in case it isn't able to look up a specific object. This reliance on global state is of course suboptimal with regards to our libification efforts. Refactor the logic so that the list of submodule sources is instead tracked in the object database itself. This allows us to lose the condition of `r == the_repository` before registering submodule sources as we only ever add submodule sources to `the_repository` anyway. As such, behaviour before and after this refactoring should always be the same. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` when handling alternatesPatrick Steinhardt-1/+2
The functions to manage alternates all depend on `the_repository`. Refactor them to accept an object database as a parameter and adjust all callers. The functions are renamed accordingly. Note that right now the situation is still somewhat weird because we end up using the object store path provided by the object store's repository anyway. Consequently, we could have instead passed in a pointer to the repository instead of passing in the pointer to the object store. This will be addressed in subsequent commits though, where we will start to use the path owned by the object store itself. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: move `safe_create_leading_directories()` into "path.c"Patrick Steinhardt-1/+1
The `safe_create_leading_directories()` function and its relatives are located in "object-file.c", which is not a good fit as they provide generic functionality not related to objects at all. Move them into "path.c", which already hosts `safe_create_dir()` and its relative `safe_create_dir_in_gitdir()`. "path.c" is free of `the_repository`, but the moved functions depend on `the_repository` to read the "core.sharedRepository" config. Adapt the function signature to accept a repository as argument to fix the issue and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10hash: stop depending on `the_repository` in `null_oid()`Patrick Steinhardt-14/+14
The `null_oid()` function returns the object ID that only consists of zeroes. Naturally, this ID also depends on the hash algorithm used, as the number of zeroes is different between SHA1 and SHA256. Consequently, the function returns the hash-algorithm-specific null object ID. This is currently done by depending on `the_hash_algo`, which implicitly makes us depend on `the_repository`. Refactor the function to instead pass in the hash algorithm for which we want to retrieve the null object ID. Adapt callsites accordingly by passing in `the_repository`, thus bubbling up the dependency on that global variable by one layer. There are a couple of trivial exceptions for subsystems that already got rid of `the_repository`. These subsystems instead use the repository that is available via the calling context: - "builtin/grep.c" - "grep.c" - "refs/debug.c" There are also two non-trivial exceptions: - "diff-no-index.c": Here we know that we may not have a repository initialized at all, so we cannot rely on `the_repository`. Instead, we adapt `diff_no_index()` to get a `struct git_hash_algo` as parameter. The only caller is located in "builtin/diff.c", where we know to call `repo_set_hash_algo()` in case we're running outside of a Git repository. Consequently, it is fine to continue passing `the_repository->hash_algo` even in this case. - "builtin/ls-files.c": There is an in-flight patch series that drops `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic conflict because we use `null_oid()` in `show_submodule()`. The value is passed to `repo_submodule_init()`, which may use the object ID to resolve a tree-ish in the superproject from which we want to read the submodule config. As such, the object ID should refer to an object in the superproject, and consequently we need to use its hash algorithm. This means that we could in theory just not bother about this edge case at all and just use `the_repository` in "diff-no-index.c". But doing so would feel misdesigned. Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in "hash.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07submodule: refactor `submodule_to_gitdir()` to accept a repoPatrick Steinhardt-5/+6
The `submodule_to_gitdir()` function implicitly uses `the_repository` to resolve submodule paths. Refactor the function to instead accept a repo as parameter to remove the dependency on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07path: refactor `repo_git_path()` family of functionsPatrick Steinhardt-2/+2
As explained in an earlier commit, we're refactoring path-related functions to provide a consistent interface for computing paths into the commondir, gitdir and worktree. Refactor the "gitdir" family of functions accordingly. Note that the `repo_git_pathv()` function is converted into an internal implementation detail. It is only used to implement `the_repository` compatibility shims and will eventually be removed from the public interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'sk/calloc-not-malloc-plus-memset'Junio C Hamano-5/+5
Code clean-up. * sk/calloc-not-malloc-plus-memset: git: use calloc instead of malloc + memset where possible
2024-12-18git: use calloc instead of malloc + memset where possibleSeija Kijin-5/+5
Avoid calling malloc + memset by calling calloc. Signed-off-by: Seija Kijin <doremylover123@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt-0/+1
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12refspec: store raw refspecs inside refspec_itemJeff King-2/+2
The refspec struct keeps two matched arrays: one for the refspec_item structs and one for the original raw refspec strings. The main reason for this is that there are other users of refspec_item that do not care about the raw strings. But it does make managing the refspec struct awkward, as we must keep the two arrays in sync. This has led to bugs in the past (both leaks and double-frees). Let's just store a copy of the raw refspec string directly in each refspec_item struct. This simplifies the handling at a small cost: 1. Direct callers of refspec_item_init() will now get an extra copy of the refspec string, even if they don't need it. This should be negligible, as the struct is already allocating two strings for the parsed src/dst values (and we tend to only do it sparingly anyway for things like the TAG_REFSPEC literal). 2. Users of refspec_appendf() will now generate a temporary string, copy it, and then free the result (versus handing off ownership of the temporary string). We could get around this by having a "nodup" variant of refspec_item_init(), but it doesn't seem worth the extra complexity for something that is not remotely a hot code path. Code which accesses refspec->raw now needs to look at refspec->item.raw. Other callers which just use refspec_item directly can remain the same. We'll free the allocated string in refspec_item_clear(), which they should be calling anyway to free src/dst. One subtle note: refspec_item_init() can return an error, in which case we'll still have set its "raw" field. But that is also true of the "src" and "dst" fields, so any caller which does not _clear() the failed item is already potentially leaking. In practice most code just calls die() on an error anyway, but you can see the exception in valid_fetch_refspec(), which does correctly call _clear() even on error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12refspec: drop separate raw_nr countJeff King-2/+2
A refspec struct contains zero or more refspec_item structs, along with matching "raw" strings. The items and raw strings are kept in separate arrays, but those arrays will always have the same length (because we write them only via refspec_append_nodup(), which grows both). This can lead to bugs when manipulating the array, since the arrays and lengths must be modified in lockstep. For example, the bug fixed in the previous commit, which forgot to decrement raw_nr. So let's get rid of "raw_nr" and have only "nr", making this kind of bug impossible (and also making it clear that the two are always matched, something that existing code already assumed but was not guaranteed by the interface). Even though we'd expect "alloc" and "raw_alloc" to likewise move in lockstep, we still need to keep separate counts there if we want to continue to use ALLOC_GROW() for both. Conceptually this would all be simpler if refspec_item just held onto its own raw string, and we had a single array. But there are callers which use refspec_item outside of "struct refspec" (and so don't hold on to a matching "raw" string at all), which we'd possibly need to adjust. So let's not worry about refactoring that for now, and just get rid of the redundant count variable. That is the first step on the road to combining them anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02Merge branch 'ps/leakfixes-part-7'Junio C Hamano-2/+7
More leak-fixes. * ps/leakfixes-part-7: (23 commits) diffcore-break: fix leaking filespecs when merging broken pairs revision: fix leaking parents when simplifying commits builtin/maintenance: fix leak in `get_schedule_cmd()` builtin/maintenance: fix leaking config string promisor-remote: fix leaking partial clone filter grep: fix leaking grep pattern submodule: fix leaking submodule ODB paths trace2: destroy context stored in thread-local storage builtin/difftool: plug several trivial memory leaks builtin/repack: fix leaking configuration diffcore-order: fix leaking buffer when parsing orderfiles parse-options: free previous value of `OPTION_FILENAME` diff: fix leaking orderfile option builtin/pull: fix leaking "ff" option dir: fix off by one errors for ignored and untracked entries builtin/submodule--helper: fix leaking remote ref on errors t/helper: fix leaking subrepo in nested submodule config helper builtin/submodule--helper: fix leaking error buffer builtin/submodule--helper: clear child process when not running it submodule: fix leaking update strategy ...
2024-09-27submodule: fix leaking submodule ODB pathsPatrick Steinhardt-2/+2
In `add_submodule_odb_by_path()` we add a path into a global string list. The list is initialized with `NODUP`, which means that we do not pass ownership of strings to the list. But we use `xstrdup()` when we insert a path, with the consequence that the string will never get free'd. Plug the leak by marking the list as `DUP`. There is only a single callsite where we insert paths anyway, and as explained above that callsite was mishandling the allocation. This leak is exposed by t7814, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-27submodule: fix leaking update strategyPatrick Steinhardt-0/+5
We're not freeing the submodule update strategy command. Provide a helper function that does this for us and call it in `update_data_release()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23Merge branch 'ps/environ-wo-the-repository'Junio C Hamano-1/+1
Code clean-up. * ps/environ-wo-the-repository: (21 commits) environment: stop storing "core.notesRef" globally environment: stop storing "core.warnAmbiguousRefs" globally environment: stop storing "core.preferSymlinkRefs" globally environment: stop storing "core.logAllRefUpdates" globally refs: stop modifying global `log_all_ref_updates` variable branch: stop modifying `log_all_ref_updates` variable repo-settings: track defaults close to `struct repo_settings` repo-settings: split out declarations into a standalone header environment: guard state depending on a repository environment: reorder header to split out `the_repository`-free section environment: move `set_git_dir()` and related into setup layer environment: make `get_git_namespace()` self-contained environment: move object database functions into object layer config: make dependency on repo in `read_early_config()` explicit config: document `read_early_config()` and `read_very_early_config()` environment: make `get_git_work_tree()` accept a repository environment: make `get_graft_file()` accept a repository environment: make `get_index_file()` accept a repository environment: make `get_object_directory()` accept a repository environment: make `get_git_common_dir()` accept a repository ...
2024-09-12environment: make `get_git_common_dir()` accept a repositoryPatrick Steinhardt-1/+1
The `get_git_common_dir()` function retrieves the path to the common directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05submodule: fix leaking fetch task dataPatrick Steinhardt-0/+2
The `submodule_parallel_fetch` structure contains various data structures that we use to set up parallel fetches of submodules. We do not free some of its data though, causing memory leaks. Plug those. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23Merge branch 'ps/config-wo-the-repository'Junio C Hamano-1/+1
Use of API functions that implicitly depend on the_repository object in the config subsystem has been rewritten to pass a repository object through the callchain. * ps/config-wo-the-repository: config: hide functions using `the_repository` by default global: prepare for hiding away repo-less config functions config: don't depend on `the_repository` with branch conditions config: don't have setters depend on `the_repository` config: pass repo to functions that rename or copy sections config: pass repo to `git_die_config()` config: pass repo to `git_config_get_expiry_in_days()` config: pass repo to `git_config_get_expiry()` config: pass repo to `git_config_get_max_percent_split_change()` config: pass repo to `git_config_get_split_index()` config: pass repo to `git_config_get_index_threads()` config: expose `repo_config_clear()` config: introduce missing setters that take repo as parameter path: hide functions using `the_repository` by default path: stop relying on `the_repository` in `worktree_git_path()` path: stop relying on `the_repository` when reporting garbage hooks: remove implicit dependency on `the_repository` editor: do not rely on `the_repository` for interactive edits path: expose `do_git_common_path()` as `repo_common_pathv()` path: expose `do_git_path()` as `repo_git_pathv()`
2024-08-15Merge branch 'jc/refs-symref-referent'Junio C Hamano-0/+2
The refs API has been taught to give symref target information to the users of ref iterators, allowing for-each-ref and friends to avoid an extra ref_resolve_* API call per a symbolic ref. * jc/refs-symref-referent: ref-filter: populate symref from iterator refs: add referent to each_ref_fn refs: keep track of unresolved reference value in iterators
2024-08-13config: pass repo to functions that rename or copy sectionsPatrick Steinhardt-1/+1
Refactor functions that rename or copy config sections to accept a `struct repository` such that we can get rid of the implicit dependency on `the_repository`. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09refs: add referent to each_ref_fnJohn Cai-0/+2
Add a parameter to each_ref_fn so that callers to the ref APIs that use this function as a callback can have acess to the unresolved value of a symbolic ref. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08submodule: fix leaking seen submodule namesPatrick Steinhardt-0/+1
We keep track of submodules we have already seen via a string map such that we don't process the same submodule twice. We never free that map though, causing a memory leak. Fix this leak by clearing the map. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08submodule: fix leaking fetch tasksPatrick Steinhardt-10/+7
When done with a fetch task used for parallel fetches of submodules, we need to both call `fetch_task_release()` to release the task's contents and `free()` to release the task itself. Most sites do this already, but some only call `fetch_task_release()` and thus leak memory. While we could trivially fix this by adding the two missing calls to free(3P), the result would be that we always call both functions. Let's thus refactor the code such that `fetch_task_release()` also frees the structure itself. Rename it to `fetch_task_free()` accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt-0/+2
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14hash: require hash algorithm in `empty_tree_oid_hex()`Patrick Steinhardt-3/+3
The `empty_tree_oid_hex()` function use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. While at it, remove the unused `empty_blob_oid_hex()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17refs: pass repo when retrieving submodule ref storePatrick Steinhardt-1/+2
Looking up submodule ref stores has two deficiencies: - The initialized subrepo will be attributed to `the_repository`. - The submodule ref store will be tracked in a global map. This makes it impossible to have submodule ref stores for a repository other than `the_repository`. Modify the function to accept the parent repository as parameter and move the global map into `struct repository`. Like this it becomes possible to look up submodule ref stores for arbitrary repositories. Note that this also adds a new reference to `the_repository` in `resolve_gitlink_ref()`, which is part of the refs interfaces. This will get adjusted in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16Merge branch 'ps/refs-without-the-repository' into ↵Junio C Hamano-2/+4
ps/refs-without-the-repository-updates * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
2024-05-07cocci: apply rules to rewrite callers of "refs" interfacesPatrick Steinhardt-2/+4
Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-29Sync with 2.44.1Johannes Schindelin-0/+89
* maint-2.44: (41 commits) Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel ...
2024-04-19Sync with 2.42.2Johannes Schindelin-0/+89
* maint-2.42: (39 commits) Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' ...
2024-04-19Sync with 2.41.1Johannes Schindelin-0/+89
* maint-2.41: (38 commits) Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs ...
2024-04-19Sync with 2.40.2Johannes Schindelin-0/+89
* maint-2.40: (39 commits) Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default ...
2024-04-19Sync with 2.39.4Johannes Schindelin-0/+89
* maint-2.39: (38 commits) Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default fetch/clone: detect dubious ownership of local repositories ...
2024-04-17submodule: require the submodule path to contain directories onlyJohannes Schindelin-0/+72
Submodules are stored in subdirectories of their superproject. When these subdirectories have been replaced with symlinks by a malicious actor, all kinds of mayhem can be caused. This _should_ not be possible, but many CVEs in the past showed that _when_ possible, it allows attackers to slip in code that gets executed during, say, a `git clone --recursive` operation. Let's add some defense-in-depth to disallow submodule paths to have anything except directories in them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17clone: prevent clashing git dirs when cloning submodule in parallelFilip Hejsek-0/+17
While it is expected to have several git dirs within the `.git/modules/` tree, it is important that they do not interfere with each other. For example, if one submodule was called "captain" and another submodule "captain/hooks", their respective git dirs would clash, as they would be located in `.git/modules/captain/` and `.git/modules/captain/hooks/`, respectively, i.e. the latter's files could clash with the actual Git hooks of the former. To prevent these clashes, and in particular to prevent hooks from being written and then executed as part of a recursive clone, we introduced checks as part of the fix for CVE-2019-1387 in a8dee3ca61 (Disallow dubiously-nested submodule git directories, 2019-10-01). It is currently possible to bypass the check for clashing submodule git dirs in two ways: 1. parallel cloning 2. checkout --recurse-submodules Let's check not only before, but also after parallel cloning (and before checking out the submodule), that the git dir is not clashing with another one, otherwise fail. This addresses the parallel cloning issue. As to the parallel checkout issue: It requires quite a few manual steps to create clashing git dirs because Git itself would refuse to initialize the inner one, as demonstrated by the test case. Nevertheless, let's teach the recursive checkout (namely, the `submodule_move_head()` function that is used by the recursive checkout) to be careful to verify that it does not use a clashing git dir, and if it does, disable it (by deleting the `HEAD` file so that subsequent Git calls won't recognize it as a git dir anymore). Note: The parallel cloning test case contains a `cat err` that proved to be highly useful when analyzing the racy nature of the operation (the operation can fail with three different error messages, depending on timing), and was left on purpose to ease future debugging should the need arise. Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>