aboutsummaryrefslogtreecommitdiffstats
path: root/builtin/cat-file.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-08-05Merge branch 'ps/object-file-wo-the-repository'Junio C Hamano1-1/+1
Reduce implicit assumption and dependence on the_repository in the object-file subsystem. * ps/object-file-wo-the-repository: object-file: get rid of `the_repository` in index-related functions object-file: get rid of `the_repository` in `force_object_loose()` object-file: get rid of `the_repository` in `read_loose_object()` object-file: get rid of `the_repository` in loose object iterators object-file: remove declaration for `for_each_file_in_obj_subdir()` object-file: inline `for_each_loose_file_in_objdir_buf()` object-file: get rid of `the_repository` when writing objects odb: introduce `odb_write_object()` loose: write loose objects map via their source object-file: get rid of `the_repository` in `finalize_object_file()` object-file: get rid of `the_repository` in `loose_object_info()` object-file: get rid of `the_repository` when freshening objects object-file: inline `check_and_freshen()` functions object-file: get rid of `the_repository` in `has_loose_object()` object-file: stop using `the_hash_algo` object-file: fix -Wsign-compare warnings
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt1-1/+1
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` in loose object iteratorsPatrick Steinhardt1-1/+1
The iterators for loose objects still rely on `the_repository`. Refactor them: - `for_each_loose_file_in_objdir()` is refactored so that the caller is now expected to pass an `odb_source` as parameter instead of the path to that source. Furthermore, it is renamed accordingly to `for_each_loose_file_in_source()`. - `for_each_loose_object()` is refactored to take in an object database now and calls the above function in a loop. This allows us to get rid of the global dependency. Adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'ps/object-store'Junio C Hamano1-32/+30
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-09Merge branch 'ps/object-store' into ps/object-file-wo-the-repositoryJunio C Hamano1-32/+30
* ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-01odb: rename `read_object_with_reference()`Patrick Steinhardt1-2/+2
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to match other functions related to the object database and our modern coding guidelines. Furthermore though, the old name didn't really describe very well what this function actually does, which is to walk down any commit and tag objects until an object of the required type has been found. This is generally referred to as "peeling", so the new name should be way more descriptive. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt1-2/+2
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt1-15/+11
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt1-12/+14
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-03cat-file.c: add batch handling for submodulesVictoria Dye1-1/+4
When an object specification is passed to 'cat-file --batch[-check]' referring to a submodule (e.g. 'HEAD:path/to/my/submodule'), the current behavior of the command is to print the "missing" error message. However, it is often valuable for callers to distinguish between paths that are actually missing and "the submodule tree entry exists, but the object does not exist in the repository". To disambiguate without needing to invoke a separate Git process (e.g. 'ls-tree'), print the message "<oid> submodule" for such objects instead of "<object> missing". In addition to the change from "missing" to "submodule", the new message differs from the old in that it always prints the resolved tree entry's OID, rather than the input object specification. Note that this implementation maintains a distinction between submodules where the commit OID is not present in the repo, and submodules where the commit OID *is* present; the former will now print "<object> submodule", but the latter will still print the full object content. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-03cat-file: add %(objectmode) atomVictoria Dye1-2/+7
Add a formatting atom, used with the --batch-check/--batch-command options, that prints the octal representation of the object mode if a given revision includes that information, e.g. one that follows the format <tree-ish>:<path>. If the mode information does not exist, an empty string is printed instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16cat-file: use type enum instead of buffer for -t optionJeff King1-9/+4
Now that we no longer support OBJECT_INFO_ALLOW_UNKNOWN_TYPE, there is no need to pass a strbuf into oid_object_info_extended() to record the type. The regular object_type enum is sufficient to capture all of the types we will allow. This simplifies the code a bit, and will eventually let us drop object_info's type_name strbuf support. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16cat-file: make --allow-unknown-type a noopJeff King1-13/+5
The cat-file command has some minor support for handling objects with "unknown" types. I.e., strings that are not "blob", "commit", "tree", or "tag". In theory this could be used for debugging or experimenting with extensions to Git. But in practice this support is not very useful: 1. You can get the type and size of such objects, but nothing else. Not even the contents! 2. Only loose objects are supported, since packfiles use numeric ids for the types, rather than strings. 3. Likewise you cannot ever transfer objects between repositories, because they cannot be represented in the packfiles used for the on-the-wire protocol. The support for these unknown types complicates the object-parsing code, and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix infinite loop on broken zlib input, 2025-02-25). So let's drop it. The first step is to remove the user-facing parts, which are accessible only via cat-file. This is technically backwards-incompatible, but given the limitations listed above, these objects couldn't possibly be useful in any workflow. However, we can't just rip out the option entirely. That would hurt a caller who ran: git cat-file -t --allow-unknown-object <oid> and fed it normal, well-formed objects. There --allow-unknown-type was doing nothing, but we wouldn't want to start bailing with an error. So to protect any such callers, we'll retain --allow-unknown-type as a noop. The code change is fairly small (but we'll able to clean up more code in follow-on patches). The test updates drop any use of the option. We still retain tests that feed the broken objects to cat-file without --allow-unknown-type, as we should continue to confirm that those objects are rejected. Note that in one spot we can drop a layer of loop, re-indenting the body; viewing the diff with "-w" helps there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt1-1/+2
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/object-file-cleanup' into ps/object-store-cleanupJunio C Hamano1-1/+1
* ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: use bitmaps to efficiently filter by object typePatrick Steinhardt1-5/+37
While it is now possible to filter objects by type, this mechanism is for now mostly a convenience. Most importantly, we still have to iterate through the whole packfile to find all objects of a specific type. This can be prohibitively expensive depending on the size of the packfiles. It isn't really possible to do better than this when only considering a packfile itself, as the order of objects is not fixed. But when we have a packfile with a corresponding bitmap, either because the packfile itself has one or because the multi-pack index has a bitmap for it, then we can use these bitmaps to improve the runtime. While bitmaps are typically used to compute reachability of objects, they also contain one bitmap per object type that encodes which object has what type. So instead of reading through the whole packfile(s), we can use the bitmaps and iterate through the type-specific bitmap. Typically, only a subset of packfiles will have a bitmap. But this isn't really much of a problem: we can use bitmaps when available, and then use the non-bitmap walk for every packfile that isn't covered by one. Overall, this leads to quite a significant speedup depending on how many objects of a certain type exist. The following benchmarks have been executed in the Chromium repository, which has a 50GB packfile with almost 25 million objects. As expected, there isn't really much of a change in performance without an object filter: Benchmark 1: cat-file with no-filter (revision = HEAD~) Time (mean ± σ): 89.675 s ± 4.527 s [User: 40.807 s, System: 10.782 s] Range (min … max): 83.052 s … 96.084 s 10 runs Benchmark 2: cat-file with no-filter (revision = HEAD) Time (mean ± σ): 88.991 s ± 2.488 s [User: 42.278 s, System: 10.305 s] Range (min … max): 82.843 s … 91.271 s 10 runs Summary cat-file with no-filter (revision = HEAD) ran 1.01 ± 0.06 times faster than cat-file with no-filter (revision = HEAD~) We still have to scan through all objects as we yield all of them, so using the bitmap in this case doesn't really buy us anything. What is noticeable in this benchmark is that we're I/O-bound, not CPU-bound, as can be seen from the user/system runtimes, which combined are way lower than the overall benchmarked runtime. But when we do use a filter we can see a significant improvement: Benchmark 1: cat-file with filter=object:type=commit (revision = HEAD~) Time (mean ± σ): 86.444 s ± 4.081 s [User: 36.830 s, System: 11.312 s] Range (min … max): 80.305 s … 93.104 s 10 runs Benchmark 2: cat-file with filter=object:type=commit (revision = HEAD) Time (mean ± σ): 2.089 s ± 0.015 s [User: 1.872 s, System: 0.207 s] Range (min … max): 2.073 s … 2.119 s 10 runs Summary cat-file with filter=object:type=commit (revision = HEAD) ran 41.38 ± 1.98 times faster than cat-file with filter=object:type=commit (revision = HEAD~) This is because we don't have to scan through all packfiles anymore, but can instead directly look up relevant objects. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: deduplicate logic to iterate over all objectsPatrick Steinhardt1-37/+48
Pull out a common function that allows us to iterate over all objects in a repository. Right now the logic is trivial and would only require two function calls, making this refactoring a bit pointless. But in the next commit we will iterate on this logic to make use of bitmaps, so this is about to become a bit more complex. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: support "object:type=" objects filterPatrick Steinhardt1-1/+11
Implement support for the "object:type=" filter in git-cat-file(1), which causes us to omit all objects that don't match the provided object type. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: support "blob:limit=" objects filterPatrick Steinhardt1-1/+14
Implement support for the "blob:limit=" filter in git-cat-file(1), which causes us to omit all blobs that are bigger than a certain size. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: support "blob:none" objects filterPatrick Steinhardt1-1/+14
Implement support for the "blob:none" filter in git-cat-file(1), which causes us to omit all blobs. Note that this new filter requires us to read the object type via `oid_object_info_extended()` in `batch_object_write()`. But as we try to optimize away reading objects from the database the `data->info.typep` pointer may not be set. We thus have to adapt the logic to conditionally set the pointer in cases where the filter is given. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: wire up an option to filter objectsPatrick Steinhardt1-4/+32
In batch mode, git-cat-file(1) enumerates all objects and prints them by iterating through both loose and packed objects. This works without considering their reachability at all, and consequently most options to filter objects as they exist in e.g. git-rev-list(1) are not applicable. In some situations it may still be useful though to filter objects based on properties that are inherent to them. This includes the object size as well as its type. Such a filter already exists in git-rev-list(1) with the `--filter=` command line option. While this option supports a couple of filters that are not applicable to our usecase, some of them are quite a neat fit. Wire up the filter as an option for git-cat-file(1). This allows us to reuse the same syntax as in git-rev-list(1) so that we don't have to reinvent the wheel. For now, we die when any of the filter options has been passed by the user, but they will be wired up in subsequent commits. Further note that the filters that we are about to introduce don't significantly speed up the runtime of git-cat-file(1). While we can skip emitting a lot of objects in case they are uninteresting to us, the majority of time is spent reading the packfile, which is bottlenecked by I/O and not the processor. This will change though once we start to make use of bitmaps, which will allow us to skip reading the whole packfile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: introduce function to report object statusPatrick Steinhardt1-5/+13
We have multiple callsites that report the status of an object, for example when the objec tis missing or its name is ambiguous. We're about to add a couple more such callsites to report on "excluded" objects. Prepare for this by introducing a new function `report_object_status()` that encapsulates the functionality. Note that this function also flushes stdout, which is a requirement so that request-response style batched modes can learn about the status before proceeding to the next object. We already flush correctly at all existing callsites, even though the flush in `batch_one_object()` only comes after the switch statement. That flush is now redundant, and we could in theory deduplicate it by moving it into all branches that don't use `report_object_status()`. But that doesn't quite feel sensible: - The duplicate flush should ultimately just be a no-op for us and thus shouldn't impact performance significantly. - By keeping the flush in `report_object_status()` we ensure that all future callers get semantics correct. So let's just be pragmatic and live with the duplicated flush. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07builtin/cat-file: rename variable that tracks usagePatrick Steinhardt1-22/+25
The usage strings for git-cat-file(1) that we pass to `parse_options()` and `usage_msg_optf()` are stored in a variable called `usage`. This variable shadows the declaration of `usage()`, which we'll want to use in a subsequent commit. Rename the variable to `builtin_catfile_usage`, which is in line with how the variable is typically called in other builtins. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano1-0/+3
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+3
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04packfile: pass down repository to `for_each_packed_object`Karthik Nayak1-3/+4
The function `for_each_packed_object` currently relies on the global variable `the_repository`. To eliminate global variable usage in `packfile.c`, we should progressively shift the dependency on the_repository to higher layers. Let's remove its usage from this function and closely related function `is_promisor_object`. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23Merge branch 'jc/pass-repo-to-builtins'Junio C Hamano1-3/+6
The convention to calling into built-in command implementation has been updated to pass the repository, if known, together with the prefix value. * jc/pass-repo-to-builtins: add: pass in repo variable instead of global the_repository builtin: remove USE_THE_REPOSITORY for those without the_repository builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h builtin: add a repository parameter for builtin functions
2024-09-13builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.hJohn Cai1-1/+1
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every builtin, remove it from builtin.h and add it to all the builtins that include builtin.h (by definition, that means all builtins/*.c). Also, remove the include statement for repository.h since it gets brought in through builtin.h. The next step will be to migrate each builtin from having to use the_repository. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13builtin: add a repository parameter for builtin functionsJohn Cai1-2/+5
In order to reduce the usage of the global the_repository, add a parameter to builtin functions that will get passed a repository variable. This commit uses UNUSED on most of the builtin functions, as subsequent commits will modify the actual builtins to pass the repository parameter down. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04builtin/cat-file: mark 'git cat-file' sparse-index compatibleKevin Lyles1-0/+3
This change affects how 'git cat-file' works with the index when specifying an object with the ":<path>" syntax (which will give file contents from the index). 'git cat-file' expands a sparse index to a full index any time contents are requested from the index by specifying an object with the ":<path>" syntax. This is true even when the requested file is part of the sparse index, and results in much slower 'git cat-file' operations when working within the sparse index. Mark 'git cat-file' as not needing a full index, so that you only pay the cost of expanding the sparse index to a full index when you request a file outside of the sparse index. Add tests to ensure both that: - 'git cat-file' returns the correct file contents whether or not the file is in the sparse index - 'git cat-file' expands to the full index any time you request something outside of the sparse index Signed-off-by: Kevin Lyles <klyles+github@epic.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11object-name: free leaking object contextsPatrick Steinhardt1-6/+11
While it is documented in `struct object_context::path` that this variable needs to be released by the caller, this fact is rather easy to miss given that we do not ever provide a function to release the object context. And of course, while some callers dutifully release the path, many others don't. Introduce a new `object_context_release()` function that releases the path. Convert callsites that used to free the path to use that new function and add missing calls to callsites that were leaking memory. Refactor those callsites as required to have a single return path, only. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18builtin: stop using `the_index`Patrick Steinhardt1-2/+2
Convert builtins to use `the_repository->index` instead of `the_index`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03Merge branch 'rs/strbuf-expand-bad-format'Junio C Hamano1-8/+8
Code clean-up. * rs/strbuf-expand-bad-format: cat-file: use strbuf_expand_bad_format() factor out strbuf_expand_bad_format()
2024-03-28Merge branch 'eb/hash-transition'Junio C Hamano1-3/+9
Work to support a repository that work with both SHA-1 and SHA-256 hash algorithms has started. * eb/hash-transition: (30 commits) t1016-compatObjectFormat: add tests to verify the conversion between objects t1006: test oid compatibility with cat-file t1006: rename sha1 to oid test-lib: compute the compatibility hash so tests may use it builtin/ls-tree: let the oid determine the output algorithm object-file: handle compat objects in check_object_signature tree-walk: init_tree_desc take an oid to get the hash algorithm builtin/cat-file: let the oid determine the output algorithm rev-parse: add an --output-object-format parameter repository: implement extensions.compatObjectFormat object-file: update object_info_extended to reencode objects object-file-convert: convert commits that embed signed tags object-file-convert: convert commit objects when writing object-file-convert: don't leak when converting tag objects object-file-convert: convert tag objects when writing object-file-convert: add a function to convert trees between algorithms object: factor out parse_mode out of fast-import and tree-walk into in object.h cache: add a function to read an OID of a specific algorithm tag: sign both hashes commit: export add_header_signature to support handling signatures on tags ...
2024-03-25cat-file: use strbuf_expand_bad_format()René Scharfe1-8/+8
Report unknown format elements and missing closing parentheses with consistent and translated messages by calling strbuf_expand_bad_format() at the very end of the combined if/else chain of expand_format() and expand_atom(). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-14Merge branch 'js/check-null-from-read-object-file'Junio C Hamano1-2/+8
The code paths that call repo_read_object_file() have been tightened to react to errors. * js/check-null-from-read-object-file: Always check the return value of `repo_read_object_file()`
2024-02-06Always check the return value of `repo_read_object_file()`Johannes Schindelin1-2/+8
There are a couple of places in Git's source code where the return value is not checked. As a consequence, they are susceptible to segmentation faults. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-1/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09doc/cat-file: make synopsis and description less confusingŠtěpán Němec1-2/+2
The DESCRIPTION's "first form" is actually the 1st, 2nd, 3rd and 5th form in SYNOPSIS, the "second form" is the 4th one. Interestingly, this state of affairs was introduced in 97fe7250753b (cat-file docs: fix SYNOPSIS and "-h" output, 2021-12-28) with the claim of "Now the two will match again." ("the two" being DESCRIPTION and SYNOPSIS)... The description also suffers from other correctness and clarity issues, e.g., the "first form" paragraph discusses -p, -s and -t, but leaves out -e, which is included in the corresponding SYNOPSIS section; the second paragraph mentions <format>, which doesn't occur in SYNOPSIS at all, and of the three batch options, really only describes the behavior of --batch-check. Also the mention of "drivers" seems an implementation detail not adding much clarity in a short summary (and isn't expanded upon in the rest of the man page, either). Rather than trying to maintain one-to-one (or N-to-M) correspondence between the DESCRIPTION and SYNOPSIS forms, creating duplication and providing opportunities for error, shorten the former into a concise summary describing the two general modes of operation: batch and non-batch, leaving details to the subsequent manual sections. While here, fix a grammar error in the description of -e and make the following further minor improvements: NAME: shorten ("content or type and size" isn't the whole story; say "details" and leave the actual details to later sections) SYNOPSIS and --help: move the (--textconv | --filters) form before --batch, closer to the other non-batch forms Signed-off-by: Štěpán Němec <stepnem@smrk.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02builtin/cat-file: let the oid determine the output algorithmEric W. Biederman1-3/+9
Use GET_OID_HASH_ANY when calling get_oid_with_context. This implements the semi-obvious behaviour that specifying a sha1 oid shows the output for a sha1 encoded object, and specifying a sha256 oid shows the output for a sha256 encoded object. This is useful for testing the the conversion of an object to an equivalent object encoded with a different hash function. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano1-1/+0
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-2/+3
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-07-06Merge branch 'rs/strbuf-expand-step'Junio C Hamano1-18/+17
Code clean-up around strbuf_expand() API. * rs/strbuf-expand-step: strbuf: simplify strbuf_expand_literal_cb() replace strbuf_expand() with strbuf_expand_step() replace strbuf_expand_dict_cb() with strbuf_expand_step() strbuf: factor out strbuf_expand_step() pretty: factor out expand_separator()
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan1-1/+0
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29Merge branch 'en/header-split-cache-h-part-3'Junio C Hamano1-3/+2
Header files cleanup. * en/header-split-cache-h-part-3: (28 commits) fsmonitor-ll.h: split this header out of fsmonitor.h hash-ll, hashmap: move oidhash() to hash-ll object-store-ll.h: split this header out of object-store.h khash: name the structs that khash declares merge-ll: rename from ll-merge git-compat-util.h: remove unneccessary include of wildmatch.h builtin.h: remove unneccessary includes list-objects-filter-options.h: remove unneccessary include diff.h: remove unnecessary include of oidset.h repository: remove unnecessary include of path.h log-tree: replace include of revision.h with simple forward declaration cache.h: remove this no-longer-used header read-cache*.h: move declarations for read-cache.c functions from cache.h repository.h: move declaration of the_index from cache.h merge.h: move declarations for merge.c from cache.h diff.h: move declaration for global in diff.c from cache.h preload-index.h: move declarations for preload-index.c from elsewhere sparse-index.h: move declarations for sparse-index.c from cache.h name-hash.h: move declarations for name-hash.c from cache.h run-command.h: move declarations for run-command.c from cache.h ...
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-2/+3
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-22Merge branch 'ps/cat-file-null-output'Junio C Hamano1-42/+43
"git cat-file --batch" and friends learned "-Z" that uses NUL delimiter for both input and output. * ps/cat-file-null-output: cat-file: add option '-Z' that delimits input and output with NUL cat-file: simplify reading from standard input strbuf: provide CRLF-aware helper to read until a specified delimiter t1006: modernize test style to use `test_cmp` t1006: don't strip timestamps from expected results
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>