diff options
Diffstat (limited to 'Documentation/CodingGuidelines')
| -rw-r--r-- | Documentation/CodingGuidelines | 190 |
1 files changed, 155 insertions, 35 deletions
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 1d92b2da03..1dc9a5e351 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -185,8 +185,8 @@ For shell scripts specifically (not exhaustive): - Even though "local" is not part of POSIX, we make heavy use of it in our test suite. We do not use it in scripted Porcelains, and - hopefully nobody starts using "local" before they are reimplemented - in C ;-) + hopefully nobody starts using "local" before all shells that matter + support it (notably, ksh from AT&T Research does not support it yet). - Some versions of shell do not understand "export variable=value", so we write "variable=value" and then "export variable" on two @@ -204,6 +204,33 @@ For shell scripts specifically (not exhaustive): local variable="$value" local variable="$(command args)" + - The common construct + + VAR=VAL command args + + to temporarily set and export environment variable VAR only while + "command args" is running is handy, but this triggers an + unspecified behaviour according to POSIX when used for a command + that is not an external command (like shell functions). Indeed, + dash 0.5.10.2-6 on Ubuntu 20.04, /bin/sh on FreeBSD 13, and AT&T + ksh all make a temporary assignment without exporting the variable, + in such a case. As it does not work portably across shells, do not + use this syntax for shell functions. A common workaround is to do + an explicit export in a subshell, like so: + + (incorrect) + VAR=VAL func args + + (correct) + ( + VAR=VAL && + export VAR && + func args + ) + + but be careful that the effect "func" makes to the variables in the + current shell will be lost across the subshell boundary. + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. "\xc2\xa2") in printf format strings, since hexadecimal escape sequences are not portable. @@ -214,6 +241,16 @@ For C programs: - We use tabs to indent, and interpret tabs as taking up to 8 spaces. + - Nested C preprocessor directives are indented after the hash by one + space per nesting level. + + #if FOO + # include <foo.h> + # if BAR + # include <bar.h> + # endif + #endif + - We try to keep to at most 80 characters per line. - As a Git developer we assume you have a reasonably modern compiler @@ -221,6 +258,14 @@ For C programs: ensure your patch is clear of all compiler warnings we care about, by e.g. "echo DEVELOPER=1 >>config.mak". + - When using DEVELOPER=1 mode, you may see warnings from the compiler + like "error: unused parameter 'foo' [-Werror=unused-parameter]", + which indicates that a function ignores its argument. If the unused + parameter can't be removed (e.g., because the function is used as a + callback and has to match a certain interface), you can annotate + the individual parameters with the UNUSED (or MAYBE_UNUSED) + keyword, like "int foo UNUSED". + - We try to support a wide range of C compilers to compile Git with, including old ones. As of Git v2.35.0 Git requires C99 (we check "__STDC_VERSION__"). You should not use features from a newer C @@ -234,7 +279,7 @@ For C programs: . since around 2007 with 2b6854c863a, we have been using initializer elements which are not computable at load time. E.g.: - const char *args[] = {"constant", variable, NULL}; + const char *args[] = { "constant", variable, NULL }; . since early 2012 with e1327023ea, we have been using an enum definition whose last element is followed by a comma. This, like @@ -266,7 +311,9 @@ For C programs: v12.01, 2022-03-28). - Variables have to be declared at the beginning of the block, before - the first statement (i.e. -Wdeclaration-after-statement). + the first statement (i.e. -Wdeclaration-after-statement). It is + encouraged to have a blank line between the end of the declarations + and the first statement in the block. - NULL pointers shall be written as NULL, not as 0. @@ -286,6 +333,13 @@ For C programs: while( condition ) func (bar+1); + - A binary operator (other than ",") and ternary conditional "?:" + have a space on each side of the operator to separate it from its + operands. E.g. "A + 1", not "A+1". + + - A unary operator (other than "." and "->") have no space between it + and its operand. E.g. "(char *)ptr", not "(char *) ptr". + - Do not explicitly compare an integral value with constant 0 or '\0', or a pointer value with constant NULL. For instance, to validate that counted array <ptr, cnt> is initialized but has no elements, write: @@ -531,6 +585,56 @@ For C programs: use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb" ./bin-wrappers/git log` (See `wrap-for-bin.sh`.) + - The primary data structure that a subsystem 'S' deals with is called + `struct S`. Functions that operate on `struct S` are named + `S_<verb>()` and should generally receive a pointer to `struct S` as + first parameter. E.g. + + struct strbuf; + + void strbuf_add(struct strbuf *buf, ...); + + void strbuf_reset(struct strbuf *buf); + + is preferred over: + + struct strbuf; + + void add_string(struct strbuf *buf, ...); + + void reset_strbuf(struct strbuf *buf); + + - There are several common idiomatic names for functions performing + specific tasks on a structure `S`: + + - `S_init()` initializes a structure without allocating the + structure itself. + + - `S_release()` releases a structure's contents without freeing the + structure. + + - `S_clear()` is equivalent to `S_release()` followed by `S_init()` + such that the structure is directly usable after clearing it. When + `S_clear()` is provided, `S_init()` shall not allocate resources + that need to be released again. + + - `S_free()` releases a structure's contents and frees the + structure. + + - Function names should be clear and descriptive, accurately reflecting + their purpose or behavior. Arbitrary suffixes that do not add meaningful + context can lead to confusion, particularly for newcomers to the codebase. + + Historically, the '_1' suffix has been used in situations where: + + - A function handles one element among a group that requires similar + processing. + - A recursive function has been separated from its setup phase. + + The '_1' suffix can be used as a concise way to indicate these specific + cases. However, it is recommended to find a more descriptive name wherever + possible to improve the readability and maintainability of the code. + For Perl programs: - Most of the C guidelines above apply. @@ -599,16 +703,30 @@ Program Output Error Messages - - Do not end error messages with a full stop. + - Do not end a single-sentence error message with a full stop. - Do not capitalize the first word, only because it is the first word - in the message ("unable to open %s", not "Unable to open %s"). But + in the message ("unable to open '%s'", not "Unable to open '%s'"). But "SHA-3 not supported" is fine, because the reason the first word is capitalized is not because it is at the beginning of the sentence, but because the word would be spelled in capital letters even when it appeared in the middle of the sentence. - - Say what the error is first ("cannot open %s", not "%s: cannot open") + - Say what the error is first ("cannot open '%s'", not "%s: cannot open"). + + - Enclose the subject of an error inside a pair of single quotes, + e.g. `die(_("unable to open '%s'"), path)`. + + - Unless there is a compelling reason not to, error messages from + porcelain commands should be marked for translation, e.g. + `die(_("bad revision %s"), revision)`. + + - Error messages from the plumbing commands are sometimes meant for + machine consumption and should not be marked for translation, + e.g., `die("bad revision %s", revision)`. + + - BUG("message") are for communicating the specific error to developers, + thus should not be translated. Externally Visible Names @@ -738,78 +856,80 @@ Markup: _<new-branch-name>_ _<template-directory>_ - A placeholder is not enclosed in backticks, as it is not a literal. - When needed, use a distinctive identifier for placeholders, usually made of a qualification and a type: _<git-dir>_ _<key-id>_ - When literal and placeholders are mixed, each markup is applied for - each sub-entity. If they are stuck, a special markup, called - unconstrained formatting is required. - Unconstrained formating for placeholders is __<like-this>__ - Unconstrained formatting for literal formatting is ++like this++ - `--jobs` _<n>_ - ++--sort=++__<key>__ - __<directory>__++/.git++ - ++remote.++__<name>__++.mirror++ + Git's Asciidoc processor has been tailored to treat backticked text + as complex synopsis. When literal and placeholders are mixed, you can + use the backtick notation which will take care of correctly typesetting + the content. + `--jobs <n>` + `--sort=<key>` + `<directory>/.git` + `remote.<name>.mirror` + `ssh://[<user>@]<host>[:<port>]/<path-to-git-repo>` - caveat: ++ unconstrained format is not verbatim and may expand - content. Use Asciidoc escapes inside them. +As a side effect, backquoted placeholders are correctly typeset, but +this style is not recommended. Synopsis Syntax - Syntax grammar is formatted neither as literal nor as placeholder. + The synopsis (a paragraph with [synopsis] attribute) is automatically + formatted by the toolchain and does not need typesetting. A few commented examples follow to provide reference when writing or modifying command usage strings and synopsis sections in the manual pages: Possibility of multiple occurrences is indicated by three dots: - _<file>_... + <file>... (One or more of <file>.) Optional parts are enclosed in square brackets: - [_<file>_...] + [<file>...] (Zero or more of <file>.) - ++--exec-path++[++=++__<path>__] + An optional parameter needs to be typeset with unconstrained pairs + [<repository>] + + --exec-path[=<path>] (Option with an optional argument. Note that the "=" is inside the brackets.) - [_<patch>_...] + [<patch>...] (Zero or more of <patch>. Note that the dots are inside, not outside the brackets.) Multiple alternatives are indicated with vertical bars: - [`-q` | `--quiet`] - [`--utf8` | `--no-utf8`] + [-q | --quiet] + [--utf8 | --no-utf8] Use spacing around "|" token(s), but not immediately after opening or before closing a [] or () pair: - Do: [`-q` | `--quiet`] - Don't: [`-q`|`--quiet`] + Do: [-q | --quiet] + Don't: [-q|--quiet] Don't use spacing around "|" tokens when they're used to separate the alternate arguments of an option: - Do: ++--track++[++=++(`direct`|`inherit`)]` - Don't: ++--track++[++=++(`direct` | `inherit`)] + Do: --track[=(direct|inherit)] + Don't: --track[=(direct | inherit)] Parentheses are used for grouping: - [(_<rev>_ | _<range>_)...] + [(<rev>|<range>)...] (Any number of either <rev> or <range>. Parens are needed to make it clear that "..." pertains to both <rev> and <range>.) - [(`-p` _<parent>_)...] + [(-p <parent>)...] (Any number of option -p, each with one <parent> argument.) - `git remote set-head` _<name>_ (`-a` | `-d` | _<branch>_) + git remote set-head <name> (-a|-d|<branch>) (One and only one of "-a", "-d" or "<branch>" _must_ (no square brackets) be provided.) And a somewhat more contrived example: - `--diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]` + --diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]] Here "=" is outside the brackets, because "--diff-filter=" is a valid usage. "*" has its own pair of brackets, because it can (optionally) be specified only when one or more of the letters is |
