aboutsummaryrefslogtreecommitdiffstats
path: root/src (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-03maint: prefer issymlink to readlink with a small bufferCollin Funk4-11/+10
* bootstrap.conf (gnulib_modules): Add issymlink and issymlinkat. * src/copy.c: Include issymlink.h. (copy_reg): Use issymlink instead of readlinkat. * src/rmdir.c: Include issymlink.h. (main): Use issymlink instead of readlink. * src/tail.c: Include issymlink.h. (recheck, any_symlinks): Use issymlink instead of readlink. * src/test.c: Include issymlink.h. (unary_operator): Use issymlink instead of readlink.
2025-09-02seq: be more accurate with large integer start valuesPádraig Brady1-2/+6
* src/seq.c (main): Avoid possibly innacurate conversion to long double, for all digit start values. * tests/seq/seq-long-double.sh: Add a test case. * NEWS: Mention the improvement. Fixes https://bugs.gnu.org/79369
2025-09-01df: pacify static analysisPaul Eggert1-9/+5
Problem reported by Yubiao Hu <https://bugs.gnu.org/79336>. * src/df.c (get_dev): Assume MOUNT_POINT is non-null.
2025-08-31ls: fix alignment with locale formatted --sizePádraig Brady1-4/+12
Fix allocated size alignment in locales with multi-byte grouping chars. Tested with: LC_ALL=sv_SE.utf8 ls --size --block-size=\'k * src/ls.c (print_file_name_and_frills): Don't rely on printf("%*s", width, string) to pad multi-byte strings appropriately. Instead work out the padding required and use: printf("%*s%s", padding, "", string) to pad multi-byte appropriately. * tests/ls/block-size.sh: Add a test case. * NEWS: Mention the bug fix. Fixes https://bugs.gnu.org/79347
2025-08-30b2sum: --length: fix upper bound checkPádraig Brady1-1/+1
* src/digest.c (main): Don't saturate -l to BLAKE2B_MAX_LEN, so that the subsequent bounds check is performed. * tests/cksum/b2sum.sh: Add a test case. * NEWS: Mention the fix introduced in commit v9.5-71-gf2c84fe63
2025-08-28fold: fix handling of invalid multi-byte charactersCollin Funk1-5/+12
* src/fold.c (fold_file): Continue the loop when we have buffered bytes but nothing left to read from the file. (adjust_column): Don't assume that the character is printable. * tests/fold/fold-characters.sh: Add a new test case. (bad_unicode): New function.
2025-08-27tests: parameterize IO_BUFSIZEPádraig Brady1-0/+4
* src/getlimits.c (main): Output IO_BUFSIZE, useful for sizing data for tests. * tests/fold/fold-characters.sh: Use it rather than hardcoding.
2025-08-27build: fold: fix build failure with C99Pádraig Brady1-1/+2
GCC 10.2 gave the following error: "error: label at end of compound statement" * src/fold.c (fold_file): Add a ";" to avoid C2X specific syntax.
2025-08-26fold: don't truncate multibyte characters at the end of the bufferCollin Funk1-2/+22
* src/fold.c (fold_file): Replace invalid characters with the original byte read. Copy multibyte sequences that may not yet be read to the start of the buffer before reading more bytes. * tests/fold/fold-characters.sh: Add a test case.
2025-08-24fold: use fread instead of getlineCollin Funk1-9/+7
* src/fold.c: Include ioblksize.h. (fold_file): Use two IO_BUFSIZE-sized buffers. Use fread instead of getline. Check for if we reached the end of file.
2025-08-23maint: prefer STRUCT_UTMP to struct gl_utmpCollin Funk4-21/+21
* cfg.mk (sc_prohibit-struct-gl_utmp): New rule for 'make syntax-check'. * src/pinky.c (time_string, print_entry, scan_entries, short_pinky): Use STRUCT_UTMP instead of struct gl_utmp. * src/uptime.c (print_uptime, print_uptime, uptime): Likewise. * src/users.c (list_entries_users, users): Likewise. * src/who.c (time_string, print_user, print_boottime, print_deadprocs) (print_login, print_initspawn, print_clockchange, print_runlevel) list_entries_who, scan_entries, who): Likewise.
2025-08-22fold: add the --characters optionCollin Funk2-68/+108
* src/fold.c: Include mcel.h. (count_bytes): Remove variable. (counting_mode, last_character_width): New variables. (shortopts, long_options): Add the option. (adjust_column): If --characters is in used account for number of characters instead of their width. (fold_file): Use getline and iterate over the result with mcel functions to handle multibyte characters. (main): Check for the option. * src/local.mk (src_fold_LDADD): Add $(LIBC32CONV), $(LIBUNISTRING), and $(MBRTOWC_LIB). * tests/fold/fold-characters.sh: New file. * tests/fold/fold-spaces.sh: New file. * tests/fold/fold-nbsp.sh: New file. * tests/local.mk (all_tests): Add the tests. * NEWS: Mention the new option. * doc/coreutils.texi (fold invocation): Likewise.
2025-08-22cp: improve hole handling on squashfsPaul Eggert1-21/+39
Better fix for problem reported by Jeremy Allison <https://bugs.gnu.org/79267>. * src/copy.c (struct scan_inference): New type, replacing union scan_inference. All uses changed. This is so infer_scantype can report the first hole's offset when known. (lseek_copy): 5th arg is now struct scan_inference const *, not just off_t. All uses changed. (infer_scantype): If SEEK_SET+SEEK_HOLE do not find a hole, fall back on ZERO_SCANTYPE.
2025-08-22cp: go back to copy_file_range optimizationPaul Eggert1-1/+1
This reverts part of the previous change. * src/copy.c (lseek_copy): When calling sparse_copy, do not ask it to scan for zeros unless --sparse=always, so that it can use copy_file_range which can be far more efficient.
2025-08-21cp: always punch holes that we makePaul Eggert1-13/+8
Problem reported by Jeremy Allison <https://bugs.gnu.org/79267>. * src/copy.c (create_hole, sparse_copy): Omit arg PUNCH_HOLES, as we always punch holes now. All uses changed. (lseek_copy): When calling sparse_copy, scan for holes when sparse_mode == SPARSE_AUTO, as that means we are making holes. (copy_reg): Always punch any hole made at end.
2025-08-18logname: list David MacKenzie as the authorCollin Funk1-1/+1
* src/logname.c (AUTHORS): List David as the author. * AUTHORS: Likewise.
2025-08-14maint: avoid syntax-check failure from previous commitCollin Funk1-1/+0
* src/tsort.c: Don't include long-options.h since the previous commit removed the call to parse_gnu_standard_options_only. This avoids a sc_prohibit_long_options_without_use syntax-check failure.
2025-08-14tsort: add do-nothing -w optionPaul Eggert1-3/+27
This is for conformance to POSIX.1-2024 * src/tsort.c: Include getopt.h. (main): Accept and ignore -w. Do not bother altering the usage message, as the option is useless. * tests/misc/tsort.pl (cycle-3): New test.
2025-08-12maint: use short form bug URLsPádraig Brady4-4/+4
* cfg.mk (sc_prohibit-long-form-bug-urls): Disallow long form in code. * scripts/git-hooks/commit-msg: Disallow long form in commit messages. * NEWS: Shorten long urls. * bootstrap.conf: Likewise. * configure.ac: Likewise. * scripts/git-hooks/commit-msg: Likewise. * src/csplit.c: Likewise. * src/fmt.c: Likewise. * src/make-prime-list.c: Likewise. * src/nohup.c: Likewise. * tests/od/od-float.sh: Likewise. * tests/rm/r-root.sh: Likewise. * tests/tail/inotify-race.sh: Likewise. * tests/tail/inotify-race2.sh: Likewise.
2025-08-11basenc: Don't trigger undefined behaviour in mini-gmpBruno Haible1-3/+5
* src/basenc.c (base58_encode): Avoid calling mpz_import on an empty limb sequence.
2025-08-10realpath: support the -E option required by POSIXCollin Funk1-4/+9
* src/realpath.c (longopts): Add the option. (main): Likewise. (usage): Add the option to the --help message. * tests/misc/realpath.sh: Add a simple test. * doc/coreutils.texi (realpath invocation): Mention the new option. * NEWS: Likewise.
2025-08-09basenc: add base58 supportPádraig Brady2-5/+348
A 58 character encoding that: - avoids visually ambiguous 0OIl characters - uses only alphanumeric characters Described at: - https://tools.ietf.org/html/draft-msporny-base58-03 This implementation uses GMP (or gnulib's gmp fallback). Performance is good in comparison to other implementations. For example when using libgmp on an i7-5600U system, encoding is 530 times faster, and decoding 830 times faster than the implementation using arbitrary precision ints in cpython 3.13. Memory use is proportional to the size of input. Encoding benchmarks: $ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc real 0m0.018s ./configure --quiet --without-libgmp && make -j $(nproc) $ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc real 0m3.431s # dnf install python3-base58 $ time yes | head -c65535 | base58 >file.enc # cpython 3.13 real 0m9.700s Decoding benchmarks: $ time src/basenc --base58 -d <file.enc >/dev/null real 0m0.010s $ ./configure --without-libgmp && make # gnulib gmp $ time src/basenc --base58 -d <file.enc >/dev/null real 0m0.145s $ time base58 -d <file.enc >/dev/null # cpython 3.13 real 0m8.302s * src/basenc.c (base_decode_ctx_finalize, base_encode_ctx_init, base_encode_ctx, base_encode_ctx_finalize): New functions to provide more general processing functionality. (base58_{de,en}code_ctx{_init,,_finalize}): New functions to accumulate all input before calling ... (base58_{de,en}code): ... the GMP based encoding/decoding routines. (do_encode, do_decode): Call the ctx variants if enabled. * doc/coreutils.texi (basenc invocation): Describe the new option, and indicate the main use case being interactive user use. * src/local.mk: Link basenc with GMP. * tests/basenc/basenc.pl: Add test cases. * NEWS: Mention the new feature.
2025-08-09basenc: fix stripping of '=' chars in some encodingsPádraig Brady1-1/+2
* src/basenc.c (do_decode): With -i ensure we strip '=' chars if there is no padding for the chosen encoding. * tests/basenc/basenc.pl: Add a test case. * NEWS: Mention the bug fix.
2025-08-08maint: prefer attribute.h in .c filesCollin Funk1-3/+3
* src/basenc.c (base16_encode, z85_encoding, do_decode): Use ATTRIBUTE_NONSTRING instead of ATTRIBUTE_NONSTRING. * src/basenc.c (sc_prohibit-_gl-attributes): New rule for 'make syntax-check'.
2025-08-07cp: omit some needless lseek callsPaul Eggert1-92/+66
The sparse code sometimes issued multiple lseeks against the same file without doing anything in betwee. Optimize them away by keeping track of the last hole output, in a way that crosses the sparse_copy function call boundary. * src/copy.c (sparse_copy): New arg hole_size, replacing old args scan_holes and last_write_made_hole. All callers changed. (sparse_copy, lseek_copy): Do not create hole at end; let the caller deal with it. All callers changed. (lseek_copy): New args hole_size and total_n_read. Caller changed. (copy_reg): Create hole at end for both lseek_copy and sparse_copy.
2025-08-07cp: --sparse=always was missing some holesPaul Eggert1-13/+17
The sparse code assumed that st_blksize was the minimum hole size. However, st_blksize is an optimum I/O buffer size, not the file system fundamental block size. Use ST_NBLOCKSIZE instead; although it may underestimate the true block size that just slows ‘cp’ down a bit, without introducing bugs. * src/copy.c (sparse_copy): Arg scan_holes replaces the old hole_size arg. All callers changed. (lseek_copy): Remove hole_size arg; no longer needed. Caller changed.
2025-08-05maint: use consistent references to standard files in messagesCollin Funk6-7/+10
* cfg.mk (sc_standard_outputs): Add a grep command for source files. * src/du.c (main): Use standard input instead of stdin, standard output instead of stdout, and standard error instead of stderr in messages. * src/nohup.c (main): Likewise. * src/sort.c (main): Likewise. * src/split.c (main): Likewise. * src/stdbuf.c (main): Likewise. * src/wc.c (main): Likewise. * tests/du/files0-from.pl (@Tests): Adjust test case to new messages. * tests/sort/sort-files0-from.pl: Likewise. * tests/wc/wc-files0-from.pl: Likewise.
2025-08-04maint: remove now-unused include of 'safe-read.h'Bernhard Voelker1-1/+0
'make syntax-check' complains: src/tail.c maint.mk: the above files include safe-read.h but don't use it make: *** [maint.mk:737: sc_prohibit_safe_read_without_use] Error 1 The removal was missed for tail.c in recent commit d3c7072a0950. * src/tail.c (safe-read.h): Remove include.
2025-08-03maint: prefer same-inode.hPaul Eggert4-22/+28
This does not change behavior on POSIX platforms; it’s mostly to make it clearer when we’re looking for file identity. * src/cat.c (main): * src/copy.c (struct dir_list, is_ancestor, copy_internal): * src/tail.c (struct File_spec, record_open_fd, recheck) (tail_forever_inotify, tail_file): * src/test.c (binary_operator): Use psame_inode, PSAME_INODE, or SAME_INODE instead of comparing device and inode numbers by hand.
2025-08-03tail: refactor ‘failable’Paul Eggert1-26/+19
* src/tail.c (recheck, tail_file): Do not mark a file as tailable merely because --retry is not in effect. Simplify internal logic. This should not change behavior; it’s just for clarity and to make the code match the comments better.
2025-08-03tail: fix race between read and fstatPaul Eggert1-24/+10
* src/tail.c (get_file_status): Remove, since after the changes described below it would be called in just one place and it’s a bit clearer to inline by hand. (tail_file): Don’t call fstat after reading the file, as that misses changes arriving between read and fstat. Instead, reuse the fstat done before reading the file.
2025-08-03tail: track errno more accuratelyPaul Eggert1-32/+46
This matters only in some obscure cases hard to test for. * src/tail.c (file_lines, pipe_lines, pipe_bytes, start_bytes) (start_lines, tail_bytes, tail_lines, tail): New return convention, which reports errno. All callers changed. (recheck): Don’t lose track of errno if a regular file is replaced by a symlink. (get_file_status): Set errno to 0 on success. (tail_file): Be more careful about f->errnum. It is now -1 only if the failure was not due to a system call failing.
2025-08-03tail: omit redundant assignmentPaul Eggert1-1/+0
* src/tail.c (recheck): f->remote must be true already, so don’t set it to true.
2025-08-03tail: prefer readlink to lstat+S_ISLNKPaul Eggert3-8/+8
When not already calling lstat for some other reason, prefer readlink to lstat+S_ISLNK, as readlink does not suffer from EOVERFLOW issues. * src/rmdir.c (main): * src/tail.c (recheck, any_symlinks): * src/test.c (unary_operator):
2025-08-03tail: fix unlikely races with >=2 --pidsPaul Eggert1-37/+40
Also, fix commentary to talk about “nonexistent” rather than “dead” processes, since the code looks for the former not the latter and the difference matters for zombies. * src/tail.c (some_writers_exist): Rename from writers_are_dead, negate the sense, don’t have a special and counterintuitive case for !nbpids, remove PIDs found not to exist, and avoid some though not all unlikely races when kernels reuse PIDs. (tail_forever): Optimize via blocking I/O even if --pid was used, so long as all the writers no longer exist. (tail_forever, tail_forever_inotify): Simplify the writers_dead logic; there is no need to have a local var to track this, since we can use pids and nbpids now. (parse_options): Also free and clear pids if !forever.
2025-08-03tail: prefer < 0 to == -1Paul Eggert1-22/+22
* src/tail.c (valid_file_spec, recheck, writers_are_dead) (tail_forever, check_fspec, tail_forever_inotify, tail_file) (parse_options, main): Be a bit more systematic about checking for sign, rather than for exact equality or inequality, when the sign is enough. Makes the code a bit clearer now that -2 sometimes means success.
2025-08-03tail: record file offset more carefullyPaul Eggert1-184/+154
* src/tail.c (struct File_spec): New member read_pos, replacing size, since the value was really a read position not a size. All uses changed. (xlseek): Move defn up. (record_open_fd): If the read_pos (formerly) size arg is unknown, compute it here if it is a regular file. (file_lines): Return the resulting read pos (or -1 on failure) instead of storing it via a pointer. Caller changed. Simplify by using SEEK_CUR instead of SEEK_SET when that is easy. Avoid reading the same data twice when there are not enough lines in the file. (pipe_lines): Return -2 on success, -1 on failure, rather than updating a read pos via a pointer (which was weird for pipes anyway). Caller changed. (pipe_bytes, tail_bytes, tail_lines, tail): Return -1 on failure, a file offset if known, and < -1 otherwise, instead of storing a file offset via a pointer. Caller changed. (pipe_bytes): Take initial file offset as an arg, or -1 if unknown. (start_bytes, start_lines): Return -1 (not 1) on error, -2 (not -1) on EOF, and do not accept pointer to read pos as an arg since neither we nor our caller know the read pos. Callers changed. (recheck): Do not assume a newly-opened file is at offset zero, as this is not always true on Solaris. (tail_forever, check_fspec): Use dump_remainder result only on regular files, to prevent (very unlikely) overflow. (tail_file): Remove no-longer-needed TAIL_TEST_SLEEP code.
2025-08-03maint: prefer 'read' to 'safe_read'Paul Eggert6-35/+32
In the old days, safe_read acted more like what full_read does now. When that went away, some code that invoked safe_read should have gone back to plain 'read' but I guess we never got around to it. Simplify this code by going back to plain 'read'. Use safe_read only in csplit.c, which has a signal handler and where 'read' can therefore fail with EINTR. Although safe_read also checks for oversize buffers, that is better done via io_blksize. * src/cat.c (simple_cat, cat): * src/head.c (copy_fd, elide_tail_lines_pipe) * src/tac.c (tac_seekable, copy_to_temp): (elide_tail_lines_seekable, head_lines): * src/tail.c (dump_remainder, file_lines, pipe_lines) (pipe_bytes, start_bytes, start_lines, tail_forever_inotify): * src/tr.c (plain_read): Use plain 'read', not safe_read, since there is no need to worry about signals or oversize requests. Also, there is no longer a need to include safe-read.h. * src/ioblksize.h: Include sys-limits.h, for SYS_BUFSIZE_MAX. (io_blksize): Max out at SYS_BUFSIZE_MAX.
2025-08-03tail: prefer signed types to size_t, blksize_tPaul Eggert1-59/+59
* src/tail.c (struct File_spec, xwrite_stdout, file_lines) (pipe_lines, pipe_bytes, start_bytes, any_live_files) (tail_forever, any_remote_file, any_non_remote_file) (any_symlinks, any_non_regular_fifo, tailable_stdin) (tail_forever_inotify, ignore_fifo_and_pipe, main): Prefer a signed type to size_t, if possible. Ordinarily this is idx_t, but use int when the value must fit in int anyway. (file_lines): Similarly for blksize_t, which had no business being here anyway. (main): Check for overflow in the oddball case where ptrdiff_t is narrower than int.
2025-08-03tail: prefer intmax_t to uintmax_tPaul Eggert1-55/+61
Signed types let us debug better, by using -fsanitize=undefined. * doc/local.mk (doc/constants.texi): Adjust change from macro to enum. * src/tail.c (COPY_TO_EOF, COPY_A_BUFFER) (DEFAULT_MAX_N_UNCHANGED_STATS_BETWEEN_OPENS): Now enum constants, not macros. (COPY_TO_EOF, COPY_A_BUFFER): Now negative, not positive. (count_t): New typedef. Use it instead of uintmax_t. (COUNT_MAX): New macro; use it instead of UINTMAX_MAX. (struct File_spec, max_n_unchanged_stats_between_opens) (dump_remainder, file_lines, pipe_lines, pipe_bytes) (start_bytes, start_lines, tail_forever, check_fspec) (tail_forever_inotify, tail_bytes, tail_lines, tail, tail_file) (parse_obsolete_option, parse_options, main): Prefer count_t to uintmax_t.
2025-08-03tail: don’t output more lines than requestedPaul Eggert1-1/+1
* src/tail.c (file_lines): Fix an unlikely bug where ‘tail -n N’ could output more than N lines if standard input is a largish regular file with large initial offset that starts with (say) N-1 lines after the initial offset, but grows to N+1 lines between the fstat and read calls. In this case ‘tail -n N’ now outputs N-1 lines, not N+1; that is, it pretends the file grew after ‘tail’ read it. That is better than outputting more than N lines.
2025-08-03tail: xlseek switch → tablePaul Eggert1-18/+8
* src/tail.c (xlseek): Turn a switch statement into a table.
2025-08-03tail: prettyname cleanupPaul Eggert1-108/+101
* src/tail.c: Use ‘prettyname’ consistently as the identifier for a prettified file name, as opposed to ‘pretty_filename’, ‘pretty_name’, and ‘name’. This makes the code easier to follow. (struct File_spec): New member prettyname. (pretty_name): Remove. All uses of pretty_name (f) replaced by f->prettyname. (close_fd, fremote): Accept struct File_spec, not name. All callers changed. (main): Initialize the new prettyname member. This is simpler/smaller than calling pretty_name everywhere.
2025-08-03tail: optimize tail -n +2**63Paul Eggert1-1/+1
* src/tail.c (tail_lines): Also optimize ‘tail -n +N’ on a seekable file, where OFF_T_MAX <= N < UINTMAX_MAX. Of course this is very unlikely.
2025-08-03tail: refactor SEEK_END and linesPaul Eggert1-25/+17
* src/tail.c (tail_lines): Refactor to simplify the confusing code for using SEEK_END when counting lines. The old code had a ‘end_pos != 0’ expression that was always true.
2025-08-03tail: refactor to skip stat call on failurePaul Eggert1-47/+47
* src/tail.c (tail_bytes): New function. (tail_bytes, tail_lines, tail): Accept struct stat pointer from caller instead of calling fstat ourselves. All callers changed. (tail_file): Skip a call to fstat if fstat already failed. * tests/tail/follow-stdin.sh: Adjust to match new behavior on failure, which omits a redundant diagnostic.
2025-08-03tail: speed up -c N for huge NPaul Eggert1-26/+45
When the user specifies -c N where 2**63 <= N, don’t give up and use the slow method (which will exhaust memory if the file is large). Instead, treat it as N = 2**63 - 1, since that has equivalent effect. * src/tail.c (tail_bytes): With -c N and large N, adjust the code so that lseeks can still be used without affecting correctness. Formerly the code gave up and did a sequential pass through the whole input, which could easily exhaust memory.
2025-08-03tail: allow >=2**64 in traditional formPaul Eggert1-18/+9
This better matches the treatment of POSIX form, e.g., ‘tail +Nc’ is now like ‘tail -c +N’ even when N is large. * src/tail.c: Don’t include xstrtol.h. (parse_obsolete_option): Treat numbers greater than UINTMAX_MAX as if they are UINTMAX_MAX. Parse the number by hand with saturating arithmetic; nowadays that’s simpler than using xstrtoumax. There is no need for a diagnostic now, as the error cannot happen any more. * tests/tail/tail.pl (obs-plus-c3): New test.
2025-08-03tail: check OFF_T_MAX vs COPY_A_BUFFERPaul Eggert1-0/+1
* src/tail.c: Document an otherwise-unstated assumption.
2025-08-03tail: fix SEEK_END typoPaul Eggert1-1/+1
* src/tail.c (tail_lines): Fix embarrassing SEEK_END typo. Luckily this matters only in never-used (though valid) invocations.