aboutsummaryrefslogtreecommitdiffstats
path: root/lib/regex.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-10-23tests: date: test --referencePádraig Brady2-0/+51
* tests/date/reference.sh: Ensure the -r option is tested. * tests/local.mk: Add the test.
2025-10-22pr: promptly diagnose write errorsPádraig Brady3-2/+9
* src/pr.c (print_page): Exit promptly for `yes | pr`. (print_clump): Exit promptly for `pr < /dev/zero`. * tests/misc/write-errors.sh: Enable test cases. * NEWS: Mention the improvement.
2025-10-22nl: promptly diagnose write errorsPádraig Brady3-2/+5
* NEWS: Mention the improvement. * src/nl.c (process_file): Exit if error outputting line. * tests/misc/write-errors.sh: Enable the test case.
2025-10-22fmt: promptly diagnose write errorsPádraig Brady3-2/+12
* NEWS: Mention the improvement. * src/fmt.c (put_line): Exit if any error writing line. (flush_paragraph): Exit if any error writing buffer. * tests/misc/write-errors.sh: Enable the (flush_paragraph) test case, and add another to check the put_line() case.
2025-10-22numfmt: promptly diagnose write errorsPádraig Brady3-5/+13
* src/numfmt.c (process line): Inspect the stdio error state when outputting each line so that we don't have to check each output function but do eventually exit upon write error, while also remaining buffered. (main): Also check when outputting a header for the edge case of very long headers. * tests/misc/write-errors.sh: Enable the numfmt test case. * NEWS: Mention the improvement, and reorganize all numfmt improvements.
2025-10-21install: prefer posix_spawnp to fork and execlpCollin Funk3-19/+43
* NEWS: Mention the change. * bootstrap.conf (gnulib_modules): Add posix_spawnattr_destroy, posix_spawnattr_init, posix_spawnattr_setflags, and posix_spawnp. * src/install.c (strip): Use posix_spawnp instead of fork and execlp.
2025-10-20maint: remove a redundant write after mcel_scan returns an errorCollin Funk1-1/+0
* gl/lib/mbbuf.h (mbbuf_get_char): Don't set G.len to 1, since mcel_err has already done it.
2025-10-20tests: numfmt: add non-utf8 multi-byte testPádraig Brady3-1/+36
* tests/numfmt/mb-non-utf8.sh: Test GB18030 delimiter search. * tests/local.mk: Reference the new test, and move the existing numfmt.pl test from tests/misc to tests/numfmt.
2025-10-20numfmt: optimize multi-byte --delimiter searchPádraig Brady1-1/+31
* src/numfmt.c (is_utf8_charset): A new function to efficiently determine if running with a UTF-8 charset. (mbsmbchr): A new function to efficiently search for a (multi-byte) character in a multi-byte string. (next-field): Use mbsmbchr() rather than mbstr() directly.
2025-10-20numfmt: support multi-byte --delimiterPádraig Brady4-23/+34
* bootstrap.conf: Depend on mbsstr() to robustly search for a multi-byte delimiter character (string) within a multi-byte string. * src/numfmt.c (main): Accept a valid multi-byte delimiter character. (next_field): Adjust delimiter search from single byte to multi-byte aware. Use mbsstr to find the first match. * tests/misc/numfmt.pl: Add test case. * NEWS: Mention the improvement.
2025-10-18numfmt: use multi-byte aware suffix matchingPádraig Brady2-6/+4
* src/numfmt.c (process_suffixed_number): Use gnulib's mbs_endswith() helper, which is more robust in non UTF-8 locales. Also always output a devmsg if a suffix is specified.
2025-10-18numfmt: fix issues with multi-byte blanksPádraig Brady3-8/+37
* src/numfmt.c (process_line): Restore byte overwritten with NUL, as it may be part of a multi-byte blank. (process_suffixed_number): Skip multi-byte blanks, and correctly determine width with mbswidth(). (parse_format_string): Use c_isblank() to explicitly indicate that's all the format spec supports. * tests/misc/numfmt.pl: Add test cases. * NEWS: Mention the bug fix.
2025-10-17numfmt: add --unit-separatorPádraig Brady5-7/+131
Output, accept, or disallow a string between the number and unit as recommended in <https://physics.nist.gov/cuu/Units/checklist.html> I.e. support outputting numbers of the form: "1234 M" * src/numfmt.c (simple_strtod_human): Skip unit separator if present, or disallow a unit separator if empty. (double_to_human): Output unit separator if specified. (main): Accept --unit-separator. * tests/misc/numfmt.pl: Add test cases. * doc/coreutils.texi: Describe the new option, giving examples of interaction with --delimiter. * NEWS: Mention the new feature. * THANKS.in: Add Johannes Schauer Marin Rodrigues, who provided a preliminary patch.
2025-10-17numfmt: support reading numbers with grouping charactersPádraig Brady3-4/+29
This does not validate grouping character placement, and currently just ignores grouping characters. * src/numfmt.c (simple_strtod_int): Skip grouping chars that are part of a number. * tests/misc/numfmt.pl: Add test cases. * NEWS: Mention the improvement.
2025-10-17numfmt: support reading numbers with NBSP before unitPádraig Brady4-11/+47
* src/numfmt.c (simple_strtod_human): Accept (multi-byte) non-breaking space character between number and unit. Note we restrict this to a single character between number and unit, to allow less ambiguous parsing if multiple blanks are used to delimit fields. * tests/misc/numfmt.pl: Add test cases. * doc/coreutils.texi (numfmt invocation): Fix stale description --delimiter skipping whitespace. * NEWS: Mention the improvement.
2025-10-16tests: du/bigtime: try harder to find a suitable filesystemNicolas Boichat1-14/+25
* tests/du/bigtime.sh: At least on Linux, the ext4 filesystem doesn't support such large timestamp, while tmpfs does. Try a bit harder to look for a filesystem with large timestamp support.
2025-10-13tests: date: check that the hour format of the current locale is usedCollin Funk2-0/+41
* tests/date/date-locale-hour.sh: New file. * tests/local.mk (all_tests): Add the new test. Co-authored-by: Pádraig Brady <P@draigBrady.com>
2025-10-13tests: fix false failure in recent memory limit testPádraig Brady1-1/+2
* tests/basenc/bounded-memory.sh: Ensure we skip the test upon failure to determine the memory lower bound. Reported by Bruno Haible.
2025-10-11numfmt: fix buffer over-read (CWE-126)Pádraig Brady3-3/+12
* src/numfmt.c (simple_strtod_human): Check for NULL after pointer adjustment to avoid Out-of-range pointer offset (CWE-823). * NEWS: Mention the fix.
2025-10-11tests: basenc: use less redundant namingPádraig Brady3-2/+2
Rename to less redundant names, now that we use a separate test directory per util. * tests/basenc/basenc-bounded-memory.sh -> .../bounded-memory.sh * tests/basenc/basenc-large.sh -> .../large-input.sh * tests/local.mk: Reference new names.
2025-10-11tests: fix memory limit determination in new testPádraig Brady1-1/+1
* tests/basenc/basenc-bounded-memory.sh: The passed command needs to succeed for memory limit determination to work.
2025-10-11tests: basenc: add a test for bounded memory operationCollin Funk2-0/+40
* tests/basenc/basenc-bounded-memory.sh: New file. * tests/local.mk (all_tests): Add the test.
2025-10-10tests: ln -f: ensure existing link replacedSylvestre Ledru1-0/+8
Identified here: <https://github.com/uutils/coreutils/issues/8830> * tests/ln/misc.sh: Add the check.
2025-10-07tests: cksum: add a test case for robust file name parsingPádraig Brady1-0/+6
* tests/cksum/cksum-c.sh: Add a test case where the file name contains tagged format delimiter characters.
2025-10-07maint: cksum: document a base64/hex parsing ambiguity with untaggedPádraig Brady1-1/+12
* src/digest.c (split_3): Mention the ambiguity in misinterpreting base64 characters as hex is not a practical consideration. Also add an example of both tagged formats which makes it easier to interpret the parsing logic.
2025-10-07cksum: fix --check with untagged base64 format with tag matchesPádraig Brady3-24/+37
* src/digest.c (split_3): Fallback to untagged matching in the case where -a is specified and we have matched a TAG in the possibly base64 data. This might happen in 1 in every 64K files. Note we remove the modification of string S (and redundant streq) in the tag matching, as that was not needed since v8.32-223-g217cd278e. * tests/cksum/cksum-c.sh: Add a test case. * NEWS: Mention the bug fix.
2025-10-07cksum: fix length validation with SHA2- tagged formatPádraig Brady3-9/+21
* src/digest.c (sha2_sum_stream): Change from unreachable() to affirm() so that we have defined behavior unless we configure with --disable-assert. (sha3_sum_stream): Likewise. (split_3): Validate SHA2-lengths before passing on. * tests/cksum/cksum-c.sh: Add a test case. * NEWS: Mention the bug fix.
2025-10-07cksum: fix --check with --algorithm=sha2Pádraig Brady3-6/+19
* src/digest.c (split_3): Look up the provided tag with -a sha2 because there is not a 1:1 mapping between them. * tests/cksum/cksum-c.sh: Add a test case. * NEWS: Mention the bug fix.
2025-10-06rm: remove redundant mark_ancestor_dirs callPaul Eggert1-1/+0
* src/remove.c (rm_fts): Remove unnecessary call. Since this code is executed only when not recursive, there are no ancestors to mark.
2025-10-06rm: make ‘rm -d DIR’ more like ‘rmdir DIR’Paul Eggert2-10/+12
* src/remove.c (rm_fts): When not recursive, arrange for ‘rm -d DIR’ to behave more like ‘rmdir DIR’. This works better for Ceph snapshot directories. Problem reported by Yannick Le Pennec (bug#78245).
2025-10-05cksum: allow -a {blake2b,sha2,sha3} --check to work on base64Collin Funk4-4/+95
* NEWS: Mention the bug. * src/digest.c (split_3): Check that the base64 digest matches the length supported by the algorithm. (digest_check): Check that the read digest matches the base64 length of the algorithm's digest. The previous condition would not work for 'cksum -a blake2b -l 8 ...'. * tests/cksum/cksum-base64-untagged.sh: New file. * tests/local.mk (all_tests): Add the new test.
2025-10-03maint: omit trailing white space in config.hPaul Eggert1-1/+1
* configure.ac (FORTIFY_SOURCE): Don’t indent a line where the indentation can cause trailing white space in config.h. Problem reported by Grisha Levit (Bug#79567).
2025-10-03maint: remove IRIX supportCollin Funk2-7/+0
* src/ptx.c (main) [HAVE_SETCHRCLASS]: Remove call to setchrclass. * src/stty.c (VREPRINT) [!VREPRINT && VRPRNT]: Remove definition.
2025-10-02tests: factor: add suggested large prime testsPádraig Brady2-3/+11
* tests/factor/create-test.sh: Add 2 new large primes from: https://github.com/coreutils/coreutils/issues/65 * tests/local.mk: Reference the 2 new generated tests.
2025-10-02unexpand: fix heap buffer overflow with --tabs=[+/]NUMPádraig Brady3-5/+22
This avoids CWE-122: Heap-based Buffer Overflow where we could write blank characters beyond the allocated heap buffer. * src/expand-common.c (set_max_column_width): Refactor function from ... (add_tab_stop): ... here. (set_extend_size): Call new function. (set_increment_size): Likewise. * NEWS: Mention the bug fix. Fixes https://bugs.gnu.org/79555
2025-10-02doc: man: consistently format -X[OPTIONAL] formPádraig Brady1-1/+3
This is significant for the date, od, and pr commands which have options of the form -X[OPTIONAL], which change like: diff -r man.orig/date.1 man/date.1 < \fB\-I[FMT]\fR, \fB\-\-iso\-8601\fR[=\fI\,FMT\/\fR] > \fB\-I\fR[\fI\,FMT\/\fR], \fB\-\-iso\-8601\fR[=\fI\,FMT\/\fR] diff -r man.orig/od.1 man/od.1 < \fB\-w[BYTES]\fR, \fB\-\-width\fR[=\fI\,BYTES\/\fR] > \fB\-w\fR[\fI\,BYTES\/\fR], \fB\-\-width\fR[=\fI\,BYTES\/\fR] * man/help2man (convert_options): Support options of the form -X[PARAM], so that we now consistently format them (in italics).
2025-10-02doc: man: consistently italicize --option parametersPádraig Brady1-1/+6
This changes a few pages, but the changes in tail.1 concisely illustrate the resulting man page changes: $ diff -r man.orig/tail.1 man/tail.1 < \fB\-c\fR, \fB\-\-bytes\fR=\fI\,[\/\fR+]NUM > \fB\-c\fR, \fB\-\-bytes\fR=\fI\,[+]NUM\/\fR < \fB\-f\fR, \fB\-\-follow[=\fR{name|descriptor}] > \fB\-f\fR, \fB\-\-follow\fR[=\fI\,{name|descriptor}\/\fR] < \fB\-n\fR, \fB\-\-lines\fR=\fI\,[\/\fR+]NUM > \fB\-n\fR, \fB\-\-lines\fR=\fI\,[+]NUM\/\fR * man/help2man: Relax the option match so more --option variations are supported, and passed through to convert_option(). Specifically more variations after '=' are now supported. Also split and document the regular expression. Reported at https://github.com/coreutils/coreutils/issues/84
2025-10-01maint: prefer unreachable () to NOTREACHED commentCollin Funk1-1/+1
* src/tsort.c (search_item): Use unreachable () instead of NOTREACHED.
2025-09-30fold: move multi-byte character reading to a moduleCollin Funk7-93/+235
* gl/modules/mbbuf: New file. * gl/lib/mbbuf.c: Likewise. * gl/lib/mbbuf.h: Likewise. * gl/local.mk (EXTRA_DIST): Add the new files. * bootstrap.conf (gnulib_modules): Add mbbuf. * src/fold.c: Include mbbuf.h. (fold_file): Use the mbbuf functions instead of calling fread and handling the input buffer ourselves. * cfg.mk (exclude_file_name_regexp--sc_preprocessor_indentation) (exclude_file_name_regexp--sc_GPL_version): Match gl/lib/mbbuf.c and gl/lib/mbbuf.h.
2025-09-30wc: add AVX512 function for line countingMathieu Bordere7-6/+137
* configure.ac: Add detection of AVX512 intrinsics for wc. * src/local.mk: Build AVX512 wc libraries. * src/wc.c: Add runtime detection of AVX512 intrinsics and call appropriate function when detected. * src/wc.h (wc_lines_avx512): Declare function. * tests/wc/wc-cpu.sh: Add a test that disables AVX512 intrinsics. * src/wc_avx512.c: New file containing the wc -l implementation using AVX512. The logic and code is reused from the AVX2 implementation with slight adaptations. Replaced __builtin_popcount by __builtin_popcountll and the combination of _mm256_cmpeq_epi8 and _mm256_movemask_epi8 by a single call to _mm512_cmpeq_epi8_mask. * NEWS: Mention the improvement.
2025-09-28maint: update valgrind instructionsPádraig Brady1-16/+20
* README-valgrind: Adjust to current repo structure, and give clearer step by step instructions.
2025-09-27maint: convert some overflow checks to ckd_add and ckd_mulCollin Funk5-20/+7
* src/csplit.c (parse_repeat_count): Prefer ckd_add when checking for overflows. * src/install.c (get_ids): Likewise. * src/shred.c (dopass): Likewise. * src/tr.c (get_spec_stats): Likewise. * src/sort.c (specify_sort_size): Prefer ckd_mul when checking for overflows.
2025-09-26tests: test: ensure file operations are coveredPádraig Brady2-0/+41
A coverage report indicated these weren't tested (as generally the test shell builtin is used). * tests/test/test-file.sh: Add a new test. * tests/local.mk: Reference the new test.
2025-09-25join: remove unused #include "argmatch.h"Bernhard Voelker1-1/+0
Prompted by the syntax-check failure: maint.mk: the above files include argmatch.h but don't use it make: *** [maint.mk:741: sc_prohibit_argmatch_without_use] Error 1 * src/join.c: Remove include, as the previous commit changed from using ARRAY_CARDINALITY to countof - for which system.h includes the header.
2025-09-24maint: prefer countof over ARRAY_CARDINALITYCollin Funk14-28/+23
* bootstrap.conf (gnulib_modules): Add stdcountof-h. * src/system.h: Include stdcountof.h. (ARRAY_CARDINALITY): Remove definition. * .gitignore (/lib/stdcountof.h): Ignore Gnulib generated file. * src/csplit.c: Use countof instead of ARRAY_CARDINALITY. * src/df.c: Likewise. * src/digest.c: Likewise. * src/dircolors.c: Likewise. * src/factor.c: Likewise. * src/join.c: Likewise. * src/ls.c: Likewise. * src/od.c: Likewise. * src/sort.c: Likewise. * src/stdbuf.c: Likewise. * src/tr.c: Likewise.
2025-09-24tail: fix tailing larger number of lines in regular filesHannes Braun4-1/+34
* src/tail.c (file_lines): Seek to the previous block instead of the beginning (or a little before) of the block that was just scanned. Otherwise, the same block is read and scanned (at least partially) again. This bug was introduced by commit v9.7-219-g976f8abc1. * tests/tail/basic-seek.sh: Add a new test. * tests/local.mk: Reference the new test. * NEWS: mention the bug fix.
2025-09-24tests: wc: fix hardware acceleration disabling testPádraig Brady1-1/+1
* tests/wc/wc-cpu.sh: The message is only printed with wc -l. Reported by Mathieu Borderé.
2025-09-24build: copy: add dependency on $(LIB_SMACK)Pádraig Brady1-0/+1
* src/local.mk: Due to gnulib adjustments, this explicit dependency is required with the mold linker at least. Reported at https://github.com/coreutils/coreutils/issues/113
2025-09-23basenc: --base58: fix buffer overflow with input > 15MBPádraig Brady4-18/+58
base58_length() operated naively on an int which resulted in an overflow to a negative number for any input > 2^31-1/138, i.e. 15,561,475 bytes. * src/basenc.c (base_length): Change input and output parameter types from int to idx_t since this needs to cater for the full input size in the base58 case. (base58_length): Likewise. Also reorder the calculation to be less exact, but doing the division first to minimize the chance of overflow (which now on 64 bit would only happen for inputs > around 6 Exa bytes). * tests/basenc/basenc-large.sh: Add a new test, that triggers with valgrind or ASAN. * tests/local.mk: Reference the new test. * NEWS: Mention the bug fix.
2025-09-23doc: document gl_cv_crc_pclmul to control hardware accelerationPádraig Brady1-3/+4
doc/coreutils.texi (Harware acceleration configuration): Sort the list and add "gl_cv_crc_pclmul".