| Age | Commit message (Collapse) | Author | Lines |
|
cancel_work_sync() is a sleeping function so it cannot be called with
the spin lock of a port being held. Move the call to this function in
ata_port_detach() after EH completes, with the port lock released,
together with other work cancellation calls.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
|
|
A deferred qc may timeout while waiting for the device queue to drain
to be submitted. In such case, since the qc is not active,
ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(),
which frees the qc. But as the port deferred_qc field still references
this finished/freed qc, the deferred qc work may eventually attempt to
call ata_qc_issue() against this invalid qc, leading to errors such as
reported by UBSAN (syzbot run):
UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24
shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int'
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
ubsan_epilogue+0xa/0x30 lib/ubsan.c:233
__ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494
ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166
ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679
process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275
process_scheduled_works kernel/workqueue.c:3358 [inline]
worker_thread+0x5da/0xe40 kernel/workqueue.c:3439
kthread+0x370/0x450 kernel/kthread.c:467
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Fix this by checking if the qc of a timed out SCSI command is a deferred
one, and in such case, clear the port deferred_qc field and finish the
SCSI command with DID_TIME_OUT.
Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
|
|
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.
As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ATA updates from Damien Le Moal:
- Cleanup IRQ masking in the handling of completed report zones
commands (Niklas)
- Improve the handling of Thunderbolt attached devices to speed up
device removal (Henry)
- Several patches to generalize the existing max_sec quirks to
facilitates quirking the maximum command size of buggy drives, many
of which have recently showed up with the recent increase of the
default max_sectors block limit (Niklas)
- Cleanup the ahci-platform and sata dt-bindings schema (Rob,
Manivannan)
- Improve device node scan in the ahci-dwc driver (Krzysztof)
- Remove clang W=1 warnings with the ahci-imx and ahci-xgene drivers
(Krzysztof)
- Fix a long standing potential command starvation situation with
non-NCQ commands issued when NCQ commands are on-going (me)
- Limit max_sectors to 8191 on the INTEL SSDSC2KG480G8 SSD (Niklas)
- Remove Vesa Local Bus (VLB) support in the pata_legacy driver (Ethan)
- Simple fixes in the pata_cypress (typo) and pata_ftide010 (timing)
drivers (Ethan, Linus W)
* tag 'ata-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: pata_ftide010: Fix some DMA timings
ata: pata_cypress: fix typo in error message
ata: pata_legacy: remove VLB support
ata: libata-core: Quirk INTEL SSDSC2KG480G8 max_sectors
dt-bindings: ata: sata: Document the graph port
ata: libata-scsi: avoid Non-NCQ command starvation
ata: libata-scsi: refactor ata_scsi_translate()
ata: ahci-xgene: Fix Wvoid-pointer-to-enum-cast warning
ata: ahci-imx: Fix Wvoid-pointer-to-enum-cast warning
ata: ahci-dwc: Simplify with scoped for each OF child loop
dt-bindings: ata: ahci-platform: Drop unnecessary select schema
ata: libata: Allow more quirks
ata: libata: Add libata.force parameter max_sec
ata: libata: Add support to parse equal sign in libata.force
ata: libata: Change libata.force to use the generic ATA_QUIRK_MAX_SEC quirk
ata: libata: Add ata_force_get_fe_for_dev() helper
ata: libata: Add ATA_QUIRK_MAX_SEC and convert all device quirks
ata: libata: avoid long timeouts on hot-unplugged SATA DAS
ata: libata-scsi: Remove superfluous local_irq_save()
|
|
Pull SCSI updates from James Bottomley:
"Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted
cleanups and fixes.
The biggest core change is the massive code motion in the sd driver to
remove forward declarations and the most significant change is to
enumify the queuecommand return"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (78 commits)
scsi: csiostor: Fix dereference of null pointer rn
scsi: buslogic: Reduce stack usage
scsi: ufs: host: mediatek: Require CONFIG_PM
scsi: ufs: mediatek: Fix page faults in ufs_mtk_clk_scale() trace event
scsi: smartpqi: Fix memory leak in pqi_report_phys_luns()
scsi: mpi3mr: Make driver probing asynchronous
scsi: ufs: core: Flush exception handling work when RPM level is zero
scsi: efct: Use IRQF_ONESHOT and default primary handler
scsi: ufs: core: Use a host-wide tagset in SDB mode
scsi: qla2xxx: target: Add WQ_PERCPU to alloc_workqueue() users
scsi: qla2xxx: Add WQ_PERCPU to alloc_workqueue() users
scsi: qla4xxx: Add WQ_PERCPU to alloc_workqueue() users
scsi: mpi3mr: Driver version update to 8.17.0.3.50
scsi: mpi3mr: Fixed the W=1 compilation warning
scsi: mpi3mr: Record and report controller firmware faults
scsi: mpi3mr: Update MPI Headers to revision 39
scsi: mpi3mr: Use negotiated link rate from DevicePage0
scsi: mpi3mr: Avoid redundant diag-fault resets
scsi: mpi3mr: Rename log data save helper to reflect threaded/BH context
scsi: mpi3mr: Add module parameter to control threaded IRQ polling
...
|
|
The FTIDE010 has been missing some timing settings since its
inception, since the upstream OpenWrt patch was missing these.
The community has since come up with the appropriate timings.
Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010")
Cc: stable@vger.kernel.org
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
An error message in the pata_cypress driver has a typo "mome" for
"mode". Fix it.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
This significantly reduces the complexity of the pata_legacy driver.
The VLB bus is very obsolete and last appeared on P5 Pentium-era
hardware. Support for it has been removed from other drivers, and
it is highly unlikely anyone is using it with modern Linux kernels.
Some of these chips were integrated on motherboards, but they
seem to have all been 486-era boards, which are equally obsolete.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Commit 9b8b84879d4a ("block: Increase BLK_DEF_MAX_SECTORS_CAP") increased
the default max_sectors_kb from 1280 KiB to 4096 KiB.
INTEL SSDSC2KG480G8 with FW rev XCV10120 times out when sending I/Os of
size 4096 KiB.
Enable ATA_QUIRK_MAX_SEC, with value 8191 (sectors) for this device,
since any I/O with more sectors than that lead to I/O timeouts.
With this, the INTEL SSDSC2KG480G8 is usable again.
Link: https://lore.kernel.org/linux-ide/176839089913.2398366.61500945766820256@eldamar.lan/
Fixes: 9b8b84879d4a ("block: Increase BLK_DEF_MAX_SECTORS_CAP")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
In clang version 21.1 and later the -Wimplicit-enum-enum-cast warning
option has been introduced. This warning is enabled by default and can
be used to catch .queuecommand() implementations that return another
value than 0 or one of the SCSI_MLQUEUE_* constants. Hence this patch
that changes the return type of the .queuecommand() implementations from
'int' into 'enum scsi_qc_status'. No functionality has been changed.
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260115210357.2501991-6-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When a non-NCQ command is issued while NCQ commands are being executed,
ata_scsi_qc_issue() indicates to the SCSI layer that the command issuing
should be deferred by returning SCSI_MLQUEUE_XXX_BUSY. This command
deferring is correct and as mandated by the ACS specifications since
NCQ and non-NCQ commands cannot be mixed.
However, in the case of a host adapter using multiple submission queues,
when the target device is under a constant load of NCQ commands, there
are no guarantees that requeueing the non-NCQ command will be executed
later and it may be deferred again repeatedly as other submission queues
can constantly issue NCQ commands from different CPUs ahead of the
non-NCQ command. This can lead to very long delays for the execution of
non-NCQ commands, and even complete starvation for these commands in the
worst case scenario.
Since the block layer and the SCSI layer do not distinguish between
queueable (NCQ) and non queueable (non-NCQ) commands, libata-scsi SAT
implementation must ensure forward progress for non-NCQ commands in the
presence of NCQ command traffic. This is similar to what SAS HBAs with a
hardware/firmware based SAT implementation do.
Implement such forward progress guarantee by limiting requeueing of
non-NCQ commands from ata_scsi_qc_issue(): when a non-NCQ command is
received and NCQ commands are in-flight, do not force a requeue of the
non-NCQ command by returning SCSI_MLQUEUE_XXX_BUSY and instead return 0
to indicate that the command was accepted but hold on to the qc using
the new deferred_qc field of struct ata_port.
This deferred qc will be issued using the work item deferred_qc_work
running the function ata_scsi_deferred_qc_work() once all in-flight
commands complete, which is checked with the port qc_defer() callback
return value indicating that no further delay is necessary. This check
is done using the helper function ata_scsi_schedule_deferred_qc() which
is called from ata_scsi_qc_complete(). This thus excludes this mechanism
from all internal non-NCQ commands issued by ATA EH.
When a port deferred_qc is non NULL, that is, the port has a command
waiting for the device queue to drain, the issuing of all incoming
commands (both NCQ and non-NCQ) is deferred using the regular busy
mechanism. This simplifies the code and also avoids potential denial of
service problems if a user issues too many non-NCQ commands.
Finally, whenever ata EH is scheduled, regardless of the reason, a
deferred qc is always requeued so that it can be retried once EH
completes. This is done by calling the function
ata_scsi_requeue_deferred_qc() from ata_eh_set_pending(). This avoids
the need for any special processing for the deferred qc in case of NCQ
error, link or device reset, or device timeout.
Reported-by: Xingui Yang <yangxingui@huawei.com>
Reported-by: Igor Pylypiv <ipylypiv@google.com>
Fixes: bdb01301f3ea ("scsi: Add host and host template flag 'host_tagset'")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Tested-by: Igor Pylypiv <ipylypiv@google.com>
Tested-by: Xingui Yang <yangxingui@huawei.com>
|
|
Commit d633b8a702ab ("libata: print feature list on device scan")
added a print of the features supported by the device for ATA_DEV_ATA and
ATA_DEV_ZAC devices, but not for ATA_DEV_ATAPI devices.
Fix this by printing the features also for ATAPI devices.
Before changes:
ata1.00: ATAPI: Slimtype DVD A DU8AESH, 6C2M, max UDMA/133
After changes:
ata1.00: ATAPI: Slimtype DVD A DU8AESH, 6C2M, max UDMA/133
ata1.00: Features: Dev-Attention HIPM DIPM
Fixes: d633b8a702ab ("libata: print feature list on device scan")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Wolf <wolf@yoxt.cc>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
ata_dev_print_features() is supposed to return early and not print anything
if there are no features supported.
However, commit b1f5af54f1f5 ("ata: libata-core: Advertize device support
for DIPM and HIPM features") added additional features to
ata_dev_print_features() without updating the early return conditional.
Add the missing features to the early return conditional.
Fixes: b1f5af54f1f5 ("ata: libata-core: Advertize device support for DIPM and HIPM features")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Wolf <wolf@yoxt.cc>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
ata_dev_print_features() is supposed to return early and not print anything
if there are no features supported.
However, commit fe22e1c2f705 ("libata: support concurrent positioning
ranges log") added another feature to ata_dev_print_features() without
updating the early return conditional.
Add the missing feature to the early return conditional.
Fixes: fe22e1c2f705 ("libata: support concurrent positioning ranges log")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Wolf <wolf@yoxt.cc>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
The link_power_management_supported sysfs attribute is currently set as
true even for ata ports that lack a .set_lpm() callback, e.g. dummy ports.
This is a bit silly, because while writing to the
link_power_management_policy sysfs attribute will make ata_scsi_lpm_store()
update ap->target_lpm_policy (thus sysfs will reflect the new value) and
call ata_port_schedule_eh() for the port, it is essentially a no-op.
This is because for a port without a .set_lpm() callback, once EH gets to
run, the ata_eh_link_set_lpm() will simply return, since the port does not
provide a .set_lpm() callback.
Thus, make sure that the link_power_management_supported sysfs attribute
is set to false for ports that lack a .set_lpm() callback. This way the
link_power_management_policy sysfs attribute will no longer be writable,
so we will no longer be misleading users to think that their sysfs write
actually does something.
Fixes: 0060beec0bfa ("ata: libata-sata: Add link_power_management_supported sysfs attribute")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Wolf <wolf@yoxt.cc>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Commit d360121832d8 ("ata: libata-core: Introduce ata_dev_config_lpm()")
introduced ata_dev_config_lpm(). However, it only called this function for
ATA_DEV_ATA and ATA_DEV_ZAC devices, not for ATA_DEV_ATAPI devices.
Additionally, commit d99a9142e782 ("ata: libata-core: Move device LPM quirk
settings to ata_dev_config_lpm()") moved the LPM quirk application from
ata_dev_configure() to ata_dev_config_lpm(), causing LPM quirks for ATAPI
devices to no longer be applied.
Call ata_dev_config_lpm() also for ATAPI devices, such that LPM quirks are
applied for ATAPI devices with an entry in __ata_dev_quirks once again.
Fixes: d360121832d8 ("ata: libata-core: Introduce ata_dev_config_lpm()")
Fixes: d99a9142e782 ("ata: libata-core: Move device LPM quirk settings to ata_dev_config_lpm()")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Wolf <wolf@yoxt.cc>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
An AHCI HBA specifies the number of ports it supports using CAP.NP.
The HBA is free to only make a subset of the number of ports available
using the PI (Ports Implemented) register.
libata currently creates dummy ports for HBA ports that are provided by
the HBA, but which are marked as "unavailable" using the PI register.
Each port will have a per port area of registers in the HBA, regardless
if the port is marked as "unavailable" or not.
ahci_mark_external_port() currently reads this per port area of registers
using readl() to see if the port is marked as external/hotplug-capable.
However, AHCI 1.3.1, section "3.1.4 Offset 0Ch: PI – Ports Implemented"
states: "Software must not read or write to registers within unavailable
ports."
Thus, make sure that we only call ahci_mark_external_port() and
ahci_update_initial_lpm_policy() for ports that are implemented.
From a libata perspective, this should not change anything related to LPM,
as dummy ports do not provide any ap->ops (they do not have a .set_lpm()
callback), so even if EH were to call .set_lpm() on a dummy port, it was
already a no-op.
Fixes: f7131935238d ("ata: ahci: move marking of external port earlier")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Wolf <wolf@yoxt.cc>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Factor out of ata_scsi_translate() the code handling queued command
deferral using the port qc_defer callback and issuing the queued
command with ata_qc_issue() into the new function ata_scsi_qc_issue(),
and simplify the goto used in ata_scsi_translate().
While at it, also add a lockdep annotation to check that the port lock
is held when ata_scsi_translate() is called.
No functional changes.
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
|
|
"version" is an enum, thus cast of pointer on 64-bit compile test with
clang W=1 causes:
ahci_xgene.c:776:13: error: cast to smaller integer type 'enum xgene_ahci_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
"imxpriv->type" is an enum, thus cast of pointer on 64-bit compile test
with clang W=1 causes:
ahci_imx.c:872:18: error: cast to smaller integer type 'enum ahci_imx_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Use scoped for-each loop when iterating over device nodes and switch to
iterating already over available nodes to make code a bit simpler.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
According to a user report, the ST2000DM008-2FR102 has problems with LPM.
Reported-by: Emerson Pinter <e@pinter.dev>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220693
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
We have currently used up 30 out of the 32-bits in the struct ata_device
struct member quirks. Thus, it is only possible to add two more quirks.
Change the struct ata_device struct member quirks from an unsigned int to
an u64.
Doing this core level change now, will make it easier for us now, as we
will not need to also do core level changes once the final two bits are
used as well.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Add a new libata.force parameter called max_sec.
The parameter can take an arbitrary value using the format:
libata.force=max_sec=<number of 512B sectors>
e.g. libata.force=max_sec=8191
or
libata.force=max_sec=2048
This will allow the user to set an arbitrary maximum command size
(dev->max_sectors) using libata.force.
We cannot remove the existing libata.force parameters "max_sec_128" and
"max_sec_1024", as these are a part of the exising user facing API.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Currently, no libata.force parameter supports an arbitrary value.
All allowed values, e.g. udma/16, udma/25, udma/33, udma/44, udma/66,
udma/100, udma/133 have hardcoded entries in the force_tbl table.
Add code to allow a libata.force param with the format
libata.force=param=param_value, where param_value can be an arbitrary
value.
This code will be used in a follow up commit.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Modify the existing libata.force parameters "max_sec_128" and
"max_sec_1024" to use the generic ATA_QUIRK_MAX_SEC quirk rather than
individual quirks.
This also allows us to remove the individual quirks ATA_QUIRK_MAX_SEC_128
and ATA_QUIRK_MAX_SEC_1024.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Add ata_force_get_fe_for_dev() helper to get the struct ata_force_ent for
a struct ata_device.
Use the helper in ata_force_quirks().
The helper will also be used in follow up commits.
No functional change intended.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Add a new quirk ATA_QUIRK_MAX_SEC, which has a separate table with device
specific values.
Convert all existing ATA_QUIRK_MAX_SEC_XXX device quirks in
__ata_dev_quirks to the new format.
Quirks ATA_QUIRK_MAX_SEC_128 and ATA_QUIRK_MAX_SEC_1024 cannot be removed
yet, since they are also used by libata.force, which functionally, is a
separate user of the quirks. The quirks will be removed once all users
have been converted to use the new format.
The quirk ATA_QUIRK_MAX_SEC_8191 can be removed since it has no equivalent
libata.force parameter.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
When a SATA DAS enclosure is connected behind a Thunderbolt PCIe
switch, hot-unplugging the whole enclosure causes pciehp to tear down
the PCI hierarchy before the SCSI layer issues SYNCHRONIZE CACHE and
START STOP UNIT for the disks.
libata still queues these commands and the AHCI driver tries to access
the HBA registers even though the PCI channel is already offline. This
results in a series of timeouts and error recovery attempts, e.g.:
[ 824.778346] pcieport 0000:00:07.0: pciehp: Slot(14): Link Down
[ 891.612720] ata8.00: qc timeout after 5000 msecs (cmd 0xec)
[ 902.876501] ata8.00: qc timeout after 10000 msecs (cmd 0xec)
[ 934.107998] ata8.00: qc timeout after 30000 msecs (cmd 0xec)
[ 936.206431] sd 7:0:0:0: [sda] Synchronize Cache(10) failed:
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
...
[ 1006.298356] ata1.00: qc timeout after 5000 msecs (cmd 0xec)
[ 1017.561926] ata1.00: qc timeout after 10000 msecs (cmd 0xec)
[ 1048.791790] ata1.00: qc timeout after 30000 msecs (cmd 0xec)
[ 1050.890035] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed:
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
With this patch applied, the same hot-unplug looks like:
[ 59.965496] pcieport 0000:00:07.0: pciehp: Slot(14): Link Down
[ 60.002502] sd 7:0:0:0: [sda] Synchronize Cache(10) failed:
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
...
[ 60.103050] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed:
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
In this test setup with two disks, the hot-unplug sequence shrinks from
about 226 seconds (~3.8 minutes) between the Link Down event and the
last SYNCHRONIZE CACHE failure to under a second. Without this patch the
total delay grows roughly with the number of disks, because each disk
gets its own SYNCHRONIZE CACHE and qc timeout series.
If the underlying PCI device is already gone, these commands cannot
succeed anyway. Avoid issuing them by introducing
ata_adapter_is_online(), which checks pci_channel_offline() for
PCI-based hosts. It is used from ata_scsi_find_dev() to return NULL,
causing the SCSI layer to fail new commands with DID_BAD_TARGET
immediately, and from ata_qc_issue() to bail out before touching the
HBA registers.
Since such failures would otherwise trigger libata error handling,
ata_adapter_is_online() is also consulted from ata_scsi_port_error_handler().
When the adapter is offline, libata skips ap->ops->error_handler(ap) and
completes error handling using the existing path, rather than running
a full EH sequence against a dead adapter.
With this change, SYNCHRONIZE CACHE and START STOP UNIT commands
issued during hot-unplug fail quickly once the PCI channel is offline,
without qc timeout spam or long libata EH delays.
Suggested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Henry Tseng <henrytseng@qnap.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Commit 28a3fc2295a7 ("libata: implement ZBC IN translation") added
ata_scsi_report_zones_complete(). Since the beginning, this function
has disabled IRQs on the local CPU using local_irq_save().
qc->complete_fn is always called with ap->lock held, and the ap->lock
is always taken using spin_lock_irq*().
Thus, this local_irq_save() is superfluous and can be removed.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
- The DELLBOSS VD SATA controller times out when sending I/Os of size
4096 KiB or larger, even though it claims LBA48 support, which per
the ACS standard requires support for a maximum command size of
65535 sectors, i.e. 32 MiB - 512. Thus, quirk the device so that it
sets a lower maximum command size (me)
* tag 'ata-6.19-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: Quirk DELLBOSS VD max_sectors
ata: libata: Move quirk flags to their own enum
|
|
Pull SCSI updates from James Bottomley:
"Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted
cleanups and fixes including the WQ_PERCPU series.
The biggest core change is the new allocation of pseudo-devices which
allow the sending of internal commands to a given SCSI target"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (147 commits)
scsi: MAINTAINERS: Add the UFS include directory
scsi: scsi_debug: Support injecting unaligned write errors
scsi: qla2xxx: Fix improper freeing of purex item
scsi: ufs: rockchip: Fix compile error without CONFIG_GPIOLIB
scsi: ufs: rockchip: Reset controller on PRE_CHANGE of hce enable notify
scsi: ufs: core: Use scsi_device_busy()
scsi: ufs: core: Fix single doorbell mode support
scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users
scsi: target: Add WQ_PERCPU to alloc_workqueue() users
scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users
scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users
scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users
scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users
scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users
scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users
scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users
scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users()
scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users
scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users
scsi: target: sbp: Replace use of system_unbound_wq with system_dfl_wq
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata updates from Niklas Cassel:
- Add DT binding for the Eswin EIC7700 SoC SATA Controller (Yulin Lu)
- Allow 'iommus' property in the Synopsys DWC AHCI SATA controller DT
binding (Rob Herring)
- Replace deprecated strcpy with strscpy in the pata_it821x driver
(Thorsten Blum)
- Add Iomega Clik! PCMCIA ATA/ATAPI Adapter PCMCIA ID to the
pata_pcmcia driver (René Rebe)
- Add ATA_QUIRK_NOLPM quirk for two Silicon Motion SSDs with broken LPM
support (me)
- Add flag WQ_PERCPU to the workqueue in the libata-sff helper library
to explicitly request the use of the per-CPU behavior (Marco
Crivellari)
* tag 'ata-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: Disable LPM on Silicon Motion MD619{H,G}XCLDE3TC
ata: pata_pcmcia: Add Iomega Clik! PCMCIA ATA/ATAPI Adapter
ata: libata-sff: add WQ_PERCPU to alloc_workqueue users
dt-bindings: ata: snps,dwc-ahci: Allow 'iommus' property
ata: pata_it821x: Replace deprecated strcpy with strscpy in it821x_display_disk
dt-bindings: ata: eswin: Document for EIC7700 SoC ahci
|
|
Commit 9b8b84879d4a ("block: Increase BLK_DEF_MAX_SECTORS_CAP") increased
the default max_sectors_kb from 1280 KiB to 4096 KiB.
DELLBOSS VD with FW rev MV.R00-0 times out when sending I/Os of size
4096 KiB.
Enable ATA_QUIRK_MAX_SEC, with value 8191 (sectors) for this device,
since any I/O with more sectors than that lead to I/O timeouts.
With this, the DELLBOSS VD SATA controller is usable again.
Cc: stable+noautosel@kernel.org # depends on Move quirk flags to their own enum
Fixes: 9b8b84879d4a ("block: Increase BLK_DEF_MAX_SECTORS_CAP")
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
According to a user report, the Silicon Motion MD619HXCLDE3TC SSD and
the Silicon Motion MD619GXCLDE3TC SSD have problems with LPM.
Reported-by: Yihang Li <liyihang9@h-partners.com>
Closes: https://lore.kernel.org/linux-ide/20251121073502.3388239-1-liyihang9@h-partners.com/
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Bean Huo <beanhuo@iokpp.de> says:
This patch series introduces OP-TEE based RPMB (Replay Protected
Memory Block) support for UFS devices, extending the kernel-level
secure storage capabilities that are currently available for eMMC
devices.
Previously, OP-TEE required a userspace supplicant to access RPMB
partitions, which created complex dependencies and reliability issues,
especially during early boot scenarios. Recent work by Linaro has
moved core supplicant functionality directly into the Linux kernel for
eMMC devices, eliminating userspace dependencies and enabling
immediate secure storage access. This series extends the same approach
to UFS devices, which are used in enterprise and mobile applications
that require secure storage capabilities.
Benefits:
- Eliminates dependency on userspace supplicant for UFS RPMB access
- Enables early boot secure storage access (e.g., fTPM, secure UEFI
variables)
- Provides kernel-level RPMB access as soon as UFS driver is
initialized
- Removes complex initramfs dependencies and boot ordering
requirements
- Ensures reliable and deterministic secure storage operations
- Supports both built-in and modular fTPM configurations.
Prerequisites:
--------------
This patch series depends on commit 7e8242405b94 ("rpmb: move struct
rpmb_frame to common header") which has been merged into mainline
v6.18-rc2.
Link: https://patch.msgid.link/20251107230518.4060231-1-beanhuo@iokpp.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
For Security locked drives (drives that have Security enabled, and have
not been Security unlocked by boot firmware), the automatic partition
scanning will result in the user being spammed with errors such as:
ata5.00: failed command: READ DMA
ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 7 dma 4096 in
res 51/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { ABRT }
sd 4:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
sd 4:0:0:0: [sda] tag#7 Sense Key : Aborted Command [current]
sd 4:0:0:0: [sda] tag#7 Add. Sense: No additional sense information
during boot, because most commands except for IDENTIFY will be aborted by
a Security locked drive.
For a Security locked drive, set capacity to zero, so that no automatic
partition scanning will happen.
If the user later unlocks the drive using e.g. hdparm, the close() by the
user space application should trigger a revalidation of the drive.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Commit cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status
handling") fixed ata_to_sense_error() to properly generate sense key
ABORTED COMMAND (without any additional sense code), instead of the
previous bogus sense key ILLEGAL REQUEST with the additional sense code
UNALIGNED WRITE COMMAND, for a failed command.
However, this broke suspend for Security locked drives (drives that have
Security enabled, and have not been Security unlocked by boot firmware).
The reason for this is that the SCSI disk driver, for the Synchronize
Cache command only, treats any sense data with sense key ILLEGAL REQUEST
as a successful command (regardless of ASC / ASCQ).
After commit cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error()
status handling") the code that treats any sense data with sense key
ILLEGAL REQUEST as a successful command is no longer applicable, so the
command fails, which causes the system suspend to be aborted:
sd 1:0:0:0: PM: dpm_run_callback(): scsi_bus_suspend returns -5
sd 1:0:0:0: PM: failed to suspend async: error -5
PM: Some devices failed to suspend, or early wake event detected
To make suspend work once again, for a Security locked device only,
return sense data LOGICAL UNIT ACCESS NOT AUTHORIZED, the actual sense
data which a real SCSI device would have returned if locked.
The SCSI disk driver treats this sense data as a successful command.
Cc: stable@vger.kernel.org
Reported-by: Ilia Baryshnikov <qwelias@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220704
Fixes: cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling")
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Call scsi_device_put() in ata_scsi_dev_rescan() if the device or its
queue are not running.
Fixes: 0c76106cb975 ("scsi: sd: Fix TCG OPAL unlock on system resume")
Cc: stable@vger.kernel.org
Signed-off-by: Yihang Li <liyihang9@h-partners.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Add Iomega Clik! "PCMCIA ATA/ATAPI Adapter" ID to pata_pcmcia.
Signed-off-by: René Rebe <rene@exactco.de>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Some embedded devices have the ability to control whether power is
provided to the disks via the SATA power connector or not. ACPI power
resources are usually off by default, thus making it unclear if the
specific power resource will retain its state after a restart. If power
resources are defined on ATA ports / devices in ACPI, we should stop the
disk on SYSTEM_RESTART, to ensure the disk will not lose power while
active.
Add a new function, ata_acpi_dev_manage_restart(), that will be used to
determine if a disk should be stopped before restarting the system. If a
usable ACPI power resource has been found, it is assumed that the disk
will lose power after a restart and should be stopped to avoid unclean
shutdown due to power loss.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Markus Probst <markus.probst@posteo.de>
Link: https://patch.msgid.link/20251104142413.322347-4-markus.probst@posteo.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Some embedded devices have the ability to control whether power is
provided to the disks via the SATA power connector or not. If power
resources are defined on ATA ports / devices in ACPI, we should try to
set the power state to D0 before probing the disk to ensure that any
power supply or power gate that may exist is providing power to the
disk.
An example for such devices would be newer synology NAS devices. Every
disk slot has its own SATA power connector. Whether the connector is
providing power is controlled via an gpio, which is *off by default*.
Also the disk loses power on reboots.
Add a new function, ata_acpi_port_power_on(), that will be used to power
on the SATA power connector if usable ACPI power resources on the
associated ATA port / device are found. It will be called right before
probing the port, therefore the disk will be powered on just in time.
Signed-off-by: Markus Probst <markus.probst@posteo.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://patch.msgid.link/20251104142413.322347-3-markus.probst@posteo.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
strcpy() is deprecated; use strscpy() instead.
Replace the hard-coded buffer size 8 with sizeof(mbuf) when using
snprintf() while we're at it.
Link: https://github.com/KSPP/linux/issues/88
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Commit 6d4405b16d37 ("ata: libata-core: Cache the general purpose log
directory") introduced caching of a device general purpose log directory
to avoid repeated access to this log page during device scan. This
change also added a check on this log page to verify that the log page
version is 0x0001 as mandated by the ACS specifications.
And it turns out that some devices do not bother reporting this version,
instead reporting a version 0, resulting in error messages such as:
ata6.00: Invalid log directory version 0x0000
and to the device being marked as not supporting the general purpose log
directory log page.
Since before commit 6d4405b16d37 the log page version check did not
exist and things were still working correctly for these devices, relax
ata_read_log_directory() version check and only warn about the invalid
log page version number without disabling access to the log directory
page.
Fixes: 6d4405b16d37 ("ata: libata-core: Cache the general purpose log directory")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220635
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "mm, swap: improve cluster scan strategy" from Kairui Song improves
performance and reduces the failure rate of swap cluster allocation
- "support large align and nid in Rust allocators" from Vitaly Wool
permits Rust allocators to set NUMA node and large alignment when
perforning slub and vmalloc reallocs
- "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
DAMOS_STAT's handling of the DAMON operations sets for virtual
address spaces for ops-level DAMOS filters
- "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps
- "mm/mincore: minor clean up for swap cache checking" from Kairui Song
performs some cleanup in the swap code
- "mm: vm_normal_page*() improvements" from David Hildenbrand provides
code cleanup in the pagemap code
- "add persistent huge zero folio support" from Pankaj Raghav provides
a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero
- "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
the recently added Kexec Handover feature
- "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
Stoakes turns mm_struct.flags into a bitmap. To end the constant
struggle with space shortage on 32-bit conflicting with 64-bit's
needs
- "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
code
- "selftests/mm: Fix false positives and skip unsupported tests" from
Donet Tom fixes a few things in our selftests code
- "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
from David Hildenbrand "allows individual processes to opt-out of
THP=always into THP=madvise, without affecting other workloads on the
system".
It's a long story - the [1/N] changelog spells out the considerations
- "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc
- "Tiny optimization for large read operations" from Chi Zhiling
improves the efficiency of the pagecache read path
- "Better split_huge_page_test result check" from Zi Yan improves our
folio splitting selftest code
- "test that rmap behaves as expected" from Wei Yang adds some rmap
selftests
- "remove write_cache_pages()" from Christoph Hellwig removes that
function and converts its two remaining callers
- "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
selftests issues
- "introduce kernel file mapped folios" from Boris Burkov introduces
the concept of "kernel file pages". Using these permits btrfs to
account its metadata pages to the root cgroup, rather than to the
cgroups of random inappropriate tasks
- "mm/pageblock: improve readability of some pageblock handling" from
Wei Yang provides some readability improvements to the page allocator
code
- "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
to understand arm32 highmem
- "tools: testing: Use existing atomic.h for vma/maple tests" from
Brendan Jackman performs some code cleanups and deduplication under
tools/testing/
- "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
a couple of 32-bit issues in tools/testing/radix-tree.c
- "kasan: unify kasan_enabled() and remove arch-specific
implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
initialization code into a common arch-neutral implementation
- "mm: remove zpool" from Johannes Weiner removes zspool - an
indirection layer which now only redirects to a single thing
(zsmalloc)
- "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
couple of cleanups in the fork code
- "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
adjustments at various nth_page() callsites, eventually permitting
the removal of that undesirable helper function
- "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
creates a KASAN read-only mode for ARM, using that architecture's
memory tagging feature. It is felt that a read-only mode KASAN is
suitable for use in production systems rather than debug-only
- "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
some tidying in the hugetlb folio allocation code
- "mm: establish const-correctness for pointer parameters" from Max
Kellermann makes quite a number of the MM API functions more accurate
about the constness of their arguments. This was getting in the way
of subsystems (in this case CEPH) when they attempt to improving
their own const/non-const accuracy
- "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
code sites which were confused over when to use free_pages() vs
__free_pages()
- "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
mapletree code accessible to Rust. Required by nouveau and by its
forthcoming successor: the new Rust Nova driver
- "selftests/mm: split_huge_page_test: split_pte_mapped_thp
improvements" from David Hildenbrand adds a fix and some cleanups to
the thp selftesting code
- "mm, swap: introduce swap table as swap cache (phase I)" from Chris
Li and Kairui Song is the first step along the path to implementing
"swap tables" - a new approach to swap allocation and state tracking
which is expected to yield speed and space improvements. This
patchset itself yields a 5-20% performance benefit in some situations
- "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
layer to clean up the ptdesc code a little
- "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
issues in our 5-level pagetable selftesting code
- "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
addresses a couple of minor issues in relatively new memory
allocation profiling feature
- "Small cleanups" from Matthew Wilcox has a few cleanups in
preparation for more memdesc work
- "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
Quanmin Yan makes some changes to DAMON in furtherance of supporting
arm highmem
- "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
Anjum adds that compiler check to selftests code and fixes the
fallout, by removing dead code
- "Improvements to Victim Process Thawing and OOM Reaper Traversal
Order" from zhongjinji makes a number of improvements in the OOM
killer: mainly thawing a more appropriate group of victim threads so
they can release resources
- "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
is a bunch of small and unrelated fixups for DAMON
- "mm/damon: define and use DAMON initialization check function" from
SeongJae Park implement reliability and maintainability improvements
to a recently-added bug fix
- "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
SeongJae Park provides additional transparency to userspace clients
of the DAMON_STAT information
- "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
some constraints on khubepaged's collapsing of anon VMAs. It also
increases the success rate of MADV_COLLAPSE against an anon vma
- "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
from Lorenzo Stoakes moves us further towards removal of
file_operations.mmap(). This patchset concentrates upon clearing up
the treatment of stacked filesystems
- "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
provides some fixes and improvements to mlock's tracking of large
folios. /proc/meminfo's "Mlocked" field became more accurate
- "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
Donet Tom fixes several user-visible KSM stats inaccuracies across
forks and adds selftest code to verify these counters
- "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
some potential but presently benign issues in KSM's mm_slot handling
* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
mm: swap: check for stable address space before operating on the VMA
mm: convert folio_page() back to a macro
mm/khugepaged: use start_addr/addr for improved readability
hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
alloc_tag: fix boot failure due to NULL pointer dereference
mm: silence data-race in update_hiwater_rss
mm/memory-failure: don't select MEMORY_ISOLATION
mm/khugepaged: remove definition of struct khugepaged_mm_slot
mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
hugetlb: increase number of reserving hugepages via cmdline
selftests/mm: add fork inheritance test for ksm_merging_pages counter
mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
drivers/base/node: fix double free in register_one_node()
mm: remove PMD alignment constraint in execmem_vmalloc()
mm/memory_hotplug: fix typo 'esecially' -> 'especially'
mm/rmap: improve mlock tracking for large folios
mm/filemap: map entire large folio faultaround
mm/fault: try to map the entire file folio in finish_fault()
mm/rmap: mlock large folios in try_to_unmap_one()
mm/rmap: fix a mlock race condition in folio_referenced_one()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith:
- FC target fixes (Daniel)
- Authentication fixes and updates (Martin, Chris)
- Admin controller handling (Kamaljit)
- Target lockdep assertions (Max)
- Keep-alive updates for discovery (Alastair)
- Suspend quirk (Georg)
- MD pull request via Yu:
- Add support for a lockless bitmap.
A key feature for the new bitmap are that the IO fastpath is
lockless. If a user issues lots of write IO to the same bitmap
bit in a short time, only the first write has additional overhead
to update bitmap bit, no additional overhead for the following
writes.
By supporting only resync or recover written data, means in the
case creating new array or replacing with a new disk, there is no
need to do a full disk resync/recovery.
- Switch ->getgeo() and ->bios_param() to using struct gendisk rather
than struct block_device.
- Rust block changes via Andreas. This series adds configuration via
configfs and remote completion to the rnull driver. The series also
includes a set of changes to the rust block device driver API: a few
cleanup patches, and a few features supporting the rnull changes.
The series removes the raw buffer formatting logic from
`kernel::block` and improves the logic available in `kernel::string`
to support the same use as the removed logic.
- floppy arch cleanups
- Reduce the number of dereferencing needed for ublk commands
- Restrict supported sockets for nbd. Mostly done to eliminate a class
of issues perpetually reported by syzbot, by using nonsensical socket
setups.
- A few s390 dasd block fixes
- Fix a few issues around atomic writes
- Improve DMA interation for integrity requests
- Improve how iovecs are treated with regards to O_DIRECT aligment
constraints.
We used to require each segment to adhere to the constraints, now
only the request as a whole needs to.
- Clean up and improve p2p support, enabling use of p2p for metadata
payloads
- Improve locking of request lookup, using SRCU where appropriate
- Use page references properly for brd, avoiding very long RCU sections
- Fix ordering of recursively submitted IOs
- Clean up and improve updating nr_requests for a live device
- Various fixes and cleanups
* tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (164 commits)
s390/dasd: enforce dma_alignment to ensure proper buffer validation
s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request
ublk: remove redundant zone op check in ublk_setup_iod()
nvme: Use non zero KATO for persistent discovery connections
nvmet: add safety check for subsys lock
nvme-core: use nvme_is_io_ctrl() for I/O controller check
nvme-core: do ioccsz/iorcsz validation only for I/O controllers
nvme-core: add method to check for an I/O controller
blk-cgroup: fix possible deadlock while configuring policy
blk-mq: fix null-ptr-deref in blk_mq_free_tags() from error path
blk-mq: Fix more tag iteration function documentation
selftests: ublk: fix behavior when fio is not installed
ublk: don't access ublk_queue in ublk_unmap_io()
ublk: pass ublk_io to __ublk_complete_rq()
ublk: don't access ublk_queue in ublk_need_complete_req()
ublk: don't access ublk_queue in ublk_check_commit_and_fetch()
ublk: don't pass ublk_queue to ublk_fetch()
ublk: don't access ublk_queue in ublk_config_io_buf()
ublk: don't access ublk_queue in ublk_check_fetch_buf()
ublk: pass q_id and tag to __ublk_check_and_get_req()
...
|