diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-07-31 14:57:54 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-07-31 14:57:54 -0700 |
| commit | beace86e61e465dba204a268ab3f3377153a4973 (patch) | |
| tree | 24f90cb26bf39eb7724326cdf3e8bffed7c05e50 /mm/page_isolation.c | |
| parent | Merge tag 'mtd/for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd... (diff) | |
| parent | MAINTAINERS: add missing headers to mempory policy & migration section (diff) | |
| download | linux-beace86e61e465dba204a268ab3f3377153a4973.tar.gz linux-beace86e61e465dba204a268ab3f3377153a4973.zip | |
Merge tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"As usual, many cleanups. The below blurbiage describes 42 patchsets.
21 of those are partially or fully cleanup work. "cleans up",
"cleanup", "maintainability", "rationalizes", etc.
I never knew the MM code was so dirty.
"mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes)
addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly
mapped VMAs were not eligible for merging with existing adjacent
VMAs.
"mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park)
adds a new kernel module which simplifies the setup and usage of
DAMON in production environments.
"stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig)
is a cleanup to the writeback code which removes a couple of
pointers from struct writeback_control.
"drivers/base/node.c: optimization and cleanups" (Donet Tom)
contains largely uncorrelated cleanups to the NUMA node setup and
management code.
"mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman)
does some maintenance work on the userfaultfd code.
"Readahead tweaks for larger folios" (Ryan Roberts)
implements some tuneups for pagecache readahead when it is reading
into order>0 folios.
"selftests/mm: Tweaks to the cow test" (Mark Brown)
provides some cleanups and consistency improvements to the
selftests code.
"Optimize mremap() for large folios" (Dev Jain)
does that. A 37% reduction in execution time was measured in a
memset+mremap+munmap microbenchmark.
"Remove zero_user()" (Matthew Wilcox)
expunges zero_user() in favor of the more modern memzero_page().
"mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand)
addresses some warts which David noticed in the huge page code.
These were not known to be causing any issues at this time.
"mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park)
provides some cleanup and consolidation work in DAMON.
"use vm_flags_t consistently" (Lorenzo Stoakes)
uses vm_flags_t in places where we were inappropriately using other
types.
"mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy)
increases the reliability of large page allocation in the memfd
code.
"mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple)
removes several now-unneeded PFN_* flags.
"mm/damon: decouple sysfs from core" (SeongJae Park)
implememnts some cleanup and maintainability work in the DAMON
sysfs layer.
"madvise cleanup" (Lorenzo Stoakes)
does quite a lot of cleanup/maintenance work in the madvise() code.
"madvise anon_name cleanups" (Vlastimil Babka)
provides additional cleanups on top or Lorenzo's effort.
"Implement numa node notifier" (Oscar Salvador)
creates a standalone notifier for NUMA node memory state changes.
Previously these were lumped under the more general memory
on/offline notifier.
"Make MIGRATE_ISOLATE a standalone bit" (Zi Yan)
cleans up the pageblock isolation code and fixes a potential issue
which doesn't seem to cause any problems in practice.
"selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park)
adds additional drgn- and python-based DAMON selftests which are
more comprehensive than the existing selftest suite.
"Misc rework on hugetlb faulting path" (Oscar Salvador)
fixes a rather obscure deadlock in the hugetlb fault code and
follows that fix with a series of cleanups.
"cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport)
rationalizes and cleans up the highmem-specific code in the CMA
allocator.
"mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand)
provides cleanups and future-preparedness to the migration code.
"mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park)
adds some tracepoints to some DAMON auto-tuning code.
"mm/damon: fix misc bugs in DAMON modules" (SeongJae Park)
does that.
"mm/damon: misc cleanups" (SeongJae Park)
also does what it claims.
"mm: folio_pte_batch() improvements" (David Hildenbrand)
cleans up the large folio PTE batching code.
"mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park)
facilitates dynamic alteration of DAMON's inter-node allocation
policy.
"Remove unmap_and_put_page()" (Vishal Moola)
provides a couple of page->folio conversions.
"mm: per-node proactive reclaim" (Davidlohr Bueso)
implements a per-node control of proactive reclaim - beyond the
current memcg-based implementation.
"mm/damon: remove damon_callback" (SeongJae Park)
replaces the damon_callback interface with a more general and
powerful damon_call()+damos_walk() interface.
"mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes)
implements a number of mremap cleanups (of course) in preparation
for adding new mremap() functionality: newly permit the remapping
of multiple VMAs when the user is specifying MREMAP_FIXED. It still
excludes some specialized situations where this cannot be performed
reliably.
"drop hugetlb_free_pgd_range()" (Anthony Yznaga)
switches some sparc hugetlb code over to the generic version and
removes the thus-unneeded hugetlb_free_pgd_range().
"mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park)
augments the present userspace-requested update of DAMON sysfs
monitoring files. Automatic update is now provided, along with a
tunable to control the update interval.
"Some randome fixes and cleanups to swapfile" (Kemeng Shi)
does what is claims.
"mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand)
provides (and uses) a means by which debug-style functions can grab
a copy of a pageframe and inspect it locklessly without tripping
over the races inherent in operating on the live pageframe
directly.
"use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan)
addresses the large contention issues which can be triggered by
reads from that procfs file. Latencies are reduced by more than
half in some situations. The series also introduces several new
selftests for the /proc/pid/maps interface.
"__folio_split() clean up" (Zi Yan)
cleans up __folio_split()!
"Optimize mprotect() for large folios" (Dev Jain)
provides some quite large (>3x) speedups to mprotect() when dealing
with large folios.
"selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian)
does some cleanup work in the selftests code.
"tools/testing: expand mremap testing" (Lorenzo Stoakes)
extends the mremap() selftest in several ways, including adding
more checking of Lorenzo's recently added "permit mremap() move of
multiple VMAs" feature.
"selftests/damon/sysfs.py: test all parameters" (SeongJae Park)
extends the DAMON sysfs interface selftest so that it tests all
possible user-requested parameters. Rather than the present minimal
subset"
* tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits)
MAINTAINERS: add missing headers to mempory policy & migration section
MAINTAINERS: add missing file to cgroup section
MAINTAINERS: add MM MISC section, add missing files to MISC and CORE
MAINTAINERS: add missing zsmalloc file
MAINTAINERS: add missing files to page alloc section
MAINTAINERS: add missing shrinker files
MAINTAINERS: move memremap.[ch] to hotplug section
MAINTAINERS: add missing mm_slot.h file THP section
MAINTAINERS: add missing interval_tree.c to memory mapping section
MAINTAINERS: add missing percpu-internal.h file to per-cpu section
mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info()
selftests/damon: introduce _common.sh to host shared function
selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
selftests/damon/sysfs.py: test non-default parameters runtime commit
selftests/damon/sysfs.py: generalize DAMON context commit assertion
selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
selftests/damon/sysfs.py: test DAMOS filters commitment
selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
selftests/damon/sysfs.py: test DAMOS destinations commitment
...
Diffstat (limited to 'mm/page_isolation.c')
| -rw-r--r-- | mm/page_isolation.c | 112 |
1 files changed, 50 insertions, 62 deletions
diff --git a/mm/page_isolation.c b/mm/page_isolation.c index b2fc5266e3d2..f72b6cd38b95 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -21,9 +21,9 @@ * consequently belong to a single zone. * * PageLRU check without isolation or lru_lock could race so that - * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable - * check without lock_page also may miss some movable non-lru pages at - * race condition. So you can't expect this function should be exact. + * MIGRATE_MOVABLE block might include unmovable pages. Similarly, pages + * with movable_ops can only be identified some time after they were + * allocated. So you can't expect this function should be exact. * * Returns a page without holding a reference. If the caller wants to * dereference that page (e.g., dumping), it has to make sure that it @@ -31,7 +31,7 @@ * */ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn, - int migratetype, int flags) + enum pb_isolate_mode mode) { struct page *page = pfn_to_page(start_pfn); struct zone *zone = page_zone(page); @@ -46,7 +46,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e * isolate CMA pageblocks even when they are not movable in fact * so consider them movable here. */ - if (is_migrate_cma(migratetype)) + if (mode == PB_ISOLATE_MODE_CMA_ALLOC) return NULL; return page; @@ -92,7 +92,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e h = size_to_hstate(folio_size(folio)); if (h && !hugepage_migration_supported(h)) return page; - } else if (!folio_test_lru(folio) && !__folio_test_movable(folio)) { + } else if (!folio_test_lru(folio)) { return page; } @@ -117,7 +117,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e * The HWPoisoned page may be not in buddy system, and * page_count() is not 0. */ - if ((flags & MEMORY_OFFLINE) && PageHWPoison(page)) + if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && PageHWPoison(page)) continue; /* @@ -130,10 +130,10 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e * move these pages that still have a reference count > 0. * (false negatives in this function only) */ - if ((flags & MEMORY_OFFLINE) && PageOffline(page)) + if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && PageOffline(page)) continue; - if (__PageMovable(page) || PageLRU(page)) + if (PageLRU(page) || page_has_movable_ops(page)) continue; /* @@ -151,7 +151,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e * present in [start_pfn, end_pfn). The pageblock must intersect with * [start_pfn, end_pfn). */ -static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags, +static int set_migratetype_isolate(struct page *page, enum pb_isolate_mode mode, unsigned long start_pfn, unsigned long end_pfn) { struct zone *zone = page_zone(page); @@ -186,9 +186,9 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_ end_pfn); unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end, - migratetype, isol_flags); + mode); if (!unmovable) { - if (!move_freepages_block_isolate(zone, page, MIGRATE_ISOLATE)) { + if (!pageblock_isolate_and_move_free_pages(zone, page)) { spin_unlock_irqrestore(&zone->lock, flags); return -EBUSY; } @@ -198,7 +198,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_ } spin_unlock_irqrestore(&zone->lock, flags); - if (isol_flags & REPORT_FAILURE) { + if (mode == PB_ISOLATE_MODE_MEM_OFFLINE) { /* * printk() with zone->lock held will likely trigger a * lockdep splat, so defer it here. @@ -209,7 +209,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_ return -EBUSY; } -static void unset_migratetype_isolate(struct page *page, int migratetype) +static void unset_migratetype_isolate(struct page *page) { struct zone *zone; unsigned long flags; @@ -262,10 +262,10 @@ static void unset_migratetype_isolate(struct page *page, int migratetype) * Isolating this block already succeeded, so this * should not fail on zone boundaries. */ - WARN_ON_ONCE(!move_freepages_block_isolate(zone, page, migratetype)); + WARN_ON_ONCE(!pageblock_unisolate_and_move_free_pages(zone, page)); } else { - set_pageblock_migratetype(page, migratetype); - __putback_isolated_page(page, order, migratetype); + clear_pageblock_isolate(page); + __putback_isolated_page(page, order, get_pageblock_migratetype(page)); } zone->nr_isolate_pageblock--; out: @@ -292,11 +292,10 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages) * isolate_single_pageblock() -- tries to isolate a pageblock that might be * within a free or in-use page. * @boundary_pfn: pageblock-aligned pfn that a page might cross - * @flags: isolation flags + * @mode: isolation mode * @isolate_before: isolate the pageblock before the boundary_pfn * @skip_isolation: the flag to skip the pageblock isolation in second * isolate_single_pageblock() - * @migratetype: migrate type to set in error recovery. * * Free and in-use pages can be as big as MAX_PAGE_ORDER and contain more than one * pageblock. When not all pageblocks within a page are isolated at the same @@ -311,8 +310,9 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages) * either. The function handles this by splitting the free page or migrating * the in-use page then splitting the free page. */ -static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, - bool isolate_before, bool skip_isolation, int migratetype) +static int isolate_single_pageblock(unsigned long boundary_pfn, + enum pb_isolate_mode mode, bool isolate_before, + bool skip_isolation) { unsigned long start_pfn; unsigned long isolate_pageblock; @@ -338,12 +338,11 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, zone->zone_start_pfn); if (skip_isolation) { - int mt __maybe_unused = get_pageblock_migratetype(pfn_to_page(isolate_pageblock)); - - VM_BUG_ON(!is_migrate_isolate(mt)); + VM_BUG_ON(!get_pageblock_isolate(pfn_to_page(isolate_pageblock))); } else { - ret = set_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype, - flags, isolate_pageblock, isolate_pageblock + pageblock_nr_pages); + ret = set_migratetype_isolate(pfn_to_page(isolate_pageblock), + mode, isolate_pageblock, + isolate_pageblock + pageblock_nr_pages); if (ret) return ret; @@ -383,7 +382,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, if (PageBuddy(page)) { int order = buddy_order(page); - /* move_freepages_block_isolate() handled this */ + /* pageblock_isolate_and_move_free_pages() handled this */ VM_WARN_ON_ONCE(pfn + (1 << order) > boundary_pfn); pfn += 1UL << order; @@ -422,7 +421,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, * proper free and split handling for them. */ VM_WARN_ON_ONCE_PAGE(PageLRU(page), page); - VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page); + VM_WARN_ON_ONCE_PAGE(page_has_movable_ops(page), page); goto failed; } @@ -433,7 +432,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, failed: /* restore the original migratetype */ if (!skip_isolation) - unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype); + unset_migratetype_isolate(pfn_to_page(isolate_pageblock)); return -EBUSY; } @@ -441,14 +440,7 @@ failed: * start_isolate_page_range() - mark page range MIGRATE_ISOLATE * @start_pfn: The first PFN of the range to be isolated. * @end_pfn: The last PFN of the range to be isolated. - * @migratetype: Migrate type to set in error recovery. - * @flags: The following flags are allowed (they can be combined in - * a bit mask) - * MEMORY_OFFLINE - isolate to offline (!allocate) memory - * e.g., skip over PageHWPoison() pages - * and PageOffline() pages. - * REPORT_FAILURE - report details about the failure to - * isolate the range + * @mode: isolation mode * * Making page-allocation-type to be MIGRATE_ISOLATE means free pages in * the range will never be allocated. Any free pages and pages freed in the @@ -481,7 +473,7 @@ failed: * Return: 0 on success and -EBUSY if any part of range cannot be isolated. */ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, - int migratetype, int flags) + enum pb_isolate_mode mode) { unsigned long pfn; struct page *page; @@ -492,8 +484,8 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, bool skip_isolation = false; /* isolate [isolate_start, isolate_start + pageblock_nr_pages) pageblock */ - ret = isolate_single_pageblock(isolate_start, flags, false, - skip_isolation, migratetype); + ret = isolate_single_pageblock(isolate_start, mode, false, + skip_isolation); if (ret) return ret; @@ -501,10 +493,9 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, skip_isolation = true; /* isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock */ - ret = isolate_single_pageblock(isolate_end, flags, true, - skip_isolation, migratetype); + ret = isolate_single_pageblock(isolate_end, mode, true, skip_isolation); if (ret) { - unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype); + unset_migratetype_isolate(pfn_to_page(isolate_start)); return ret; } @@ -513,12 +504,11 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, pfn < isolate_end - pageblock_nr_pages; pfn += pageblock_nr_pages) { page = __first_valid_page(pfn, pageblock_nr_pages); - if (page && set_migratetype_isolate(page, migratetype, flags, - start_pfn, end_pfn)) { - undo_isolate_page_range(isolate_start, pfn, migratetype); + if (page && set_migratetype_isolate(page, mode, start_pfn, + end_pfn)) { + undo_isolate_page_range(isolate_start, pfn); unset_migratetype_isolate( - pfn_to_page(isolate_end - pageblock_nr_pages), - migratetype); + pfn_to_page(isolate_end - pageblock_nr_pages)); return -EBUSY; } } @@ -529,13 +519,10 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * undo_isolate_page_range - undo effects of start_isolate_page_range() * @start_pfn: The first PFN of the isolated range * @end_pfn: The last PFN of the isolated range - * @migratetype: New migrate type to set on the range * - * This finds every MIGRATE_ISOLATE page block in the given range - * and switches it to @migratetype. + * This finds and unsets every MIGRATE_ISOLATE page block in the given range */ -void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, - int migratetype) +void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn) { unsigned long pfn; struct page *page; @@ -548,7 +535,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, page = __first_valid_page(pfn, pageblock_nr_pages); if (!page || !is_migrate_isolate_page(page)) continue; - unset_migratetype_isolate(page, migratetype); + unset_migratetype_isolate(page); } } /* @@ -560,7 +547,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, */ static unsigned long __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, - int flags) + enum pb_isolate_mode mode) { struct page *page; @@ -573,11 +560,12 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, * simple way to verify that as VM_BUG_ON(), though. */ pfn += 1 << buddy_order(page); - else if ((flags & MEMORY_OFFLINE) && PageHWPoison(page)) + else if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && + PageHWPoison(page)) /* A HWPoisoned page cannot be also PageBuddy */ pfn++; - else if ((flags & MEMORY_OFFLINE) && PageOffline(page) && - !page_count(page)) + else if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && + PageOffline(page) && !page_count(page)) /* * The responsible driver agreed to skip PageOffline() * pages when offlining memory by dropping its @@ -595,11 +583,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, * test_pages_isolated - check if pageblocks in range are isolated * @start_pfn: The first PFN of the isolated range * @end_pfn: The first PFN *after* the isolated range - * @isol_flags: Testing mode flags + * @mode: Testing mode * * This tests if all in the specified range are free. * - * If %MEMORY_OFFLINE is specified in @flags, it will consider + * If %PB_ISOLATE_MODE_MEM_OFFLINE specified in @mode, it will consider * poisoned and offlined pages free as well. * * Caller must ensure the requested range doesn't span zones. @@ -607,7 +595,7 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, * Returns 0 if true, -EBUSY if one or more pages are in use. */ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, - int isol_flags) + enum pb_isolate_mode mode) { unsigned long pfn, flags; struct page *page; @@ -643,7 +631,7 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, /* Check all pages are free or marked as ISOLATED */ zone = page_zone(page); spin_lock_irqsave(&zone->lock, flags); - pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn, isol_flags); + pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn, mode); spin_unlock_irqrestore(&zone->lock, flags); ret = pfn < end_pfn ? -EBUSY : 0; |
