From 2115dc3e3376b7bd5021950b45eebbcd992e9be9 Mon Sep 17 00:00:00 2001 From: Bart Van Assche Date: Thu, 24 Jul 2025 08:34:42 -0700 Subject: docs: filesystems: sysfs: Recommend sysfs_emit() for new code only The advantages of converting existing sysfs show() methods to sysfs_emit() and sysfs_emit_at() do not outweigh the risk of introducing bugs. Hence recommend sysfs_emit() and sysfs_emit_at() only for new implementations of show() methods. Cc: Greg Kroah-Hartman Cc: James Bottomley Cc: Martin K. Petersen Signed-off-by: Bart Van Assche Reviewed-by: Martin K. Petersen Acked-by: Greg Kroah-Hartman Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250724153449.2433395-1-bvanassche@acm.org --- Documentation/filesystems/sysfs.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sysfs.rst b/Documentation/filesystems/sysfs.rst index c32993bc83c7..624e4f51212e 100644 --- a/Documentation/filesystems/sysfs.rst +++ b/Documentation/filesystems/sysfs.rst @@ -243,8 +243,8 @@ Other notes: - show() methods should return the number of bytes printed into the buffer. -- show() should only use sysfs_emit() or sysfs_emit_at() when formatting - the value to be returned to user space. +- New implementations of show() methods should only use sysfs_emit() or + sysfs_emit_at() when formatting the value to be returned to user space. - store() should return the number of bytes used from the buffer. If the entire buffer has been used, just return the count argument. -- cgit v1.2.3 From 81fd803b5a5da3fad0140163091e12b3ac2ae484 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Wed, 13 Aug 2025 15:05:01 -0500 Subject: Documentation: Fix filesystems typos Fix typos. Signed-off-by: Bjorn Helgaas Reviewed-by: Randy Dunlap Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250813200526.290420-6-helgaas@kernel.org --- Documentation/filesystems/erofs.rst | 2 +- Documentation/filesystems/gfs2-glocks.rst | 2 +- Documentation/filesystems/hpfs.rst | 2 +- Documentation/filesystems/resctrl.rst | 2 +- Documentation/filesystems/xfs/xfs-online-fsck-design.rst | 4 ++-- 5 files changed, 6 insertions(+), 6 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst index 7ddb235aee9d..08194f194b94 100644 --- a/Documentation/filesystems/erofs.rst +++ b/Documentation/filesystems/erofs.rst @@ -116,7 +116,7 @@ cache_strategy=%s Select a strategy for cached decompression from now on: cluster for further reading. It still does in-place I/O decompression for the rest compressed physical clusters; - readaround Cache the both ends of incomplete compressed + readaround Cache both ends of incomplete compressed physical clusters for further reading. It still does in-place I/O decompression for the rest compressed physical clusters. diff --git a/Documentation/filesystems/gfs2-glocks.rst b/Documentation/filesystems/gfs2-glocks.rst index adc0d4c4d979..ce5ff08cbd59 100644 --- a/Documentation/filesystems/gfs2-glocks.rst +++ b/Documentation/filesystems/gfs2-glocks.rst @@ -105,7 +105,7 @@ go_unlocked Yes No Operations must not drop either the bit lock or the spinlock if its held on entry. go_dump and do_demote_ok must never block. Note that go_dump will only be called if the glock's state - indicates that it is caching uptodate data. + indicates that it is caching up-to-date data. Glock locking order within GFS2: diff --git a/Documentation/filesystems/hpfs.rst b/Documentation/filesystems/hpfs.rst index 7e0dd2f4373e..0f9516b5eb07 100644 --- a/Documentation/filesystems/hpfs.rst +++ b/Documentation/filesystems/hpfs.rst @@ -65,7 +65,7 @@ are case sensitive, so for example when you create a file FOO, you can use 'cat FOO', 'cat Foo', 'cat foo' or 'cat F*' but not 'cat f*'. Note, that you also won't be able to compile linux kernel (and maybe other things) on HPFS because kernel creates different files with names like bootsect.S and -bootsect.s. When searching for file thats name has characters >= 128, codepages +bootsect.s. When searching for file whose name has characters >= 128, codepages are used - see below. OS/2 ignores dots and spaces at the end of file name, so this driver does as well. If you create 'a. ...', the file 'a' will be created, but you can still diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst index c7949dd44f2f..4db3b07c16c5 100644 --- a/Documentation/filesystems/resctrl.rst +++ b/Documentation/filesystems/resctrl.rst @@ -563,7 +563,7 @@ this would be dependent on number of cores the benchmark is run on. depending on # of threads: For the same SKU in #1, a 'single thread, with 10% bandwidth' and '4 -thread, with 10% bandwidth' can consume upto 10GBps and 40GBps although +thread, with 10% bandwidth' can consume up to 10GBps and 40GBps although they have same percentage bandwidth of 10%. This is simply because as threads start using more cores in an rdtgroup, the actual bandwidth may increase or vary although user specified bandwidth percentage is same. diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index e231d127cd40..9fe994353395 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -454,7 +454,7 @@ filesystem so that it can apply pending filesystem updates to the staging information. Once the scan is done, the owning object is re-locked, the live data is used to write a new ondisk structure, and the repairs are committed atomically. -The hooks are disabled and the staging staging area is freed. +The hooks are disabled and the staging area is freed. Finally, the storage from the old data structure are carefully reaped. Introducing concurrency helps online repair avoid various locking problems, but @@ -2185,7 +2185,7 @@ The chapter about :ref:`secondary metadata` mentioned that checking and repairing of secondary metadata commonly requires coordination between a live metadata scan of the filesystem and writer threads that are updating that metadata. -Keeping the scan data up to date requires requires the ability to propagate +Keeping the scan data up to date requires the ability to propagate metadata updates from the filesystem into the data being collected by the scan. This *can* be done by appending concurrent updates into a separate log file and applying them before writing the new metadata to disk, but this leads to -- cgit v1.2.3 From 41ecad8b233bf480a1f39b76462dcb2c2d3cdfed Mon Sep 17 00:00:00 2001 From: Raphael Pinsonneault-Thibeault Date: Mon, 18 Aug 2025 14:19:34 -0400 Subject: docs: fix trailing whitespace error and remove repeated words in propagate_umount.txt in Documentation/filesystems/propagate_umount.txt: line 289: remove whitespace on blank line line 315: remove duplicate "that" line 364: remove duplicate "in" Signed-off-by: Raphael Pinsonneault-Thibeault Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250818181934.55491-2-rpthibeault@gmail.com --- Documentation/filesystems/propagate_umount.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/propagate_umount.txt b/Documentation/filesystems/propagate_umount.txt index c90349e5b889..9a7eb96df300 100644 --- a/Documentation/filesystems/propagate_umount.txt +++ b/Documentation/filesystems/propagate_umount.txt @@ -286,7 +286,7 @@ Trim_one(m) strip the "seen by Trim_ancestors" mark from m remove m from the Candidates list return - + remove_this = false found = false for each n in children(m) @@ -312,7 +312,7 @@ Trim_ancestors(m) } Terminating condition in the loop in Trim_ancestors() is correct, -since that that loop will never run into p belonging to U - p is always +since that loop will never run into p belonging to U - p is always an ancestor of argument of Trim_one() and since U is closed, the argument of Trim_one() would also have to belong to U. But Trim_one() is never called for elements of U. In other words, p belongs to S if and only @@ -361,7 +361,7 @@ such removals. Proof: suppose S was non-shifting, x is a locked element of S, parent of x is not in S and S - {x} is not non-shifting. Then there is an element m in S - {x} and a subtree mounted strictly inside m, such that m contains -an element not in in S - {x}. Since S is non-shifting, everything in +an element not in S - {x}. Since S is non-shifting, everything in that subtree must belong to S. But that means that this subtree must contain x somewhere *and* that parent of x either belongs that subtree or is equal to m. Either way it must belong to S. Contradiction. -- cgit v1.2.3 From 61578493ca7f9d4acd804544f3f5651f5124b12f Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 26 Aug 2025 09:47:56 +0700 Subject: Documentation: ocfs2: Properly reindent filecheck operations list Some of texts in filecheck operations list are indented out of the list. In particular, the third operation is shown not as the third list item but rather as a separate paragraph. Reindent the list so that gets properly rendered as such. Signed-off-by: Bagas Sanjaya Acked-by: Joseph Qi Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250826024756.16073-1-bagasdotme@gmail.com --- Documentation/filesystems/ocfs2-online-filecheck.rst | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/ocfs2-online-filecheck.rst b/Documentation/filesystems/ocfs2-online-filecheck.rst index 2257bb53edc1..9e8449416e0b 100644 --- a/Documentation/filesystems/ocfs2-online-filecheck.rst +++ b/Documentation/filesystems/ocfs2-online-filecheck.rst @@ -58,33 +58,33 @@ inode, fixing inode and setting the size of result record history. # echo "" > /sys/fs/ocfs2//filecheck/check # cat /sys/fs/ocfs2//filecheck/check -The output is like this:: + The output is like this:: INO DONE ERROR 39502 1 GENERATION - lists the inode numbers. - indicates whether the operation has been finished. - says what kind of errors was found. For the detailed error numbers, - please refer to the file linux/fs/ocfs2/filecheck.h. + lists the inode numbers. + indicates whether the operation has been finished. + says what kind of errors was found. For the detailed error numbers, + please refer to the file linux/fs/ocfs2/filecheck.h. 2. If you determine to fix this inode, do:: # echo "" > /sys/fs/ocfs2//filecheck/fix # cat /sys/fs/ocfs2//filecheck/fix -The output is like this::: + The output is like this:: INO DONE ERROR 39502 1 SUCCESS -This time, the column indicates whether this fix is successful or not. + This time, the column indicates whether this fix is successful or not. 3. The record cache is used to store the history of check/fix results. It's -default size is 10, and can be adjust between the range of 10 ~ 100. You can -adjust the size like this:: + default size is 10, and can be adjust between the range of 10 ~ 100. You can + adjust the size like this:: - # echo "" > /sys/fs/ocfs2//filecheck/set + # echo "" > /sys/fs/ocfs2//filecheck/set Fixing stuff ============ -- cgit v1.2.3 From ba653158f40deccb3f79005bf1d5c6c37d45b247 Mon Sep 17 00:00:00 2001 From: Alperen Aksu Date: Thu, 21 Aug 2025 13:13:47 +0000 Subject: Documentation/filesystems/xfs: Fix typo error Fixed typo error in referring to the section's headline Fixed to correct spelling of "mapping" Signed-off-by: Alperen Aksu Reviewed-by: Randy Dunlap Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250821131404.25461-1-aksulperen@gmail.com --- Documentation/filesystems/xfs/xfs-online-fsck-design.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index 9fe994353395..8cbcd3c26434 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -475,7 +475,7 @@ operation, which may cause application failure or an unplanned filesystem shutdown. Inspiration for the secondary metadata repair strategy was drawn from section -2.4 of Srinivasan above, and sections 2 ("NSF: Inded Build Without Side-File") +2.4 of Srinivasan above, and sections 2 ("NSF: Index Build Without Side-File") and 3.1.1 ("Duplicate Key Insert Problem") in C. Mohan, `"Algorithms for Creating Indexes for Very Large Tables Without Quiescing Updates" `_, 1992. @@ -4179,7 +4179,7 @@ When the exchange is initiated, the sequence of operations is as follows: This will be discussed in more detail in subsequent sections. If the filesystem goes down in the middle of an operation, log recovery will -find the most recent unfinished maping exchange log intent item and restart +find the most recent unfinished mapping exchange log intent item and restart from there. This is how atomic file mapping exchanges guarantees that an outside observer will either see the old broken structure or the new one, and never a mismash of -- cgit v1.2.3 From 7d1c5e52ec1549adc4394b6e2f38278fc671e522 Mon Sep 17 00:00:00 2001 From: Mallikarjun Thammanavar Date: Tue, 19 Aug 2025 12:46:04 +0000 Subject: docs: fix spelling and grammar in atomic_writes Fix minor spelling and grammatical issues in the ext4 atomic_writes documentation. Signed-off-by: Mallikarjun Thammanavar Reviewed-by: Randy Dunlap Reviewed-by: Darrick J. Wong Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250819124604.8995-1-mallikarjunst09@gmail.com --- Documentation/filesystems/ext4/atomic_writes.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/ext4/atomic_writes.rst b/Documentation/filesystems/ext4/atomic_writes.rst index aeb47ace738d..ae8995740aa8 100644 --- a/Documentation/filesystems/ext4/atomic_writes.rst +++ b/Documentation/filesystems/ext4/atomic_writes.rst @@ -14,7 +14,7 @@ I/O) on regular files with extents, provided the underlying storage device supports hardware atomic writes. This is supported in the following two ways: 1. **Single-fsblock Atomic Writes**: - EXT4's supports atomic write operations with a single filesystem block since + EXT4 supports atomic write operations with a single filesystem block since v6.13. In this the atomic write unit minimum and maximum sizes are both set to filesystem blocksize. e.g. doing atomic write of 16KB with 16KB filesystem blocksize on 64KB @@ -50,7 +50,7 @@ Multi-fsblock Implementation Details The bigalloc feature changes ext4 to allocate in units of multiple filesystem blocks, also known as clusters. With bigalloc each bit within block bitmap -represents cluster (power of 2 number of blocks) rather than individual +represents a cluster (power of 2 number of blocks) rather than individual filesystem blocks. EXT4 supports multi-fsblock atomic writes with bigalloc, subject to the following constraints. The minimum atomic write size is the larger of the fs @@ -189,7 +189,7 @@ The write must be aligned to the filesystem's block size and not exceed the filesystem's maximum atomic write unit size. See ``generic_atomic_write_valid()`` for more details. -``statx()`` system call with ``STATX_WRITE_ATOMIC`` flag can provides following +``statx()`` system call with ``STATX_WRITE_ATOMIC`` flag can provide following details: * ``stx_atomic_write_unit_min``: Minimum size of an atomic write request. -- cgit v1.2.3 From 69c6739d671df58b9f034b94ac8310f569e2b632 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 19 Aug 2025 13:12:49 +0700 Subject: Documentation: sharedsubtree: Format remaining of shell snippets as literal code blcoks Fix formatting inconsistency of shell snippets by wrapping the remaining of them in literal code blocks. Signed-off-by: Bagas Sanjaya Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250819061254.31220-2-bagasdotme@gmail.com --- Documentation/filesystems/sharedsubtree.rst | 68 ++++++++++++++++------------- 1 file changed, 37 insertions(+), 31 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sharedsubtree.rst b/Documentation/filesystems/sharedsubtree.rst index 1cf56489ed48..06497c4455b4 100644 --- a/Documentation/filesystems/sharedsubtree.rst +++ b/Documentation/filesystems/sharedsubtree.rst @@ -90,37 +90,42 @@ replicas continue to be exactly same. Here is an example: - Let's say /mnt has a mount which is shared. - # mount --make-shared /mnt + Let's say /mnt has a mount which is shared:: - Let's bind mount /mnt to /tmp - # mount --bind /mnt /tmp + # mount --make-shared /mnt + + Let's bind mount /mnt to /tmp:: + + # mount --bind /mnt /tmp the new mount at /tmp becomes a shared mount and it is a replica of the mount at /mnt. - Now let's make the mount at /tmp; a slave of /mnt - # mount --make-slave /tmp + Now let's make the mount at /tmp; a slave of /mnt:: + + # mount --make-slave /tmp + + let's mount /dev/sd0 on /mnt/a:: - let's mount /dev/sd0 on /mnt/a - # mount /dev/sd0 /mnt/a + # mount /dev/sd0 /mnt/a - #ls /mnt/a - t1 t2 t3 + # ls /mnt/a + t1 t2 t3 - #ls /tmp/a - t1 t2 t3 + # ls /tmp/a + t1 t2 t3 Note the mount event has propagated to the mount at /tmp - However let's see what happens if we mount something on the mount at /tmp + However let's see what happens if we mount something on the mount at + /tmp:: - # mount /dev/sd1 /tmp/b + # mount /dev/sd1 /tmp/b - #ls /tmp/b - s1 s2 s3 + # ls /tmp/b + s1 s2 s3 - #ls /mnt/b + # ls /mnt/b Note how the mount event has not propagated to the mount at /mnt @@ -137,7 +142,7 @@ replicas continue to be exactly same. # mount --make-unbindable /mnt - Let's try to bind mount this mount somewhere else:: + Let's try to bind mount this mount somewhere else:: # mount --bind /mnt /tmp mount: wrong fs type, bad option, bad superblock on /mnt, @@ -471,9 +476,9 @@ replicas continue to be exactly same. 5d) Move semantics - Consider the following command + Consider the following command:: - mount --move A B/b + mount --move A B/b where 'A' is the source mount, 'B' is the destination mount and 'b' is the dentry in the destination mount. @@ -663,9 +668,9 @@ replicas continue to be exactly same. 'B' is the slave of 'A' and 'C' is a slave of 'B' A -> B -> C - at this point if we execute the following command + at this point if we execute the following command:: - mount --bind /bin /tmp/test + mount --bind /bin /tmp/test The mount is attempted on 'A' @@ -706,8 +711,8 @@ replicas continue to be exactly same. / \ tmp usr - And we want to replicate the tree at multiple - mountpoints under /root/tmp + And we want to replicate the tree at multiple + mountpoints under /root/tmp step 2: :: @@ -731,7 +736,7 @@ replicas continue to be exactly same. / m1 - it has two vfsmounts + it has two vfsmounts step 3: :: @@ -739,7 +744,7 @@ replicas continue to be exactly same. mkdir -p /tmp/m2 mount --rbind /root /tmp/m2 - the new tree now looks like this:: + the new tree now looks like this:: root / \ @@ -759,14 +764,15 @@ replicas continue to be exactly same. / \ m1 m2 - it has 6 vfsmounts + it has 6 vfsmounts step 4: - :: + :: + mkdir -p /tmp/m3 mount --rbind /root /tmp/m3 - I won't draw the tree..but it has 24 vfsmounts + I won't draw the tree..but it has 24 vfsmounts at step i the number of vfsmounts is V[i] = i*V[i-1]. @@ -785,8 +791,8 @@ replicas continue to be exactly same. / \ tmp usr - How do we set up the same tree at multiple locations under - /root/tmp + How do we set up the same tree at multiple locations under + /root/tmp step 2: :: -- cgit v1.2.3 From a8886b42d57b8280e0c064779d87030266d9c7ce Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 19 Aug 2025 13:12:50 +0700 Subject: Documentation: sharedsubtree: Use proper enumerator sequence for enumerated lists Sphinx does not recognize mixed-letter sequences (e.g. 2a) as enumerator for enumerated lists. As such, lists that use such sequences end up as definition lists instead. Use proper enumeration sequences for this purpose. Signed-off-by: Bagas Sanjaya Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250819061254.31220-3-bagasdotme@gmail.com --- Documentation/filesystems/sharedsubtree.rst | 40 ++++++++++++++--------------- 1 file changed, 20 insertions(+), 20 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sharedsubtree.rst b/Documentation/filesystems/sharedsubtree.rst index 06497c4455b4..7ad5101b4c03 100644 --- a/Documentation/filesystems/sharedsubtree.rst +++ b/Documentation/filesystems/sharedsubtree.rst @@ -39,8 +39,8 @@ precise d. unbindable mount -2a) A shared mount can be replicated to as many mountpoints and all the -replicas continue to be exactly same. +a) A shared mount can be replicated to as many mountpoints and all the + replicas continue to be exactly same. Here is an example: @@ -83,8 +83,8 @@ replicas continue to be exactly same. contents will be visible under /tmp/a too. -2b) A slave mount is like a shared mount except that mount and umount events - only propagate towards it. +b) A slave mount is like a shared mount except that mount and umount events + only propagate towards it. All slave mounts have a master mount which is a shared. @@ -131,12 +131,12 @@ replicas continue to be exactly same. /mnt -2c) A private mount does not forward or receive propagation. +c) A private mount does not forward or receive propagation. This is the mount we are familiar with. Its the default type. -2d) A unbindable mount is a unbindable private mount +d) A unbindable mount is a unbindable private mount let's say we have a mount at /mnt and we make it unbindable:: @@ -185,7 +185,7 @@ replicas continue to be exactly same. namespaces. B) A process wants its mounts invisible to any other process, but - still be able to see the other system mounts. + still be able to see the other system mounts. Solution: @@ -250,7 +250,7 @@ replicas continue to be exactly same. Note: the word 'vfsmount' and the noun 'mount' have been used to mean the same thing, throughout this document. -5a) Mount states +a) Mount states A given mount can be in one of the following states @@ -360,7 +360,7 @@ replicas continue to be exactly same. the state of a mount depending on type of the destination mount. Its explained in section 5d. -5b) Bind semantics +b) Bind semantics Consider the following command:: @@ -437,7 +437,7 @@ replicas continue to be exactly same. 8. 'A' is a unbindable mount and 'B' is a non-shared mount. This is a invalid operation. A unbindable mount cannot be bind mounted. -5c) Rbind semantics +c) Rbind semantics rbind is same as bind. Bind replicates the specified mount. Rbind replicates all the mounts in the tree belonging to the specified mount. @@ -474,7 +474,7 @@ replicas continue to be exactly same. -5d) Move semantics +d) Move semantics Consider the following command:: @@ -551,7 +551,7 @@ replicas continue to be exactly same. 'A' is mounted on mount 'B' at dentry 'b'. Mount 'A' continues to be a unbindable mount. -5e) Mount semantics +e) Mount semantics Consider the following command:: @@ -564,7 +564,7 @@ replicas continue to be exactly same. that the source mount is always a private mount. -5f) Unmount semantics +f) Unmount semantics Consider the following command:: @@ -598,7 +598,7 @@ replicas continue to be exactly same. to be unmounted and 'C1' has some sub-mounts, the umount operation is failed entirely. -5g) Clone Namespace +g) Clone Namespace A cloned namespace contains all the mounts as that of the parent namespace. @@ -682,18 +682,18 @@ replicas continue to be exactly same. 7) FAQ ------ - Q1. Why is bind mount needed? How is it different from symbolic links? + 1. Why is bind mount needed? How is it different from symbolic links? symbolic links can get stale if the destination mount gets unmounted or moved. Bind mounts continue to exist even if the other mount is unmounted or moved. - Q2. Why can't the shared subtree be implemented using exportfs? + 2. Why can't the shared subtree be implemented using exportfs? exportfs is a heavyweight way of accomplishing part of what shared subtree can do. I cannot imagine a way to implement the semantics of slave mount using exportfs? - Q3 Why is unbindable mount needed? + 3. Why is unbindable mount needed? Let's say we want to replicate the mount tree at multiple locations within the same subtree. @@ -852,7 +852,7 @@ replicas continue to be exactly same. 8) Implementation ----------------- -8A) Datastructure +A) Datastructure 4 new fields are introduced to struct vfsmount: @@ -941,7 +941,7 @@ replicas continue to be exactly same. NOTE: The propagation tree is orthogonal to the mount tree. -8B Locking: +B) Locking: ->mnt_share, ->mnt_slave, ->mnt_slave_list, ->mnt_master are protected by namespace_sem (exclusive for modifications, shared for reading). @@ -953,7 +953,7 @@ replicas continue to be exactly same. The latter holds namespace_sem and the only references to vfsmount are in lists that can't be traversed without namespace_sem. -8C Algorithm: +C) Algorithm: The crux of the implementation resides in rbind/move operation. -- cgit v1.2.3 From 570924bf17de3e0c86c0502e8a20f6017e17bbb2 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 19 Aug 2025 13:12:51 +0700 Subject: Documentation: sharedsubtree: Don't repeat lists with explanation Don't repeat lists only mentioning the items when a corresponding list with item's explanations suffices. Signed-off-by: Bagas Sanjaya Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250819061254.31220-4-bagasdotme@gmail.com --- Documentation/filesystems/sharedsubtree.rst | 106 ++++++++++++---------------- 1 file changed, 44 insertions(+), 62 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sharedsubtree.rst b/Documentation/filesystems/sharedsubtree.rst index 7ad5101b4c03..64858ff0471b 100644 --- a/Documentation/filesystems/sharedsubtree.rst +++ b/Documentation/filesystems/sharedsubtree.rst @@ -31,15 +31,10 @@ and versioned filesystem. ----------- Shared subtree provides four different flavors of mounts; struct vfsmount to be -precise +precise: - a. shared mount - b. slave mount - c. private mount - d. unbindable mount - -a) A shared mount can be replicated to as many mountpoints and all the +a) A **shared mount** can be replicated to as many mountpoints and all the replicas continue to be exactly same. Here is an example: @@ -83,7 +78,7 @@ a) A shared mount can be replicated to as many mountpoints and all the contents will be visible under /tmp/a too. -b) A slave mount is like a shared mount except that mount and umount events +b) A **slave mount** is like a shared mount except that mount and umount events only propagate towards it. All slave mounts have a master mount which is a shared. @@ -131,12 +126,13 @@ b) A slave mount is like a shared mount except that mount and umount events /mnt -c) A private mount does not forward or receive propagation. +c) A **private mount** does not forward or receive propagation. This is the mount we are familiar with. Its the default type. -d) A unbindable mount is a unbindable private mount +d) An **unbindable mount** is, as the name suggests, an unbindable private + mount. let's say we have a mount at /mnt and we make it unbindable:: @@ -252,24 +248,18 @@ d) A unbindable mount is a unbindable private mount a) Mount states - A given mount can be in one of the following states - - 1) shared - 2) slave - 3) shared and slave - 4) private - 5) unbindable - - A 'propagation event' is defined as event generated on a vfsmount + A **propagation event** is defined as event generated on a vfsmount that leads to mount or unmount actions in other vfsmounts. - A 'peer group' is defined as a group of vfsmounts that propagate + A **peer group** is defined as a group of vfsmounts that propagate events to each other. + A given mount can be in one of the following states: + (1) Shared mounts - A 'shared mount' is defined as a vfsmount that belongs to a - 'peer group'. + A **shared mount** is defined as a vfsmount that belongs to a + peer group. For example:: @@ -284,7 +274,7 @@ a) Mount states (2) Slave mounts - A 'slave mount' is defined as a vfsmount that receives + A **slave mount** is defined as a vfsmount that receives propagation events and does not forward propagation events. A slave mount as the name implies has a master mount from which @@ -299,7 +289,7 @@ a) Mount states (3) Shared and Slave - A vfsmount can be both shared as well as slave. This state + A vfsmount can be both **shared** as well as **slave**. This state indicates that the mount is a slave of some vfsmount, and has its own peer group too. This vfsmount receives propagation events from its master vfsmount, and also forwards propagation @@ -318,12 +308,12 @@ a) Mount states (4) Private mount - A 'private mount' is defined as vfsmount that does not + A **private mount** is defined as vfsmount that does not receive or forward any propagation events. (5) Unbindable mount - A 'unbindable mount' is defined as vfsmount that does not + A **unbindable mount** is defined as vfsmount that does not receive or forward any propagation events and cannot be bind mounted. @@ -854,31 +844,26 @@ g) Clone Namespace A) Datastructure - 4 new fields are introduced to struct vfsmount: - - * ->mnt_share - * ->mnt_slave_list - * ->mnt_slave - * ->mnt_master + Several new fields are introduced to struct vfsmount: ->mnt_share - links together all the mount to/from which this vfsmount + Links together all the mount to/from which this vfsmount send/receives propagation events. ->mnt_slave_list - links all the mounts to which this vfsmount propagates + Links all the mounts to which this vfsmount propagates to. ->mnt_slave - links together all the slaves that its master vfsmount + Links together all the slaves that its master vfsmount propagates to. ->mnt_master - points to the master vfsmount from which this vfsmount + Points to the master vfsmount from which this vfsmount receives propagation. ->mnt_flags - takes two more flags to indicate the propagation status of + Takes two more flags to indicate the propagation status of the vfsmount. MNT_SHARE indicates that the vfsmount is a shared vfsmount. MNT_UNCLONABLE indicates that the vfsmount cannot be replicated. @@ -960,39 +945,36 @@ C) Algorithm: The overall algorithm breaks the operation into 3 phases: (look at attach_recursive_mnt() and propagate_mnt()) - 1. prepare phase. - 2. commit phases. - 3. abort phases. + 1. Prepare phase. - Prepare phase: + For each mount in the source tree: - for each mount in the source tree: + a) Create the necessary number of mount trees to + be attached to each of the mounts that receive + propagation from the destination mount. + b) Do not attach any of the trees to its destination. + However note down its ->mnt_parent and ->mnt_mountpoint + c) Link all the new mounts to form a propagation tree that + is identical to the propagation tree of the destination + mount. - a) Create the necessary number of mount trees to - be attached to each of the mounts that receive - propagation from the destination mount. - b) Do not attach any of the trees to its destination. - However note down its ->mnt_parent and ->mnt_mountpoint - c) Link all the new mounts to form a propagation tree that - is identical to the propagation tree of the destination - mount. + If this phase is successful, there should be 'n' new + propagation trees; where 'n' is the number of mounts in the + source tree. Go to the commit phase - If this phase is successful, there should be 'n' new - propagation trees; where 'n' is the number of mounts in the - source tree. Go to the commit phase + Also there should be 'm' new mount trees, where 'm' is + the number of mounts to which the destination mount + propagates to. - Also there should be 'm' new mount trees, where 'm' is - the number of mounts to which the destination mount - propagates to. + If any memory allocations fail, go to the abort phase. - if any memory allocations fail, go to the abort phase. + 2. Commit phase. - Commit phase - attach each of the mount trees to their corresponding - destination mounts. + Attach each of the mount trees to their corresponding + destination mounts. - Abort phase - delete all the newly created trees. + 3. Abort phase. + Delete all the newly created trees. .. Note:: all the propagation related functionality resides in the file pnode.c -- cgit v1.2.3 From b293fd55a1b8925b9b0578dca88d93eaaa6942c5 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 19 Aug 2025 13:12:52 +0700 Subject: Documentation: sharedsubtree: Align text The docs make heavy use of lists. As it is currently written, these generate a lot of unnecessary hanging indents since these are not semantically meant to be definition lists by accident. Align text to trim these indents. Signed-off-by: Bagas Sanjaya Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250819061254.31220-5-bagasdotme@gmail.com --- Documentation/filesystems/sharedsubtree.rst | 1301 ++++++++++++++------------- 1 file changed, 651 insertions(+), 650 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sharedsubtree.rst b/Documentation/filesystems/sharedsubtree.rst index 64858ff0471b..b09650e28534 100644 --- a/Documentation/filesystems/sharedsubtree.rst +++ b/Documentation/filesystems/sharedsubtree.rst @@ -37,947 +37,948 @@ precise: a) A **shared mount** can be replicated to as many mountpoints and all the replicas continue to be exactly same. - Here is an example: + Here is an example: - Let's say /mnt has a mount that is shared:: + Let's say /mnt has a mount that is shared:: - mount --make-shared /mnt + # mount --make-shared /mnt - Note: mount(8) command now supports the --make-shared flag, - so the sample 'smount' program is no longer needed and has been - removed. + Note: mount(8) command now supports the --make-shared flag, + so the sample 'smount' program is no longer needed and has been + removed. - :: + :: - # mount --bind /mnt /tmp + # mount --bind /mnt /tmp - The above command replicates the mount at /mnt to the mountpoint /tmp - and the contents of both the mounts remain identical. + The above command replicates the mount at /mnt to the mountpoint /tmp + and the contents of both the mounts remain identical. - :: + :: - #ls /mnt - a b c + #ls /mnt + a b c - #ls /tmp - a b c + #ls /tmp + a b c - Now let's say we mount a device at /tmp/a:: + Now let's say we mount a device at /tmp/a:: - # mount /dev/sd0 /tmp/a + # mount /dev/sd0 /tmp/a - #ls /tmp/a - t1 t2 t3 + # ls /tmp/a + t1 t2 t3 - #ls /mnt/a - t1 t2 t3 + # ls /mnt/a + t1 t2 t3 - Note that the mount has propagated to the mount at /mnt as well. + Note that the mount has propagated to the mount at /mnt as well. - And the same is true even when /dev/sd0 is mounted on /mnt/a. The - contents will be visible under /tmp/a too. + And the same is true even when /dev/sd0 is mounted on /mnt/a. The + contents will be visible under /tmp/a too. b) A **slave mount** is like a shared mount except that mount and umount events only propagate towards it. - All slave mounts have a master mount which is a shared. + All slave mounts have a master mount which is a shared. - Here is an example: + Here is an example: - Let's say /mnt has a mount which is shared:: + Let's say /mnt has a mount which is shared:: - # mount --make-shared /mnt + # mount --make-shared /mnt - Let's bind mount /mnt to /tmp:: + Let's bind mount /mnt to /tmp:: - # mount --bind /mnt /tmp + # mount --bind /mnt /tmp - the new mount at /tmp becomes a shared mount and it is a replica of - the mount at /mnt. + the new mount at /tmp becomes a shared mount and it is a replica of + the mount at /mnt. - Now let's make the mount at /tmp; a slave of /mnt:: + Now let's make the mount at /tmp; a slave of /mnt:: - # mount --make-slave /tmp + # mount --make-slave /tmp - let's mount /dev/sd0 on /mnt/a:: + let's mount /dev/sd0 on /mnt/a:: - # mount /dev/sd0 /mnt/a + # mount /dev/sd0 /mnt/a - # ls /mnt/a - t1 t2 t3 + # ls /mnt/a + t1 t2 t3 - # ls /tmp/a - t1 t2 t3 + # ls /tmp/a + t1 t2 t3 - Note the mount event has propagated to the mount at /tmp + Note the mount event has propagated to the mount at /tmp - However let's see what happens if we mount something on the mount at - /tmp:: + However let's see what happens if we mount something on the mount at + /tmp:: - # mount /dev/sd1 /tmp/b + # mount /dev/sd1 /tmp/b - # ls /tmp/b - s1 s2 s3 + # ls /tmp/b + s1 s2 s3 - # ls /mnt/b + # ls /mnt/b - Note how the mount event has not propagated to the mount at - /mnt + Note how the mount event has not propagated to the mount at + /mnt c) A **private mount** does not forward or receive propagation. - This is the mount we are familiar with. Its the default type. + This is the mount we are familiar with. Its the default type. d) An **unbindable mount** is, as the name suggests, an unbindable private mount. - let's say we have a mount at /mnt and we make it unbindable:: + let's say we have a mount at /mnt and we make it unbindable:: - # mount --make-unbindable /mnt + # mount --make-unbindable /mnt - Let's try to bind mount this mount somewhere else:: + Let's try to bind mount this mount somewhere else:: - # mount --bind /mnt /tmp - mount: wrong fs type, bad option, bad superblock on /mnt, - or too many mounted file systems + # mount --bind /mnt /tmp mount: wrong fs type, bad option, bad + superblock on /mnt, or too many mounted file systems - Binding a unbindable mount is a invalid operation. + Binding a unbindable mount is a invalid operation. 3) Setting mount states ----------------------- - The mount command (util-linux package) can be used to set mount - states:: +The mount command (util-linux package) can be used to set mount +states:: - mount --make-shared mountpoint - mount --make-slave mountpoint - mount --make-private mountpoint - mount --make-unbindable mountpoint + mount --make-shared mountpoint + mount --make-slave mountpoint + mount --make-private mountpoint + mount --make-unbindable mountpoint 4) Use cases ------------ - A) A process wants to clone its own namespace, but still wants to - access the CD that got mounted recently. +A) A process wants to clone its own namespace, but still wants to + access the CD that got mounted recently. - Solution: + Solution: - The system administrator can make the mount at /cdrom shared:: + The system administrator can make the mount at /cdrom shared:: - mount --bind /cdrom /cdrom - mount --make-shared /cdrom + mount --bind /cdrom /cdrom + mount --make-shared /cdrom - Now any process that clones off a new namespace will have a - mount at /cdrom which is a replica of the same mount in the - parent namespace. + Now any process that clones off a new namespace will have a + mount at /cdrom which is a replica of the same mount in the + parent namespace. - So when a CD is inserted and mounted at /cdrom that mount gets - propagated to the other mount at /cdrom in all the other clone - namespaces. + So when a CD is inserted and mounted at /cdrom that mount gets + propagated to the other mount at /cdrom in all the other clone + namespaces. - B) A process wants its mounts invisible to any other process, but - still be able to see the other system mounts. +B) A process wants its mounts invisible to any other process, but + still be able to see the other system mounts. - Solution: + Solution: - To begin with, the administrator can mark the entire mount tree - as shareable:: + To begin with, the administrator can mark the entire mount tree + as shareable:: - mount --make-rshared / + mount --make-rshared / - A new process can clone off a new namespace. And mark some part - of its namespace as slave:: + A new process can clone off a new namespace. And mark some part + of its namespace as slave:: - mount --make-rslave /myprivatetree + mount --make-rslave /myprivatetree - Hence forth any mounts within the /myprivatetree done by the - process will not show up in any other namespace. However mounts - done in the parent namespace under /myprivatetree still shows - up in the process's namespace. + Hence forth any mounts within the /myprivatetree done by the + process will not show up in any other namespace. However mounts + done in the parent namespace under /myprivatetree still shows + up in the process's namespace. - Apart from the above semantics this feature provides the - building blocks to solve the following problems: +Apart from the above semantics this feature provides the +building blocks to solve the following problems: - C) Per-user namespace +C) Per-user namespace - The above semantics allows a way to share mounts across - namespaces. But namespaces are associated with processes. If - namespaces are made first class objects with user API to - associate/disassociate a namespace with userid, then each user - could have his/her own namespace and tailor it to his/her - requirements. This needs to be supported in PAM. + The above semantics allows a way to share mounts across + namespaces. But namespaces are associated with processes. If + namespaces are made first class objects with user API to + associate/disassociate a namespace with userid, then each user + could have his/her own namespace and tailor it to his/her + requirements. This needs to be supported in PAM. - D) Versioned files +D) Versioned files - If the entire mount tree is visible at multiple locations, then - an underlying versioning file system can return different - versions of the file depending on the path used to access that - file. + If the entire mount tree is visible at multiple locations, then + an underlying versioning file system can return different + versions of the file depending on the path used to access that + file. - An example is:: + An example is:: - mount --make-shared / - mount --rbind / /view/v1 - mount --rbind / /view/v2 - mount --rbind / /view/v3 - mount --rbind / /view/v4 + mount --make-shared / + mount --rbind / /view/v1 + mount --rbind / /view/v2 + mount --rbind / /view/v3 + mount --rbind / /view/v4 - and if /usr has a versioning filesystem mounted, then that - mount appears at /view/v1/usr, /view/v2/usr, /view/v3/usr and - /view/v4/usr too + and if /usr has a versioning filesystem mounted, then that + mount appears at /view/v1/usr, /view/v2/usr, /view/v3/usr and + /view/v4/usr too - A user can request v3 version of the file /usr/fs/namespace.c - by accessing /view/v3/usr/fs/namespace.c . The underlying - versioning filesystem can then decipher that v3 version of the - filesystem is being requested and return the corresponding - inode. + A user can request v3 version of the file /usr/fs/namespace.c + by accessing /view/v3/usr/fs/namespace.c . The underlying + versioning filesystem can then decipher that v3 version of the + filesystem is being requested and return the corresponding + inode. 5) Detailed semantics --------------------- - The section below explains the detailed semantics of - bind, rbind, move, mount, umount and clone-namespace operations. +The section below explains the detailed semantics of +bind, rbind, move, mount, umount and clone-namespace operations. - Note: the word 'vfsmount' and the noun 'mount' have been used - to mean the same thing, throughout this document. +Note: the word 'vfsmount' and the noun 'mount' have been used +to mean the same thing, throughout this document. a) Mount states - A **propagation event** is defined as event generated on a vfsmount - that leads to mount or unmount actions in other vfsmounts. + A **propagation event** is defined as event generated on a vfsmount + that leads to mount or unmount actions in other vfsmounts. - A **peer group** is defined as a group of vfsmounts that propagate - events to each other. + A **peer group** is defined as a group of vfsmounts that propagate + events to each other. - A given mount can be in one of the following states: + A given mount can be in one of the following states: - (1) Shared mounts + (1) Shared mounts - A **shared mount** is defined as a vfsmount that belongs to a - peer group. + A **shared mount** is defined as a vfsmount that belongs to a + peer group. - For example:: + For example:: - mount --make-shared /mnt - mount --bind /mnt /tmp + mount --make-shared /mnt + mount --bind /mnt /tmp - The mount at /mnt and that at /tmp are both shared and belong - to the same peer group. Anything mounted or unmounted under - /mnt or /tmp reflect in all the other mounts of its peer - group. + The mount at /mnt and that at /tmp are both shared and belong + to the same peer group. Anything mounted or unmounted under + /mnt or /tmp reflect in all the other mounts of its peer + group. - (2) Slave mounts + (2) Slave mounts - A **slave mount** is defined as a vfsmount that receives - propagation events and does not forward propagation events. + A **slave mount** is defined as a vfsmount that receives + propagation events and does not forward propagation events. - A slave mount as the name implies has a master mount from which - mount/unmount events are received. Events do not propagate from - the slave mount to the master. Only a shared mount can be made - a slave by executing the following command:: + A slave mount as the name implies has a master mount from which + mount/unmount events are received. Events do not propagate from + the slave mount to the master. Only a shared mount can be made + a slave by executing the following command:: - mount --make-slave mount + mount --make-slave mount - A shared mount that is made as a slave is no more shared unless - modified to become shared. + A shared mount that is made as a slave is no more shared unless + modified to become shared. - (3) Shared and Slave + (3) Shared and Slave - A vfsmount can be both **shared** as well as **slave**. This state - indicates that the mount is a slave of some vfsmount, and - has its own peer group too. This vfsmount receives propagation - events from its master vfsmount, and also forwards propagation - events to its 'peer group' and to its slave vfsmounts. + A vfsmount can be both **shared** as well as **slave**. This state + indicates that the mount is a slave of some vfsmount, and + has its own peer group too. This vfsmount receives propagation + events from its master vfsmount, and also forwards propagation + events to its 'peer group' and to its slave vfsmounts. - Strictly speaking, the vfsmount is shared having its own - peer group, and this peer-group is a slave of some other - peer group. + Strictly speaking, the vfsmount is shared having its own + peer group, and this peer-group is a slave of some other + peer group. - Only a slave vfsmount can be made as 'shared and slave' by - either executing the following command:: + Only a slave vfsmount can be made as 'shared and slave' by + either executing the following command:: - mount --make-shared mount + mount --make-shared mount - or by moving the slave vfsmount under a shared vfsmount. + or by moving the slave vfsmount under a shared vfsmount. - (4) Private mount + (4) Private mount - A **private mount** is defined as vfsmount that does not - receive or forward any propagation events. + A **private mount** is defined as vfsmount that does not + receive or forward any propagation events. - (5) Unbindable mount + (5) Unbindable mount - A **unbindable mount** is defined as vfsmount that does not - receive or forward any propagation events and cannot - be bind mounted. + A **unbindable mount** is defined as vfsmount that does not + receive or forward any propagation events and cannot + be bind mounted. - State diagram: + State diagram: - The state diagram below explains the state transition of a mount, - in response to various commands:: + The state diagram below explains the state transition of a mount, + in response to various commands:: - ----------------------------------------------------------------------- - | |make-shared | make-slave | make-private |make-unbindab| - --------------|------------|--------------|--------------|-------------| - |shared |shared |*slave/private| private | unbindable | - | | | | | | - |-------------|------------|--------------|--------------|-------------| - |slave |shared | **slave | private | unbindable | - | |and slave | | | | - |-------------|------------|--------------|--------------|-------------| - |shared |shared | slave | private | unbindable | - |and slave |and slave | | | | - |-------------|------------|--------------|--------------|-------------| - |private |shared | **private | private | unbindable | - |-------------|------------|--------------|--------------|-------------| - |unbindable |shared |**unbindable | private | unbindable | - ------------------------------------------------------------------------ + ----------------------------------------------------------------------- + | |make-shared | make-slave | make-private |make-unbindab| + --------------|------------|--------------|--------------|-------------| + |shared |shared |*slave/private| private | unbindable | + | | | | | | + |-------------|------------|--------------|--------------|-------------| + |slave |shared | **slave | private | unbindable | + | |and slave | | | | + |-------------|------------|--------------|--------------|-------------| + |shared |shared | slave | private | unbindable | + |and slave |and slave | | | | + |-------------|------------|--------------|--------------|-------------| + |private |shared | **private | private | unbindable | + |-------------|------------|--------------|--------------|-------------| + |unbindable |shared |**unbindable | private | unbindable | + ------------------------------------------------------------------------ - * if the shared mount is the only mount in its peer group, making it - slave, makes it private automatically. Note that there is no master to - which it can be slaved to. + * if the shared mount is the only mount in its peer group, making it + slave, makes it private automatically. Note that there is no master to + which it can be slaved to. - ** slaving a non-shared mount has no effect on the mount. + ** slaving a non-shared mount has no effect on the mount. - Apart from the commands listed below, the 'move' operation also changes - the state of a mount depending on type of the destination mount. Its - explained in section 5d. + Apart from the commands listed below, the 'move' operation also changes + the state of a mount depending on type of the destination mount. Its + explained in section 5d. b) Bind semantics - Consider the following command:: - - mount --bind A/a B/b - - where 'A' is the source mount, 'a' is the dentry in the mount 'A', 'B' - is the destination mount and 'b' is the dentry in the destination mount. - - The outcome depends on the type of mount of 'A' and 'B'. The table - below contains quick reference:: - - -------------------------------------------------------------------------- - | BIND MOUNT OPERATION | - |************************************************************************| - |source(A)->| shared | private | slave | unbindable | - | dest(B) | | | | | - | | | | | | | - | v | | | | | - |************************************************************************| - | shared | shared | shared | shared & slave | invalid | - | | | | | | - |non-shared| shared | private | slave | invalid | - ************************************************************************** - - Details: - - 1. 'A' is a shared mount and 'B' is a shared mount. A new mount 'C' - which is clone of 'A', is created. Its root dentry is 'a' . 'C' is - mounted on mount 'B' at dentry 'b'. Also new mount 'C1', 'C2', 'C3' ... - are created and mounted at the dentry 'b' on all mounts where 'B' - propagates to. A new propagation tree containing 'C1',..,'Cn' is - created. This propagation tree is identical to the propagation tree of - 'B'. And finally the peer-group of 'C' is merged with the peer group - of 'A'. - - 2. 'A' is a private mount and 'B' is a shared mount. A new mount 'C' - which is clone of 'A', is created. Its root dentry is 'a'. 'C' is - mounted on mount 'B' at dentry 'b'. Also new mount 'C1', 'C2', 'C3' ... - are created and mounted at the dentry 'b' on all mounts where 'B' - propagates to. A new propagation tree is set containing all new mounts - 'C', 'C1', .., 'Cn' with exactly the same configuration as the - propagation tree for 'B'. - - 3. 'A' is a slave mount of mount 'Z' and 'B' is a shared mount. A new - mount 'C' which is clone of 'A', is created. Its root dentry is 'a' . - 'C' is mounted on mount 'B' at dentry 'b'. Also new mounts 'C1', 'C2', - 'C3' ... are created and mounted at the dentry 'b' on all mounts where - 'B' propagates to. A new propagation tree containing the new mounts - 'C','C1',.. 'Cn' is created. This propagation tree is identical to the - propagation tree for 'B'. And finally the mount 'C' and its peer group - is made the slave of mount 'Z'. In other words, mount 'C' is in the - state 'slave and shared'. - - 4. 'A' is a unbindable mount and 'B' is a shared mount. This is a - invalid operation. - - 5. 'A' is a private mount and 'B' is a non-shared(private or slave or - unbindable) mount. A new mount 'C' which is clone of 'A', is created. - Its root dentry is 'a'. 'C' is mounted on mount 'B' at dentry 'b'. - - 6. 'A' is a shared mount and 'B' is a non-shared mount. A new mount 'C' - which is a clone of 'A' is created. Its root dentry is 'a'. 'C' is - mounted on mount 'B' at dentry 'b'. 'C' is made a member of the - peer-group of 'A'. - - 7. 'A' is a slave mount of mount 'Z' and 'B' is a non-shared mount. A - new mount 'C' which is a clone of 'A' is created. Its root dentry is - 'a'. 'C' is mounted on mount 'B' at dentry 'b'. Also 'C' is set as a - slave mount of 'Z'. In other words 'A' and 'C' are both slave mounts of - 'Z'. All mount/unmount events on 'Z' propagates to 'A' and 'C'. But - mount/unmount on 'A' do not propagate anywhere else. Similarly - mount/unmount on 'C' do not propagate anywhere else. - - 8. 'A' is a unbindable mount and 'B' is a non-shared mount. This is a - invalid operation. A unbindable mount cannot be bind mounted. + Consider the following command:: + + mount --bind A/a B/b + + where 'A' is the source mount, 'a' is the dentry in the mount 'A', 'B' + is the destination mount and 'b' is the dentry in the destination mount. + + The outcome depends on the type of mount of 'A' and 'B'. The table + below contains quick reference:: + + -------------------------------------------------------------------------- + | BIND MOUNT OPERATION | + |************************************************************************| + |source(A)->| shared | private | slave | unbindable | + | dest(B) | | | | | + | | | | | | | + | v | | | | | + |************************************************************************| + | shared | shared | shared | shared & slave | invalid | + | | | | | | + |non-shared| shared | private | slave | invalid | + ************************************************************************** + + Details: + + 1. 'A' is a shared mount and 'B' is a shared mount. A new mount 'C' + which is clone of 'A', is created. Its root dentry is 'a' . 'C' is + mounted on mount 'B' at dentry 'b'. Also new mount 'C1', 'C2', 'C3' ... + are created and mounted at the dentry 'b' on all mounts where 'B' + propagates to. A new propagation tree containing 'C1',..,'Cn' is + created. This propagation tree is identical to the propagation tree of + 'B'. And finally the peer-group of 'C' is merged with the peer group + of 'A'. + + 2. 'A' is a private mount and 'B' is a shared mount. A new mount 'C' + which is clone of 'A', is created. Its root dentry is 'a'. 'C' is + mounted on mount 'B' at dentry 'b'. Also new mount 'C1', 'C2', 'C3' ... + are created and mounted at the dentry 'b' on all mounts where 'B' + propagates to. A new propagation tree is set containing all new mounts + 'C', 'C1', .., 'Cn' with exactly the same configuration as the + propagation tree for 'B'. + + 3. 'A' is a slave mount of mount 'Z' and 'B' is a shared mount. A new + mount 'C' which is clone of 'A', is created. Its root dentry is 'a' . + 'C' is mounted on mount 'B' at dentry 'b'. Also new mounts 'C1', 'C2', + 'C3' ... are created and mounted at the dentry 'b' on all mounts where + 'B' propagates to. A new propagation tree containing the new mounts + 'C','C1',.. 'Cn' is created. This propagation tree is identical to the + propagation tree for 'B'. And finally the mount 'C' and its peer group + is made the slave of mount 'Z'. In other words, mount 'C' is in the + state 'slave and shared'. + + 4. 'A' is a unbindable mount and 'B' is a shared mount. This is a + invalid operation. + + 5. 'A' is a private mount and 'B' is a non-shared(private or slave or + unbindable) mount. A new mount 'C' which is clone of 'A', is created. + Its root dentry is 'a'. 'C' is mounted on mount 'B' at dentry 'b'. + + 6. 'A' is a shared mount and 'B' is a non-shared mount. A new mount 'C' + which is a clone of 'A' is created. Its root dentry is 'a'. 'C' is + mounted on mount 'B' at dentry 'b'. 'C' is made a member of the + peer-group of 'A'. + + 7. 'A' is a slave mount of mount 'Z' and 'B' is a non-shared mount. A + new mount 'C' which is a clone of 'A' is created. Its root dentry is + 'a'. 'C' is mounted on mount 'B' at dentry 'b'. Also 'C' is set as a + slave mount of 'Z'. In other words 'A' and 'C' are both slave mounts of + 'Z'. All mount/unmount events on 'Z' propagates to 'A' and 'C'. But + mount/unmount on 'A' do not propagate anywhere else. Similarly + mount/unmount on 'C' do not propagate anywhere else. + + 8. 'A' is a unbindable mount and 'B' is a non-shared mount. This is a + invalid operation. A unbindable mount cannot be bind mounted. c) Rbind semantics - rbind is same as bind. Bind replicates the specified mount. Rbind - replicates all the mounts in the tree belonging to the specified mount. - Rbind mount is bind mount applied to all the mounts in the tree. + rbind is same as bind. Bind replicates the specified mount. Rbind + replicates all the mounts in the tree belonging to the specified mount. + Rbind mount is bind mount applied to all the mounts in the tree. - If the source tree that is rbind has some unbindable mounts, - then the subtree under the unbindable mount is pruned in the new - location. + If the source tree that is rbind has some unbindable mounts, + then the subtree under the unbindable mount is pruned in the new + location. - eg: + eg: - let's say we have the following mount tree:: + let's say we have the following mount tree:: - A - / \ - B C - / \ / \ - D E F G + A + / \ + B C + / \ / \ + D E F G - Let's say all the mount except the mount C in the tree are - of a type other than unbindable. + Let's say all the mount except the mount C in the tree are + of a type other than unbindable. - If this tree is rbound to say Z + If this tree is rbound to say Z - We will have the following tree at the new location:: + We will have the following tree at the new location:: - Z - | - A' - / - B' Note how the tree under C is pruned - / \ in the new location. - D' E' + Z + | + A' + / + B' Note how the tree under C is pruned + / \ in the new location. + D' E' d) Move semantics - Consider the following command:: - - mount --move A B/b - - where 'A' is the source mount, 'B' is the destination mount and 'b' is - the dentry in the destination mount. - - The outcome depends on the type of the mount of 'A' and 'B'. The table - below is a quick reference:: - - --------------------------------------------------------------------------- - | MOVE MOUNT OPERATION | - |************************************************************************** - | source(A)->| shared | private | slave | unbindable | - | dest(B) | | | | | - | | | | | | | - | v | | | | | - |************************************************************************** - | shared | shared | shared |shared and slave| invalid | - | | | | | | - |non-shared| shared | private | slave | unbindable | - *************************************************************************** - - .. Note:: moving a mount residing under a shared mount is invalid. - - Details follow: - - 1. 'A' is a shared mount and 'B' is a shared mount. The mount 'A' is - mounted on mount 'B' at dentry 'b'. Also new mounts 'A1', 'A2'...'An' - are created and mounted at dentry 'b' on all mounts that receive - propagation from mount 'B'. A new propagation tree is created in the - exact same configuration as that of 'B'. This new propagation tree - contains all the new mounts 'A1', 'A2'... 'An'. And this new - propagation tree is appended to the already existing propagation tree - of 'A'. - - 2. 'A' is a private mount and 'B' is a shared mount. The mount 'A' is - mounted on mount 'B' at dentry 'b'. Also new mount 'A1', 'A2'... 'An' - are created and mounted at dentry 'b' on all mounts that receive - propagation from mount 'B'. The mount 'A' becomes a shared mount and a - propagation tree is created which is identical to that of - 'B'. This new propagation tree contains all the new mounts 'A1', - 'A2'... 'An'. - - 3. 'A' is a slave mount of mount 'Z' and 'B' is a shared mount. The - mount 'A' is mounted on mount 'B' at dentry 'b'. Also new mounts 'A1', - 'A2'... 'An' are created and mounted at dentry 'b' on all mounts that - receive propagation from mount 'B'. A new propagation tree is created - in the exact same configuration as that of 'B'. This new propagation - tree contains all the new mounts 'A1', 'A2'... 'An'. And this new - propagation tree is appended to the already existing propagation tree of - 'A'. Mount 'A' continues to be the slave mount of 'Z' but it also - becomes 'shared'. - - 4. 'A' is a unbindable mount and 'B' is a shared mount. The operation - is invalid. Because mounting anything on the shared mount 'B' can - create new mounts that get mounted on the mounts that receive - propagation from 'B'. And since the mount 'A' is unbindable, cloning - it to mount at other mountpoints is not possible. - - 5. 'A' is a private mount and 'B' is a non-shared(private or slave or - unbindable) mount. The mount 'A' is mounted on mount 'B' at dentry 'b'. - - 6. 'A' is a shared mount and 'B' is a non-shared mount. The mount 'A' - is mounted on mount 'B' at dentry 'b'. Mount 'A' continues to be a - shared mount. - - 7. 'A' is a slave mount of mount 'Z' and 'B' is a non-shared mount. - The mount 'A' is mounted on mount 'B' at dentry 'b'. Mount 'A' - continues to be a slave mount of mount 'Z'. - - 8. 'A' is a unbindable mount and 'B' is a non-shared mount. The mount - 'A' is mounted on mount 'B' at dentry 'b'. Mount 'A' continues to be a - unbindable mount. + Consider the following command:: + + mount --move A B/b + + where 'A' is the source mount, 'B' is the destination mount and 'b' is + the dentry in the destination mount. + + The outcome depends on the type of the mount of 'A' and 'B'. The table + below is a quick reference:: + + --------------------------------------------------------------------------- + | MOVE MOUNT OPERATION | + |************************************************************************** + | source(A)->| shared | private | slave | unbindable | + | dest(B) | | | | | + | | | | | | | + | v | | | | | + |************************************************************************** + | shared | shared | shared |shared and slave| invalid | + | | | | | | + |non-shared| shared | private | slave | unbindable | + *************************************************************************** + + .. Note:: moving a mount residing under a shared mount is invalid. + + Details follow: + + 1. 'A' is a shared mount and 'B' is a shared mount. The mount 'A' is + mounted on mount 'B' at dentry 'b'. Also new mounts 'A1', 'A2'...'An' + are created and mounted at dentry 'b' on all mounts that receive + propagation from mount 'B'. A new propagation tree is created in the + exact same configuration as that of 'B'. This new propagation tree + contains all the new mounts 'A1', 'A2'... 'An'. And this new + propagation tree is appended to the already existing propagation tree + of 'A'. + + 2. 'A' is a private mount and 'B' is a shared mount. The mount 'A' is + mounted on mount 'B' at dentry 'b'. Also new mount 'A1', 'A2'... 'An' + are created and mounted at dentry 'b' on all mounts that receive + propagation from mount 'B'. The mount 'A' becomes a shared mount and a + propagation tree is created which is identical to that of + 'B'. This new propagation tree contains all the new mounts 'A1', + 'A2'... 'An'. + + 3. 'A' is a slave mount of mount 'Z' and 'B' is a shared mount. The + mount 'A' is mounted on mount 'B' at dentry 'b'. Also new mounts 'A1', + 'A2'... 'An' are created and mounted at dentry 'b' on all mounts that + receive propagation from mount 'B'. A new propagation tree is created + in the exact same configuration as that of 'B'. This new propagation + tree contains all the new mounts 'A1', 'A2'... 'An'. And this new + propagation tree is appended to the already existing propagation tree of + 'A'. Mount 'A' continues to be the slave mount of 'Z' but it also + becomes 'shared'. + + 4. 'A' is a unbindable mount and 'B' is a shared mount. The operation + is invalid. Because mounting anything on the shared mount 'B' can + create new mounts that get mounted on the mounts that receive + propagation from 'B'. And since the mount 'A' is unbindable, cloning + it to mount at other mountpoints is not possible. + + 5. 'A' is a private mount and 'B' is a non-shared(private or slave or + unbindable) mount. The mount 'A' is mounted on mount 'B' at dentry 'b'. + + 6. 'A' is a shared mount and 'B' is a non-shared mount. The mount 'A' + is mounted on mount 'B' at dentry 'b'. Mount 'A' continues to be a + shared mount. + + 7. 'A' is a slave mount of mount 'Z' and 'B' is a non-shared mount. + The mount 'A' is mounted on mount 'B' at dentry 'b'. Mount 'A' + continues to be a slave mount of mount 'Z'. + + 8. 'A' is a unbindable mount and 'B' is a non-shared mount. The mount + 'A' is mounted on mount 'B' at dentry 'b'. Mount 'A' continues to be a + unbindable mount. e) Mount semantics - Consider the following command:: + Consider the following command:: - mount device B/b + mount device B/b - 'B' is the destination mount and 'b' is the dentry in the destination - mount. + 'B' is the destination mount and 'b' is the dentry in the destination + mount. - The above operation is the same as bind operation with the exception - that the source mount is always a private mount. + The above operation is the same as bind operation with the exception + that the source mount is always a private mount. f) Unmount semantics - Consider the following command:: + Consider the following command:: - umount A + umount A - where 'A' is a mount mounted on mount 'B' at dentry 'b'. + where 'A' is a mount mounted on mount 'B' at dentry 'b'. - If mount 'B' is shared, then all most-recently-mounted mounts at dentry - 'b' on mounts that receive propagation from mount 'B' and does not have - sub-mounts within them are unmounted. + If mount 'B' is shared, then all most-recently-mounted mounts at dentry + 'b' on mounts that receive propagation from mount 'B' and does not have + sub-mounts within them are unmounted. - Example: Let's say 'B1', 'B2', 'B3' are shared mounts that propagate to - each other. + Example: Let's say 'B1', 'B2', 'B3' are shared mounts that propagate to + each other. - let's say 'A1', 'A2', 'A3' are first mounted at dentry 'b' on mount - 'B1', 'B2' and 'B3' respectively. + let's say 'A1', 'A2', 'A3' are first mounted at dentry 'b' on mount + 'B1', 'B2' and 'B3' respectively. - let's say 'C1', 'C2', 'C3' are next mounted at the same dentry 'b' on - mount 'B1', 'B2' and 'B3' respectively. + let's say 'C1', 'C2', 'C3' are next mounted at the same dentry 'b' on + mount 'B1', 'B2' and 'B3' respectively. - if 'C1' is unmounted, all the mounts that are most-recently-mounted on - 'B1' and on the mounts that 'B1' propagates-to are unmounted. + if 'C1' is unmounted, all the mounts that are most-recently-mounted on + 'B1' and on the mounts that 'B1' propagates-to are unmounted. - 'B1' propagates to 'B2' and 'B3'. And the most recently mounted mount - on 'B2' at dentry 'b' is 'C2', and that of mount 'B3' is 'C3'. + 'B1' propagates to 'B2' and 'B3'. And the most recently mounted mount + on 'B2' at dentry 'b' is 'C2', and that of mount 'B3' is 'C3'. - So all 'C1', 'C2' and 'C3' should be unmounted. + So all 'C1', 'C2' and 'C3' should be unmounted. - If any of 'C2' or 'C3' has some child mounts, then that mount is not - unmounted, but all other mounts are unmounted. However if 'C1' is told - to be unmounted and 'C1' has some sub-mounts, the umount operation is - failed entirely. + If any of 'C2' or 'C3' has some child mounts, then that mount is not + unmounted, but all other mounts are unmounted. However if 'C1' is told + to be unmounted and 'C1' has some sub-mounts, the umount operation is + failed entirely. g) Clone Namespace - A cloned namespace contains all the mounts as that of the parent - namespace. + A cloned namespace contains all the mounts as that of the parent + namespace. - Let's say 'A' and 'B' are the corresponding mounts in the parent and the - child namespace. + Let's say 'A' and 'B' are the corresponding mounts in the parent and the + child namespace. - If 'A' is shared, then 'B' is also shared and 'A' and 'B' propagate to - each other. + If 'A' is shared, then 'B' is also shared and 'A' and 'B' propagate to + each other. - If 'A' is a slave mount of 'Z', then 'B' is also the slave mount of - 'Z'. + If 'A' is a slave mount of 'Z', then 'B' is also the slave mount of + 'Z'. - If 'A' is a private mount, then 'B' is a private mount too. + If 'A' is a private mount, then 'B' is a private mount too. - If 'A' is unbindable mount, then 'B' is a unbindable mount too. + If 'A' is unbindable mount, then 'B' is a unbindable mount too. 6) Quiz ------- - A. What is the result of the following command sequence? +A. What is the result of the following command sequence? - :: + :: - mount --bind /mnt /mnt - mount --make-shared /mnt - mount --bind /mnt /tmp - mount --move /tmp /mnt/1 + mount --bind /mnt /mnt + mount --make-shared /mnt + mount --bind /mnt /tmp + mount --move /tmp /mnt/1 - what should be the contents of /mnt /mnt/1 /mnt/1/1 should be? - Should they all be identical? or should /mnt and /mnt/1 be - identical only? + what should be the contents of /mnt /mnt/1 /mnt/1/1 should be? + Should they all be identical? or should /mnt and /mnt/1 be + identical only? - B. What is the result of the following command sequence? +B. What is the result of the following command sequence? - :: + :: - mount --make-rshared / - mkdir -p /v/1 - mount --rbind / /v/1 + mount --make-rshared / + mkdir -p /v/1 + mount --rbind / /v/1 - what should be the content of /v/1/v/1 be? + what should be the content of /v/1/v/1 be? - C. What is the result of the following command sequence? +C. What is the result of the following command sequence? - :: + :: - mount --bind /mnt /mnt - mount --make-shared /mnt - mkdir -p /mnt/1/2/3 /mnt/1/test - mount --bind /mnt/1 /tmp - mount --make-slave /mnt - mount --make-shared /mnt - mount --bind /mnt/1/2 /tmp1 - mount --make-slave /mnt + mount --bind /mnt /mnt + mount --make-shared /mnt + mkdir -p /mnt/1/2/3 /mnt/1/test + mount --bind /mnt/1 /tmp + mount --make-slave /mnt + mount --make-shared /mnt + mount --bind /mnt/1/2 /tmp1 + mount --make-slave /mnt - At this point we have the first mount at /tmp and - its root dentry is 1. Let's call this mount 'A' - And then we have a second mount at /tmp1 with root - dentry 2. Let's call this mount 'B' - Next we have a third mount at /mnt with root dentry - mnt. Let's call this mount 'C' + At this point we have the first mount at /tmp and + its root dentry is 1. Let's call this mount 'A' + And then we have a second mount at /tmp1 with root + dentry 2. Let's call this mount 'B' + Next we have a third mount at /mnt with root dentry + mnt. Let's call this mount 'C' - 'B' is the slave of 'A' and 'C' is a slave of 'B' - A -> B -> C + 'B' is the slave of 'A' and 'C' is a slave of 'B' + A -> B -> C - at this point if we execute the following command:: + at this point if we execute the following command:: - mount --bind /bin /tmp/test + mount --bind /bin /tmp/test - The mount is attempted on 'A' + The mount is attempted on 'A' - will the mount propagate to 'B' and 'C' ? + will the mount propagate to 'B' and 'C' ? - what would be the contents of - /mnt/1/test be? + what would be the contents of + /mnt/1/test be? 7) FAQ ------ - 1. Why is bind mount needed? How is it different from symbolic links? - symbolic links can get stale if the destination mount gets - unmounted or moved. Bind mounts continue to exist even if the - other mount is unmounted or moved. +1. Why is bind mount needed? How is it different from symbolic links? + + symbolic links can get stale if the destination mount gets + unmounted or moved. Bind mounts continue to exist even if the + other mount is unmounted or moved. - 2. Why can't the shared subtree be implemented using exportfs? +2. Why can't the shared subtree be implemented using exportfs? - exportfs is a heavyweight way of accomplishing part of what - shared subtree can do. I cannot imagine a way to implement the - semantics of slave mount using exportfs? + exportfs is a heavyweight way of accomplishing part of what + shared subtree can do. I cannot imagine a way to implement the + semantics of slave mount using exportfs? - 3. Why is unbindable mount needed? +3. Why is unbindable mount needed? - Let's say we want to replicate the mount tree at multiple - locations within the same subtree. + Let's say we want to replicate the mount tree at multiple + locations within the same subtree. - if one rbind mounts a tree within the same subtree 'n' times - the number of mounts created is an exponential function of 'n'. - Having unbindable mount can help prune the unneeded bind - mounts. Here is an example. + if one rbind mounts a tree within the same subtree 'n' times + the number of mounts created is an exponential function of 'n'. + Having unbindable mount can help prune the unneeded bind + mounts. Here is an example. - step 1: - let's say the root tree has just two directories with - one vfsmount:: + step 1: + let's say the root tree has just two directories with + one vfsmount:: - root - / \ - tmp usr + root + / \ + tmp usr - And we want to replicate the tree at multiple - mountpoints under /root/tmp + And we want to replicate the tree at multiple + mountpoints under /root/tmp - step 2: - :: + step 2: + :: - mount --make-shared /root + mount --make-shared /root - mkdir -p /tmp/m1 + mkdir -p /tmp/m1 - mount --rbind /root /tmp/m1 + mount --rbind /root /tmp/m1 - the new tree now looks like this:: + the new tree now looks like this:: - root - / \ - tmp usr - / - m1 - / \ - tmp usr - / - m1 + root + / \ + tmp usr + / + m1 + / \ + tmp usr + / + m1 - it has two vfsmounts + it has two vfsmounts - step 3: - :: + step 3: + :: - mkdir -p /tmp/m2 - mount --rbind /root /tmp/m2 + mkdir -p /tmp/m2 + mount --rbind /root /tmp/m2 - the new tree now looks like this:: + the new tree now looks like this:: - root - / \ - tmp usr - / \ - m1 m2 - / \ / \ - tmp usr tmp usr - / \ / - m1 m2 m1 - / \ / \ - tmp usr tmp usr - / / \ - m1 m1 m2 - / \ - tmp usr - / \ - m1 m2 + root + / \ + tmp usr + / \ + m1 m2 + / \ / \ + tmp usr tmp usr + / \ / + m1 m2 m1 + / \ / \ + tmp usr tmp usr + / / \ + m1 m1 m2 + / \ + tmp usr + / \ + m1 m2 - it has 6 vfsmounts + it has 6 vfsmounts - step 4: - :: + step 4: + :: - mkdir -p /tmp/m3 - mount --rbind /root /tmp/m3 + mkdir -p /tmp/m3 + mount --rbind /root /tmp/m3 - I won't draw the tree..but it has 24 vfsmounts + I won't draw the tree..but it has 24 vfsmounts - at step i the number of vfsmounts is V[i] = i*V[i-1]. - This is an exponential function. And this tree has way more - mounts than what we really needed in the first place. + at step i the number of vfsmounts is V[i] = i*V[i-1]. + This is an exponential function. And this tree has way more + mounts than what we really needed in the first place. - One could use a series of umount at each step to prune - out the unneeded mounts. But there is a better solution. - Unclonable mounts come in handy here. + One could use a series of umount at each step to prune + out the unneeded mounts. But there is a better solution. + Unclonable mounts come in handy here. - step 1: - let's say the root tree has just two directories with - one vfsmount:: + step 1: + let's say the root tree has just two directories with + one vfsmount:: - root - / \ - tmp usr + root + / \ + tmp usr - How do we set up the same tree at multiple locations under - /root/tmp + How do we set up the same tree at multiple locations under + /root/tmp - step 2: - :: + step 2: + :: - mount --bind /root/tmp /root/tmp + mount --bind /root/tmp /root/tmp - mount --make-rshared /root - mount --make-unbindable /root/tmp + mount --make-rshared /root + mount --make-unbindable /root/tmp - mkdir -p /tmp/m1 + mkdir -p /tmp/m1 - mount --rbind /root /tmp/m1 + mount --rbind /root /tmp/m1 - the new tree now looks like this:: + the new tree now looks like this:: - root - / \ - tmp usr - / - m1 - / \ - tmp usr + root + / \ + tmp usr + / + m1 + / \ + tmp usr - step 3: - :: + step 3: + :: - mkdir -p /tmp/m2 - mount --rbind /root /tmp/m2 + mkdir -p /tmp/m2 + mount --rbind /root /tmp/m2 - the new tree now looks like this:: + the new tree now looks like this:: - root - / \ - tmp usr - / \ - m1 m2 - / \ / \ - tmp usr tmp usr + root + / \ + tmp usr + / \ + m1 m2 + / \ / \ + tmp usr tmp usr - step 4: - :: + step 4: + :: - mkdir -p /tmp/m3 - mount --rbind /root /tmp/m3 + mkdir -p /tmp/m3 + mount --rbind /root /tmp/m3 - the new tree now looks like this:: + the new tree now looks like this:: - root - / \ - tmp usr - / \ \ - m1 m2 m3 - / \ / \ / \ - tmp usr tmp usr tmp usr + root + / \ + tmp usr + / \ \ + m1 m2 m3 + / \ / \ / \ + tmp usr tmp usr tmp usr 8) Implementation ----------------- A) Datastructure - Several new fields are introduced to struct vfsmount: + Several new fields are introduced to struct vfsmount: - ->mnt_share - Links together all the mount to/from which this vfsmount - send/receives propagation events. + ->mnt_share + Links together all the mount to/from which this vfsmount + send/receives propagation events. - ->mnt_slave_list - Links all the mounts to which this vfsmount propagates - to. + ->mnt_slave_list + Links all the mounts to which this vfsmount propagates + to. - ->mnt_slave - Links together all the slaves that its master vfsmount - propagates to. + ->mnt_slave + Links together all the slaves that its master vfsmount + propagates to. - ->mnt_master - Points to the master vfsmount from which this vfsmount - receives propagation. + ->mnt_master + Points to the master vfsmount from which this vfsmount + receives propagation. - ->mnt_flags - Takes two more flags to indicate the propagation status of - the vfsmount. MNT_SHARE indicates that the vfsmount is a shared - vfsmount. MNT_UNCLONABLE indicates that the vfsmount cannot be - replicated. + ->mnt_flags + Takes two more flags to indicate the propagation status of + the vfsmount. MNT_SHARE indicates that the vfsmount is a shared + vfsmount. MNT_UNCLONABLE indicates that the vfsmount cannot be + replicated. - All the shared vfsmounts in a peer group form a cyclic list through - ->mnt_share. + All the shared vfsmounts in a peer group form a cyclic list through + ->mnt_share. - All vfsmounts with the same ->mnt_master form on a cyclic list anchored - in ->mnt_master->mnt_slave_list and going through ->mnt_slave. + All vfsmounts with the same ->mnt_master form on a cyclic list anchored + in ->mnt_master->mnt_slave_list and going through ->mnt_slave. - ->mnt_master can point to arbitrary (and possibly different) members - of master peer group. To find all immediate slaves of a peer group - you need to go through _all_ ->mnt_slave_list of its members. - Conceptually it's just a single set - distribution among the - individual lists does not affect propagation or the way propagation - tree is modified by operations. + ->mnt_master can point to arbitrary (and possibly different) members + of master peer group. To find all immediate slaves of a peer group + you need to go through _all_ ->mnt_slave_list of its members. + Conceptually it's just a single set - distribution among the + individual lists does not affect propagation or the way propagation + tree is modified by operations. - All vfsmounts in a peer group have the same ->mnt_master. If it is - non-NULL, they form a contiguous (ordered) segment of slave list. + All vfsmounts in a peer group have the same ->mnt_master. If it is + non-NULL, they form a contiguous (ordered) segment of slave list. - A example propagation tree looks as shown in the figure below. - [ NOTE: Though it looks like a forest, if we consider all the shared - mounts as a conceptual entity called 'pnode', it becomes a tree]:: + A example propagation tree looks as shown in the figure below. + [ NOTE: Though it looks like a forest, if we consider all the shared + mounts as a conceptual entity called 'pnode', it becomes a tree]:: - A <--> B <--> C <---> D - /|\ /| |\ - / F G J K H I - / - E<-->K - /|\ - M L N + A <--> B <--> C <---> D + /|\ /| |\ + / F G J K H I + / + E<-->K + /|\ + M L N - In the above figure A,B,C and D all are shared and propagate to each - other. 'A' has got 3 slave mounts 'E' 'F' and 'G' 'C' has got 2 slave - mounts 'J' and 'K' and 'D' has got two slave mounts 'H' and 'I'. - 'E' is also shared with 'K' and they propagate to each other. And - 'K' has 3 slaves 'M', 'L' and 'N' + In the above figure A,B,C and D all are shared and propagate to each + other. 'A' has got 3 slave mounts 'E' 'F' and 'G' 'C' has got 2 slave + mounts 'J' and 'K' and 'D' has got two slave mounts 'H' and 'I'. + 'E' is also shared with 'K' and they propagate to each other. And + 'K' has 3 slaves 'M', 'L' and 'N' - A's ->mnt_share links with the ->mnt_share of 'B' 'C' and 'D' + A's ->mnt_share links with the ->mnt_share of 'B' 'C' and 'D' - A's ->mnt_slave_list links with ->mnt_slave of 'E', 'K', 'F' and 'G' + A's ->mnt_slave_list links with ->mnt_slave of 'E', 'K', 'F' and 'G' - E's ->mnt_share links with ->mnt_share of K + E's ->mnt_share links with ->mnt_share of K - 'E', 'K', 'F', 'G' have their ->mnt_master point to struct vfsmount of 'A' + 'E', 'K', 'F', 'G' have their ->mnt_master point to struct vfsmount of 'A' - 'M', 'L', 'N' have their ->mnt_master point to struct vfsmount of 'K' + 'M', 'L', 'N' have their ->mnt_master point to struct vfsmount of 'K' - K's ->mnt_slave_list links with ->mnt_slave of 'M', 'L' and 'N' + K's ->mnt_slave_list links with ->mnt_slave of 'M', 'L' and 'N' - C's ->mnt_slave_list links with ->mnt_slave of 'J' and 'K' + C's ->mnt_slave_list links with ->mnt_slave of 'J' and 'K' - J and K's ->mnt_master points to struct vfsmount of C + J and K's ->mnt_master points to struct vfsmount of C - and finally D's ->mnt_slave_list links with ->mnt_slave of 'H' and 'I' + and finally D's ->mnt_slave_list links with ->mnt_slave of 'H' and 'I' - 'H' and 'I' have their ->mnt_master pointing to struct vfsmount of 'D'. + 'H' and 'I' have their ->mnt_master pointing to struct vfsmount of 'D'. - NOTE: The propagation tree is orthogonal to the mount tree. + NOTE: The propagation tree is orthogonal to the mount tree. B) Locking: - ->mnt_share, ->mnt_slave, ->mnt_slave_list, ->mnt_master are protected - by namespace_sem (exclusive for modifications, shared for reading). + ->mnt_share, ->mnt_slave, ->mnt_slave_list, ->mnt_master are protected + by namespace_sem (exclusive for modifications, shared for reading). - Normally we have ->mnt_flags modifications serialized by vfsmount_lock. - There are two exceptions: do_add_mount() and clone_mnt(). - The former modifies a vfsmount that has not been visible in any shared - data structures yet. - The latter holds namespace_sem and the only references to vfsmount - are in lists that can't be traversed without namespace_sem. + Normally we have ->mnt_flags modifications serialized by vfsmount_lock. + There are two exceptions: do_add_mount() and clone_mnt(). + The former modifies a vfsmount that has not been visible in any shared + data structures yet. + The latter holds namespace_sem and the only references to vfsmount + are in lists that can't be traversed without namespace_sem. C) Algorithm: - The crux of the implementation resides in rbind/move operation. + The crux of the implementation resides in rbind/move operation. + + The overall algorithm breaks the operation into 3 phases: (look at + attach_recursive_mnt() and propagate_mnt()) - The overall algorithm breaks the operation into 3 phases: (look at - attach_recursive_mnt() and propagate_mnt()) + 1. Prepare phase. - 1. Prepare phase. + For each mount in the source tree: - For each mount in the source tree: + a) Create the necessary number of mount trees to + be attached to each of the mounts that receive + propagation from the destination mount. + b) Do not attach any of the trees to its destination. + However note down its ->mnt_parent and ->mnt_mountpoint + c) Link all the new mounts to form a propagation tree that + is identical to the propagation tree of the destination + mount. - a) Create the necessary number of mount trees to - be attached to each of the mounts that receive - propagation from the destination mount. - b) Do not attach any of the trees to its destination. - However note down its ->mnt_parent and ->mnt_mountpoint - c) Link all the new mounts to form a propagation tree that - is identical to the propagation tree of the destination - mount. + If this phase is successful, there should be 'n' new + propagation trees; where 'n' is the number of mounts in the + source tree. Go to the commit phase - If this phase is successful, there should be 'n' new - propagation trees; where 'n' is the number of mounts in the - source tree. Go to the commit phase + Also there should be 'm' new mount trees, where 'm' is + the number of mounts to which the destination mount + propagates to. - Also there should be 'm' new mount trees, where 'm' is - the number of mounts to which the destination mount - propagates to. + If any memory allocations fail, go to the abort phase. - If any memory allocations fail, go to the abort phase. + 2. Commit phase. - 2. Commit phase. + Attach each of the mount trees to their corresponding + destination mounts. - Attach each of the mount trees to their corresponding - destination mounts. + 3. Abort phase. - 3. Abort phase. - Delete all the newly created trees. + Delete all the newly created trees. - .. Note:: - all the propagation related functionality resides in the file pnode.c + .. Note:: + all the propagation related functionality resides in the file pnode.c ------------------------------------------------------------------------ -- cgit v1.2.3 From ec1a37468f15b5fa69ecd01f49a0d818ed559943 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Tue, 19 Aug 2025 13:12:53 +0700 Subject: Documentation: sharedsubtree: Convert notes to note directive While a few of the notes are already in reST syntax, others are left intact (inconsistent). Convert them to reST syntax too. Signed-off-by: Bagas Sanjaya Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250819061254.31220-6-bagasdotme@gmail.com --- Documentation/filesystems/sharedsubtree.rst | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sharedsubtree.rst b/Documentation/filesystems/sharedsubtree.rst index b09650e28534..8b7dc9159083 100644 --- a/Documentation/filesystems/sharedsubtree.rst +++ b/Documentation/filesystems/sharedsubtree.rst @@ -43,9 +43,10 @@ a) A **shared mount** can be replicated to as many mountpoints and all the # mount --make-shared /mnt - Note: mount(8) command now supports the --make-shared flag, - so the sample 'smount' program is no longer needed and has been - removed. + .. note:: + mount(8) command now supports the --make-shared flag, + so the sample 'smount' program is no longer needed and has been + removed. :: @@ -242,8 +243,9 @@ D) Versioned files The section below explains the detailed semantics of bind, rbind, move, mount, umount and clone-namespace operations. -Note: the word 'vfsmount' and the noun 'mount' have been used -to mean the same thing, throughout this document. +.. Note:: + the word 'vfsmount' and the noun 'mount' have been used + to mean the same thing, throughout this document. a) Mount states @@ -885,8 +887,12 @@ A) Datastructure non-NULL, they form a contiguous (ordered) segment of slave list. A example propagation tree looks as shown in the figure below. - [ NOTE: Though it looks like a forest, if we consider all the shared - mounts as a conceptual entity called 'pnode', it becomes a tree]:: + + .. note:: + Though it looks like a forest, if we consider all the shared + mounts as a conceptual entity called 'pnode', it becomes a tree. + + :: A <--> B <--> C <---> D -- cgit v1.2.3 From a4c2ff6e507ebb47761865289772494e717d035a Mon Sep 17 00:00:00 2001 From: Ranganath V N Date: Wed, 3 Sep 2025 01:08:22 +0530 Subject: Documentation: Fix spelling mistakes Corrected a few spelling mistakes to improve the readability. Signed-off-by: Ranganath V N Acked-by: Rob Herring (Arm) Reviewed-by: Randy Dunlap Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20250902193822.6349-1-vnranganath.20@gmail.com --- Documentation/devicetree/bindings/submitting-patches.rst | 2 +- Documentation/filesystems/iomap/operations.rst | 2 +- Documentation/virt/kvm/review-checklist.rst | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/devicetree/bindings/submitting-patches.rst b/Documentation/devicetree/bindings/submitting-patches.rst index 46d0b036c97e..191085b0d5e8 100644 --- a/Documentation/devicetree/bindings/submitting-patches.rst +++ b/Documentation/devicetree/bindings/submitting-patches.rst @@ -66,7 +66,7 @@ I. For patch submitters any DTS patches, regardless whether using existing or new bindings, should be placed at the end of patchset to indicate no dependency of drivers on the DTS. DTS will be anyway applied through separate tree or branch, so - different order would indicate the serie is non-bisectable. + different order would indicate the series is non-bisectable. If a driver subsystem maintainer prefers to apply entire set, instead of their relevant portion of patchset, please split the DTS patches into diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst index 067ed8e14ef3..387fd9cc72ca 100644 --- a/Documentation/filesystems/iomap/operations.rst +++ b/Documentation/filesystems/iomap/operations.rst @@ -321,7 +321,7 @@ The fields are as follows: - ``writeback_submit``: Submit the previous built writeback context. Block based file systems should use the iomap_ioend_writeback_submit helper, other file system can implement their own. - File systems can optionall to hook into writeback bio submission. + File systems can optionally hook into writeback bio submission. This might include pre-write space accounting updates, or installing a custom ``->bi_end_io`` function for internal purposes, such as deferring the ioend completion to a workqueue to run metadata update diff --git a/Documentation/virt/kvm/review-checklist.rst b/Documentation/virt/kvm/review-checklist.rst index debac54e14e7..053f00c50d66 100644 --- a/Documentation/virt/kvm/review-checklist.rst +++ b/Documentation/virt/kvm/review-checklist.rst @@ -98,7 +98,7 @@ New APIs It is important to demonstrate your use case. This can be as simple as explaining that the feature is already in use on bare metal, or it can be a proof-of-concept implementation in userspace. The latter need not be - open source, though that is of course preferrable for easier testing. + open source, though that is of course preferable for easier testing. Selftests should test corner cases of the APIs, and should also cover basic host and guest operation if no open source VMM uses the feature. -- cgit v1.2.3 From 7e5a0fe4e8ae2eb341f8ebbee2b24231a58fc28b Mon Sep 17 00:00:00 2001 From: Baruch Siach Date: Wed, 27 Aug 2025 14:37:27 +0300 Subject: doc: filesystems: proc: remove stale information from intro Most of the information in the first paragraph of the Introduction/Credits section is outdated. Documentation update suggestions should go to documentation maintainers listed in MAINTAINERS. Remove misleading contact information. Signed-off-by: Baruch Siach Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/cb4987a16ed96ee86841aec921d914bd44249d0b.1756294647.git.baruch@tkos.co.il --- Documentation/filesystems/proc.rst | 21 --------------------- 1 file changed, 21 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 2971551b7235..ede654478dff 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -61,19 +61,6 @@ Preface 0.1 Introduction/Credits ------------------------ -This documentation is part of a soon (or so we hope) to be released book on -the SuSE Linux distribution. As there is no complete documentation for the -/proc file system and we've used many freely available sources to write these -chapters, it seems only fair to give the work back to the Linux community. -This work is based on the 2.2.* kernel version and the upcoming 2.4.*. I'm -afraid it's still far from complete, but we hope it will be useful. As far as -we know, it is the first 'all-in-one' document about the /proc file system. It -is focused on the Intel x86 hardware, so if you are looking for PPC, ARM, -SPARC, AXP, etc., features, you probably won't find what you are looking for. -It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But -additions and patches are welcome and will be added to this document if you -mail them to Bodo. - We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of other people for help compiling this documentation. We'd also like to extend a special thank you to Andi Kleen for documentation, which we relied on heavily @@ -81,17 +68,9 @@ to create this document, as well as the additional information he provided. Thanks to everybody else who contributed source or docs to the Linux kernel and helped create a great piece of software... :) -If you have any comments, corrections or additions, please don't hesitate to -contact Bodo Bauer at bb@ricochet.net. We'll be happy to add them to this -document. - The latest version of this document is available online at https://www.kernel.org/doc/html/latest/filesystems/proc.html -If the above direction does not works for you, you could try the kernel -mailing list at linux-kernel@vger.kernel.org and/or try to reach me at -comandante@zaralinux.com. - 0.2 Legal Stuff --------------- -- cgit v1.2.3 From a1d4416f8682d3c6d0545ad8a887d2a77f170808 Mon Sep 17 00:00:00 2001 From: Alex Tran Date: Mon, 1 Sep 2025 19:30:39 -0700 Subject: docs: filesystems: sysfs: remove top level sysfs net directory The net/ directory is not present as a top level sysfs directory in standard Linux systems. Network interfaces can be accessible via /sys/class/net instead. Signed-off-by: Alex Tran Signed-off-by: Jonathan Corbet Message-ID: <20250902023039.1351270-3-alex.t.tran@gmail.com> --- Documentation/filesystems/sysfs.rst | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sysfs.rst b/Documentation/filesystems/sysfs.rst index 624e4f51212e..db153cda0786 100644 --- a/Documentation/filesystems/sysfs.rst +++ b/Documentation/filesystems/sysfs.rst @@ -299,7 +299,6 @@ The top level sysfs directory looks like:: hypervisor/ kernel/ module/ - net/ power/ devices/ contains a filesystem representation of the device tree. It maps -- cgit v1.2.3 From 63e6e9dde28a8e8967308a9dfb807edea3810aab Mon Sep 17 00:00:00 2001 From: Alex Tran Date: Mon, 1 Sep 2025 19:30:38 -0700 Subject: docs: filesystems: sysfs: clarify symlink destinations in dev and bus/devices descriptions Change sysfs bus/devices and dev directory descriptions to provide more verbose information about the specific symlink destination the devices point to. Signed-off-by: Alex Tran Signed-off-by: Jonathan Corbet Message-ID: <20250902023039.1351270-2-alex.t.tran@gmail.com> --- Documentation/filesystems/sysfs.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sysfs.rst b/Documentation/filesystems/sysfs.rst index db153cda0786..f1a776a7ac66 100644 --- a/Documentation/filesystems/sysfs.rst +++ b/Documentation/filesystems/sysfs.rst @@ -312,7 +312,7 @@ kernel. Each bus's directory contains two subdirectories:: drivers/ devices/ contains symlinks for each device discovered in the system -that point to the device's directory under root/. +that point to the device's directory under /sys/devices. drivers/ contains a directory for each device driver that is loaded for devices on that particular bus (this assumes that drivers do not @@ -327,7 +327,7 @@ loaded system modules, for both builtin and loadable modules. dev/ contains two directories: char/ and block/. Inside these two directories there are symlinks named :. These symlinks -point to the sysfs directory for the given device. /sys/dev provides a +point to the directories under /sys/devices for each device. /sys/dev provides a quick way to lookup the sysfs interface for a device from the result of a stat(2) operation. -- cgit v1.2.3 From a3d13ec44aea30794c4764b3516b4b0396ce4814 Mon Sep 17 00:00:00 2001 From: Alex Tran Date: Mon, 1 Sep 2025 19:30:37 -0700 Subject: docs: filesystems: sysfs: add remaining top level sysfs directory descriptions Finish top level sysfs directory descriptions for block, class, firmware, hypervisor, kernel, and power. Did not write one for net directory. See commit bc3a88431672 ("docs: filesystems: sysfs: remove top level sysfs net directory") Signed-off-by: Alex Tran Signed-off-by: Jonathan Corbet Message-ID: <20250902023039.1351270-1-alex.t.tran@gmail.com> --- Documentation/filesystems/sysfs.rst | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/sysfs.rst b/Documentation/filesystems/sysfs.rst index f1a776a7ac66..354c5fb310b4 100644 --- a/Documentation/filesystems/sysfs.rst +++ b/Documentation/filesystems/sysfs.rst @@ -334,8 +334,22 @@ a stat(2) operation. More information on driver-model specific features can be found in Documentation/driver-api/driver-model/. +block/ contains symlinks to all the block devices discovered on the system. +These symlinks point to directories under /sys/devices. -TODO: Finish this section. +class/ contains a directory for each device class, grouped by functional type. +Each directory in class/ contains symlinks to devices in the /sys/devices directory. + +firmware/ contains system firmware data and configuration such as firmware tables, +ACPI information, and device tree data. + +hypervisor/ contains virtualization platform information and provides an interface to +the underlying hypervisor. It is only present when running on a virtual machine. + +kernel/ contains runtime kernel parameters, configuration settings, and status. + +power/ contains power management subsystem information including +sleep states, suspend/resume capabilities, and policies. Current Interfaces -- cgit v1.2.3