Changes in 4.19.325
netlink: terminate outstanding dump on socket close
ocfs2: uncache inode which has failed entering the group
nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint
ocfs2: fix UBSAN warning in ocfs2_verify_volume()
nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint
Revert "mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K"
media: dvbdev: fix the logic when DVB_DYNAMIC_MINORS is not set
kbuild: Use uname for LINUX_COMPILE_HOST detection
mm: revert "mm: shmem: fix data-race in shmem_getattr()"
ASoC: Intel: bytcr_rt5640: Add DMI quirk for Vexia Edu Atla 10 tablet
mac80211: fix user-power when emulating chanctx
selftests/watchdog-test: Fix system accidentally reset after watchdog-test
x86/amd_nb: Fix compile-testing without CONFIG_AMD_NB
net: usb: qmi_wwan: add Quectel RG650V
proc/softirqs: replace seq_printf with seq_put_decimal_ull_width
nvme: fix metadata handling in nvme-passthrough
initramfs: avoid filename buffer overrun
m68k: mvme147: Fix SCSI controller IRQ numbers
m68k: mvme16x: Add and use "mvme16x.h"
m68k: mvme147: Reinstate early console
acpi/arm64: Adjust error handling procedure in gtdt_parse_timer_block()
s390/syscalls: Avoid creation of arch/arch/ directory
hfsplus: don't query the device logical block size multiple times
EDAC/fsl_ddr: Fix bad bit shift operations
crypto: pcrypt - Call crypto layer directly when padata_do_parallel() return -EBUSY
crypto: cavium - Fix the if condition to exit loop after timeout
crypto: bcm - add error check in the ahash_hmac_init function
crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()
time: Fix references to _msecs_to_jiffies() handling of values
soc: qcom: geni-se: fix array underflow in geni_se_clk_tbl_get()
mmc: mmc_spi: drop buggy snprintf()
ARM: dts: cubieboard4: Fix DCDC5 regulator constraints
regmap: irq: Set lockdep class for hierarchical IRQ domains
firmware: arm_scpi: Check the DVFS OPP count returned by the firmware
drm/mm: Mark drm_mm_interval_tree*() functions with __maybe_unused
wifi: ath9k: add range check for conn_rsp_epid in htc_connect_service()
drm/omap: Fix locking in omap_gem_new_dmabuf()
bpf: Fix the xdp_adjust_tail sample prog issue
wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_config_scan()
drm/etnaviv: consolidate hardware fence handling in etnaviv_gpu
drm/etnaviv: dump: fix sparse warnings
drm/etnaviv: fix power register offset on GC300
drm/etnaviv: hold GPU lock across perfmon sampling
net: rfkill: gpio: Add check for clk_enable()
ALSA: us122l: Use snd_card_free_when_closed() at disconnection
ALSA: caiaq: Use snd_card_free_when_closed() at disconnection
ALSA: 6fire: Release resources at card release
netpoll: Use rcu_access_pointer() in netpoll_poll_lock
trace/trace_event_perf: remove duplicate samples on the first tracepoint event
powerpc/vdso: Flag VDSO64 entry points as functions
mfd: da9052-spi: Change read-mask to write-mask
cpufreq: loongson2: Unregister platform_driver on failure
mtd: rawnand: atmel: Fix possible memory leak
RDMA/bnxt_re: Check cqe flags to know imm_data vs inv_irkey
mfd: rt5033: Fix missing regmap_del_irq_chip()
scsi: bfa: Fix use-after-free in bfad_im_module_exit()
scsi: fusion: Remove unused variable 'rc'
scsi: qedi: Fix a possible memory leak in qedi_alloc_and_init_sb()
ocfs2: fix uninitialized value in ocfs2_file_read_iter()
powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static
fbdev/sh7760fb: Alloc DMA memory from hardware device
fbdev: sh7760fb: Fix a possible memory leak in sh7760fb_alloc_mem()
dt-bindings: clock: adi,axi-clkgen: convert old binding to yaml format
dt-bindings: clock: axi-clkgen: include AXI clk
clk: axi-clkgen: use devm_platform_ioremap_resource() short-hand
clk: clk-axi-clkgen: make sure to enable the AXI bus clock
perf probe: Correct demangled symbols in C++ program
PCI: cpqphp: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: cpqphp: Fix PCIBIOS_* return value confusion
m68k: mcfgpio: Fix incorrect register offset for CONFIG_M5441x
m68k: coldfire/device.c: only build FEC when HW macros are defined
rpmsg: glink: Add TX_DATA_CONT command while sending
rpmsg: glink: Send READ_NOTIFY command in FIFO full case
rpmsg: glink: Fix GLINK command prefix
rpmsg: glink: use only lower 16-bits of param2 for CMD_OPEN name length
NFSD: Prevent NULL dereference in nfsd4_process_cb_update()
NFSD: Cap the number of bytes copied by nfs4_reset_recoverydir()
vfio/pci: Properly hide first-in-list PCIe extended capability
power: supply: core: Remove might_sleep() from power_supply_put()
net: usb: lan78xx: Fix memory leak on device unplug by freeing PHY device
tg3: Set coherent DMA mask bits to 31 for BCM57766 chipsets
net: usb: lan78xx: Fix refcounting and autosuspend on invalid WoL configuration
marvell: pxa168_eth: fix call balance of pep->clk handling routines
net: stmmac: dwmac-socfpga: Set RX watchdog interrupt as broken
usb: using mutex lock and supporting O_NONBLOCK flag in iowarrior_read()
USB: chaoskey: fail open after removal
USB: chaoskey: Fix possible deadlock chaoskey_list_lock
misc: apds990x: Fix missing pm_runtime_disable()
apparmor: fix 'Do simple duplicate message elimination'
usb: ehci-spear: fix call balance of sehci clk handling routines
ext4: supress data-race warnings in ext4_free_inodes_{count,set}()
ext4: fix FS_IOC_GETFSMAP handling
jfs: xattr: check invalid xattr size more strictly
ASoC: codecs: Fix atomicity violation in snd_soc_component_get_drvdata()
PCI: Fix use-after-free of slot->bus on hot remove
tty: ldsic: fix tty_ldisc_autoload sysctl's proc_handler
Bluetooth: Fix type of len in rfcomm_sock_getsockopt{,_old}()
ALSA: usb-audio: Fix potential out-of-bound accesses for Extigy and Mbox devices
Revert "usb: gadget: composite: fix OS descriptors w_value logic"
serial: sh-sci: Clean sci_ports[0] after at earlycon exit
Revert "serial: sh-sci: Clean sci_ports[0] after at earlycon exit"
netfilter: ipset: add missing range check in bitmap_ip_uadt
spi: Fix acpi deferred irq probe
ubi: wl: Put source PEB into correct list if trying locking LEB failed
um: ubd: Do not use drvdata in release
um: net: Do not use drvdata in release
serial: 8250: omap: Move pm_runtime_get_sync
um: vector: Do not use drvdata in release
sh: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
block: fix ordering between checking BLK_MQ_S_STOPPED request adding
HID: wacom: Interpret tilt data from Intuos Pro BT as signed values
media: wl128x: Fix atomicity violation in fmc_send_cmd()
usb: dwc3: gadget: Fix checking for number of TRBs left
lib: string_helpers: silence snprintf() output truncation warning
NFSD: Prevent a potential integer overflow
rpmsg: glink: Propagate TX failures in intentless mode as well
um: Fix the return value of elf_core_copy_task_fpregs
NFSv4.0: Fix a use-after-free problem in the asynchronous open()
rtc: check if __rtc_read_time was successful in rtc_timer_do_work()
ubifs: Correct the total block count by deducting journal reservation
ubi: fastmap: Fix duplicate slab cache names while attaching
jffs2: fix use of uninitialized variable
block: return unsigned int from bdev_io_min
9p/xen: fix init sequence
9p/xen: fix release of IRQ
modpost: remove incorrect code in do_eisa_entry()
sh: intc: Fix use-after-free bug in register_intc_controller()
Linux 4.19.325
Change-Id: I50250c8bd11f9ff4b40da75225c1cfb060e0c258
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
JFFS2 LOCKING DOCUMENTATION
---------------------------
This document attempts to describe the existing locking rules for
JFFS2. It is not expected to remain perfectly up to date, but ought to
be fairly close.
alloc_sem
---------
The alloc_sem is a per-filesystem mutex, used primarily to ensure
contiguous allocation of space on the medium. It is automatically
obtained during space allocations (jffs2_reserve_space()) and freed
upon write completion (jffs2_complete_reservation()). Note that
the garbage collector will obtain this right at the beginning of
jffs2_garbage_collect_pass() and release it at the end, thereby
preventing any other write activity on the file system during a
garbage collect pass.
When writing new nodes, the alloc_sem must be held until the new nodes
have been properly linked into the data structures for the inode to
which they belong. This is for the benefit of NAND flash - adding new
nodes to an inode may obsolete old ones, and by holding the alloc_sem
until this happens we ensure that any data in the write-buffer at the
time this happens are part of the new node, not just something that
was written afterwards. Hence, we can ensure the newly-obsoleted nodes
don't actually get erased until the write-buffer has been flushed to
the medium.
With the introduction of NAND flash support and the write-buffer,
the alloc_sem is also used to protect the wbuf-related members of the
jffs2_sb_info structure. Atomically reading the wbuf_len member to see
if the wbuf is currently holding any data is permitted, though.
Ordering constraints: See f->sem.
File Mutex f->sem
---------------------
This is the JFFS2-internal equivalent of the inode mutex i->i_sem.
It protects the contents of the jffs2_inode_info private inode data,
including the linked list of node fragments (but see the notes below on
erase_completion_lock), etc.
The reason that the i_sem itself isn't used for this purpose is to
avoid deadlocks with garbage collection -- the VFS will lock the i_sem
before calling a function which may need to allocate space. The
allocation may trigger garbage-collection, which may need to move a
node belonging to the inode which was locked in the first place by the
VFS. If the garbage collection code were to attempt to lock the i_sem
of the inode from which it's garbage-collecting a physical node, this
lead to deadlock, unless we played games with unlocking the i_sem
before calling the space allocation functions.
Instead of playing such games, we just have an extra internal
mutex, which is obtained by the garbage collection code and also
by the normal file system code _after_ allocation of space.
Ordering constraints:
1. Never attempt to allocate space or lock alloc_sem with
any f->sem held.
2. Never attempt to lock two file mutexes in one thread.
No ordering rules have been made for doing so.
3. Never lock a page cache page with f->sem held.
erase_completion_lock spinlock
------------------------------
This is used to serialise access to the eraseblock lists, to the
per-eraseblock lists of physical jffs2_raw_node_ref structures, and
(NB) the per-inode list of physical nodes. The latter is a special
case - see below.
As the MTD API no longer permits erase-completion callback functions
to be called from bottom-half (timer) context (on the basis that nobody
ever actually implemented such a thing), it's now sufficient to use
a simple spin_lock() rather than spin_lock_bh().
Note that the per-inode list of physical nodes (f->nodes) is a special
case. Any changes to _valid_ nodes (i.e. ->flash_offset & 1 == 0) in
the list are protected by the file mutex f->sem. But the erase code
may remove _obsolete_ nodes from the list while holding only the
erase_completion_lock. So you can walk the list only while holding the
erase_completion_lock, and can drop the lock temporarily mid-walk as
long as the pointer you're holding is to a _valid_ node, not an
obsolete one.
The erase_completion_lock is also used to protect the c->gc_task
pointer when the garbage collection thread exits. The code to kill the
GC thread locks it, sends the signal, then unlocks it - while the GC
thread itself locks it, zeroes c->gc_task, then unlocks on the exit path.
inocache_lock spinlock
----------------------
This spinlock protects the hashed list (c->inocache_list) of the
in-core jffs2_inode_cache objects (each inode in JFFS2 has the
correspondent jffs2_inode_cache object). So, the inocache_lock
has to be locked while walking the c->inocache_list hash buckets.
This spinlock also covers allocation of new inode numbers, which is
currently just '++->highest_ino++', but might one day get more complicated
if we need to deal with wrapping after 4 milliard inode numbers are used.
Note, the f->sem guarantees that the correspondent jffs2_inode_cache
will not be removed. So, it is allowed to access it without locking
the inocache_lock spinlock.
Ordering constraints:
If both erase_completion_lock and inocache_lock are needed, the
c->erase_completion has to be acquired first.
erase_free_sem
--------------
This mutex is only used by the erase code which frees obsolete node
references and the jffs2_garbage_collect_deletion_dirent() function.
The latter function on NAND flash must read _obsolete_ nodes to
determine whether the 'deletion dirent' under consideration can be
discarded or whether it is still required to show that an inode has
been unlinked. Because reading from the flash may sleep, the
erase_completion_lock cannot be held, so an alternative, more
heavyweight lock was required to prevent the erase code from freeing
the jffs2_raw_node_ref structures in question while the garbage
collection code is looking at them.
Suggestions for alternative solutions to this problem would be welcomed.
wbuf_sem
--------
This read/write semaphore protects against concurrent access to the
write-behind buffer ('wbuf') used for flash chips where we must write
in blocks. It protects both the contents of the wbuf and the metadata
which indicates which flash region (if any) is currently covered by
the buffer.
Ordering constraints:
Lock wbuf_sem last, after the alloc_sem or and f->sem.
c->xattr_sem
------------
This read/write semaphore protects against concurrent access to the
xattr related objects which include stuff in superblock and ic->xref.
In read-only path, write-semaphore is too much exclusion. It's enough
by read-semaphore. But you must hold write-semaphore when updating,
creating or deleting any xattr related object.
Once xattr_sem released, there would be no assurance for the existence
of those objects. Thus, a series of processes is often required to retry,
when updating such a object is necessary under holding read semaphore.
For example, do_jffs2_getxattr() holds read-semaphore to scan xref and
xdatum at first. But it retries this process with holding write-semaphore
after release read-semaphore, if it's necessary to load name/value pair
from medium.
Ordering constraints:
Lock xattr_sem last, after the alloc_sem.