lineage-22.2
21 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a1757f43e8 |
Merge 4.19.247 into android-4.19-stable
Changes in 4.19.247
binfmt_flat: do not stop relocating GOT entries prematurely on riscv
ALSA: hda/realtek - Fix microphone noise on ASUS TUF B550M-PLUS
USB: serial: option: add Quectel BG95 modem
USB: new quirk for Dell Gen 2 devices
ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
btrfs: add "0x" prefix for unsupported optional features
btrfs: repair super block num_devices automatically
drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes
mwifiex: add mutex lock for call in mwifiex_dfs_chan_sw_work_queue
b43legacy: Fix assigning negative value to unsigned variable
b43: Fix assigning negative value to unsigned variable
ipw2x00: Fix potential NULL dereference in libipw_xmit()
ipv6: fix locking issues with loops over idev->addr_list
fbcon: Consistently protect deferred_takeover with console_lock()
ACPICA: Avoid cache flush inside virtual machines
ALSA: jack: Access input_dev under mutex
drm/amd/pm: fix double free in si_parse_power_table()
ath9k: fix QCA9561 PA bias level
media: venus: hfi: avoid null dereference in deinit
media: pci: cx23885: Fix the error handling in cx23885_initdev()
media: cx25821: Fix the warning when removing the module
md/bitmap: don't set sb values if can't pass sanity check
scsi: megaraid: Fix error check return value of register_chrdev()
drm/plane: Move range check for format_count earlier
drm/amd/pm: fix the compile warning
ipv6: Don't send rs packets to the interface of ARPHRD_TUNNEL
ASoC: dapm: Don't fold register value changes into notifications
mlxsw: spectrum_dcb: Do not warn about priority changes
ASoC: tscs454: Add endianness flag in snd_soc_component_driver
s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES
dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
ipmi:ssif: Check for NULL msg when handling events and messages
rtlwifi: Use pr_warn instead of WARN_ONCE
media: cec-adap.c: fix is_configuring state
openrisc: start CPU timer early in boot
nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags
ASoC: rt5645: Fix errorenous cleanup order
net: phy: micrel: Allow probing without .driver_data
media: exynos4-is: Fix compile warning
hwmon: Make chip parameter for with_info API mandatory
rxrpc: Return an error to sendmsg if call failed
eth: tg3: silence the GCC 12 array-bounds warning
ARM: dts: ox820: align interrupt controller node name with dtschema
PM / devfreq: rk3399_dmc: Disable edev on remove()
fs: jfs: fix possible NULL pointer dereference in dbFree()
ARM: OMAP1: clock: Fix UART rate reporting algorithm
fat: add ratelimit to fat*_ent_bread()
ARM: versatile: Add missing of_node_put in dcscb_init
ARM: dts: exynos: add atmel,24c128 fallback to Samsung EEPROM
ARM: hisi: Add missing of_node_put after of_find_compatible_node
PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
powerpc/xics: fix refcount leak in icp_opal_init()
macintosh/via-pmu: Fix build failure when CONFIG_INPUT is disabled
RDMA/hfi1: Prevent panic when SDMA is disabled
drm: fix EDID struct for old ARM OABI format
ath9k: fix ar9003_get_eepmisc
drm/edid: fix invalid EDID extension block filtering
drm/bridge: adv7511: clean up CEC adapter when probe fails
ASoC: mediatek: Fix error handling in mt8173_max98090_dev_probe
ASoC: mediatek: Fix missing of_node_put in mt2701_wm8960_machine_probe
x86/delay: Fix the wrong asm constraint in delay_loop()
drm/mediatek: Fix mtk_cec_mask()
drm/vc4: txp: Don't set TXP_VSTART_AT_EOF
drm/vc4: txp: Force alpha to be 0xff if it's disabled
nl80211: show SSID for P2P_GO interfaces
spi: spi-ti-qspi: Fix return value handling of wait_for_completion_timeout
NFC: NULL out the dev->rfkill to prevent UAF
efi: Add missing prototype for efi_capsule_setup_info
HID: hid-led: fix maximum brightness for Dream Cheeky
HID: elan: Fix potential double free in elan_input_configured
spi: img-spfi: Fix pm_runtime_get_sync() error checking
ath9k_htc: fix potential out of bounds access with invalid rxstatus->rs_keyix
inotify: show inotify mask flags in proc fdinfo
fsnotify: fix wrong lockdep annotations
of: overlay: do not break notify on NOTIFY_{OK|STOP}
scsi: ufs: core: Exclude UECxx from SFR dump list
x86/pm: Fix false positive kmemleak report in msr_build_context()
x86/speculation: Add missing prototype for unpriv_ebpf_notify()
drm/msm/disp/dpu1: set vbif hw config to NULL to avoid use after memory free during pm runtime resume
drm/msm/dsi: fix error checks and return values for DSI xmit functions
drm/msm/hdmi: check return value after calling platform_get_resource_byname()
drm/rockchip: vop: fix possible null-ptr-deref in vop_bind()
x86: Fix return value of __setup handlers
irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value
x86/mm: Cleanup the control_va_addr_alignment() __setup handler
drm/msm/mdp5: Return error code in mdp5_pipe_release when deadlock is detected
drm/msm/mdp5: Return error code in mdp5_mixer_release when deadlock is detected
drm/msm: return an error pointer in msm_gem_prime_get_sg_table()
media: uvcvideo: Fix missing check to determine if element is found in list
perf/amd/ibs: Use interrupt regs ip for stack unwinding
ASoC: mxs-saif: Fix refcount leak in mxs_saif_probe
regulator: pfuze100: Fix refcount leak in pfuze_parse_regulators_dt
scripts/faddr2line: Fix overlapping text section failures
media: st-delta: Fix PM disable depth imbalance in delta_probe
media: exynos4-is: Change clk_disable to clk_disable_unprepare
media: pvrusb2: fix array-index-out-of-bounds in pvr2_i2c_core_init
media: vsp1: Fix offset calculation for plane cropping
Bluetooth: fix dangling sco_conn and use-after-free in sco_sock_timeout
m68k: math-emu: Fix dependencies of math emulation support
sctp: read sk->sk_bound_dev_if once in sctp_rcv()
ext4: reject the 'commit' option on ext2 filesystems
drm: msm: fix possible memory leak in mdp5_crtc_cursor_set()
ASoC: wm2000: fix missing clk_disable_unprepare() on error in wm2000_anc_transition()
NFC: hci: fix sleep in atomic context bugs in nfc_hci_hcp_message_tx
rxrpc: Fix listen() setting the bar too high for the prealloc rings
rxrpc: Don't try to resend the request if we're receiving the reply
soc: qcom: smp2p: Fix missing of_node_put() in smp2p_parse_ipc
soc: qcom: smsm: Fix missing of_node_put() in smsm_parse_ipc
PCI: cadence: Fix find_first_zero_bit() limit
PCI: rockchip: Fix find_first_zero_bit() limit
ARM: dts: bcm2835-rpi-zero-w: Fix GPIO line name for Wifi/BT
ARM: dts: bcm2835-rpi-b: Fix GPIO line names
crypto: marvell/cesa - ECB does not IV
mfd: ipaq-micro: Fix error check return value of platform_get_irq()
scsi: fcoe: Fix Wstringop-overflow warnings in fcoe_wwn_from_mac()
firmware: arm_scmi: Fix list protocols enumeration in the base protocol
pinctrl: mvebu: Fix irq_of_parse_and_map() return value
drivers/base/node.c: fix compaction sysfs file leak
dax: fix cache flush on PMD-mapped pages
powerpc/8xx: export 'cpm_setbrg' for modules
powerpc/idle: Fix return value of __setup() handler
powerpc/4xx/cpm: Fix return value of __setup() handler
proc: fix dentry/inode overinstantiating under /proc/${pid}/net
tty: fix deadlock caused by calling printk() under tty_port->lock
Input: sparcspkr - fix refcount leak in bbc_beep_probe
powerpc/perf: Fix the threshold compare group constraint for power9
powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup
mailbox: forward the hrtimer if not queued and under a lock
RDMA/hfi1: Prevent use of lock before it is initialized
f2fs: fix dereference of stale list iterator after loop body
iommu/mediatek: Add list_del in mtk_iommu_remove
i2c: at91: use dma safe buffers
i2c: at91: Initialize dma_buf in at91_twi_xfer()
NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout
video: fbdev: clcdfb: Fix refcount leak in clcdfb_of_vram_setup
dmaengine: stm32-mdma: remove GISR1 register
iommu/amd: Increase timeout waiting for GA log enablement
perf c2c: Use stdio interface if slang is not supported
perf jevents: Fix event syntax error caused by ExtSel
f2fs: fix deadloop in foreground GC
wifi: mac80211: fix use-after-free in chanctx code
iwlwifi: mvm: fix assert 1F04 upon reconfig
fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
netfilter: nf_tables: disallow non-stateful expression in sets earlier
ext4: fix use-after-free in ext4_rename_dir_prepare
ext4: fix bug_on in ext4_writepages
ext4: verify dir block before splitting it
ext4: avoid cycles in directory h-tree
tracing: Fix potential double free in create_var_ref()
PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299
PCI: qcom: Fix runtime PM imbalance on probe errors
PCI: qcom: Fix unbalanced PHY init on probe errors
dlm: fix plock invalid read
dlm: fix missing lkb refcount handling
ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock
scsi: dc395x: Fix a missing check on list iterator
scsi: ufs: qcom: Add a readl() to make sure ref_clk gets enabled
drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.
drm/nouveau/clk: Fix an incorrect NULL check on list iterator
drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
md: fix an incorrect NULL check in does_sb_need_changing
md: fix an incorrect NULL check in md_reload_sb
media: coda: Fix reported H264 profile
media: coda: Add more H264 levels for CODA960
RDMA/hfi1: Fix potential integer multiplication overflow errors
irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x
irqchip: irq-xtensa-mx: fix initial IRQ affinity
mac80211: upgrade passive scan to active scan on DFS channels after beacon rx
um: chan_user: Fix winch_tramp() return value
um: Fix out-of-bounds read in LDT setup
iommu/msm: Fix an incorrect NULL check on list iterator
nodemask.h: fix compilation error with GCC12
hugetlb: fix huge_pmd_unshare address update
rtl818x: Prevent using not initialized queues
ASoC: rt5514: Fix event generation for "DSP Voice Wake Up" control
carl9170: tx: fix an incorrect use of list iterator
gma500: fix an incorrect NULL check on list iterator
arm64: dts: qcom: ipq8074: fix the sleep clock frequency
phy: qcom-qmp: fix struct clk leak on probe errors
docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0
dt-bindings: gpio: altera: correct interrupt-cells
blk-iolatency: Fix inflight count imbalances and IO hangs on offline
phy: qcom-qmp: fix reset-controller leak on probe errors
RDMA/rxe: Generate a completion for unsupported/invalid opcode
MIPS: IP27: Remove incorrect `cpu_has_fpu' override
md: bcache: check the return value of kzalloc() in detached_dev_do_request()
pcmcia: db1xxx_ss: restrict to MIPS_DB1XXX boards
staging: greybus: codecs: fix type confusion of list iterator variable
tty: goldfish: Use tty_port_destroy() to destroy port
usb: usbip: fix a refcount leak in stub_probe()
usb: usbip: add missing device lock on tweak configuration cmd
USB: storage: karma: fix rio_karma_init return
usb: musb: Fix missing of_node_put() in omap2430_probe
pwm: lp3943: Fix duty calculation in case period was clamped
rpmsg: qcom_smd: Fix irq_of_parse_and_map() return value
usb: dwc3: pci: Fix pm_runtime_get_sync() error checking
iio: adc: sc27xx: fix read big scale voltage not right
rpmsg: qcom_smd: Fix returning 0 if irq_of_parse_and_map() fails
coresight: cpu-debug: Replace mutex with mutex_trylock on panic notifier
soc: rockchip: Fix refcount leak in rockchip_grf_init
clocksource/drivers/riscv: Events are stopped during CPU suspend
rtc: mt6397: check return value after calling platform_get_resource()
serial: meson: acquire port->lock in startup()
serial: 8250_fintek: Check SER_RS485_RTS_* only with RS485
serial: digicolor-usart: Don't allow CS5-6
serial: txx9: Don't allow CS5-6
serial: sh-sci: Don't allow CS5-6
serial: st-asc: Sanitize CSIZE and correct PARENB for CS7
serial: stm32-usart: Correct CSIZE, bits, and parity
firmware: dmi-sysfs: Fix memory leak in dmi_sysfs_register_handle
bus: ti-sysc: Fix warnings for unbind for serial
clocksource/drivers/oxnas-rps: Fix irq_of_parse_and_map() return value
s390/crypto: fix scatterwalk_unmap() callers in AES-GCM
net: ethernet: mtk_eth_soc: out of bounds read in mtk_hwlro_get_fdir_entry()
net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_register
modpost: fix removing numeric suffixes
jffs2: fix memory leak in jffs2_do_fill_super
ubi: ubi_create_volume: Fix use-after-free when volume creation failed
nfp: only report pause frame configuration for physical device
net/mlx5e: Update netdev features after changing XDP state
tcp: tcp_rtx_synack() can be called from process context
afs: Fix infinite loop found by xfstest generic/676
tipc: check attribute length for bearer name
perf c2c: Fix sorting in percent_rmt_hitm_cmp()
mips: cpc: Fix refcount leak in mips_cpc_default_phys_base
tracing: Fix sleeping function called from invalid context on RT kernel
tracing: Avoid adding tracer option before update_tracer_options
i2c: cadence: Increase timeout per message if necessary
m68knommu: set ZERO_PAGE() to the allocated zeroed page
m68knommu: fix undefined reference to `_init_sp'
NFSv4: Don't hold the layoutget locks across multiple RPC calls
video: fbdev: pxa3xx-gcu: release the resources correctly in pxa3xx_gcu_probe/remove()
xprtrdma: treat all calls not a bcall when bc_serv is NULL
ata: pata_octeon_cf: Fix refcount leak in octeon_cf_probe
af_unix: Fix a data-race in unix_dgram_peer_wake_me().
bpf, arm64: Clear prog->jited_len along prog->jited
net/mlx4_en: Fix wrong return value on ioctl EEPROM query failure
SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
net: mdio: unexport __init-annotated mdio_bus_init()
net: xfrm: unexport __init-annotated xfrm4_protocol_init()
net: ipv6: unexport __init-annotated seg6_hmac_init()
net/mlx5: Rearm the FW tracer after each tracer event
ip_gre: test csum_start instead of transport header
net: altera: Fix refcount leak in altera_tse_mdio_create
drm: imx: fix compiler warning with gcc-12
iio: dummy: iio_simple_dummy: check the return value of kstrdup()
lkdtm/usercopy: Expand size of "out of frame" object
tty: synclink_gt: Fix null-pointer-dereference in slgt_clean()
tty: Fix a possible resource leak in icom_probe
drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop()
drivers: staging: rtl8192e: Fix deadlock in rtllib_beacons_stop()
USB: host: isp116x: check return value after calling platform_get_resource()
drivers: tty: serial: Fix deadlock in sa1100_set_termios()
drivers: usb: host: Fix deadlock in oxu_bus_suspend()
USB: hcd-pci: Fully suspend across freeze/thaw cycle
usb: dwc2: gadget: don't reset gadget's driver->bus
misc: rtsx: set NULL intfdata when probe fails
extcon: Modify extcon device to be created after driver data is set
clocksource/drivers/sp804: Avoid error on multiple instances
staging: rtl8712: fix uninit-value in r871xu_drv_init()
serial: msm_serial: disable interrupts in __msm_console_write()
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
md: protect md_unregister_thread from reentrancy
Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process"
ceph: allow ceph.dir.rctime xattr to be updatable
drm/radeon: fix a possible null pointer dereference
modpost: fix undefined behavior of is_arm_mapping_symbol()
nbd: call genl_unregister_family() first in nbd_cleanup()
nbd: fix race between nbd_alloc_config() and module removal
nbd: fix io hung while disconnecting device
nodemask: Fix return values to be unsigned
vringh: Fix loop descriptors check in the indirect cases
ALSA: hda/conexant - Fix loopback issue with CX20632
cifs: return errors during session setup during reconnects
ata: libata-transport: fix {dma|pio|xfer}_mode sysfs files
mmc: block: Fix CQE recovery reset success
nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
ixgbe: fix bcast packets Rx on VF after promisc removal
ixgbe: fix unexpected VLAN Rx in promisc mode on VF
Input: bcm5974 - set missing URB_NO_TRANSFER_DMA_MAP urb flag
powerpc/32: Fix overread/overwrite of thread_struct via ptrace
md/raid0: Ignore RAID0 layout if the second zone has only one device
mtd: cfi_cmdset_0002: Move and rename chip_check/chip_ready/chip_good_for_write
mtd: cfi_cmdset_0002: Use chip_ready() for write on S29GL064N
tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd
Linux 4.19.247
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I58c002ddc38e389a13e2bdb9f291f05805718c9d
|
||
|
|
515d077ee3 |
blk-iolatency: Fix inflight count imbalances and IO hangs on offline
commit 8a177a36da6c54c98b8685d4f914cb3637d53c0d upstream.
iolatency needs to track the number of inflight IOs per cgroup. As this
tracking can be expensive, it is disabled when no cgroup has iolatency
configured for the device. To ensure that the inflight counters stay
balanced, iolatency_set_limit() freezes the request_queue while manipulating
the enabled counter, which ensures that no IO is in flight and thus all
counters are zero.
Unfortunately, iolatency_set_limit() isn't the only place where the enabled
counter is manipulated. iolatency_pd_offline() can also dec the counter and
trigger disabling. As this disabling happens without freezing the q, this
can easily happen while some IOs are in flight and thus leak the counts.
This can be easily demonstrated by turning on iolatency on an one empty
cgroup while IOs are in flight in other cgroups and then removing the
cgroup. Note that iolatency shouldn't have been enabled elsewhere in the
system to ensure that removing the cgroup disables iolatency for the whole
device.
The following keeps flipping on and off iolatency on sda:
echo +io > /sys/fs/cgroup/cgroup.subtree_control
while true; do
mkdir -p /sys/fs/cgroup/test
echo '8:0 target=100000' > /sys/fs/cgroup/test/io.latency
sleep 1
rmdir /sys/fs/cgroup/test
sleep 1
done
and there's concurrent fio generating direct rand reads:
fio --name test --filename=/dev/sda --direct=1 --rw=randread \
--runtime=600 --time_based --iodepth=256 --numjobs=4 --bs=4k
while monitoring with the following drgn script:
while True:
for css in css_for_each_descendant_pre(prog['blkcg_root'].css.address_of_()):
for pos in hlist_for_each(container_of(css, 'struct blkcg', 'css').blkg_list):
blkg = container_of(pos, 'struct blkcg_gq', 'blkcg_node')
pd = blkg.pd[prog['blkcg_policy_iolatency'].plid]
if pd.value_() == 0:
continue
iolat = container_of(pd, 'struct iolatency_grp', 'pd')
inflight = iolat.rq_wait.inflight.counter.value_()
if inflight:
print(f'inflight={inflight} {disk_name(blkg.q.disk).decode("utf-8")} '
f'{cgroup_path(css.cgroup).decode("utf-8")}')
time.sleep(1)
The monitoring output looks like the following:
inflight=1 sda /user.slice
inflight=1 sda /user.slice
...
inflight=14 sda /user.slice
inflight=13 sda /user.slice
inflight=17 sda /user.slice
inflight=15 sda /user.slice
inflight=18 sda /user.slice
inflight=17 sda /user.slice
inflight=20 sda /user.slice
inflight=19 sda /user.slice <- fio stopped, inflight stuck at 19
inflight=19 sda /user.slice
inflight=19 sda /user.slice
If a cgroup with stuck inflight ends up getting throttled, the throttled IOs
will never get issued as there's no completion event to wake it up leading
to an indefinite hang.
This patch fixes the bug by unifying enable handling into a work item which
is automatically kicked off from iolatency_set_min_lat_nsec() which is
called from both iolatency_set_limit() and iolatency_pd_offline() paths.
Punting to a work item is necessary as iolatency_pd_offline() is called
under spinlocks while freezing a request_queue requires a sleepable context.
This also simplifies the code reducing LOC sans the comments and avoids the
unnecessary freezes which were happening whenever a cgroup's latency target
is newly set or cleared.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Liu Bo <bo.liu@linux.alibaba.com>
Fixes: 8c772a9bfc7c ("blk-iolatency: fix IO hang due to negative inflight counter")
Cc: stable@vger.kernel.org # v5.0+
Link: https://lore.kernel.org/r/Yn9ScX6Nx2qIiQQi@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
78a4d032ec |
Merge 4.19.203 into android-4.19-stable
Changes in 4.19.203 Revert "ACPICA: Fix memory leak caused by _CID repair function" ALSA: seq: Fix racy deletion of subscriber ARM: imx: add missing iounmap() ARM: dts: colibri-imx6ull: limit SDIO clock to 25MHz ALSA: usb-audio: fix incorrect clock source setting clk: stm32f4: fix post divisor setup for I2S/SAI PLLs omap5-board-common: remove not physically existing vdds_1v8_main fixed-regulator scsi: sr: Return correct event when media event code is 3 media: videobuf2-core: dequeue if start_streaming fails net: natsemi: Fix missing pci_disable_device() in probe and remove sctp: move the active_key update after sh_keys is added nfp: update ethtool reporting of pauseframe control net: ipv6: fix returned variable type in ip6_skb_dst_mtu mips: Fix non-POSIX regexp bnx2x: fix an error code in bnx2x_nic_load() net: pegasus: fix uninit-value in get_interrupt_interval net: fec: fix use-after-free in fec_drv_remove net: vxge: fix use-after-free in vxge_device_unregister blk-iolatency: error out if blk_get_queue() failed in iolatency_set_limit() Bluetooth: defer cleanup of resources in hci_unregister_dev() USB: usbtmc: Fix RCU stall warning USB: serial: option: add Telit FD980 composition 0x1056 USB: serial: ch341: fix character loss at high transfer rates USB: serial: ftdi_sio: add device ID for Auto-M3 OP-COM v2 firmware_loader: use -ETIMEDOUT instead of -EAGAIN in fw_load_sysfs_fallback firmware_loader: fix use-after-free in firmware_fallback_sysfs ALSA: usb-audio: Add registration quirk for JBL Quantum 600 usb: gadget: f_hid: added GET_IDLE and SET_IDLE handlers usb: gadget: f_hid: fixed NULL pointer dereference usb: gadget: f_hid: idle uses the highest byte for duration usb: otg-fsm: Fix hrtimer list corruption scripts/tracing: fix the bug that can't parse raw_trace_func tracing / histogram: Give calculation hist_fields a size tracing/histogram: Rename "cpu" to "common_cpu" optee: Clear stale cache entries during initialization staging: rtl8723bs: Fix a resource leak in sd_int_dpc media: rtl28xxu: fix zero-length control request pipe: increase minimum default pipe size to 2 pages ext4: fix potential htree corruption when growing large_dir directories serial: 8250: Mask out floating 16/32-bit bus bits MIPS: Malta: Do not byte-swap accesses to the CBUS UART pcmcia: i82092: fix a null pointer dereference bug KVM: x86: accept userspace interrupt only if no event is injected KVM: x86/mmu: Fix per-cpu counter corruption on 32-bit builds spi: meson-spicc: fix memory leak in meson_spicc_remove perf/x86/amd: Don't touch the AMD64_EVENTSEL_HOSTONLY bit inside the guest qmi_wwan: add network device usage statistics for qmimux devices libata: fix ata_pio_sector for CONFIG_HIGHMEM reiserfs: add check for root_inode in reiserfs_fill_super reiserfs: check directory items on read from disk alpha: Send stop IPI to send to online CPUs net/qla3xxx: fix schedule while atomic in ql_wait_for_drvr_lock and ql_adapter_reset ARM: imx: add mmdc ipg clock operation for mmdc Linux 4.19.203 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4adcce1092ab000faec667eda6cf569e7a269562 |
||
|
|
76ab02d9b8 |
blk-iolatency: error out if blk_get_queue() failed in iolatency_set_limit()
[ Upstream commit 8d75d0eff6887bcac7225e12b9c75595e523d92d ]
If queue is dying while iolatency_set_limit() is in progress,
blk_get_queue() won't increment the refcount of the queue. However,
blk_put_queue() will still decrement the refcount later, which will
cause the refcout to be unbalanced.
Thus error out in such case to fix the problem.
Fixes: 8c772a9bfc7c ("blk-iolatency: fix IO hang due to negative inflight counter")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210805124645.543797-1-yukuai3@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
8ca5759502 |
Merge 4.19.73 into android-4.19
Changes in 4.19.73
ALSA: hda - Fix potential endless loop at applying quirks
ALSA: hda/realtek - Fix overridden device-specific initialization
ALSA: hda/realtek - Add quirk for HP Pavilion 15
ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL
ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre
sched/fair: Don't assign runtime for throttled cfs_rq
drm/vmwgfx: Fix double free in vmw_recv_msg()
vhost/test: fix build for vhost test
vhost/test: fix build for vhost test - again
powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
batman-adv: fix uninit-value in batadv_netlink_get_ifindex()
batman-adv: Only read OGM tvlv_len after buffer len check
hv_sock: Fix hang when a connection is closed
Blk-iolatency: warn on negative inflight IO counter
blk-iolatency: fix STS_AGAIN handling
{nl,mac}80211: fix interface combinations on crypto controlled devices
timekeeping: Use proper ktime_add when adding nsecs in coarse offset
selftests: fib_rule_tests: use pre-defined DEV_ADDR
x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
powerpc/64: mark start_here_multiplatform as __ref
media: stm32-dcmi: fix irq = 0 case
arm64: dts: rockchip: enable usb-host regulators at boot on rk3328-rock64
scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
riscv: remove unused variable in ftrace
nvme-fc: use separate work queue to avoid warning
clk: s2mps11: Add used attribute to s2mps11_dt_match
remoteproc: qcom: q6v5: shore up resource probe handling
modules: always page-align module section allocations
kernel/module: Fix mem leak in module_add_modinfo_attrs
drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"
media: cec/v4l2: move V4L2 specific CEC functions to V4L2
media: cec: remove cec-edid.c
scsi: qla2xxx: Move log messages before issuing command to firmware
keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
Drivers: hv: kvp: Fix two "this statement may fall through" warnings
x86, hibernate: Fix nosave_regions setup for hibernation
remoteproc: qcom: q6v5-mss: add SCM probe dependency
drm/amdgpu/gfx9: Update gfx9 golden settings.
drm/amdgpu: Update gc_9_0 golden settings.
KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS
KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables
KVM: x86: hyperv: keep track of mismatched VP indexes
KVM: hyperv: define VP assist page helpers
x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit
drm/i915: Fix intel_dp_mst_best_encoder()
drm/i915: Rename PLANE_CTL_DECOMPRESSION_ENABLE
drm/i915/gen9+: Fix initial readout for Y tiled framebuffers
drm/atomic_helper: Disallow new modesets on unregistered connectors
Drivers: hv: kvp: Fix the indentation of some "break" statements
Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up
powerplay: Respect units on max dcfclk watermark
drm/amd/pp: Fix truncated clock value when set watermark
drm/amd/dm: Understand why attaching path/tile properties are needed
ARM: davinci: da8xx: define gpio interrupts as separate resources
ARM: davinci: dm365: define gpio interrupts as separate resources
ARM: davinci: dm646x: define gpio interrupts as separate resources
ARM: davinci: dm355: define gpio interrupts as separate resources
ARM: davinci: dm644x: define gpio interrupts as separate resources
s390/zcrypt: reinit ap queue state machine during device probe
media: vim2m: use workqueue
media: vim2m: use cancel_delayed_work_sync instead of flush_schedule_work
drm/i915: Restore sane defaults for KMS on GEM error load
drm/i915: Cleanup gt powerstate from gem
KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch
Btrfs: clean up scrub is_dev_replace parameter
Btrfs: fix deadlock with memory reclaim during scrub
btrfs: Remove extent_io_ops::fill_delalloc
btrfs: Fix error handling in btrfs_cleanup_ordered_extents
scsi: megaraid_sas: Fix combined reply queue mode detection
scsi: megaraid_sas: Add check for reset adapter bit
scsi: megaraid_sas: Use 63-bit DMA addressing
powerpc/pkeys: Fix handling of pkey state across fork()
btrfs: volumes: Make sure no dev extent is beyond device boundary
btrfs: Use real device structure to verify dev extent
media: vim2m: only cancel work if it is for right context
ARC: show_regs: lockdep: re-enable preemption
ARC: mm: do_page_fault fixes #1: relinquish mmap_sem if signal arrives while handle_mm_fault
IB/uverbs: Fix OOPs upon device disassociation
crypto: ccree - fix resume race condition on init
crypto: ccree - add missing inline qualifier
drm/vblank: Allow dynamic per-crtc max_vblank_count
drm/i915/ilk: Fix warning when reading emon_status with no output
mfd: Kconfig: Fix I2C_DESIGNWARE_PLATFORM dependencies
tpm: Fix some name collisions with drivers/char/tpm.h
bcache: replace hard coded number with BUCKET_GC_GEN_MAX
bcache: treat stale && dirty keys as bad keys
KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-run
iio: adc: exynos-adc: Add S5PV210 variant
dt-bindings: iio: adc: exynos-adc: Add S5PV210 variant
iio: adc: exynos-adc: Use proper number of channels for Exynos4x12
mt76: fix corrupted software generated tx CCMP PN
drm/nouveau: Don't WARN_ON VCPI allocation failures
iwlwifi: fix devices with PCI Device ID 0x34F0 and 11ac RF modules
iwlwifi: add new card for 9260 series
x86/kvmclock: set offset for kvm unstable clock
spi: spi-gpio: fix SPI_CS_HIGH capability
powerpc/kvm: Save and restore host AMR/IAMR/UAMOR
mmc: renesas_sdhi: Fix card initialization failure in high speed mode
btrfs: scrub: pass fs_info to scrub_setup_ctx
btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex
btrfs: scrub: fix circular locking dependency warning
btrfs: init csum_list before possible free
PCI: qcom: Fix error handling in runtime PM support
PCI: qcom: Don't deassert reset GPIO during probe
drm: add __user attribute to ptr_to_compat()
CIFS: Fix error paths in writeback code
CIFS: Fix leaking locked VFS cache pages in writeback retry
drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set
drm/i915: Sanity check mmap length against object size
usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps
arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's
IB/mlx5: Reset access mask when looping inside page fault handler
kvm: mmu: Fix overflow on kvm mmu page limit calculation
x86/kvm: move kvm_load/put_guest_xcr0 into atomic context
KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels
cifs: Fix lease buffer length error
media: i2c: tda1997x: select V4L2_FWNODE
ext4: protect journal inode's blocks using block_validity
ARM: dts: qcom: ipq4019: fix PCI range
ARM: dts: qcom: ipq4019: Fix MSI IRQ type
ARM: dts: qcom: ipq4019: enlarge PCIe BAR range
dt-bindings: mmc: Add supports-cqe property
dt-bindings: mmc: Add disable-cqe-dcmd property.
PCI: Add macro for Switchtec quirk declarations
PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary
dm mpath: fix missing call of path selector type->end_io
blk-mq: free hw queue's resource in hctx's release handler
mmc: sdhci-pci: Add support for Intel CML
PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
cifs: smbd: take an array of reqeusts when sending upper layer data
dm crypt: move detailed message into debug level
signal/arc: Use force_sig_fault where appropriate
ARC: mm: fix uninitialised signal code in do_page_fault
ARC: mm: SIGSEGV userspace trying to access kernel virtual memory
drm/amdkfd: Add missing Polaris10 ID
kvm: Check irqchip mode before assign irqfd
drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2)
drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc
Btrfs: fix race between block group removal and block group allocation
cifs: add spinlock for the openFileList to cifsInodeInfo
clk: tegra: Fix maximum audio sync clock for Tegra124/210
clk: tegra210: Fix default rates for HDA clocks
IB/hfi1: Avoid hardlockup with flushlist_lock
apparmor: reset pos on failure to unpack for various functions
scsi: target/core: Use the SECTOR_SHIFT constant
scsi: target/iblock: Fix overrun in WRITE SAME emulation
staging: wilc1000: fix error path cleanup in wilc_wlan_initialize()
scsi: zfcp: fix request object use-after-free in send path causing wrong traces
cifs: Properly handle auto disabling of serverino option
ALSA: hda - Don't resume forcibly i915 HDMI/DP codec
ceph: use ceph_evict_inode to cleanup inode's resource
KVM: x86: optimize check for valid PAT value
KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value
KVM: VMX: Fix handling of #MC that occurs during VM-Entry
KVM: VMX: check CPUID before allowing read/write of IA32_XSS
KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct
KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation
ARM: dts: gemini: Set DIR-685 SPI CS as active low
RDMA/srp: Document srp_parse_in() arguments
RDMA/srp: Accept again source addresses that do not have a port number
btrfs: correctly validate compression type
resource: Include resource end in walk_*() interfaces
resource: Fix find_next_iomem_res() iteration issue
resource: fix locking in find_next_iomem_res()
pstore: Fix double-free in pstore_mkfile() failure path
dm thin metadata: check if in fail_io mode when setting needs_check
drm/panel: Add support for Armadeus ST0700 Adapt
ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips
powerpc/mm: Limit rma_size to 1TB when running without HV mode
iommu/iova: Remove stale cached32_node
gpio: don't WARN() on NULL descs if gpiolib is disabled
i2c: at91: disable TXRDY interrupt after sending data
i2c: at91: fix clk_offset for sama5d2
mm/migrate.c: initialize pud_entry in migrate_vma()
iio: adc: gyroadc: fix uninitialized return code
NFSv4: Fix delegation state recovery
bcache: only clear BTREE_NODE_dirty bit when it is set
bcache: add comments for mutex_lock(&b->write_lock)
bcache: fix race in btree_flush_write()
drm/i915: Make sure cdclk is high enough for DP audio on VLV/CHV
virtio/s390: fix race on airq_areas[]
drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors
ext4: don't perform block validity checks on the journal inode
ext4: fix block validity checks for journal inodes using indirect blocks
ext4: unsigned int compared against zero
PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
powerpc/tm: Remove msr_tm_active()
powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
vhost: make sure log_num < in_num
Linux 4.19.73
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7bc57825aeb36759bb8e8726888da9af06392c09
|
||
|
|
178d1337a5 |
blk-iolatency: fix STS_AGAIN handling
[ Upstream commit c9b3007feca018d3f7061f5d5a14cb00766ffe9b ]
The iolatency controller is based on rq_qos. It increments on
rq_qos_throttle() and decrements on either rq_qos_cleanup() or
rq_qos_done_bio(). a3fb01ba5af0 fixes the double accounting issue where
blk_mq_make_request() may call both rq_qos_cleanup() and
rq_qos_done_bio() on REQ_NO_WAIT. So checking STS_AGAIN prevents the
double decrement.
The above works upstream as the only way we can get STS_AGAIN is from
blk_mq_get_request() failing. The STS_AGAIN handling isn't a real
problem as bio_endio() skipping only happens on reserved tag allocation
failures which can only be caused by driver bugs and already triggers
WARN.
However, the fix creates a not so great dependency on how STS_AGAIN can
be propagated. Internally, we (Facebook) carry a patch that kills read
ahead if a cgroup is io congested or a fatal signal is pending. This
combined with chained bios progagate their bi_status to the parent is
not already set can can cause the parent bio to not clean up properly
even though it was successful. This consequently leaks the inflight
counter and can hang all IOs under that blkg.
To nip the adverse interaction early, this removes the rq_qos_cleanup()
callback in iolatency in favor of cleaning up always on the
rq_qos_done_bio() path.
Fixes: a3fb01ba5af0 ("blk-iolatency: only account submitted bios")
Debugged-by: Tejun Heo <tj@kernel.org>
Debugged-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
5f33e81250 |
Blk-iolatency: warn on negative inflight IO counter
[ Upstream commit 391f552af213985d3d324c60004475759a7030c5 ] This is to catch any unexpected negative value of inflight IO counter. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
71ce27c31a |
Merge 4.19.61 into android-4.19
Changes in 4.19.61
MIPS: ath79: fix ar933x uart parity mode
MIPS: fix build on non-linux hosts
arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly
scsi: iscsi: set auth_protocol back to NULL if CHAP_A value is not supported
dmaengine: imx-sdma: fix use-after-free on probe error path
wil6210: fix potential out-of-bounds read
ath10k: Do not send probe response template for mesh
ath9k: Check for errors when reading SREV register
ath6kl: add some bounds checking
ath10k: add peer id check in ath10k_peer_find_by_id
wil6210: fix spurious interrupts in 3-msi
ath: DFS JP domain W56 fixed pulse type 3 RADAR detection
regmap: debugfs: Fix memory leak in regmap_debugfs_init
batman-adv: fix for leaked TVLV handler.
media: dvb: usb: fix use after free in dvb_usb_device_exit
media: spi: IR LED: add missing of table registration
crypto: talitos - fix skcipher failure due to wrong output IV
media: ov7740: avoid invalid framesize setting
media: marvell-ccic: fix DMA s/g desc number calculation
media: vpss: fix a potential NULL pointer dereference
media: media_device_enum_links32: clean a reserved field
net: stmmac: dwmac1000: Clear unused address entries
net: stmmac: dwmac4/5: Clear unused address entries
qed: Set the doorbell address correctly
signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig
af_key: fix leaks in key_pol_get_resp and dump_sp.
xfrm: Fix xfrm sel prefix length validation
fscrypt: clean up some BUG_ON()s in block encryption/decryption
perf annotate TUI browser: Do not use member from variable within its own initialization
media: mc-device.c: don't memset __user pointer contents
media: saa7164: fix remove_proc_entry warning
media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails.
net: phy: Check against net_device being NULL
crypto: talitos - properly handle split ICV.
crypto: talitos - Align SEC1 accesses to 32 bits boundaries.
tua6100: Avoid build warnings.
batman-adv: Fix duplicated OGMs on NETDEV_UP
locking/lockdep: Fix merging of hlocks with non-zero references
media: wl128x: Fix some error handling in fm_v4l2_init_video_device()
net: hns3: set ops to null when unregister ad_dev
cpupower : frequency-set -r option misses the last cpu in related cpu list
arm64: mm: make CONFIG_ZONE_DMA32 configurable
perf jvmti: Address gcc string overflow warning for strncpy()
net: stmmac: dwmac4: fix flow control issue
net: stmmac: modify default value of tx-frames
crypto: inside-secure - do not rely on the hardware last bit for result descriptors
net: fec: Do not use netdev messages too early
net: axienet: Fix race condition causing TX hang
s390/qdio: handle PENDING state for QEBSM devices
RAS/CEC: Fix pfn insertion
net: sfp: add mutex to prevent concurrent state checks
ipset: Fix memory accounting for hash types on resize
perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode
perf test 6: Fix missing kvm module load for s390
perf report: Fix OOM error in TUI mode on s390
irqchip/meson-gpio: Add support for Meson-G12A SoC
media: uvcvideo: Fix access to uninitialized fields on probe error
media: fdp1: Support M3N and E3 platforms
iommu: Fix a leak in iommu_insert_resv_region
gpio: omap: fix lack of irqstatus_raw0 for OMAP4
gpio: omap: ensure irq is enabled before wakeup
regmap: fix bulk writes on paged registers
bpf: silence warning messages in core
media: s5p-mfc: fix reading min scratch buffer size on MFC v6/v7
selinux: fix empty write to keycreate file
x86/cpu: Add Ice Lake NNPI to Intel family
ASoC: meson: axg-tdm: fix sample clock inversion
rcu: Force inlining of rcu_read_lock()
x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS
qed: iWARP - Fix tc for MPA ll2 connection
net: hns3: fix for skb leak when doing selftest
block: null_blk: fix race condition for null_del_dev
blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration
xfrm: fix sa selector validation
sched/core: Add __sched tag for io_schedule()
sched/fair: Fix "runnable_avg_yN_inv" not used warnings
perf/x86/intel/uncore: Handle invalid event coding for free-running counter
x86/atomic: Fix smp_mb__{before,after}_atomic()
perf evsel: Make perf_evsel__name() accept a NULL argument
vhost_net: disable zerocopy by default
ipoib: correcly show a VF hardware address
x86/cacheinfo: Fix a -Wtype-limits warning
blk-iolatency: only account submitted bios
ACPICA: Clear status of GPEs on first direct enable
EDAC/sysfs: Fix memory leak when creating a csrow object
nvme: fix possible io failures when removing multipathed ns
nvme-pci: properly report state change failure in nvme_reset_work
nvme-pci: set the errno on ctrl state change error
lightnvm: pblk: fix freeing of merged pages
arm64: Do not enable IRQs for ct_user_exit
ipsec: select crypto ciphers for xfrm_algo
ipvs: defer hook registration to avoid leaks
media: s5p-mfc: Make additional clocks optional
media: i2c: fix warning same module names
ntp: Limit TAI-UTC offset
timer_list: Guard procfs specific code
acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
media: coda: fix mpeg2 sequence number handling
media: coda: fix last buffer handling in V4L2_ENC_CMD_STOP
media: coda: increment sequence offset for the last returned frame
media: vimc: cap: check v4l2_fill_pixfmt return value
media: hdpvr: fix locking and a missing msleep
net: stmmac: sun8i: force select external PHY when no internal one
rtlwifi: rtl8192cu: fix error handle when usb probe failed
mt7601u: do not schedule rx_tasklet when the device has been disconnected
x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c
mt7601u: fix possible memory leak when the device is disconnected
ipvs: fix tinfo memory leak in start_sync_thread
ath10k: add missing error handling
ath10k: fix PCIE device wake up failed
perf tools: Increase MAX_NR_CPUS and MAX_CACHES
ASoC: Intel: hdac_hdmi: Set ops to NULL on remove
libata: don't request sense data on !ZAC ATA devices
clocksource/drivers/exynos_mct: Increase priority over ARM arch timer
xsk: Properly terminate assignment in xskq_produce_flush_desc
rslib: Fix decoding of shortened codes
rslib: Fix handling of of caller provided syndrome
ixgbe: Check DDM existence in transceiver before access
crypto: serpent - mark __serpent_setkey_sbox noinline
crypto: asymmetric_keys - select CRYPTO_HASH where needed
wil6210: drop old event after wmi_call timeout
EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec
bcache: check CACHE_SET_IO_DISABLE in allocator code
bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
bcache: acquire bch_register_lock later in cached_dev_free()
bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush()
bcache: fix potential deadlock in cached_def_free()
net: hns3: fix a -Wformat-nonliteral compile warning
net: hns3: add some error checking in hclge_tm module
ath10k: destroy sdio workqueue while remove sdio module
net: mvpp2: prs: Don't override the sign bit in SRAM parser shift
igb: clear out skb->tstamp after reading the txtime
iwlwifi: mvm: Drop large non sta frames
bpf: fix uapi bpf_prog_info fields alignment
perf stat: Make metric event lookup more robust
perf stat: Fix group lookup for metric group
bnx2x: Prevent ptp_task to be rescheduled indefinitely
net: usb: asix: init MAC address buffers
rxrpc: Fix oops in tracepoint
bpf, libbpf, smatch: Fix potential NULL pointer dereference
selftests: bpf: fix inlines in test_lwt_seg6local
bonding: validate ip header before check IPPROTO_IGMP
gpiolib: Fix references to gpiod_[gs]et_*value_cansleep() variants
tools: bpftool: Fix json dump crash on powerpc
Bluetooth: hci_bcsp: Fix memory leak in rx_skb
Bluetooth: Add new 13d3:3491 QCA_ROME device
Bluetooth: Add new 13d3:3501 QCA_ROME device
Bluetooth: 6lowpan: search for destination address in all peers
perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64
Bluetooth: Check state in l2cap_disconnect_rsp
gtp: add missing gtp_encap_disable_sock() in gtp_encap_enable()
Bluetooth: validate BLE connection interval updates
gtp: fix suspicious RCU usage
gtp: fix Illegal context switch in RCU read-side critical section.
gtp: fix use-after-free in gtp_encap_destroy()
gtp: fix use-after-free in gtp_newlink()
net: mvmdio: defer probe of orion-mdio if a clock is not ready
iavf: fix dereference of null rx_buffer pointer
floppy: fix div-by-zero in setup_format_params
floppy: fix out-of-bounds read in next_valid_format
floppy: fix invalid pointer dereference in drive_name
floppy: fix out-of-bounds read in copy_buffer
xen: let alloc_xenballooned_pages() fail if not enough memory free
scsi: NCR5380: Reduce goto statements in NCR5380_select()
scsi: NCR5380: Always re-enable reselection interrupt
Revert "scsi: ncr5380: Increase register polling limit"
scsi: core: Fix race on creating sense cache
scsi: megaraid_sas: Fix calculation of target ID
scsi: mac_scsi: Increase PIO/PDMA transfer length threshold
scsi: mac_scsi: Fix pseudo DMA implementation, take 2
crypto: ghash - fix unaligned memory access in ghash_setkey()
crypto: ccp - Validate the the error value used to index error messages
crypto: arm64/sha1-ce - correct digest for empty data in finup
crypto: arm64/sha2-ce - correct digest for empty data in finup
crypto: chacha20poly1305 - fix atomic sleep when using async algorithm
crypto: crypto4xx - fix AES CTR blocksize value
crypto: crypto4xx - fix blocksize for cfb and ofb
crypto: crypto4xx - block ciphers should only accept complete blocks
crypto: ccp - memset structure fields to zero before reuse
crypto: ccp/gcm - use const time tag comparison.
crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe
Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
bcache: Revert "bcache: fix high CPU occupancy during journal"
bcache: Revert "bcache: free heap cache_set->flush_btree in bch_journal_free"
bcache: ignore read-ahead request failure on backing device
bcache: fix mistaken sysfs entry for io_error counter
bcache: destroy dc->writeback_write_wq if failed to create dc->writeback_thread
Input: gtco - bounds check collection indent level
Input: alps - don't handle ALPS cs19 trackpoint-only device
Input: synaptics - whitelist Lenovo T580 SMBus intertouch
Input: alps - fix a mismatch between a condition check and its comment
regulator: s2mps11: Fix buck7 and buck8 wrong voltages
arm64: tegra: Update Jetson TX1 GPU regulator timings
iwlwifi: pcie: don't service an interrupt that was masked
iwlwifi: pcie: fix ALIVE interrupt handling for gen2 devices w/o MSI-X
iwlwifi: don't WARN when calling iwl_get_shared_mem_conf with RF-Kill
iwlwifi: fix RF-Kill interrupt while FW load for gen2 devices
NFSv4: Handle the special Linux file open access mode
pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error
pNFS: Fix a typo in pnfs_update_layout
pnfs: Fix a problem where we gratuitously start doing I/O through the MDS
lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
ASoC: dapm: Adapt for debugfs API change
raid5-cache: Need to do start() part job after adding journal device
ALSA: seq: Break too long mutex context in the write loop
ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platform
ALSA: hda/realtek: apply ALC891 headset fixup to one Dell machine
media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()
media: coda: Remove unbalanced and unneeded mutex unlock
media: videobuf2-core: Prevent size alignment wrapping buffer size to 0
media: videobuf2-dma-sg: Prevent size from overflowing
KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
arm64: tegra: Fix AGIC register range
fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.
kconfig: fix missing choice values in auto.conf
drm/nouveau/i2c: Enable i2c pads & busses during preinit
padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
dm zoned: fix zone state management race
xen/events: fix binding user event channels to cpus
9p/xen: Add cleanup path in p9_trans_xen_init
9p/virtio: Add cleanup path in p9_virtio_init
x86/boot: Fix memory leak in default_get_smp_config()
perf/x86/intel: Fix spurious NMI on fixed counter
perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs
perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs
drm/edid: parse CEA blocks embedded in DisplayID
intel_th: pci: Add Ice Lake NNPI support
PCI: hv: Fix a use-after-free bug in hv_eject_device_work()
PCI: Do not poll for PME if the device is in D3cold
PCI: qcom: Ensure that PERST is asserted for at least 100 ms
Btrfs: fix data loss after inode eviction, renaming it, and fsync it
Btrfs: fix fsync not persisting dentry deletions due to inode evictions
Btrfs: add missing inode version, ctime and mtime updates when punching hole
IB/mlx5: Report correctly tag matching rendezvous capability
HID: wacom: generic: only switch the mode on devices with LEDs
HID: wacom: generic: Correct pad syncing
HID: wacom: correct touch resolution x/y typo
libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
coda: pass the host file in vma->vm_file on mmap
include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
xfs: fix pagecache truncation prior to reflink
xfs: flush removing page cache in xfs_reflink_remap_prep
xfs: don't overflow xattr listent buffer
xfs: rename m_inotbt_nores to m_finobt_nores
xfs: don't ever put nlink > 0 inodes on the unlinked list
xfs: reserve blocks for ifree transaction during log recovery
xfs: fix reporting supported extra file attributes for statx()
xfs: serialize unaligned dio writes against all other dio writes
xfs: abort unaligned nowait directio early
gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM
crypto: caam - limit output IV to CBC to work around CTR mode DMA issue
parisc: Ensure userspace privilege for ptraced processes in regset functions
parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1
powerpc/32s: fix suspend/resume when IBATs 4-7 are used
powerpc/watchpoint: Restore NV GPRs while returning from exception
powerpc/powernv/npu: Fix reference leak
powerpc/pseries: Fix oops in hotplug memory notifier
mmc: sdhci-msm: fix mutex while in spinlock
eCryptfs: fix a couple type promotion bugs
mtd: rawnand: mtk: Correct low level time calculation of r/w cycle
mtd: spinand: read returns badly if the last page has bitflips
intel_th: msu: Fix single mode with disabled IOMMU
Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug
usb: Handle USB3 remote wakeup for LPM enabled devices correctly
blk-throttle: fix zero wait time for iops throttled group
blk-iolatency: clear use_delay when io.latency is set to zero
blkcg: update blkcg_print_stat() to handle larger outputs
net: mvmdio: allow up to four clocks to be specified for orion-mdio
dt-bindings: allow up to four clocks for orion-mdio
dm bufio: fix deadlock with loop device
Linux 4.19.61
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2f565111b1c16f369fa86e0481527fcc6357fe1b
|
||
|
|
73efdc5d7d |
blk-iolatency: clear use_delay when io.latency is set to zero
commit 5de0073fcd50cc1f150895a7bb04d3cf8067b1d7 upstream.
If use_delay was non-zero when the latency target of a cgroup was set
to zero, it will stay stuck until io.latency is enabled on the cgroup
again. This keeps readahead disabled for the cgroup impacting
performance negatively.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <jbacik@fb.com>
Fixes:
|
||
|
|
3ae98dc2db |
blk-iolatency: only account submitted bios
[ Upstream commit a3fb01ba5af066521f3f3421839e501bb2c71805 ] As is, iolatency recognizes done_bio and cleanup as ending paths. If a request is marked REQ_NOWAIT and fails to get a request, the bio is cleaned up via rq_qos_cleanup() and ended in bio_wouldblock_error(). This results in underflowing the inflight counter. Fix this by only accounting bios that were actually submitted. Signed-off-by: Dennis Zhou <dennis@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
10f41ccfc7 |
Merge 4.19.36 into android-4.19
Changes in 4.19.36 ARC: u-boot args: check that magic number is correct arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM inotify: Fix fsnotify_mark refcount leak in inotify_update_existing_watch() perf/core: Restore mmap record type correctly ext4: avoid panic during forced reboot ext4: add missing brelse() in add_new_gdb_meta_bg() ext4: report real fs size after failed resize ALSA: echoaudio: add a check for ioremap_nocache ALSA: sb8: add a check for request_region auxdisplay: hd44780: Fix memory leak on ->remove() drm/udl: use drm_gem_object_put_unlocked. IB/mlx4: Fix race condition between catas error reset and aliasguid flows i40iw: Avoid panic when handling the inetdev event mmc: davinci: remove extraneous __init annotation ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration thermal/intel_powerclamp: fix __percpu declaration of worker_data thermal: samsung: Fix incorrect check after code merge thermal: bcm2835: Fix crash in bcm2835_thermal_debugfs thermal/int340x_thermal: Add additional UUIDs thermal/int340x_thermal: fix mode setting thermal/intel_powerclamp: fix truncated kthread name scsi: iscsi: flush running unbind operations when removing a session sched/cpufreq: Fix 32-bit math overflow sched/core: Fix buffer overflow in cgroup2 property cpu.max x86/mm: Don't leak kernel addresses tools/power turbostat: return the exit status of a command perf list: Don't forget to drop the reference to the allocated thread_map perf config: Fix an error in the config template documentation perf config: Fix a memory leak in collect_config() perf build-id: Fix memory leak in print_sdt_events() perf top: Fix error handling in cmd_top() perf hist: Add missing map__put() in error case perf evsel: Free evsel->counts in perf_evsel__exit() perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test perf tests: Fix memory leak by expr__find_other() in test__expr() perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test() ACPI / utils: Drop reference in test for device presence PM / Domains: Avoid a potential deadlock blk-iolatency: #include "blk.h" drm/exynos/mixer: fix MIXER shadow registry synchronisation code irqchip/stm32: Don't clear rising/falling config registers at init irqchip/mbigen: Don't clear eventid when freeing an MSI x86/hpet: Prevent potential NULL pointer dereference x86/hyperv: Prevent potential NULL pointer dereference x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors drm/nouveau/debugfs: Fix check of pm_runtime_get_sync failure iommu/vt-d: Check capability before disabling protected memory x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error fix incorrect error code mapping for OBJECTID_NOT_FOUND x86/gart: Exclude GART aperture from kcore ext4: prohibit fstrim in norecovery mode drm/cirrus: Use drm_framebuffer_put to avoid kernel oops in clean-up gpio: pxa: handle corner case of unprobed device rsi: improve kernel thread handling to fix kernel panic f2fs: fix to avoid NULL pointer dereference on se->discard_map 9p: do not trust pdu content for stat item size 9p locks: add mount option for lock retry interval ASoC: Fix UBSAN warning at snd_soc_get/put_volsw_sx() f2fs: fix to do sanity check with current segment number netfilter: xt_cgroup: shrink size of v2 path serial: uartps: console_setup() can't be placed to init section powerpc/pseries: Remove prrn_work workqueue media: au0828: cannot kfree dev before usb disconnect Bluetooth: Fix debugfs NULL pointer dereference HID: i2c-hid: override HID descriptors for certain devices pinctrl: core: make sure strcmp() doesn't get a null parameter ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms usbip: fix vhci_hcd controller counting ACPI / SBS: Fix GPE storm on recent MacBookPro's HID: usbhid: Add quirk for Redragon/Dragonrise Seymur 2 KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail compiler.h: update definition of unreachable() netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine f2fs: cleanup dirty pages if recover failed net: stmmac: Set OWN bit for jumbo frames cifs: fallback to older infolevels on findfirst queryinfo retry kernel: hung_task.c: disable on suspend platform/x86: Add Intel AtomISP2 dummy / power-management driver drm/ttm: Fix bo_global and mem_global kfree error ALSA: hda: fix front speakers on Huawei MBXP ACPI: EC / PM: Disable non-wakeup GPEs for suspend-to-idle net/rds: fix warn in rds_message_alloc_sgs xfrm: destroy xfrm_state synchronously on net exit path crypto: sha256/arm - fix crash bug in Thumb2 build crypto: sha512/arm - fix crash bug in Thumb2 build net: ip6_gre: fix possible NULL pointer dereference in ip6erspan_set_version iommu/dmar: Fix buffer overflow during PCI bus notification scsi: core: Avoid that system resume triggers a kernel warning soc/tegra: pmc: Drop locking from tegra_powergate_is_powered() lkdtm: Print real addresses lkdtm: Add tests for NULL pointer dereference drm/panel: panel-innolux: set display off in innolux_panel_unprepare crypto: axis - fix for recursive locking from bottom half Revert "ACPI / EC: Remove old CLEAR_ON_RESUME quirk" coresight: cpu-debug: Support for CA73 CPUs PCI: Blacklist power management of Gigabyte X299 DESIGNARE EX PCIe ports drm/nouveau/volt/gf117: fix speedo readout register ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI) appletalk: Fix use-after-free in atalk_proc_exit lib/div64.c: off by one in shift rxrpc: Fix client call connect/disconnect race f2fs: fix to dirty inode for i_mode recovery include/linux/swap.h: use offsetof() instead of custom __swapoffset macro bpf: fix use after free in bpf_evict_inode IB/hfi1: Failed to drain send queue when QP is put into error state mm: hide incomplete nr_indirectly_reclaimable in /proc/zoneinfo mm: hide incomplete nr_indirectly_reclaimable in sysfs appletalk: Fix compile regression Linux 4.19.36 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
bde271d1ad |
blk-iolatency: #include "blk.h"
[ Upstream commit 373e915cd8e84544609eced57a44fbc084f8d60f ]
This patch avoids that the following warning is reported when building
with W=1:
block/blk-iolatency.c:734:5: warning: no previous prototype for 'blk_iolatency_init' [-Wmissing-prototypes]
Cc: Josef Bacik <jbacik@fb.com>
Fixes:
|
||
|
|
2ba18b41d3 |
BACKPORT: sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
There are several definitions of those functions/macros in places that mess with fixed-point load averages. Provide an official version. [akpm@linux-foundation.org: fix missed conversion in block/blk-iolatency.c] Link: http://lkml.kernel.org/r/20180828172258.3185-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Daniel Drake <drake@endlessm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 8508cf3ffad4defa202b303e5b6379efc4cd9054) Conflicts: block/blk-iolatency.c (1. manual merge to replace stat->rqs.mean with stat.mean) Bug: 127712811 Test: lmkd in PSI mode Change-Id: I716b4874491cff75a2355c6d95c64cf02d05e7ee Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
|
|
6d482bc569 |
blk-iolatency: fix IO hang due to negative inflight counter
[ Upstream commit 8c772a9bfc7c07c76f4a58b58910452fbb20843b ] Our test reported the following stack, and vmcore showed that ->inflight counter is -1. [ffffc9003fcc38d0] __schedule at ffffffff8173d95d [ffffc9003fcc3958] schedule at ffffffff8173de26 [ffffc9003fcc3970] io_schedule at ffffffff810bb6b6 [ffffc9003fcc3988] blkcg_iolatency_throttle at ffffffff813911cb [ffffc9003fcc3a20] rq_qos_throttle at ffffffff813847f3 [ffffc9003fcc3a48] blk_mq_make_request at ffffffff8137468a [ffffc9003fcc3b08] generic_make_request at ffffffff81368b49 [ffffc9003fcc3b68] submit_bio at ffffffff81368d7d [ffffc9003fcc3bb8] ext4_io_submit at ffffffffa031be00 [ext4] [ffffc9003fcc3c00] ext4_writepages at ffffffffa03163de [ext4] [ffffc9003fcc3d68] do_writepages at ffffffff811c49ae [ffffc9003fcc3d78] __filemap_fdatawrite_range at ffffffff811b6188 [ffffc9003fcc3e30] filemap_write_and_wait_range at ffffffff811b6301 [ffffc9003fcc3e60] ext4_sync_file at ffffffffa030cee8 [ext4] [ffffc9003fcc3ea8] vfs_fsync_range at ffffffff8128594b [ffffc9003fcc3ee8] do_fsync at ffffffff81285abd [ffffc9003fcc3f18] sys_fsync at ffffffff81285d50 [ffffc9003fcc3f28] do_syscall_64 at ffffffff81003c04 [ffffc9003fcc3f50] entry_SYSCALL_64_after_swapgs at ffffffff81742b8e The ->inflight counter may be negative (-1) if 1) blk-iolatency was disabled when the IO was issued, 2) blk-iolatency was enabled before this IO reached its endio, 3) the ->inflight counter is decreased from 0 to -1 in endio() In fact the hang can be easily reproduced by the below script, H=/sys/fs/cgroup/unified/ P=/sys/fs/cgroup/unified/test echo "+io" > $H/cgroup.subtree_control mkdir -p $P echo $$ > $P/cgroup.procs xfs_io -f -d -c "pwrite 0 4k" /dev/sdg echo "`cat /sys/block/sdg/dev` target=1000000" > $P/io.latency xfs_io -f -d -c "pwrite 0 4k" /dev/sdg This fixes the problem by freezing the queue so that while enabling/disabling iolatency, there is no inflight rq running. Note that quiesce_queue is not needed as this only updating iolatency configuration about which dispatching request_queue doesn't care. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
c480bcf97b |
block: make iolatency avg_lat exponentially decay
Currently, avg_lat is calculated by accumulating the mean of every window in a long running cumulative average. As time goes on, the metric becomes less and less useful due to the accumulated history. This patch reuses the same calculation done in load averages to make the avg_lat metric more lively. Unlike load averages, the avg only advances when a window elapses (due to an io). Idle periods extend the most recent window. Bucketing is used to limit the history of avg_lat by binding it to the window size. So, the window range for 1/exp (decay rate) is [1 min, 2.5 min) when windows elapse immediately. The current sample window size is exposed in the debug info to enable calculation of the window range. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
52a1199ccd |
blk-iolatency: fix blkg leak in timer_fn
At this point we have a ref on the blkg, we need to drop it if we don't have a iolat. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
71e9690b59 |
blk-iolatency: truncate our current time
In our longer tests we noticed that some boxes would degrade to the point of uselessness. This is because we truncate the current time when saving it in our bio, but I was using the raw current time to subtract from. So once the box had been up a certain amount of time it would appear as if our IO's were taking several years to complete. Fix this by truncating the current time so it matches the issue time. Verified this worked by running with this patch for a week on our test tier. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
d607eefa3b |
blk-iolatency: don't change the latency window
Early versions of these patches had us waiting for seconds at a time during submission, so we had to adjust the timing window we monitored for latency. Now we don't do things like that so this is unnecessary code. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
a284390b39 |
blk-iolatency: fix max_depth comparisons
max_depth used to be a u64, but I changed it to a unsigned int but didn't convert my comparisons over everywhere. Fix by using UINT_MAX everywhere instead of (u64)-1. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
88b7210c81 |
block: iolatency: avoid 64-bit division
On 32-bit architectures, dividing a 64-bit number needs to use the
do_div() function or something like it to avoid a link failure:
block/blk-iolatency.o: In function `iolatency_prfill_limit':
blk-iolatency.c:(.text+0x8cc): undefined reference to `__aeabi_uldivmod'
Using div_u64() gives us the best output and avoids the need for an
explicit cast.
Fixes:
|
||
|
|
d706751215 |
block: introduce blk-iolatency io controller
Current IO controllers for the block layer are less than ideal for our use case. The io.max controller is great at hard limiting, but it is not work conserving. This patch introduces io.latency. You provide a latency target for your group and we monitor the io in short windows to make sure we are not exceeding those latency targets. This makes use of the rq-qos infrastructure and works much like the wbt stuff. There are a few differences from wbt - It's bio based, so the latency covers the whole block layer in addition to the actual io. - We will throttle all IO types that comes in here if we need to. - We use the mean latency over the 100ms window. This is because writes can be particularly fast, which could give us a false sense of the impact of other workloads on our protected workload. - By default there's no throttling, we set the queue_depth to INT_MAX so that we can have as many outstanding bio's as we're allowed to. Only at throttle time do we pay attention to the actual queue depth. - We backcharge cgroups for root cg issued IO and induce artificial delays in order to deal with cases like metadata only or swap heavy workloads. In testing this has worked out relatively well. Protected workloads will throttle noisy workloads down to 1 io at time if they are doing normal IO on their own, or induce up to a 1 second delay per syscall if they are doing a lot of root issued IO (metadata/swap IO). Our testing has revolved mostly around our production web servers where we have hhvm (the web server application) in a protected group and everything else in another group. We see slightly higher requests per second (RPS) on the test tier vs the control tier, and much more stable RPS across all machines in the test tier vs the control tier. Another test we run is a slow memory allocator in the unprotected group. Before this would eventually push us into swap and cause the whole box to die and not recover at all. With these patches we see slight RPS drops (usually 10-15%) before the memory consumer is properly killed and things recover within seconds. Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |