9c20bfd64d3f79d929a37d70e8aaaad5405d4d6e
744 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
24a799db09 |
Merge 4.19.297 into android-4.19-stable
Changes in 4.19.297 indirect call wrappers: helpers to speed-up indirect calls of builtin net: use indirect calls helpers at the socket layer net: fix kernel-doc warnings for socket.c net: prevent rewrite of msg_name in sock_sendmsg() RDMA/cxgb4: Check skb value for failure to allocate HID: logitech-hidpp: Fix kernel crash on receiver USB disconnect quota: Fix slow quotaoff net: prevent address rewrite in kernel_bind() drm: etvnaviv: fix bad backport leading to warning drm/msm/dsi: skip the wait for video mode done if not applicable ieee802154: ca8210: Fix a potential UAF in ca8210_probe xen-netback: use default TX queue size for vifs drm/vmwgfx: fix typo of sizeof argument ixgbe: fix crash with empty VF macvlan list net: nfc: fix races in nfc_llcp_sock_get() and nfc_llcp_sock_get_sn() nfc: nci: assert requested protocol is valid workqueue: Override implicit ordered attribute in workqueue_apply_unbound_cpumask() sched,idle,rcu: Push rcu_idle deeper into the idle path dmaengine: stm32-mdma: abort resume if no ongoing transfer usb: xhci: xhci-ring: Use sysdev for mapping bounce buffer net: usb: dm9601: fix uninitialized variable use in dm9601_mdio_read usb: dwc3: Soft reset phy on probe for host usb: musb: Get the musb_qh poniter after musb_giveback usb: musb: Modify the "HWVers" register address iio: pressure: bmp280: Fix NULL pointer exception iio: pressure: ms5611: ms5611_prom_is_valid false negative bug mcb: remove is_added flag from mcb_device struct ceph: fix incorrect revoked caps assert in ceph_fill_file_size() Input: powermate - fix use-after-free in powermate_config_complete Input: psmouse - fix fast_reconnect function for PS/2 mode Input: xpad - add PXN V900 support cgroup: Remove duplicates in cgroup v1 tasks file pinctrl: avoid unsafe code pattern in find_pinctrl() x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs usb: gadget: udc-xilinx: replace memcpy with memcpy_toio usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call powerpc/64e: Fix wrong test in __ptep_test_and_clear_young() x86/alternatives: Disable KASAN in apply_alternatives() dev_forward_skb: do not scrub skb mark within the same name space usb: hub: Guard against accesses to uninitialized BOS descriptors Bluetooth: hci_event: Ignore NULL link key Bluetooth: Reject connection with the device which has same BD_ADDR Bluetooth: Fix a refcnt underflow problem for hci_conn Bluetooth: vhci: Fix race when opening vhci device Bluetooth: hci_event: Fix coding style Bluetooth: avoid memcmp() out of bounds warning nfc: nci: fix possible NULL pointer dereference in send_acknowledge() regmap: fix NULL deref on lookup KVM: x86: Mask LVTPC when handling a PMI netfilter: nft_payload: fix wrong mac header matching xfrm: fix a data-race in xfrm_gen_index() xfrm: interface: use DEV_STATS_INC() net: ipv4: fix return value check in esp_remove_trailer net: ipv6: fix return value check in esp_remove_trailer net: rfkill: gpio: prevent value glitch during probe tcp: fix excessive TLP and RACK timeouts from HZ rounding tcp: tsq: relax tcp_small_queue_check() when rtx queue contains a single skb net: usb: smsc95xx: Fix an error code in smsc95xx_reset() i40e: prevent crash on probe if hw registers have invalid values net/sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner curve netfilter: nft_set_rbtree: .deactivate fails if element has expired net: pktgen: Fix interface flags printing libceph: fix unaligned accesses in ceph_entity_addr handling libceph: use kernel_connect() ARM: dts: ti: omap: Fix noisy serial with overrun-throttle-ms for mapphone btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 btrfs: initialize start_slot in btrfs_log_prealloc_extents i2c: mux: Avoid potential false error message in i2c_mux_add_adapter overlayfs: set ctime when setting mtime and atime gpio: timberdale: Fix potential deadlock on &tgpio->lock ata: libata-eh: Fix compilation warning in ata_eh_link_report() tracing: relax trace_event_eval_update() execution with cond_resched() HID: holtek: fix slab-out-of-bounds Write in holtek_kbd_input_event Bluetooth: Avoid redundant authentication Bluetooth: hci_core: Fix build warnings wifi: mac80211: allow transmitting EAPOL frames with tainted key wifi: cfg80211: avoid leaking stack data into trace sky2: Make sure there is at least one frag_addr available drm: panel-orientation-quirks: Add quirk for One Mix 2S btrfs: fix some -Wmaybe-uninitialized warnings in ioctl.c Bluetooth: hci_event: Fix using memcmp when comparing keys mtd: rawnand: qcom: Unmap the right resource upon probe failure mtd: spinand: micron: correct bitmask for ecc status mmc: core: Capture correct oemid-bits for eMMC cards Revert "pinctrl: avoid unsafe code pattern in find_pinctrl()" ACPI: irq: Fix incorrect return value in acpi_register_gsi() USB: serial: option: add Telit LE910C4-WWX 0x1035 composition USB: serial: option: add entry for Sierra EM9191 with new firmware USB: serial: option: add Fibocom to DELL custom modem FM101R-GL perf: Disallow mis-matched inherited group reads s390/pci: fix iommu bitmap allocation gpio: vf610: set value before the direction to avoid a glitch ASoC: pxa: fix a memory leak in probe() phy: mapphone-mdm6600: Fix runtime PM for remove Bluetooth: hci_sock: fix slab oob read in create_monitor_event Bluetooth: hci_sock: Correctly bounds check and pad HCI_MON_NEW_INDEX name xfrm6: fix inet6_dev refcount underflow problem Linux 4.19.297 Change-Id: I495e8b8fbb6416ec3f94094fa905bdde364618b4 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
6d022a7abf |
tcp: fix excessive TLP and RACK timeouts from HZ rounding
commit 1c2709cfff1dedbb9591e989e2f001484208d914 upstream.
We discovered from packet traces of slow loss recovery on kernels with
the default HZ=250 setting (and min_rtt < 1ms) that after reordering,
when receiving a SACKed sequence range, the RACK reordering timer was
firing after about 16ms rather than the desired value of roughly
min_rtt/4 + 2ms. The problem is largely due to the RACK reorder timer
calculation adding in TCP_TIMEOUT_MIN, which is 2 jiffies. On kernels
with HZ=250, this is 2*4ms = 8ms. The TLP timer calculation has the
exact same issue.
This commit fixes the TLP transmit timer and RACK reordering timer
floor calculation to more closely match the intended 2ms floor even on
kernels with HZ=250. It does this by adding in a new
TCP_TIMEOUT_MIN_US floor of 2000 us and then converting to jiffies,
instead of the current approach of converting to jiffies and then
adding th TCP_TIMEOUT_MIN value of 2 jiffies.
Our testing has verified that on kernels with HZ=1000, as expected,
this does not produce significant changes in behavior, but on kernels
with the default HZ=250 the latency improvement can be large. For
example, our tests show that for HZ=250 kernels at low RTTs this fix
roughly halves the latency for the RACK reorder timer: instead of
mostly firing at 16ms it mostly fires at 8ms.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Fixes:
|
||
|
|
7f2e810705 |
Merge 4.19.296 into android-4.19-stable
Changes in 4.19.296
NFS/pNFS: Report EINVAL errors from connect() to the server
ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones
ata: libahci: clear pending interrupt status
netfilter: nf_tables: disallow element removal on anonymous sets
selftests/tls: Add {} to avoid static checker warning
selftests: tls: swap the TX and RX sockets in some tests
ipv4: fix null-deref in ipv4_link_failure
powerpc/perf/hv-24x7: Update domain value check
net: hns3: add 5ms delay before clear firmware reset irq source
net: add atomic_long_t to net_device_stats fields
net: bridge: use DEV_STATS_INC()
team: fix null-ptr-deref when team device type is changed
gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()
i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()
Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
scsi: qla2xxx: Add protection mask module parameters
scsi: qla2xxx: Remove unsupported ql2xenabledif option
scsi: megaraid_sas: Load balance completions across all MSI-X
scsi: megaraid_sas: Fix deadlock on firmware crashdump
ext4: remove the 'group' parameter of ext4_trim_extent
ext4: add new helper interface ext4_try_to_trim_range()
ext4: scope ret locally in ext4_try_to_trim_range()
ext4: change s_last_trim_minblks type to unsigned long
ext4: mark group as trimmed only if it was fully scanned
ext4: replace the traditional ternary conditional operator with with max()/min()
ext4: move setting of trimmed bit into ext4_try_to_trim_range()
ext4: do not let fstrim block system suspend
MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled
clk: tegra: fix error return case for recalc_rate
ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot
gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip
parisc: sba: Fix compile warning wrt list of SBA devices
parisc: iosapic.c: Fix sparse warnings
parisc: drivers: Fix sparse warning
parisc: irq: Make irq_stack_union static to avoid sparse warning
selftests/ftrace: Correctly enable event in instance-event.tc
ring-buffer: Avoid softlockup in ring_buffer_resize()
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
bpf: Clarify error expectations from bpf_clone_redirect
fbdev/sh7760fb: Depend on FB=y
nvme-pci: do not set the NUMA node of device if it has none
watchdog: iTCO_wdt: No need to stop the timer in probe
watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running
net: Fix unwanted sign extension in netdev_stats_to_stats64()
scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers
Smack:- Use overlay inode label in smack_inode_copy_up()
smack: Retrieve transmuting information in smack_inode_getsecurity()
smack: Record transmuting in smk_transmuted
serial: 8250_port: Check IRQ data before use
nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()
ALSA: hda: Disable power save for solving pop issue on Lenovo ThinkCentre M70q
ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
i2c: i801: unregister tco_pdev in i801_probe() error path
btrfs: properly report 0 avail for very full file systems
net: thunderbolt: Fix TCPv6 GSO checksum calculation
ata: libata-core: Fix ata_port_request_pm() locking
ata: libata-core: Fix port and device removal
ata: libata-core: Do not register PM operations for SAS ports
ata: libata-sata: increase PMP SRST timeout to 10s
fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
ext4: fix rec_len verify error
ata: libata: disallow dev-initiated LPM transitions to unsupported states
Revert "drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions"
media: dvb: symbol fixup for dvb_attach() - again
Revert "PCI: qcom: Disable write access to read only registers for IP v2.3.3"
scsi: zfcp: Fix a double put in zfcp_port_enqueue()
qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info
wifi: mwifiex: Fix tlv_buf_left calculation
net: replace calls to sock->ops->connect() with kernel_connect()
ubi: Refuse attaching if mtd's erasesize is 0
wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet
drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close()
regmap: rbtree: Fix wrong register marked as in-cache when creating new node
scsi: target: core: Fix deadlock due to recursive locking
modpost: add missing else to the "of" check
ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()
net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg
net: stmmac: dwmac-stm32: fix resume on STM32 MCU
tcp: fix quick-ack counting to count actual ACKs of new data
tcp: fix delayed ACKs for MSS boundary condition
sctp: update transport state when processing a dupcook packet
sctp: update hb timer immediately after users change hb_interval
cpupower: add Makefile dependencies for install targets
IB/mlx4: Fix the size of a buffer in add_port_entries()
gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config()
gpio: pxa: disable pinctrl calls for MMP_GPIO
RDMA/cma: Fix truncation compilation warning in make_cma_ports
RDMA/mlx5: Fix NULL string error
parisc: Restore __ldcw_align for PA-RISC 2.0 processors
dccp: fix dccp_v4_err()/dccp_v6_err() again
Revert "rtnetlink: Reject negative ifindexes in RTM_NEWLINK"
rtnetlink: Reject negative ifindexes in RTM_NEWLINK
xen/events: replace evtchn_rwlock with RCU
Linux 4.19.296
Change-Id: I4f76959ead91691fe9a242cb5b158fc8dc67bf39
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
b86bfa8334 |
tcp: fix quick-ack counting to count actual ACKs of new data
[ Upstream commit 059217c18be6757b95bfd77ba53fb50b48b8a816 ]
This commit fixes quick-ack counting so that it only considers that a
quick-ack has been provided if we are sending an ACK that newly
acknowledges data.
The code was erroneously using the number of data segments in outgoing
skbs when deciding how many quick-ack credits to remove. This logic
does not make sense, and could cause poor performance in
request-response workloads, like RPC traffic, where requests or
responses can be multi-segment skbs.
When a TCP connection decides to send N quick-acks, that is to
accelerate the cwnd growth of the congestion control module
controlling the remote endpoint of the TCP connection. That quick-ack
decision is purely about the incoming data and outgoing ACKs. It has
nothing to do with the outgoing data or the size of outgoing data.
And in particular, an ACK only serves the intended purpose of allowing
the remote congestion control to grow the congestion window quickly if
the ACK is ACKing or SACKing new data.
The fix is simple: only count packets as serving the goal of the
quickack mechanism if they are ACKing/SACKing new data. We can tell
whether this is the case by checking inet_csk_ack_scheduled(), since
we schedule an ACK exactly when we are ACKing/SACKing new data.
Fixes:
|
||
|
|
501b721387 |
Merge 4.19.295 into android-4.19-stable
Changes in 4.19.295
erofs: ensure that the post-EOF tails are all zeroed
ARM: pxa: remove use of symbol_get()
mmc: au1xmmc: force non-modular build and remove symbol_get usage
rtc: ds1685: use EXPORT_SYMBOL_GPL for ds1685_rtc_poweroff
modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules
USB: serial: option: add Quectel EM05G variant (0x030e)
USB: serial: option: add FOXCONN T99W368/T99W373 product
HID: wacom: remove the battery when the EKR is off
Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition
serial: sc16is7xx: fix bug when first setting GPIO direction
fsi: master-ast-cf: Add MODULE_FIRMWARE macro
nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
pinctrl: amd: Don't show `Invalid config param` errors
9p: virtio: make sure 'offs' is initialized in zc_request
ASoC: da7219: Flush pending AAD IRQ when suspending
ASoC: da7219: Check for failure reading AAD IRQ events
ethernet: atheros: fix return value check in atl1c_tso_csum()
vxlan: generalize vxlan_parse_gpe_hdr and remove unused args
m68k: Fix invalid .section syntax
s390/dasd: use correct number of retries for ERP requests
s390/dasd: fix hanging device after request requeue
fs/nls: make load_nls() take a const parameter
ASoc: codecs: ES8316: Fix DMIC config
ASoC: atmel: Fix the 8K sample parameter in I2SC master
platform/x86: intel: hid: Always call BTNL ACPI method
security: keys: perform capable check only on privileged operations
net: usb: qmi_wwan: add Quectel EM05GV2
idmaengine: make FSL_EDMA and INTEL_IDMA64 depends on HAS_IOMEM
scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
netlabel: fix shift wrapping bug in netlbl_catmap_setlong()
bnx2x: fix page fault following EEH recovery
sctp: handle invalid error codes without calling BUG()
cifs: add a warning when the in-flight count goes negative
ALSA: seq: oss: Fix racy open/close of MIDI devices
net: Avoid address overwrite in kernel_connect
powerpc/32: Include .branch_lt in data section
powerpc/32s: Fix assembler warning about r0
udf: Check consistency of Space Bitmap Descriptor
udf: Handle error when adding extent to a file
Revert "net: macsec: preserve ingress frame ordering"
reiserfs: Check the return value from __getblk()
eventfd: Export eventfd_ctx_do_read()
eventfd: prevent underflow for eventfd semaphores
new helper: lookup_positive_unlocked()
netfilter: nft_flow_offload: fix underflow in flowtable reference counter
netfilter: nf_tables: missing NFT_TRANS_PREPARE_ERROR in flowtable deactivatation
fs: Fix error checking for d_hash_and_lookup()
cpufreq: powernow-k8: Use related_cpus instead of cpus in driver.exit()
bpf: Clear the probe_addr for uprobe
tcp: tcp_enter_quickack_mode() should be static
regmap: rbtree: Use alloc_flags for memory allocations
spi: tegra20-sflash: fix to check return value of platform_get_irq() in tegra_sflash_probe()
can: gs_usb: gs_usb_receive_bulk_callback(): count RX overflow errors also in case of OOM
wifi: mwifiex: Fix OOB and integer underflow when rx packets
mwifiex: drop 'set_consistent_dma_mask' log message
mwifiex: switch from 'pci_' to 'dma_' API
wifi: mwifiex: fix error recovery in PCIE buffer descriptor management
Bluetooth: nokia: fix value check in nokia_bluetooth_serdev_probe()
crypto: caam - fix unchecked return value error
lwt: Check LWTUNNEL_XMIT_CONTINUE strictly
fs: ocfs2: namei: check return value of ocfs2_add_entry()
wifi: mwifiex: fix memory leak in mwifiex_histogram_read()
wifi: mwifiex: Fix missed return in oob checks failed path
wifi: ath9k: fix races between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx
wifi: ath9k: protect WMI command response buffer replacement with a lock
wifi: mwifiex: avoid possible NULL skb pointer dereference
wifi: ath9k: use IS_ERR() with debugfs_create_dir()
net: arcnet: Do not call kfree_skb() under local_irq_disable()
net/sched: sch_hfsc: Ensure inner classes have fsc curve
netrom: Deny concurrent connect().
quota: add dqi_dirty_list description to comment of Dquot List Management
quota: avoid increasing DQST_LOOKUPS when iterating over dirty/inuse list
quota: factor out dquot_write_dquot()
quota: rename dquot_active() to inode_quota_active()
quota: add new helper dquot_active()
quota: fix dqput() to follow the guarantees dquot_srcu should provide
arm64: dts: msm8996: thermal: Add interrupt support
arm64: dts: qcom: msm8996: Add missing interrupt to the USB2 controller
drm/amdgpu: avoid integer overflow warning in amdgpu_device_resize_fb_bar()
ARM: dts: BCM5301X: Harmonize EHCI/OHCI DT nodes name
ARM: dts: BCM53573: Describe on-SoC BCM53125 rev 4 switch
ARM: dts: BCM53573: Drop nonexistent #usb-cells
ARM: dts: BCM53573: Add cells sizes to PCIe node
ARM: dts: BCM53573: Use updated "spi-gpio" binding properties
ARM: dts: s3c6410: move fixed clocks under root node in Mini6410
ARM: dts: s3c6410: align node SROM bus node name with dtschema in Mini6410
ARM: dts: s3c64xx: align pinctrl with dtschema
ARM: dts: samsung: s3c6410-mini6410: correct ethernet reg addresses (split)
ARM: dts: s5pv210: add RTC 32 KHz clock in SMDKV210
ARM: dts: s5pv210: use defines for IRQ flags in SMDKV210
ARM: dts: s5pv210: correct ethernet unit address in SMDKV210
ARM: dts: s5pv210: add dummy 5V regulator for backlight on SMDKv210
ARM: dts: samsung: s5pv210-smdkv210: correct ethernet reg addresses (split)
drm: adv7511: Fix low refresh rate register for ADV7533/5
ARM: dts: BCM53573: Fix Ethernet info for Luxul devices
drm/tegra: Remove superfluous error messages around platform_get_irq()
drm/tegra: dpaux: Fix incorrect return value of platform_get_irq
of: unittest: fix null pointer dereferencing in of_unittest_find_node_by_name()
drm/msm: Replace drm_framebuffer_{un/reference} with put, get functions
drm/msm/mdp5: Don't leak some plane state
smackfs: Prevent underflow in smk_set_cipso()
audit: fix possible soft lockup in __audit_inode_child()
of: unittest: Fix overlay type in apply/revert check
ALSA: ac97: Fix possible error value of *rac97
drivers: clk: keystone: Fix parameter judgment in _of_pll_clk_init()
clk: sunxi-ng: Modify mismatched function name
PCI: Mark NVIDIA T4 GPUs to avoid bus reset
PCI: pciehp: Use RMW accessors for changing LNKCTL
PCI/ASPM: Use RMW accessors for changing LNKCTL
PCI/ATS: Add pci_prg_resp_pasid_required() interface.
PCI: Cleanup register definition width and whitespace
PCI: Decode PCIe 32 GT/s link speed
PCI: Add #defines for Enter Compliance, Transmit Margin
drm/amdgpu: Correct Transmit Margin masks
drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
drm/amdgpu: Prefer pcie_capability_read_word()
drm/amdgpu: Use RMW accessors for changing LNKCTL
drm/radeon: Correct Transmit Margin masks
drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
drm/radeon: Prefer pcie_capability_read_word()
drm/radeon: Use RMW accessors for changing LNKCTL
wifi: ath10k: Use RMW accessors for changing LNKCTL
nfs/blocklayout: Use the passed in gfp flags
powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
jfs: validate max amount of blocks before allocation.
fs: lockd: avoid possible wrong NULL parameter
NFSD: da_addr_body field missing in some GETDEVICEINFO replies
media: Use of_node_name_eq for node name comparisons
media: v4l2-fwnode: fix v4l2_fwnode_parse_link handling
media: v4l2-fwnode: simplify v4l2_fwnode_parse_link
media: v4l2-core: Fix a potential resource leak in v4l2_fwnode_parse_link()
drivers: usb: smsusb: fix error handling code in smsusb_init_device
media: dib7000p: Fix potential division by zero
media: dvb-usb: m920x: Fix a potential memory leak in m920x_i2c_xfer()
media: cx24120: Add retval check for cx24120_message_send()
media: mediatek: vcodec: Return NULL if no vdec_fb is found
usb: phy: mxs: fix getting wrong state with mxs_phy_is_otg_host()
scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param()
scsi: be2iscsi: Add length check when parsing nlattrs
scsi: qla4xxx: Add length check when parsing nlattrs
x86/APM: drop the duplicate APM_MINOR_DEV macro
scsi: qedf: Do not touch __user pointer in qedf_dbg_stop_io_on_error_cmd_read() directly
scsi: qedf: Do not touch __user pointer in qedf_dbg_fp_int_cmd_read() directly
dma-buf/sync_file: Fix docs syntax
IB/uverbs: Fix an potential error pointer dereference
media: go7007: Remove redundant if statement
USB: gadget: f_mass_storage: Fix unused variable warning
media: i2c: ov2680: Set V4L2_CTRL_FLAG_MODIFY_LAYOUT on flips
media: ov2680: Remove auto-gain and auto-exposure controls
media: ov2680: Fix ov2680_bayer_order()
media: ov2680: Fix vflip / hflip set functions
media: ov2680: Fix regulators being left enabled on ov2680_power_on() errors
cgroup:namespace: Remove unused cgroup_namespaces_init()
scsi: core: Use 32-bit hostnum in scsi_host_lookup()
scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock
serial: tegra: handle clk prepare error in tegra_uart_hw_init()
amba: bus: fix refcount leak
Revert "IB/isert: Fix incorrect release of isert connection"
HID: multitouch: Correct devm device reference for hidinput input_dev name
rpmsg: glink: Add check for kstrdup
arch: um: drivers: Kconfig: pedantic formatting
um: Fix hostaudio build errors
dmaengine: ste_dma40: Add missing IRQ check in d40_probe
igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU
netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
netfilter: xt_u32: validate user space input
netfilter: xt_sctp: validate the flag_info count
skbuff: skb_segment, Call zero copy functions before using skbuff frags
igb: set max size RX buffer when store bad packet is enabled
PM / devfreq: Fix leak in devfreq_dev_release()
ALSA: pcm: Fix missing fixup call in compat hw_refine ioctl
ipmi_si: fix a memleak in try_smi_init()
ARM: OMAP2+: Fix -Warray-bounds warning in _pwrdm_state_switch()
backlight/gpio_backlight: Compare against struct fb_info.device
backlight/bd6107: Compare against struct fb_info.device
backlight/lv5207lp: Compare against struct fb_info.device
media: dvb: symbol fixup for dvb_attach()
ntb: Drop packets when qp link is down
ntb: Clean up tx tail index on link down
ntb: Fix calculation ntb_transport_tx_free_entry()
Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"
procfs: block chmod on /proc/thread-self/comm
parisc: Fix /proc/cpuinfo output for lscpu
dlm: fix plock lookup when using multiple lockspaces
dccp: Fix out of bounds access in DCCP error handler
crypto: stm32 - fix loop iterating through scatterlist for DMA
cpufreq: brcmstb-avs-cpufreq: Fix -Warray-bounds bug
X.509: if signature is unsupported skip validation
net: handle ARPHRD_PPP in dev_is_mac_header_xmit()
pstore/ram: Check start of empty przs during init
PCI/ATS: Add inline to pci_prg_resp_pasid_required()
sc16is7xx: Set iobase to device index
serial: sc16is7xx: fix broken port 0 uart init
usb: typec: tcpci: clear the fault status bit
udf: initialize newblock to 0
scsi: qla2xxx: fix inconsistent TMF timeout
scsi: qla2xxx: Turn off noisy message log
fbdev/ep93xx-fb: Do not assign to struct fb_info.dev
drm/ast: Fix DRAM init on AST2200
parisc: led: Fix LAN receive and transmit LEDs
parisc: led: Reduce CPU overhead for disk & lan LED computation
clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock
soc: qcom: qmi_encdec: Restrict string length in decode
NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info
kconfig: fix possible buffer overflow
x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm()
watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load
pwm: lpc32xx: Remove handling of PWM channels
net: read sk->sk_family once in sk_mc_loop()
igb: disable virtualization features on 82580
veth: Fixing transmit return status for dropped packets
net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr
af_unix: Fix data-races around user->unix_inflight.
af_unix: Fix data-race around unix_tot_inflight.
af_unix: Fix data-races around sk->sk_shutdown.
af_unix: Fix data race around sk->sk_err.
net: sched: sch_qfq: Fix UAF in qfq_dequeue()
kcm: Destroy mutex in kcm_exit_net()
igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80
igb: Change IGB_MIN to allow set rx/tx value between 64 and 80
idr: fix param name in idr_alloc_cyclic() doc
netfilter: nfnetlink_osf: avoid OOB read
ata: sata_gemini: Add missing MODULE_DESCRIPTION
ata: pata_ftide010: Add missing MODULE_DESCRIPTION
btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART
mtd: rawnand: brcmnand: Fix crash during the panic_write
mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write
mtd: rawnand: brcmnand: Fix potential false time out warning
perf hists browser: Fix hierarchy mode header
net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all()
kcm: Fix memory leak in error path of kcm_sendmsg()
ixgbe: fix timestamp configuration code
kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
drm/amd/display: Fix a bug when searching for insert_above_mpcc
parisc: Drop loops_per_jiffy from per_cpu struct
autofs: fix memory leak of waitqueues in autofs_catatonic_mode
btrfs: output extra debug info if we failed to find an inline backref
ACPICA: Add AML_NO_OPERAND_RESOLVE flag to Timer
ACPI: video: Add backlight=native DMI quirk for Lenovo Ideapad Z470
hw_breakpoint: fix single-stepping when using bpf_overflow_handler
wifi: ath9k: fix printk specifier
wifi: mwifiex: fix fortify warning
crypto: lib/mpi - avoid null pointer deref in mpi_cmp_ui()
tpm_tis: Resend command to recover from data transfer errors
alx: fix OOB-read compiler warning
drm/exynos: fix a possible null-pointer dereference due to data race in exynos_drm_crtc_atomic_disable()
md: raid1: fix potential OOB in raid1_remove_disk()
ext2: fix datatype of block number in ext2_xattr_set2()
fs/jfs: prevent double-free in dbUnmount() after failed jfs_remount()
jfs: fix invalid free of JFS_IP(ipimap)->i_imap in diUnmount
powerpc/pseries: fix possible memory leak in ibmebus_bus_init()
media: dvb-usb-v2: af9035: Fix null-ptr-deref in af9035_i2c_master_xfer
media: dw2102: Fix null-ptr-deref in dw2102_i2c_transfer()
media: af9005: Fix null-ptr-deref in af9005_i2c_xfer
media: anysee: fix null-ptr-deref in anysee_master_xfer
media: az6007: Fix null-ptr-deref in az6007_i2c_xfer()
iio: core: Use min() instead of min_t() to make code more robust
media: tuners: qt1010: replace BUG_ON with a regular error
media: pci: cx23885: replace BUG with error return
usb: gadget: fsl_qe_udc: validate endpoint index for ch9 udc
scsi: target: iscsi: Fix buffer overflow in lio_target_nacl_info_show()
serial: cpm_uart: Avoid suspicious locking
media: pci: ipu3-cio2: Initialise timing struct to avoid a compiler warning
kobject: Add sanity check for kset->kobj.ktype in kset_register()
md/raid1: fix error: ISO C90 forbids mixed declarations
attr: block mode changes of symlinks
btrfs: fix lockdep splat and potential deadlock after failure running delayed items
nfsd: fix change_info in NFSv4 RENAME replies
mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller
net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free
net/sched: Retire rsvp classifier
Linux 4.19.295
Change-Id: I5de88dc1e8cebe5736df3023205233cb40c4aa35
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
0f9ca28a3a |
tcp: tcp_enter_quickack_mode() should be static
[ Upstream commit 03b123debcbc8db987bda17ed8412cc011064c22 ] After commit |
||
|
|
813e482b1b |
Merge 4.19.291 into android-4.19-stable
Changes in 4.19.291
gfs2: Don't deref jdesc in evict
x86/smp: Use dedicated cache-line for mwait_play_dead()
video: imsttfb: check for ioremap() failures
fbdev: imsttfb: Fix use after free bug in imsttfb_probe
drm/edid: Fix uninitialized variable in drm_cvt_modes()
scripts/tags.sh: Resolve gtags empty index generation
drm/amdgpu: Validate VM ioctl flags.
treewide: Remove uninitialized_var() usage
md/raid10: check slab-out-of-bounds in md_bitmap_get_counter
md/raid10: fix overflow of md/safe_mode_delay
md/raid10: fix wrong setting of max_corr_read_errors
md/raid10: fix io loss while replacement replace rdev
irqchip/jcore-aic: Kill use of irq_create_strict_mappings()
irqchip/jcore-aic: Fix missing allocation of IRQ descriptors
clocksource/drivers: Unify the names to timer-* format
clocksource/drivers/cadence-ttc: Use ttc driver as platform driver
clocksource/drivers/cadence-ttc: Fix memory leak in ttc_timer_probe
PM: domains: fix integer overflow issues in genpd_parse_state()
ARM: 9303/1: kprobes: avoid missing-declaration warnings
evm: Complete description of evm_inode_setattr()
wifi: ath9k: fix AR9003 mac hardware hang check register offset calculation
wifi: ath9k: avoid referencing uninit memory in ath9k_wmi_ctrl_rx
samples/bpf: Fix buffer overflow in tcp_basertt
wifi: mwifiex: Fix the size of a memory allocation in mwifiex_ret_802_11_scan()
nfc: constify several pointers to u8, char and sk_buff
nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect()
wifi: orinoco: Fix an error handling path in spectrum_cs_probe()
wifi: orinoco: Fix an error handling path in orinoco_cs_probe()
wifi: atmel: Fix an error handling path in atmel_probe()
wl3501_cs: Fix a bunch of formatting issues related to function docs
wl3501_cs: Remove unnecessary NULL check
wl3501_cs: Fix misspelling and provide missing documentation
net: create netdev->dev_addr assignment helpers
wl3501_cs: use eth_hw_addr_set()
wifi: wl3501_cs: Fix an error handling path in wl3501_probe()
wifi: ray_cs: Utilize strnlen() in parse_addr()
wifi: ray_cs: Drop useless status variable in parse_addr()
wifi: ray_cs: Fix an error handling path in ray_probe()
wifi: ath9k: don't allow to overwrite ENDPOINT0 attributes
wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown
watchdog/perf: define dummy watchdog_update_hrtimer_threshold() on correct config
watchdog/perf: more properly prevent false positives with turbo modes
kexec: fix a memory leak in crash_shrink_memory()
memstick r592: make memstick_debug_get_tpc_name() static
wifi: ath9k: Fix possible stall on ath9k_txq_list_has_key()
wifi: ath9k: convert msecs to jiffies where needed
netlink: fix potential deadlock in netlink_set_err()
netlink: do not hard code device address lenth in fdb dumps
gtp: Fix use-after-free in __gtp_encap_destroy().
lib/ts_bm: reset initial match offset for every block of text
netfilter: nf_conntrack_sip: fix the ct_sip_parse_numerical_param() return value.
ipvlan: Fix return value of ipvlan_queue_xmit()
netlink: Add __sock_i_ino() for __netlink_diag_dump().
radeon: avoid double free in ci_dpm_init()
Input: drv260x - sleep between polling GO bit
ARM: dts: BCM5301X: Drop "clock-names" from the SPI node
Input: adxl34x - do not hardcode interrupt trigger type
drm/panel: simple: fix active size for Ampire AM-480272H3TMQW-T01H
ARM: ep93xx: fix missing-prototype warnings
ASoC: es8316: Increment max value for ALC Capture Target Volume control
soc/fsl/qe: fix usb.c build errors
IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors
arm64: dts: renesas: ulcb-kf: Remove flow control for SCIF1
fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe()
drm/radeon: fix possible division-by-zero errors
ALSA: ac97: Fix possible NULL dereference in snd_ac97_mixer
scsi: 3w-xxxx: Add error handling for initialization failure in tw_probe()
PCI: Add pci_clear_master() stub for non-CONFIG_PCI
pinctrl: cherryview: Return correct value if pin in push-pull mode
perf dwarf-aux: Fix off-by-one in die_get_varname()
pinctrl: at91-pio4: check return value of devm_kasprintf()
hwrng: virtio - add an internal buffer
hwrng: virtio - don't wait on cleanup
hwrng: virtio - don't waste entropy
hwrng: virtio - always add a pending request
hwrng: virtio - Fix race on data_avail and actual data
crypto: nx - fix build warnings when DEBUG_FS is not enabled
modpost: fix section mismatch message for R_ARM_ABS32
modpost: fix section mismatch message for R_ARM_{PC24,CALL,JUMP24}
ARCv2: entry: comments about hardware auto-save on taken interrupts
ARCv2: entry: push out the Z flag unclobber from common EXCEPTION_PROLOGUE
ARCv2: entry: avoid a branch
ARCv2: entry: rewrite to enable use of double load/stores LDD/STD
ARC: define ASM_NL and __ALIGN(_STR) outside #ifdef __ASSEMBLY__ guard
USB: serial: option: add LARA-R6 01B PIDs
block: change all __u32 annotations to __be32 in affs_hardblocks.h
w1: fix loop in w1_fini()
sh: j2: Use ioremap() to translate device tree address into kernel memory
media: usb: Check az6007_read() return value
media: videodev2.h: Fix struct v4l2_input tuner index comment
media: usb: siano: Fix warning due to null work_func_t function pointer
extcon: Fix kernel doc of property fields to avoid warnings
extcon: Fix kernel doc of property capability fields to avoid warnings
usb: phy: phy-tahvo: fix memory leak in tahvo_usb_probe()
mfd: rt5033: Drop rt5033-battery sub-device
KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes
mfd: intel-lpss: Add missing check for platform_get_resource
mfd: stmpe: Only disable the regulators if they are enabled
rtc: st-lpc: Release some resources in st_rtc_probe() in case of error
sctp: fix potential deadlock on &net->sctp.addr_wq_lock
Add MODULE_FIRMWARE() for FIRMWARE_TG357766.
spi: bcm-qspi: return error if neither hif_mspi nor mspi is available
mailbox: ti-msgmgr: Fill non-message tx data fields with 0x0
f2fs: fix error path handling in truncate_dnode()
powerpc: allow PPC_EARLY_DEBUG_CPM only when SERIAL_CPM=y
net: bridge: keep ports without IFF_UNICAST_FLT in BR_PROMISC mode
tcp: annotate data races in __tcp_oow_rate_limited()
net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX
sh: dma: Fix DMA channel offset calculation
i2c: xiic: Defer xiic_wakeup() and __xiic_start_xfer() in xiic_process()
i2c: xiic: Don't try to handle more interrupt events after error
ALSA: jack: Fix mutex call in snd_jack_report()
NFSD: add encoding of op_recall flag for write delegation
mmc: core: disable TRIM on Kingston EMMC04G-M627
mmc: core: disable TRIM on Micron MTFC4GACAJCN-1M
bcache: Remove unnecessary NULL point check in node allocations
integrity: Fix possible multiple allocation in integrity_inode_get()
jffs2: reduce stack usage in jffs2_build_xattr_subsystem()
btrfs: fix race when deleting quota root from the dirty cow roots list
ARM: orion5x: fix d2net gpio initialization
spi: spi-fsl-spi: remove always-true conditional in fsl_spi_do_one_msg
spi: spi-fsl-spi: relax message sanity checking a little
spi: spi-fsl-spi: allow changing bits_per_word while CS is still active
netfilter: nf_tables: fix nat hook table deletion
netfilter: nf_tables: add rescheduling points during loop detection walks
netfilter: nftables: add helper function to set the base sequence number
netfilter: add helper function to set up the nfnetlink header and use it
netfilter: nf_tables: use net_generic infra for transaction data
netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE
netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain
netfilter: nf_tables: reject unbound anonymous set before commit phase
netfilter: nf_tables: unbind non-anonymous set if rule construction fails
netfilter: nf_tables: fix scheduling-while-atomic splat
netfilter: conntrack: Avoid nf_ct_helper_hash uses after free
netfilter: nf_tables: prevent OOB access in nft_byteorder_eval
net: lan743x: Don't sleep in atomic context
workqueue: clean up WORK_* constant types, clarify masking
net: mvneta: fix txq_map in case of txq_number==1
vrf: Increment Icmp6InMsgs on the original netdev
icmp6: Fix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev().
udp6: fix udp6_ehashfn() typo
ntb: idt: Fix error handling in idt_pci_driver_init()
NTB: amd: Fix error handling in amd_ntb_pci_driver_init()
ntb: intel: Fix error handling in intel_ntb_pci_driver_init()
NTB: ntb_transport: fix possible memory leak while device_register() fails
NTB: ntb_tool: Add check for devm_kcalloc
ipv6/addrconf: fix a potential refcount underflow for idev
wifi: airo: avoid uninitialized warning in airo_get_rate()
net/sched: make psched_mtu() RTNL-less safe
pinctrl: amd: Fix mistake in handling clearing pins at startup
pinctrl: amd: Detect internal GPIO0 debounce handling
pinctrl: amd: Only use special debounce behavior for GPIO 0
tpm: tpm_vtpm_proxy: fix a race condition in /dev/vtpmx creation
net: bcmgenet: Ensure MDIO unregistration has clocks enabled
SUNRPC: Fix UAF in svc_tcp_listen_data_ready()
perf intel-pt: Fix CYC timestamps after standalone CBR
ext4: fix wrong unit use in ext4_mb_clear_bb
ext4: only update i_reserved_data_blocks on successful block allocation
jfs: jfs_dmap: Validate db_l2nbperpage while mounting
PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
PCI: Add function 1 DMA alias quirk for Marvell 88SE9235
PCI: qcom: Disable write access to read only registers for IP v2.3.3
PCI: rockchip: Assert PCI Configuration Enable bit after probe
PCI: rockchip: Write PCI Device ID to correct register
PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
PCI: rockchip: Use u32 variable to access 32-bit registers
misc: pci_endpoint_test: Free IRQs before removing the device
misc: pci_endpoint_test: Re-init completion for every test
md/raid0: add discard support for the 'original' layout
fs: dlm: return positive pid value for F_GETLK
serial: atmel: don't enable IRQs prematurely
hwrng: imx-rngc - fix the timeout for init and self check
ceph: don't let check_caps skip sending responses for revoke msgs
meson saradc: fix clock divider mask length
Revert "8250: add support for ASIX devices with a FIFO bug"
tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() in case of error
tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() when iterating clk
ring-buffer: Fix deadloop issue on reading trace_pipe
xtensa: ISS: fix call to split_if_spec
scsi: qla2xxx: Wait for io return on terminate rport
scsi: qla2xxx: Fix potential NULL pointer dereference
scsi: qla2xxx: Check valid rport returned by fc_bsg_to_rport()
scsi: qla2xxx: Pointer may be dereferenced
drm/atomic: Fix potential use-after-free in nonblocking commits
tracing/histograms: Add histograms to hist_vars if they have referenced variables
perf probe: Add test for regression introduced by switch to die_get_decl_file()
fuse: revalidate: don't invalidate if interrupted
can: bcm: Fix UAF in bcm_proc_show()
ext4: correct inline offset when handling xattrs in inode body
debugobjects: Recheck debug_objects_enabled before reporting
nbd: Add the maximum limit of allocated index in nbd_dev_add
md: fix data corruption for raid456 when reshape restart while grow up
md/raid10: prevent soft lockup while flush writes
posix-timers: Ensure timer ID search-loop limit is valid
sched/fair: Don't balance task to its current running CPU
bpf: Address KCSAN report on bpf_lru_list
wifi: wext-core: Fix -Wstringop-overflow warning in ioctl_standard_iw_point()
wifi: iwlwifi: mvm: avoid baid size integer overflow
igb: Fix igb_down hung on surprise removal
spi: bcm63xx: fix max prepend length
fbdev: imxfb: warn about invalid left/right margin
pinctrl: amd: Use amd_pinconf_set() for all config options
net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()
net:ipv6: check return value of pskb_trim()
Revert "tcp: avoid the lookup process failing to get sk in ehash table"
fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe
llc: Don't drop packet from non-root netns.
netfilter: nf_tables: fix spurious set element insertion failure
netfilter: nf_tables: can't schedule in nft_chain_validate
net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX
tcp: annotate data-races around tp->linger2
tcp: annotate data-races around rskq_defer_accept
tcp: annotate data-races around tp->notsent_lowat
tcp: annotate data-races around fastopenq.max_qlen
tracing/histograms: Return an error if we fail to add histogram to hist_vars list
gpio: tps68470: Make tps68470_gpio_output() always set the initial value
bcache: use MAX_CACHES_PER_SET instead of magic number 8 in __bch_bucket_alloc_set
bcache: remove 'int n' from parameter list of bch_bucket_alloc_set()
bcache: Fix __bch_btree_node_alloc to make the failure behavior consistent
btrfs: fix extent buffer leak after tree mod log failure at split_node()
ext4: rename journal_dev to s_journal_dev inside ext4_sb_info
ext4: Fix reusing stale buffer heads from last failed mounting
PCI: Rework pcie_retrain_link() wait loop
PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link()
PCI/ASPM: Factor out pcie_wait_for_retrain()
PCI/ASPM: Avoid link retraining race
dlm: cleanup plock_op vs plock_xop
dlm: rearrange async condition return
fs: dlm: interrupt posix locks only when process is killed
ftrace: Add information on number of page groups allocated
ftrace: Check if pages were allocated before calling free_pages()
ftrace: Store the order of pages allocated in ftrace_page
ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()
scsi: qla2xxx: Fix inconsistent format argument type in qla_os.c
scsi: qla2xxx: Array index may go out of bound
ext4: fix to check return value of freeze_bdev() in ext4_shutdown()
i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir()
phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
ethernet: atheros: fix return value check in atl1e_tso_csum()
ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
tcp: Reduce chance of collisions in inet6_hashfn().
bonding: reset bond's flags when down link is P2P device
team: reset team's flags when down link is P2P device
platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100
net/sched: mqprio: refactor nlattr parsing to a separate function
net/sched: mqprio: add extack to mqprio_parse_nlattr()
net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
benet: fix return value check in be_lancer_xmit_workarounds()
RDMA/mlx4: Make check for invalid flags stricter
drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()
ASoC: fsl_spdif: Silence output on stop
block: Fix a source code comment in include/uapi/linux/blkzoned.h
dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
ata: pata_ns87415: mark ns87560_tf_read static
ring-buffer: Fix wrong stat of cpu_buffer->read
tracing: Fix warning in trace_buffered_event_disable()
USB: serial: option: support Quectel EM060K_128
USB: serial: option: add Quectel EC200A module support
USB: serial: simple: add Kaufmann RKS+CAN VCP
USB: serial: simple: sort driver entries
can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED
Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
usb: dwc3: don't reset device side if dwc3 was configured as host-only
usb: ohci-at91: Fix the unhandle interrupt when resume
USB: quirks: add quirk for Focusrite Scarlett
usb: xhci-mtk: set the dma max_seg_size
Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
Documentation: security-bugs.rst: clarify CVE handling
staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
btrfs: check for commit error at btrfs_attach_transaction_barrier()
tpm_tis: Explicitly check for error code
irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
serial: 8250_dw: split Synopsys DesignWare 8250 common functions
serial: 8250_dw: Preserve original value of DLF register
virtio-net: fix race between set queues and probe
s390/dasd: fix hanging device after quiesce/resume
ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register
dm cache policy smq: ensure IO doesn't prevent cleaner policy progress
drm/client: Fix memory leak in drm_client_target_cloned
net/sched: cls_fw: Fix improper refcount update leads to use-after-free
net/sched: sch_qfq: account for stab overhead in qfq_enqueue
ASoC: cs42l51: fix driver to properly autoload with automatic module loading
net/sched: cls_u32: Fix reference counter leak leading to overflow
perf: Fix function pointer case
loop: Select I/O scheduler 'none' from inside add_disk()
word-at-a-time: use the same return type for has_zero regardless of endianness
KVM: s390: fix sthyi error handling
net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer()
perf test uprobe_from_different_cu: Skip if there is no gcc
net: sched: cls_u32: Fix match key mis-addressing
net: add missing data-race annotations around sk->sk_peek_off
net: add missing data-race annotation for sk_ll_usec
net/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free
net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free
ip6mr: Fix skb_under_panic in ip6mr_cache_report()
tcp_metrics: fix addr_same() helper
tcp_metrics: annotate data-races around tm->tcpm_stamp
tcp_metrics: annotate data-races around tm->tcpm_lock
tcp_metrics: annotate data-races around tm->tcpm_vals[]
tcp_metrics: annotate data-races around tm->tcpm_net
tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
scsi: zfcp: Defer fc_rport blocking until after ADISC response
libceph: fix potential hang in ceph_osdc_notify()
USB: zaurus: Add ID for A-300/B-500/C-700
fs/sysv: Null check to prevent null-ptr-deref bug
Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb
net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb
ext2: Drop fragment support
test_firmware: fix a memory leak with reqs buffer
test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation
mtd: rawnand: omap_elm: Fix incorrect type in assignment
powerpc/mm/altmap: Fix altmap boundary check
PM / wakeirq: support enabling wake-up irq after runtime_suspend called
PM: sleep: wakeirq: fix wake irq arming
ARM: dts: imx6sll: Make ssi node name same as other platforms
ARM: dts: imx: add usb alias
ARM: dts: imx6sll: fixup of operating points
ARM: dts: nxp/imx6sll: fix wrong property name in usbphy node
drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions
arm64: dts: stratix10: fix incorrect I2C property for SCL signal
drm/edid: fix objtool warning in drm_cvt_modes()
Linux 4.19.291
Change-Id: I4f78e25efd18415989ecf5e227a17e05b0d6386c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
73e2b8a15e |
tcp: annotate data-races around tp->notsent_lowat
[ Upstream commit 1aeb87bc1440c5447a7fa2d6e3c2cca52cbd206b ]
tp->notsent_lowat can be read locklessly from do_tcp_getsockopt()
and tcp_poll().
Fixes:
|
||
|
|
457202c473 |
net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX
[ Upstream commit f0628c524fd188c3f9418e12478dfdfadacba815 ] This patch changes the behavior of TCP_LINGER2 about its limit. The sysctl_tcp_fin_timeout used to be the limit of TCP_LINGER2 but now it's only the default value. A new macro named TCP_FIN_TIMEOUT_MAX is added as the limit of TCP_LINGER2, which is 2 minutes. Since TCP_LINGER2 used sysctl_tcp_fin_timeout as the default value and the limit in the past, the system administrator cannot set the default value for most of sockets and let some sockets have a greater timeout. It might be a mistake that let the sysctl to be the limit of the TCP_LINGER2. Maybe we can add a new sysctl to set the max of TCP_LINGER2, but FIN-WAIT-2 timeout is usually no need to be too long and 2 minutes are legal considering TCP specs. Changes in v3: - Remove the new socket option and change the TCP_LINGER2 behavior so that the timeout can be set to value between sysctl_tcp_fin_timeout and 2 minutes. Changes in v2: - Add int overflow check for the new socket option. Changes in v1: - Add a new socket option to set timeout greater than sysctl_tcp_fin_timeout. Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 9df5335ca974 ("tcp: annotate data-races around tp->linger2") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
2d7501f8af |
Revert "tcp: reduce POLLOUT events caused by TCP_NOTSENT_LOWAT"
This reverts commit
|
||
|
|
9b73585f8e |
Revert "tcp: factor out __tcp_close() helper"
This reverts commit
|
||
|
|
4e2cad2c2a |
Merge 4.19.284 into android-4.19-stable
Changes in 4.19.284
net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
netlink: annotate accesses to nlk->cb_running
net: annotate sk->sk_err write from do_recvmmsg()
tcp: reduce POLLOUT events caused by TCP_NOTSENT_LOWAT
tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit
tcp: factor out __tcp_close() helper
tcp: add annotations around sk->sk_shutdown accesses
ipvlan:Fix out-of-bounds caused by unclear skb->cb
net: datagram: fix data-races in datagram_poll()
af_unix: Fix a data race of sk->sk_receive_queue->qlen.
af_unix: Fix data races around sk->sk_shutdown.
fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode()
drm/amd/display: Use DC_LOG_DC in the trasform pixel function
regmap: cache: Return error in cache sync operations for REGCACHE_NONE
memstick: r592: Fix UAF bug in r592_remove due to race condition
firmware: arm_sdei: Fix sleep from invalid context BUG
ACPI: EC: Fix oops when removing custom query handlers
drm/tegra: Avoid potential 32-bit integer overflow
ACPICA: Avoid undefined behavior: applying zero offset to null pointer
ACPICA: ACPICA: check null return of ACPI_ALLOCATE_ZEROED in acpi_db_display_objects
wifi: brcmfmac: cfg80211: Pass the PMK in binary instead of hex
ext2: Check block size validity during mount
net: pasemi: Fix return type of pasemi_mac_start_tx()
net: Catch invalid index in XPS mapping
lib: cpu_rmap: Avoid use after free on rmap->obj array entries
scsi: message: mptlan: Fix use after free bug in mptlan_remove() due to race condition
gfs2: Fix inode height consistency check
ext4: set goal start correctly in ext4_mb_normalize_request
ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()
f2fs: fix to drop all dirty pages during umount() if cp_error is set
wifi: iwlwifi: dvm: Fix memcpy: detected field-spanning write backtrace
Bluetooth: L2CAP: fix "bad unlock balance" in l2cap_disconnect_rsp
staging: rtl8192e: Replace macro RTL_PCI_DEVICE with PCI_DEVICE
HID: logitech-hidpp: Don't use the USB serial for USB devices
HID: logitech-hidpp: Reconcile USB and Unifying serials
spi: spi-imx: fix MX51_ECSPI_* macros when cs > 3
HID: wacom: generic: Set battery quirk only when we see battery data
usb: typec: tcpm: fix multiple times discover svids error
serial: 8250: Reinit port->pm on port specific driver unbind
mcb-pci: Reallocate memory region to avoid memory overlapping
sched: Fix KCSAN noinstr violation
recordmcount: Fix memory leaks in the uwrite function
clk: tegra20: fix gcc-7 constant overflow warning
Input: xpad - add constants for GIP interface numbers
phy: st: miphy28lp: use _poll_timeout functions for waits
mfd: dln2: Fix memory leak in dln2_probe()
btrfs: replace calls to btrfs_find_free_ino with btrfs_find_free_objectid
btrfs: fix space cache inconsistency after error loading it from disk
cpupower: Make TSC read per CPU for Mperf monitor
af_key: Reject optional tunnel/BEET mode templates in outbound policies
net: fec: Better handle pm_runtime_get() failing in .remove()
vsock: avoid to close connected socket after the timeout
drivers: provide devm_platform_ioremap_resource()
serial: arc_uart: fix of_iomap leak in `arc_serial_probe`
ip6_gre: Fix skb_under_panic in __gre6_xmit()
ip6_gre: Make o_seqno start from 0 in native mode
ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode
erspan: get the proto with the md version for collect_md
media: netup_unidvb: fix use-after-free at del_timer()
drm/exynos: fix g2d_open/close helper function definitions
net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment()
net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop()
net: bcmgenet: Restore phy_stop() depending upon suspend/close
cassini: Fix a memory leak in the error handling path of cas_init_one()
igb: fix bit_shift to be in [1..8] range
vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit()
usb-storage: fix deadlock when a scsi command timeouts more than once
usb: typec: altmodes/displayport: fix pin_assignment_show
ALSA: hda: Fix Oops by 9.1 surround channel names
ALSA: hda: Add NVIDIA codec IDs a3 through a7 to patch table
statfs: enforce statfs[64] structure initialization
serial: Add support for Advantech PCI-1611U card
ceph: force updating the msg pointer in non-split case
tpm/tpm_tis: Disable interrupts for more Lenovo devices
nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode()
netfilter: nftables: add nft_parse_register_load() and use it
netfilter: nftables: add nft_parse_register_store() and use it
netfilter: nftables: statify nft_parse_register()
netfilter: nf_tables: validate registers coming from userspace.
netfilter: nf_tables: add nft_setelem_parse_key()
netfilter: nf_tables: allow up to 64 bytes in the set element data area
netfilter: nf_tables: stricter validation of element data
netfilter: nf_tables: validate NFTA_SET_ELEM_OBJREF based on NFT_SET_OBJECT flag
netfilter: nf_tables: do not allow RULE_ID to refer to another chain
HID: wacom: Force pen out of prox if no events have been received in a while
Add Acer Aspire Ethos 8951G model quirk
ALSA: hda/realtek - More constifications
ALSA: hda/realtek - Add Headset Mic supported for HP cPC
ALSA: hda/realtek - Enable headset mic of Acer X2660G with ALC662
ALSA: hda/realtek - Enable the headset of Acer N50-600 with ALC662
ALSA: hda/realtek - The front Mic on a HP machine doesn't work
ALSA: hda/realtek: Fix the mic type detection issue for ASUS G551JW
ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform
ALSA: hda/realtek - ALC897 headset MIC no sound
ALSA: hda/realtek: Add a quirk for HP EliteDesk 805
lib/string_helpers: Introduce string_upper() and string_lower() helpers
usb: gadget: u_ether: Convert prints to device prints
usb: gadget: u_ether: Fix host MAC address case
vc_screen: rewrite vcs_size to accept vc, not inode
vc_screen: reload load of struct vc_data pointer in vcs_write() to avoid UAF
s390/qdio: get rid of register asm
s390/qdio: fix do_sqbs() inline assembly constraint
spi: spi-fsl-spi: automatically adapt bits-per-word in cpu mode
spi: fsl-spi: Re-organise transfer bits_per_word adaptation
spi: fsl-cpm: Use 16 bit mode for large transfers with even size
ALSA: hda/ca0132: add quirk for EVGA X299 DARK
m68k: Move signal frame following exception on 68020/030
parisc: Allow to reboot machine after system halt
btrfs: use nofs when cleaning up aborted transactions
x86/mm: Avoid incomplete Global INVLPG flushes
selftests/memfd: Fix unknown type name build failure
parisc: Fix flush_dcache_page() for usage from irq context
ALSA: hda/realtek - Fixed one of HP ALC671 platform Headset Mic supported
ALSA: hda/realtek - Fix inverted bass GPIO pin on Acer 8951G
udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
USB: core: Add routines for endpoint checks in old drivers
USB: sisusbvga: Add endpoint checks
media: radio-shark: Add endpoint checks
net: fix skb leak in __skb_tstamp_tx()
bpf: Fix mask generation for 32-bit narrow loads of 64-bit fields
ipv6: Fix out-of-bounds access in ipv6_find_tlv()
power: supply: leds: Fix blink to LED on transition
power: supply: bq27xxx: Fix bq27xxx_battery_update() race condition
power: supply: bq27xxx: Fix I2C IRQ race on remove
power: supply: bq27xxx: Fix poll_interval handling and races on remove
power: supply: sbs-charger: Fix INHIBITED bit for Status reg
coresight: Fix signedness bug in tmc_etr_buf_insert_barrier_packet()
xen/pvcalls-back: fix double frees with pvcalls_new_active_socket()
x86/show_trace_log_lvl: Ensure stack pointer is aligned, again
ASoC: Intel: Skylake: Fix declaration of enum skl_ch_cfg
forcedeth: Fix an error handling path in nv_probe()
3c589_cs: Fix an error handling path in tc589_probe()
drivers: depend on HAS_IOMEM for devm_platform_ioremap_resource()
Linux 4.19.284
Change-Id: I88843be551e748e295ea608158a2db7ab4486a65
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
||
|
|
e62806034c |
tcp: factor out __tcp_close() helper
[ Upstream commit 77c3c95637526f1e4330cc9a4b2065f668c2c4fe ] unlocked version of protocol level close, will be used by MPTCP to allow decouple orphaning and subflow level close. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: e14cadfd80d7 ("tcp: add annotations around sk->sk_shutdown accesses") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
0d70e638ab |
tcp: reduce POLLOUT events caused by TCP_NOTSENT_LOWAT
[ Upstream commit a74f0fa082b76c6a76cba5672f36218518bfdc09 ] TCP_NOTSENT_LOWAT socket option or sysctl was added in linux-3.12 as a step to enable bigger tcp sndbuf limits. It works reasonably well, but the following happens : Once the limit is reached, TCP stack generates an [E]POLLOUT event for every incoming ACK packet. This causes a high number of context switches. This patch implements the strategy David Miller added in sock_def_write_space() : - If TCP socket has a notsent_lowat constraint of X bytes, allow sendmsg() to fill up to X bytes, but send [E]POLLOUT only if number of notsent bytes is below X/2 This considerably reduces TCP_NOTSENT_LOWAT overhead, while allowing to keep the pipe full. Tested: 100 ms RTT netem testbed between A and B, 100 concurrent TCP_STREAM A:/# cat /proc/sys/net/ipv4/tcp_wmem 4096 262144 64000000 A:/# super_netperf 100 -H B -l 1000 -- -K bbr & A:/# grep TCP /proc/net/sockstat TCP: inuse 203 orphan 0 tw 19 alloc 414 mem 1364904 # This is about 54 MB of memory per flow :/ A:/# vmstat 5 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 256220672 13532 694976 0 0 10 0 28 14 0 1 99 0 0 2 0 0 256320016 13532 698480 0 0 512 0 715901 5927 0 10 90 0 0 0 0 0 256197232 13532 700992 0 0 735 13 771161 5849 0 11 89 0 0 1 0 0 256233824 13532 703320 0 0 512 23 719650 6635 0 11 89 0 0 2 0 0 256226880 13532 705780 0 0 642 4 775650 6009 0 12 88 0 0 A:/# echo 2097152 >/proc/sys/net/ipv4/tcp_notsent_lowat A:/# grep TCP /proc/net/sockstat TCP: inuse 203 orphan 0 tw 19 alloc 414 mem 86411 # 3.5 MB per flow A:/# vmstat 5 5 # check that context switches have not inflated too much. procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 260386512 13592 662148 0 0 10 0 17 14 0 1 99 0 0 0 0 0 260519680 13592 604184 0 0 512 13 726843 12424 0 10 90 0 0 1 1 0 260435424 13592 598360 0 0 512 25 764645 12925 0 10 90 0 0 1 0 0 260855392 13592 578380 0 0 512 7 722943 13624 0 11 88 0 0 1 0 0 260445008 13592 601176 0 0 614 34 772288 14317 0 10 90 0 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: e14cadfd80d7 ("tcp: add annotations around sk->sk_shutdown accesses") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
c2ad6ebaaa |
Revert "tcp/udp: Make early_demux back namespacified."
This reverts commit
|
||
|
|
7162f05f1f |
tcp/udp: Make early_demux back namespacified.
commit 11052589cf5c0bab3b4884d423d5f60c38fcf25d upstream. Commit |
||
|
|
a434d10e7a |
tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
[ Upstream commit f4ce91ce12a7c6ead19b128ffa8cff6e3ded2a14 ]
This commit fixes a bug in the tracking of max_packets_out and
is_cwnd_limited. This bug can cause the connection to fail to remember
that is_cwnd_limited is true, causing the connection to fail to grow
cwnd when it should, causing throughput to be lower than it should be.
The following event sequence is an example that triggers the bug:
(a) The connection is cwnd_limited, but packets_out is not at its
peak due to TSO deferral deciding not to send another skb yet.
In such cases the connection can advance max_packets_seq and set
tp->is_cwnd_limited to true and max_packets_out to a small
number.
(b) Then later in the round trip the connection is pacing-limited (not
cwnd-limited), and packets_out is larger. In such cases the
connection would raise max_packets_out to a bigger number but
(unexpectedly) flip tp->is_cwnd_limited from true to false.
This commit fixes that bug.
One straightforward fix would be to separately track (a) the next
window after max_packets_out reaches a maximum, and (b) the next
window after tp->is_cwnd_limited is set to true. But this would
require consuming an extra u32 sequence number.
Instead, to save space we track only the most important
information. Specifically, we track the strongest available signal of
the degree to which the cwnd is fully utilized:
(1) If the connection is cwnd-limited then we remember that fact for
the current window.
(2) If the connection not cwnd-limited then we track the maximum
number of outstanding packets in the current window.
In particular, note that the new logic cannot trigger the buggy
(a)/(b) sequence above because with the new logic a condition where
tp->packets_out > tp->max_packets_out can only trigger an update of
tp->is_cwnd_limited if tp->is_cwnd_limited is false.
This first showed up in a testing of a BBRv2 dev branch, but this
buggy behavior highlighted a general issue with the
tcp_cwnd_validate() logic that can cause cwnd to fail to increase at
the proper rate for any TCP congestion control, including Reno or
CUBIC.
Fixes:
|
||
|
|
3e39da4423 |
tcp: Fix a data-race around sysctl_tcp_adv_win_scale.
commit 36eeee75ef0157e42fb6593dcc65daab289b559e upstream.
While reading sysctl_tcp_adv_win_scale, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.
Fixes:
|
||
|
|
3b26e11b07 |
tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
[ Upstream commit 4845b5713ab18a1bb6e31d1fbb4d600240b8b691 ]
While reading sysctl_tcp_slow_start_after_idle, it can be changed
concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes:
|
||
|
|
0f75343584 |
tcp: Fix a data-race around sysctl_tcp_notsent_lowat.
[ Upstream commit 55be873695ed8912eb77ff46d1d1cadf028bd0f3 ]
While reading sysctl_tcp_notsent_lowat, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.
Fixes:
|
||
|
|
f197442a0e |
tcp: Fix data-races around some timeout sysctl knobs.
[ Upstream commit 39e24435a776e9de5c6dd188836cf2523547804b ]
While reading these sysctl knobs, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.
- tcp_retries1
- tcp_retries2
- tcp_orphan_retries
- tcp_fin_timeout
Fixes:
|
||
|
|
6c2176f5ad |
tcp: make sure treq->af_specific is initialized
commit ba5a4fdd63ae0c575707030db0b634b160baddd7 upstream. syzbot complained about a recent change in TCP stack, hitting a NULL pointer [1] tcp request sockets have an af_specific pointer, which was used before the blamed change only for SYNACK generation in non SYNCOOKIE mode. tcp requests sockets momentarily created when third packet coming from client in SYNCOOKIE mode were not using treq->af_specific. Make sure this field is populated, in the same way normal TCP requests sockets do in tcp_conn_request(). [1] TCP: request_sock_TCPv6: Possible SYN flooding on port 20002. Sending cookies. Check SNMP counters. general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 PID: 3695 Comm: syz-executor864 Not tainted 5.18.0-rc3-syzkaller-00224-g5fd1fe4807f9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_create_openreq_child+0xe16/0x16b0 net/ipv4/tcp_minisocks.c:534 Code: 48 c1 ea 03 80 3c 02 00 0f 85 e5 07 00 00 4c 8b b3 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7e 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 c9 07 00 00 48 8b 3c 24 48 89 de 41 ff 56 08 48 RSP: 0018:ffffc90000de0588 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888076490330 RCX: 0000000000000100 RDX: 0000000000000001 RSI: ffffffff87d67ff0 RDI: 0000000000000008 RBP: ffff88806ee1c7f8 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff87d67f00 R11: 0000000000000000 R12: ffff88806ee1bfc0 R13: ffff88801b0e0368 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f517fe58700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffcead76960 CR3: 000000006f97b000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> tcp_v6_syn_recv_sock+0x199/0x23b0 net/ipv6/tcp_ipv6.c:1267 tcp_get_cookie_sock+0xc9/0x850 net/ipv4/syncookies.c:207 cookie_v6_check+0x15c3/0x2340 net/ipv6/syncookies.c:258 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1131 [inline] tcp_v6_do_rcv+0x1148/0x13b0 net/ipv6/tcp_ipv6.c:1486 tcp_v6_rcv+0x3305/0x3840 net/ipv6/tcp_ipv6.c:1725 ip6_protocol_deliver_rcu+0x2e9/0x1900 net/ipv6/ip6_input.c:422 ip6_input_finish+0x14c/0x2c0 net/ipv6/ip6_input.c:464 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:473 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ipv6_rcv+0x27f/0x3b0 net/ipv6/ip6_input.c:297 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519 process_backlog+0x3a0/0x7c0 net/core/dev.c:5847 __napi_poll+0xb3/0x6e0 net/core/dev.c:6413 napi_poll net/core/dev.c:6480 [inline] net_rx_action+0x8ec/0xc60 net/core/dev.c:6567 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097 Fixes: 5b0b9e4c2c89 ("tcp: md5: incorrect tcp_header_len for incoming connections") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> [fruggeri: Account for backport conflicts from 35b2c3211609 and 6fc8c827dd4f] Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
cc639aa3c2 |
tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT
[ Upstream commit 4bfe744ff1644fbc0a991a2677dc874475dd6776 ]
I had this bug sitting for too long in my pile, it is time to fix it.
Thanks to Doug Porter for reminding me of it!
We had various attempts in the past, including commit
0cbe6a8f089e ("tcp: remove SOCK_QUEUE_SHRUNK"),
but the issue is that TCP stack currently only generates
EPOLLOUT from input path, when tp->snd_una has advanced
and skb(s) cleaned from rtx queue.
If a flow has a big RTT, and/or receives SACKs, it is possible
that the notsent part (tp->write_seq - tp->snd_nxt) reaches 0
and no more data can be sent until tp->snd_una finally advances.
What is needed is to also check if POLLOUT needs to be generated
whenever tp->snd_nxt is advanced, from output path.
This bug triggers more often after an idle period, as
we do not receive ACK for at least one RTT. tcp_notsent_lowat
could be a fraction of what CWND and pacing rate would allow to
send during this RTT.
In a followup patch, I will remove the bogus call
to tcp_chrono_stop(sk, TCP_CHRONO_SNDBUF_LIMITED)
from tcp_check_space(). Fact that we have decided to generate
an EPOLLOUT does not mean the application has immediately
refilled the transmit queue. This optimistic call
might have been the reason the bug seemed not too serious.
Tested:
200 ms rtt, 1% packet loss, 32 MB tcp_rmem[2] and tcp_wmem[2]
$ echo 500000 >/proc/sys/net/ipv4/tcp_notsent_lowat
$ cat bench_rr.sh
SUM=0
for i in {1..10}
do
V=`netperf -H remote_host -l30 -t TCP_RR -- -r 10000000,10000 -o LOCAL_BYTES_SENT | egrep -v "MIGRATED|Bytes"`
echo $V
SUM=$(($SUM + $V))
done
echo SUM=$SUM
Before patch:
$ bench_rr.sh
130000000
80000000
140000000
140000000
140000000
140000000
130000000
40000000
90000000
110000000
SUM=1140000000
After patch:
$ bench_rr.sh
430000000
590000000
530000000
450000000
450000000
350000000
450000000
490000000
480000000
460000000
SUM=4680000000 # This is 410 % of the value before patch.
Fixes:
|
||
|
|
92ba49b27e |
tcp: annotate tp->write_seq lockless reads
[ Upstream commit 0f31746452e6793ad6271337438af8f4defb8940 ] There are few places where we fetch tp->write_seq while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
777d796966 |
tcp: fix SO_RCVLOWAT related hangs under mem pressure
[ Upstream commit f969dc5a885736842c3511ecdea240fbb02d25d9 ]
While commit 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs")
fixed an issue vs too small sk_rcvbuf for given sk_rcvlowat constraint,
it missed to address issue caused by memory pressure.
1) If we are under memory pressure and socket receive queue is empty.
First incoming packet is allowed to be queued, after commit
|
||
|
|
dc7bc439a8 |
tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
commit 62d9f1a6945ba69c125e548e72a36d203b30596e upstream.
Upon receiving a cumulative ACK that changes the congestion state from
Disorder to Open, the TLP timer is not set. If the sender is app-limited,
it can only wait for the RTO timer to expire and retransmit.
The reason for this is that the TLP timer is set before the congestion
state changes in tcp_ack(), so we delay the time point of calling
tcp_set_xmit_timer() until after tcp_fastretrans_alert() returns and
remove the FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer
is set.
This commit has two additional benefits:
1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
avoid spurious RTO caused by RTO timer early expires.
2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
timer is set.
Fixes:
|
||
|
|
99779c24bf |
tcp: fix SO_RCVLOWAT hangs with fat skbs
[ Upstream commit 24adbc1676af4e134e709ddc7f34cf2adc2131e4 ] We autotune rcvbuf whenever SO_RCVLOWAT is set to account for 100% overhead in tcp_set_rcvlowat() This works well when skb->len/skb->truesize ratio is bigger than 0.5 But if we receive packets with small MSS, we can end up in a situation where not enough bytes are available in the receive queue to satisfy RCVLOWAT setting. As our sk_rcvbuf limit is hit, we send zero windows in ACK packets, preventing remote peer from sending more data. Even autotuning does not help, because it only triggers at the time user process drains the queue. If no EPOLLIN is generated, this can not happen. Note poll() has a similar issue, after commit |
||
|
|
3405bf51f6 |
tcp: cache line align MAX_TCP_HEADER
[ Upstream commit 9bacd256f1354883d3c1402655153367982bba49 ] TCP stack is dumb in how it cooks its output packets. Depending on MAX_HEADER value, we might chose a bad ending point for the headers. If we align the end of TCP headers to cache line boundary, we make sure to always use the smallest number of cache lines, which always help. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
a92c895e22 |
tcp: annotate lockless access to tcp_memory_pressure
[ Upstream commit 1f142c17d19a5618d5a633195a46f2c8be9bf232 ]
tcp_memory_pressure is read without holding any lock,
and its value could be changed on other cpus.
Use READ_ONCE() to annotate these lockless reads.
The write side is already using atomic ops.
Fixes:
|
||
|
|
fbcf85b047 |
tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
[ Upstream commit 721c8dafad26ccfa90ff659ee19755e3377b829d ] Syncookies borrow the ->rx_opt.ts_recent_stamp field to store the timestamp of the last synflood. Protect them with READ_ONCE() and WRITE_ONCE() since reads and writes aren't serialised. Use of .rx_opt.ts_recent_stamp for storing the synflood timestamp was introduced by |
||
|
|
4b8a98697a |
tcp: tighten acceptance of ACKs not matching a child socket
[ Upstream commit cb44a08f8647fd2e8db5cc9ac27cd8355fa392d8 ] When no synflood occurs, the synflood timestamp isn't updated. Therefore it can be so old that time_after32() can consider it to be in the future. That's a problem for tcp_synq_no_recent_overflow() as it may report that a recent overflow occurred while, in fact, it's just that jiffies has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31. Spurious detection of recent overflows lead to extra syncookie verification in cookie_v[46]_check(). At that point, the verification should fail and the packet dropped. But we should have dropped the packet earlier as we didn't even send a syncookie. Let's refine tcp_synq_no_recent_overflow() to report a recent overflow only if jiffies is within the [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This way, no spurious recent overflow is reported when jiffies wraps and 'last_overflow' becomes in the future from the point of view of time_after32(). However, if jiffies wraps and enters the [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with 'last_overflow' being a stale synflood timestamp), then tcp_synq_no_recent_overflow() still erroneously reports an overflow. In such cases, we have to rely on syncookie verification to drop the packet. We unfortunately have no way to differentiate between a fresh and a stale syncookie timestamp. In practice, using last_overflow as lower bound is problematic. If the synflood timestamp is concurrently updated between the time we read jiffies and the moment we store the timestamp in 'last_overflow', then 'now' becomes smaller than 'last_overflow' and tcp_synq_no_recent_overflow() returns true, potentially dropping a valid syncookie. Reading jiffies after loading the timestamp could fix the problem, but that'd require a memory barrier. Let's just accommodate for potential timestamp growth instead and extend the interval using 'last_overflow - HZ' as lower bound. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
bac9e8f345 |
tcp: fix rejected syncookies due to stale timestamps
[ Upstream commit 04d26e7b159a396372646a480f4caa166d1b6720 ]
If no synflood happens for a long enough period of time, then the
synflood timestamp isn't refreshed and jiffies can advance so much
that time_after32() can't accurately compare them any more.
Therefore, we can end up in a situation where time_after32(now,
last_overflow + HZ) returns false, just because these two values are
too far apart. In that case, the synflood timestamp isn't updated as
it should be, which can trick tcp_synq_no_recent_overflow() into
rejecting valid syncookies.
For example, let's consider the following scenario on a system
with HZ=1000:
* The synflood timestamp is 0, either because that's the timestamp
of the last synflood or, more commonly, because we're working with
a freshly created socket.
* We receive a new SYN, which triggers synflood protection. Let's say
that this happens when jiffies == 2147484649 (that is,
'synflood timestamp' + HZ + 2^31 + 1).
* Then tcp_synq_overflow() doesn't update the synflood timestamp,
because time_after32(2147484649, 1000) returns false.
With:
- 2147484649: the value of jiffies, aka. 'now'.
- 1000: the value of 'last_overflow' + HZ.
* A bit later, we receive the ACK completing the 3WHS. But
cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
says that we're not under synflood. That's because
time_after32(2147484649, 120000) returns false.
With:
- 2147484649: the value of jiffies, aka. 'now'.
- 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.
Of course, in reality jiffies would have increased a bit, but this
condition will last for the next 119 seconds, which is far enough
to accommodate for jiffie's growth.
Fix this by updating the overflow timestamp whenever jiffies isn't
within the [last_overflow, last_overflow + HZ] range. That shouldn't
have any performance impact since the update still happens at most once
per second.
Now we're guaranteed to have fresh timestamps while under synflood, so
tcp_synq_no_recent_overflow() can safely use it with time_after32() in
such situations.
Stale timestamps can still make tcp_synq_no_recent_overflow() return
the wrong verdict when not under synflood. This will be handled in the
next patch.
For 64 bits architectures, the problem was introduced with the
conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
|
||
|
|
3ad3a9a242 |
tcp: make tcp_space() aware of socket backlog
[ Upstream commit 85bdf7db5b53cdcc7a901db12bcb3d0063e3866d ] Jean-Louis Dupond reported poor iscsi TCP receive performance that we tracked to backlog drops. Apparently we fail to send window updates reflecting the fact that we are under stress. Note that we might lack a proper window increase when backlog is fully processed, since __release_sock() clears sk->sk_backlog.len _after_ all skbs have been processed. This should not matter in practice. If we had a significant load through socket backlog, we are in a dangerous situation. Reported-by: Jean-Louis Dupond <jean-louis@dupond.be> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Tested-by: Jean-Louis Dupond<jean-louis@dupond.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
c60f57dfe9 |
tcp: fix tcp_set_congestion_control() use from bpf hook
[ Upstream commit 8d650cdedaabb33e85e9b7c517c0c71fcecc1de9 ]
Neal reported incorrect use of ns_capable() from bpf hook.
bpf_setsockopt(...TCP_CONGESTION...)
-> tcp_set_congestion_control()
-> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
-> ns_capable_common()
-> current_cred()
-> rcu_dereference_protected(current->cred, 1)
Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.
As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.
The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.
Fixes:
|
||
|
|
6323c238bb |
tcp: be more careful in tcp_fragment()
[ Upstream commit b617158dc096709d8600c53b6052144d12b89fab ]
Some applications set tiny SO_SNDBUF values and expect
TCP to just work. Recent patches to address CVE-2019-11478
broke them in case of losses, since retransmits might
be prevented.
We should allow these flows to make progress.
This patch allows the first and last skb in retransmit queue
to be split even if memory limits are hit.
It also adds the some room due to the fact that tcp_sendmsg()
and tcp_sendpage() might overshoot sk_wmem_queued by about one full
TSO skb (64KB size). Note this allowance was already present
in stable backports for kernels < 4.15
Note for < 4.15 backports :
tcp_rtx_queue_tail() will probably look like :
static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
struct sk_buff *skb = tcp_send_head(sk);
return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
}
Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrew Prout <aprout@ll.mit.edu>
Tested-by: Andrew Prout <aprout@ll.mit.edu>
Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Christoph Paasch <cpaasch@apple.com>
Cc: Jonathan Looney <jtl@netflix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
c09be31461 |
tcp: limit payload size of sacked skbs
commit 3b4929f65b0d8249f19a50245cd88ed1a2f78cff upstream.
Jonathan Looney reported that TCP can trigger the following crash
in tcp_shifted_skb() :
BUG_ON(tcp_skb_pcount(skb) < pcount);
This can happen if the remote peer has advertized the smallest
MSS that linux TCP accepts : 48
An skb can hold 17 fragments, and each fragment can hold 32KB
on x86, or 64KB on PowerPC.
This means that the 16bit witdh of TCP_SKB_CB(skb)->tcp_gso_segs
can overflow.
Note that tcp_sendmsg() builds skbs with less than 64KB
of payload, so this problem needs SACK to be enabled.
SACK blocks allow TCP to coalesce multiple skbs in the retransmit
queue, thus filling the 17 fragments to maximal capacity.
CVE-2019-11477 -- u16 overflow of TCP_SKB_CB(skb)->tcp_gso_segs
Fixes:
|
||
|
|
037b0b86ec |
tcp, ulp: add alias for all ulp modules
Lets not turn the TCP ULP lookup into an arbitrary module loader as
we only intend to load ULP modules through this mechanism, not other
unrelated kernel modules:
[root@bar]# cat foo.c
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/tcp.h>
#include <linux/in.h>
int main(void)
{
int sock = socket(PF_INET, SOCK_STREAM, 0);
setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp"));
return 0;
}
[root@bar]# gcc foo.c -O2 -Wall
[root@bar]# lsmod | grep sctp
[root@bar]# ./a.out
[root@bar]# lsmod | grep sctp
sctp 1077248 4
libcrc32c 16384 3 nf_conntrack,nf_nat,sctp
[root@bar]#
Fix it by adding module alias to TCP ULP modules, so probing module
via request_module() will be limited to tcp-ulp-[name]. The existing
modules like kTLS will load fine given tcp-ulp-tls alias, but others
will fail to load:
[root@bar]# lsmod | grep sctp
[root@bar]# ./a.out
[root@bar]# lsmod | grep sctp
[root@bar]#
Sockmap is not affected from this since it's either built-in or not.
Fixes:
|
||
|
|
40a1227ea8 |
tcp: Avoid TCP syncookie rejected by SO_REUSEPORT socket
Although the actual cookie check "__cookie_v[46]_check()" does
not involve sk specific info, it checks whether the sk has recent
synq overflow event in "tcp_synq_no_recent_overflow()". The
tcp_sk(sk)->rx_opt.ts_recent_stamp is updated every second
when it has sent out a syncookie (through "tcp_synq_overflow()").
The above per sk "recent synq overflow event timestamp" works well
for non SO_REUSEPORT use case. However, it may cause random
connection request reject/discard when SO_REUSEPORT is used with
syncookie because it fails the "tcp_synq_no_recent_overflow()"
test.
When SO_REUSEPORT is used, it usually has multiple listening
socks serving TCP connection requests destinated to the same local IP:PORT.
There are cases that the TCP-ACK-COOKIE may not be received
by the same sk that sent out the syncookie. For example,
if reuse->socks[] began with {sk0, sk1},
1) sk1 sent out syncookies and tcp_sk(sk1)->rx_opt.ts_recent_stamp
was updated.
2) the reuse->socks[] became {sk1, sk2} later. e.g. sk0 was first closed
and then sk2 was added. Here, sk2 does not have ts_recent_stamp set.
There are other ordering that will trigger the similar situation
below but the idea is the same.
3) When the TCP-ACK-COOKIE comes back, sk2 was selected.
"tcp_synq_no_recent_overflow(sk2)" returns true. In this case,
all syncookies sent by sk1 will be handled (and rejected)
by sk2 while sk1 is still alive.
The userspace may create and remove listening SO_REUSEPORT sockets
as it sees fit. E.g. Adding new thread (and SO_REUSEPORT sock) to handle
incoming requests, old process stopping and new process starting...etc.
With or without SO_ATTACH_REUSEPORT_[CB]BPF,
the sockets leaving and joining a reuseport group makes picking
the same sk to check the syncookie very difficult (if not impossible).
The later patches will allow bpf prog more flexibility in deciding
where a sk should be located in a bpf map and selecting a particular
SO_REUSEPORT sock as it sees fit. e.g. Without closing any sock,
replace the whole bpf reuseport_array in one map_update() by using
map-in-map. Getting the syncookie check working smoothly across
socks in the same "reuse->socks[]" is important.
A partial solution is to set the newly added sk's ts_recent_stamp
to the max ts_recent_stamp of a reuseport group but that will require
to iterate through reuse->socks[] OR
pessimistically set it to "now - TCP_SYNCOOKIE_VALID" when a sk is
joining a reuseport group. However, neither of them will solve the
existing sk getting moved around the reuse->socks[] and that
sk may not have ts_recent_stamp updated, unlikely under continuous
synflood but not impossible.
This patch opts to treat the reuseport group as a whole when
considering the last synq overflow timestamp since
they are serving the same IP:PORT from the userspace
(and BPF program) perspective.
"synq_overflow_ts" is added to "struct sock_reuseport".
The tcp_synq_overflow() and tcp_synq_no_recent_overflow()
will update/check reuse->synq_overflow_ts if the sk is
in a reuseport group. Similar to the reuseport decision in
__inet_lookup_listener(), both sk->sk_reuseport and
sk->sk_reuseport_cb are tested for SO_REUSEPORT usage.
Update on "synq_overflow_ts" happens at roughly once
every second.
A synflood test was done with a 16 rx-queues and 16 reuseport sockets.
No meaningful performance change is observed. Before and
after the change is ~9Mpps in IPv4.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
||
|
|
19725496da | Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net | ||
|
|
24b711edfc |
net/ipv6: Fix linklocal to global address with VRF
Example setup:
host: ip -6 addr add dev eth1 2001:db8:104::4
where eth1 is enslaved to a VRF
switch: ip -6 ro add 2001:db8:104::4/128 dev br1
where br1 only has an LLA
ping6 2001:db8:104::4
ssh 2001:db8:104::4
(NOTE: UDP works fine if the PKTINFO has the address set to the global
address and ifindex is set to the index of eth1 with a destination an
LLA).
For ICMP, icmp6_iif needs to be updated to check if skb->dev is an
L3 master. If it is then return the ifindex from rt6i_idev similar
to what is done for loopback.
For TCP, restore the original tcp_v6_iif definition which is needed in
most places and add a new tcp_v6_iif_l3_slave that considers the
l3_slave variability. This latter check is only needed for socket
lookups.
Fixes:
|
||
|
|
c4c5551df1 |
Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
All conflicts were trivial overlapping changes, so reasonably easy to resolve. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
a0496ef2c2 |
tcp: do not delay ACK in DCTCP upon CE status change
Per DCTCP RFC8257 (Section 3.2) the ACK reflecting the CE status change
has to be sent immediately so the sender can respond quickly:
""" When receiving packets, the CE codepoint MUST be processed as follows:
1. If the CE codepoint is set and DCTCP.CE is false, set DCTCP.CE to
true and send an immediate ACK.
2. If the CE codepoint is not set and DCTCP.CE is true, set DCTCP.CE
to false and send an immediate ACK.
"""
Previously DCTCP implementation may continue to delay the ACK. This
patch fixes that to implement the RFC by forcing an immediate ACK.
Tested with this packetdrill script provided by Larry Brakmo
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001
0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
+0.005 < [ce] . 2001:3001(1000) ack 2 win 257
+0.000 > [ect01] . 2:2(0) ack 2001
// Previously the ACK below would be delayed by 40ms
+0.000 > [ect01] E. 2:2(0) ack 3001
+0.500 < F. 9501:9501(0) ack 4 win 257
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
||
|
|
27cde44a25 |
tcp: do not cancel delay-AcK on DCTCP special ACK
Currently when a DCTCP receiver delays an ACK and receive a data packet with a different CE mark from the previous one's, it sends two immediate ACKs acking previous and latest sequences respectly (for ECN accounting). Previously sending the first ACK may mark off the delayed ACK timer (tcp_event_ack_sent). This may subsequently prevent sending the second ACK to acknowledge the latest sequence (tcp_ack_snd_check). The culprit is that tcp_send_ack() assumes it always acknowleges the latest sequence, which is not true for the first special ACK. The fix is to not make the assumption in tcp_send_ack and check the actual ack sequence before cancelling the delayed ACK. Further it's safer to pass the ack sequence number as a local variable into tcp_send_ack routine, instead of intercepting tp->rcv_nxt to avoid future bugs like this. Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
a69258f7aa |
tcp: remove DELAYED ACK events in DCTCP
After fixing the way DCTCP tracking delayed ACKs, the delayed-ACK related callbacks are no longer needed Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
cca9bab1b7 |
tcp: use monotonic timestamps for PAWS
Using get_seconds() for timestamps is deprecated since it can lead to overflows on 32-bit systems. While the interface generally doesn't overflow until year 2106, the specific implementation of the TCP PAWS algorithm breaks in 2038 when the intermediate signed 32-bit timestamps overflow. A related problem is that the local timestamps in CLOCK_REALTIME form lead to unexpected behavior when settimeofday is called to set the system clock backwards or forwards by more than 24 days. While the first problem could be solved by using an overflow-safe method of comparing the timestamps, a nicer solution is to use a monotonic clocksource with ktime_get_seconds() that simply doesn't overflow (at least not until 136 years after boot) and that doesn't change during settimeofday(). To make 32-bit and 64-bit architectures behave the same way here, and also save a few bytes in the tcp_options_received structure, I'm changing the type to a 32-bit integer, which is now safe on all architectures. Finally, the ts_recent_stamp field also (confusingly) gets used to store a jiffies value in tcp_synq_overflow()/tcp_synq_no_recent_overflow(). This is currently safe, but changing the type to 32-bit requires some small changes there to keep it working. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
4929c9428a |
tcp: expose both send and receive intervals for rate sample
Congestion control algorithms, which access the rate sample through the tcp_cong_control function, only have access to the maximum of the send and receive interval, for cases where the acknowledgment rate may be inaccurate due to ACK compression or decimation. Algorithms may want to use send rates and receive rates as separate signals. Signed-off-by: Deepti Raghavan <deeptir@mit.edu> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
0ea488ff8d |
bpf: sockmap, convert bpf_compute_data_pointers to bpf_*_sk_skb
In commit 'bpf: bpf_compute_data uses incorrect cb structure' ( |
||
|
|
5cd3da4ba2 |
Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Simple overlapping changes in stmmac driver. Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
a11e1d432b |
Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
d4546c2509 |
net: Convert GRO SKB handling to list_head.
Manage pending per-NAPI GRO packets via list_head. Return an SKB pointer from the GRO receive handlers. When GRO receive handlers return non-NULL, it means that this SKB needs to be completed at this time and removed from the NAPI queue. Several operations are greatly simplified by this transformation, especially timing out the oldest SKB in the list when gro_count exceeds MAX_GRO_SKBS, and napi_gro_flush() which walks the queue in reverse order. Signed-off-by: David S. Miller <davem@davemloft.net> |