lineage-22.2
125 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
90e35adc2c |
bpf: add __weak hook for allocating executable memory
By default, BPF uses module_alloc() to allocate executable memory, but this is not necessary on all arches and potentially undesirable on some of them. So break out the module_alloc() and module_memfree() calls into __weak functions to allow them to be overridden in arch code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Danny Lin <danny@kdrag0n.dev> Change-Id: I582794881942bc0b766515861f2232354860536b |
||
|
|
ca61495eb4 |
Merge 4.19.280 into android-4.19-stable
Changes in 4.19.280 power: supply: da9150: Fix use after free bug in da9150_charger_remove due to race condition i40evf: Change a VF mac without reloading the VF driver intel-ethernet: rename i40evf to iavf iavf: diet and reformat iavf: fix inverted Rx hash condition leading to disabled hash intel/igbvf: free irq on the error path in igbvf_request_msix() igbvf: Regard vf reset nack as success i2c: imx-lpi2c: check only for enabled interrupt flags scsi: scsi_dh_alua: Fix memleak for 'qdata' in alua_activate() net: usb: smsc95xx: Limit packet length to skb->len qed/qed_sriov: guard against NULL derefs from qed_iov_get_vf_info xirc2ps_cs: Fix use after free bug in xirc2ps_detach net: qcom/emac: Fix use after free bug in emac_remove due to race condition net/ps3_gelic_net: Fix RX sk_buff length net/ps3_gelic_net: Use dma_mapping_error bpf: Adjust insufficient default bpf_jit_limit net/mlx5: Read the TC mapping of all priorities on ETS query atm: idt77252: fix kmemleak when rmmod idt77252 erspan: do not use skb_mac_header() in ndo_start_xmit() net/sonic: use dma_mapping_error() for error check hvc/xen: prevent concurrent accesses to the shared ring net: mdio: thunder: Add missing fwnode_handle_put() Bluetooth: btqcomsmd: Fix command timeout after setting BD address Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work hwmon (it87): Fix voltage scaling for chips with 10.9mV ADCs uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2 thunderbolt: Use const qualifier for `ring_interrupt_index` riscv: Bump COMMAND_LINE_SIZE value to 1024 ca8210: fix mac_len negative array access m68k: Only force 030 bus error if PC not in exception table scsi: target: iscsi: Fix an error message in iscsi_check_key() scsi: ufs: core: Add soft dependency on governor_simpleondemand net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990 net: usb: qmi_wwan: add Telit 0x1080 composition sh: sanitize the flags on sigreturn cifs: empty interface list when server doesn't support query interfaces scsi: core: Add BLIST_SKIP_VPD_PAGES for SKhynix H28U74301AMR usb: gadget: u_audio: don't let userspace block driver unbind igb: revert rtnl_lock() that causes deadlock dm thin: fix deadlock when swapping to thin device usb: chipdea: core: fix return -EINVAL if request role is the same with current role usb: chipidea: core: fix possible concurrent when switch role nilfs2: fix kernel-infoleak in nilfs_ioctl_wrap_copy() i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer() dm stats: check for and propagate alloc_percpu failure dm crypt: add cond_resched() to dmcrypt_write() sched/fair: sanitize vruntime of entity being placed sched/fair: Sanitize vruntime of entity being migrated tun: avoid double free in tun_free_netdev ocfs2: fix data corruption after failed write bus: imx-weim: fix branch condition evaluates to a garbage value md: avoid signed overflow in slot_store() ALSA: asihpi: check pao in control_message() ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set() fbdev: tgafb: Fix potential divide by zero sched_getaffinity: don't assume 'cpumask_size()' is fully initialized fbdev: nvidia: Fix potential divide by zero fbdev: intelfb: Fix potential divide by zero fbdev: lxfb: Fix potential divide by zero fbdev: au1200fb: Fix potential divide by zero ca8210: Fix unsigned mac_len comparison with zero in ca8210_skb_tx() scsi: megaraid_sas: Fix crash after a double completion can: bcm: bcm_tx_setup(): fix KMSAN uninit-value in vfs_write i40e: fix registers dump after run ethtool adapter self test net: dsa: mv88e6xxx: Enable IGMP snooping on user ports only net: mvneta: make tx buffer array agnostic Input: alps - fix compatibility with -funsigned-char Input: focaltech - use explicitly signed char type cifs: prevent infinite recursion in CIFSGetDFSRefer() cifs: fix DFS traversal oops without CONFIG_CIFS_DFS_UPCALL xen/netback: don't do grant copy across page boundary pinctrl: at91-pio4: fix domain name assignment ALSA: hda/conexant: Partial revert of a quirk for Lenovo ALSA: usb-audio: Fix regression on detection of Roland VS-100 drm/etnaviv: fix reference leak when mmaping imported buffer s390/uaccess: add missing earlyclobber annotations to __clear_user() usb: host: ohci-pxa27x: Fix and & vs | typo ext4: fix kernel BUG in 'ext4_write_inline_data_end()' firmware: arm_scmi: Fix device node validation for mailbox transport gfs2: Always check inode size of inline inodes net: sched: cbq: dont intepret cls results when asked to drop cgroup/cpuset: Change cpuset_rwsem and hotplug lock order cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all() Linux 4.19.280 Change-Id: I63f8dc1e674a396e468ee0ea314d141682d60b72 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
42049e65d3 |
bpf: Adjust insufficient default bpf_jit_limit
[ Upstream commit 10ec8ca8ec1a2f04c4ed90897225231c58c124a7 ]
We've seen recent AWS EKS (Kubernetes) user reports like the following:
After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
clusters after a few days a number of the nodes have containers stuck
in ContainerCreating state or liveness/readiness probes reporting the
following error:
Readiness probe errored: rpc error: code = Unknown desc = failed to
exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
OCI runtime exec failed: exec failed: unable to start container process:
unable to init seccomp: error loading seccomp filter into kernel:
error loading seccomp filter: errno 524: unknown
However, we had not been seeing this issue on previous AMIs and it only
started to occur on v20230217 (following the upgrade from kernel 5.4 to
5.10) with no other changes to the underlying cluster or workloads.
We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
which helped to immediately allow containers to be created and probes to
execute but after approximately a day the issue returned and the value
returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
was steadily increasing.
I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
seccomp BPF and native (e)BPF programs, and the behavior all looks sane
and expected, that is nothing "leaking" from an upstream perspective.
The bpf_jit_limit knob was originally added in order to avoid a situation
where unprivileged applications loading BPF programs (e.g. seccomp BPF
policies) consuming all the module memory space via BPF JIT such that loading
of kernel modules would be prevented. The default limit was defined back in
2018 and while good enough back then, we are generally seeing far more BPF
consumers today.
Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
module memory space to better reflect today's needs and avoid more users
running into potentially hard to debug issues.
Fixes: fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Reported-by: Stephen Haynes <sh@synk.net>
Reported-by: Lefteris Alexakis <lefteris.alexakis@kpn.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://github.com/awslabs/amazon-eks-ami/issues/1179
Link: https://github.com/awslabs/amazon-eks-ami/issues/1219
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230320143725.8394-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
6af002b301 |
Merge 4.19.274 into android-4.19-stable
Changes in 4.19.274 wifi: rtl8xxxu: gen2: Turn on the rate control powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G random: always mix cycle counter in add_latent_entropy() can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len powerpc: dts: t208x: Disable 10G on MAC1 and MAC2 alarmtimer: Prevent starvation by small intervals and SIG_IGN drm/i915/gvt: fix double free bug in split_2MB_gtt_entry mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh uaccess: Add speculation barrier to copy_from_user() wifi: mwifiex: Add missing compatible string for SD8787 ext4: Fix function prototype mismatch for ext4_feat_ktype bpf: add missing header file include Linux 4.19.274 Change-Id: Ibf649340dee25d21c329d09a1f19454dfd2e5e7f Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
|
c7603df976 |
bpf: add missing header file include
commit f3dd0c53370e70c0f9b7e931bbec12916f3bb8cc upstream.
Commit 74e19ef0ff80 ("uaccess: Add speculation barrier to
copy_from_user()") built fine on x86-64 and arm64, and that's the extent
of my local build testing.
It turns out those got the <linux/nospec.h> include incidentally through
other header files (<linux/kvm_host.h> in particular), but that was not
true of other architectures, resulting in build errors
kernel/bpf/core.c: In function ‘___bpf_prog_run’:
kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’
so just make sure to explicitly include the proper <linux/nospec.h>
header file to make everybody see it.
Fixes: 74e19ef0ff80 ("uaccess: Add speculation barrier to copy_from_user()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Reported-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
f8e54da1c7 |
uaccess: Add speculation barrier to copy_from_user()
commit 74e19ef0ff8061ef55957c3abd71614ef0f42f47 upstream. The results of "access_ok()" can be mis-speculated. The result is that you can end speculatively: if (access_ok(from, size)) // Right here even for bad from/size combinations. On first glance, it would be ideal to just add a speculation barrier to "access_ok()" so that its results can never be mis-speculated. But there are lots of system calls just doing access_ok() via "copy_to_user()" and friends (example: fstat() and friends). Those are generally not problematic because they do not _consume_ data from userspace other than the pointer. They are also very quick and common system calls that should not be needlessly slowed down. "copy_from_user()" on the other hand uses a user-controller pointer and is frequently followed up with code that might affect caches. Take something like this: if (!copy_from_user(&kernelvar, uptr, size)) do_something_with(kernelvar); If userspace passes in an evil 'uptr' that *actually* points to a kernel addresses, and then do_something_with() has cache (or other) side-effects, it could allow userspace to infer kernel data values. Add a barrier to the common copy_from_user() code to prevent mis-speculated values which happen after the copy. Also add a stub for architectures that do not define barrier_nospec(). This makes the macro usable in generic code. Since the barrier is now usable in generic code, the x86 #ifdef in the BPF code can also go away. Reported-by: Jordy Zomer <jordyzomer@google.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Borkmann <daniel@iogearbox.net> # BPF bits Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
88e34926b0 |
Merge 4.19.254 into android-4.19-stable
Changes in 4.19.254 riscv: add as-options for modules with assembly compontents xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup() power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe pinctrl: ralink: Check for null return of devm_kcalloc perf/core: Fix data race between perf_event_set_output() and perf_mmap_close() ip: Fix data-races around sysctl_ip_fwd_use_pmtu. ip: Fix data-races around sysctl_ip_nonlocal_bind. ip: Fix a data-race around sysctl_fwmark_reflect. tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept. tcp: Fix data-races around sysctl_tcp_mtu_probing. tcp: Fix a data-race around sysctl_tcp_probe_threshold. tcp: Fix a data-race around sysctl_tcp_probe_interval. i2c: cadence: Change large transfer count reset logic to be unconditional net: stmmac: fix dma queue left shift overflow issue net/tls: Fix race in TLS device down flow igmp: Fix data-races around sysctl_igmp_llm_reports. igmp: Fix a data-race around sysctl_igmp_max_memberships. tcp: Fix data-races around sysctl_tcp_reordering. tcp: Fix data-races around some timeout sysctl knobs. tcp: Fix a data-race around sysctl_tcp_notsent_lowat. tcp: Fix a data-race around sysctl_tcp_tw_reuse. tcp: Fix data-races around sysctl_tcp_fastopen. be2net: Fix buffer overflow in be_get_module_eeprom tcp: Fix a data-race around sysctl_tcp_early_retrans. tcp: Fix data-races around sysctl_tcp_recovery. tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. tcp: Fix a data-race around sysctl_tcp_retrans_collapse. tcp: Fix a data-race around sysctl_tcp_stdurg. tcp: Fix a data-race around sysctl_tcp_rfc1337. tcp: Fix data-races around sysctl_tcp_max_reordering. Revert "Revert "char/random: silence a lockdep splat with printk()"" mm/mempolicy: fix uninit-value in mpol_rebind_policy() bpf: Make sure mac_header was set before using it drm/tilcdc: Remove obsolete crtc_mode_valid() hack tilcdc: tilcdc_external: fix an incorrect NULL check on list iterator HID: multitouch: simplify the application retrieval HID: multitouch: Lenovo X1 Tablet Gen3 trackpoint and buttons HID: multitouch: add support for the Smart Tech panel HID: add ALWAYS_POLL quirk to lenovo pixart mouse dlm: fix pending remove if msg allocation fails ima: remove the IMA_TEMPLATE Kconfig option ALSA: memalloc: Align buffer allocations in page size Bluetooth: Add bt_skb_sendmsg helper Bluetooth: Add bt_skb_sendmmsg helper Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg Bluetooth: Fix passing NULL to PTR_ERR Bluetooth: SCO: Fix sco_send_frame returning skb->len Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks serial: mvebu-uart: correctly report configured baudrate value tty: drivers/tty/, stop using tty_schedule_flip() tty: the rest, stop using tty_schedule_flip() tty: drop tty_schedule_flip() tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push() tty: use new tty_insert_flip_string_and_push_buffer() in pty_write() net: usb: ax88179_178a needs FLAG_SEND_ZLP PCI: hv: Fix multi-MSI to allow more than one MSI vector PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() PCI: hv: Fix interrupt mapping for multi-MSI Linux 4.19.254 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I8164bc3c6ca4775c4fffb7983f7d5b3b11f5bb09 |
||
|
|
03fe739e27 |
bpf: Make sure mac_header was set before using it
commit 0326195f523a549e0a9d7fd44c70b26fd7265090 upstream. Classic BPF has a way to load bytes starting from the mac header. Some skbs do not have a mac header, and skb_mac_header() in this case is returning a pointer that 65535 bytes after skb->head. Existing range check in bpf_internal_load_pointer_neg_helper() was properly kicking and no illegal access was happening. New sanity check in skb_mac_header() is firing, so we need to avoid it. WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline] WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74 Modules linked in: CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb9484be #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022 RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline] RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74 Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41 RSP: 0018:ffffc9000309f668 EFLAGS: 00010216 RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000 RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003 RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004 R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000 FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ____bpf_skb_load_helper_32 net/core/filter.c:276 [inline] bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264 Fixes: f9aefd6b2aa3 ("net: warn if mac header was not set") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
47e51a7a22 |
Merge 4.19.218 into android-4.19-stable
Changes in 4.19.218
xhci: Fix USB 3.1 enumeration issues by increasing roothub power-on-good delay
binder: use euid from cred instead of using task
binder: use cred instead of task for selinux checks
Input: elantench - fix misreporting trackpoint coordinates
Input: i8042 - Add quirk for Fujitsu Lifebook T725
libata: fix read log timeout value
ocfs2: fix data corruption on truncate
mmc: dw_mmc: Dont wait for DRTO on Write RSP error
parisc: Fix ptrace check on syscall return
tpm: Check for integer overflow in tpm2_map_response_body()
firmware/psci: fix application of sizeof to pointer
crypto: s5p-sss - Add error handling in s5p_aes_probe()
media: ite-cir: IR receiver stop working after receive overflow
media: ir-kbd-i2c: improve responsiveness of hauppauge zilog receivers
ALSA: hda/realtek: Add quirk for Clevo PC70HS
ALSA: ua101: fix division by zero at probe
ALSA: 6fire: fix control and bulk message timeouts
ALSA: line6: fix control and interrupt message timeouts
ALSA: usb-audio: Add registration quirk for JBL Quantum 400
ALSA: synth: missing check for possible NULL after the call to kstrdup
ALSA: timer: Fix use-after-free problem
ALSA: timer: Unconditionally unlink slave instances, too
x86/sme: Use #define USE_EARLY_PGTABLE_L5 in mem_encrypt_identity.c
x86/irq: Ensure PI wakeup handler is unregistered before module unload
cavium: Return negative value when pci_alloc_irq_vectors() fails
scsi: qla2xxx: Fix unmap of already freed sgl
cavium: Fix return values of the probe function
sfc: Don't use netif_info before net_device setup
hyperv/vmbus: include linux/bitops.h
mmc: winbond: don't build on M68K
drm: panel-orientation-quirks: Add quirk for Aya Neo 2021
bpf: Prevent increasing bpf_jit_limit above max
xen/netfront: stop tx queues during live migration
spi: spl022: fix Microwire full duplex mode
watchdog: Fix OMAP watchdog early handling
vmxnet3: do not stop tx queues after netif_device_detach()
btrfs: clear MISSING device status bit in btrfs_close_one_device
btrfs: fix lost error handling when replaying directory deletes
btrfs: call btrfs_check_rw_degradable only if there is a missing device
ia64: kprobes: Fix to pass correct trampoline address to the handler
hwmon: (pmbus/lm25066) Add offset coefficients
regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled
regulator: dt-bindings: samsung,s5m8767: correct s5m8767,pmic-buck-default-dvs-idx property
EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
mwifiex: fix division by zero in fw download path
ath6kl: fix division by zero in send path
ath6kl: fix control-message timeout
ath10k: fix control-message timeout
ath10k: fix division by zero in send path
PCI: Mark Atheros QCA6174 to avoid bus reset
rtl8187: fix control-message timeouts
evm: mark evm_fixmode as __ro_after_init
wcn36xx: Fix HT40 capability for 2Ghz band
mwifiex: Read a PCI register after writing the TX ring write pointer
libata: fix checking of DMA state
wcn36xx: handle connection loss indication
rsi: fix occasional initialisation failure with BT coex
rsi: fix key enabled check causing unwanted encryption for vap_id > 0
rsi: fix rate mask set leading to P2P failure
rsi: Fix module dev_oper_mode parameter description
RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
signal: Remove the bogus sigkill_pending in ptrace_stop
signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT
power: supply: max17042_battery: Prevent int underflow in set_soc_threshold
power: supply: max17042_battery: use VFSOC for capacity when no rsns
powerpc/85xx: Fix oops when mpc85xx_smp_guts_ids node cannot be found
serial: core: Fix initializing and restoring termios speed
ALSA: mixer: oss: Fix racy access to slots
ALSA: mixer: fix deadlock in snd_mixer_oss_set_volume
xen/balloon: add late_initcall_sync() for initial ballooning done
PCI: aardvark: Do not clear status bits of masked interrupts
PCI: aardvark: Do not unmask unused interrupts
PCI: aardvark: Fix return value of MSI domain .alloc() method
PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
quota: check block number when reading the block in quota file
quota: correct error number in free_dqentry()
pinctrl: core: fix possible memory leak in pinctrl_enable()
iio: dac: ad5446: Fix ad5622_write() return value
USB: serial: keyspan: fix memleak on probe errors
USB: iowarrior: fix control-message timeouts
drm: panel-orientation-quirks: Add quirk for KD Kurio Smart C15200 2-in-1
Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg()
Bluetooth: fix use-after-free error in lock_sock_nested()
platform/x86: wmi: do not fail if disabling fails
MIPS: lantiq: dma: add small delay after reset
MIPS: lantiq: dma: reset correct number of channel
locking/lockdep: Avoid RCU-induced noinstr fail
net: sched: update default qdisc visibility after Tx queue cnt changes
smackfs: Fix use-after-free in netlbl_catmap_walk()
x86: Increase exception stack sizes
mwifiex: Run SET_BSS_MODE when changing from P2P to STATION vif-type
mwifiex: Properly initialize private structure on interface type changes
media: mt9p031: Fix corrupted frame after restarting stream
media: netup_unidvb: handle interrupt properly according to the firmware
media: uvcvideo: Set capability in s_param
media: uvcvideo: Return -EIO for control errors
media: s5p-mfc: fix possible null-pointer dereference in s5p_mfc_probe()
media: s5p-mfc: Add checking to s5p_mfc_probe().
media: mceusb: return without resubmitting URB in case of -EPROTO error.
ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK
media: rcar-csi2: Add checking to rcsi2_start_receiver()
ACPICA: Avoid evaluating methods too early during system resume
media: usb: dvd-usb: fix uninit-value bug in dibusb_read_eeprom_byte()
tracefs: Have tracefs directories not set OTH permission bits by default
ath: dfs_pattern_detector: Fix possible null-pointer dereference in channel_detector_create()
ACPI: battery: Accept charges over the design capacity as full
leaking_addresses: Always print a trailing newline
memstick: r592: Fix a UAF bug when removing the driver
lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
lib/xz: Validate the value before assigning it to an enum variable
workqueue: make sysfs of unbound kworker cpumask more clever
tracing/cfi: Fix cmp_entries_* functions signature mismatch
mwl8k: Fix use-after-free in mwl8k_fw_state_machine()
PM: hibernate: Get block device exclusively in swsusp_check()
iwlwifi: mvm: disable RX-diversity in powersave
smackfs: use __GFP_NOFAIL for smk_cipso_doi()
ARM: clang: Do not rely on lr register for stacktrace
gre/sit: Don't generate link-local addr if addr_gen_mode is IN6_ADDR_GEN_MODE_NONE
ARM: 9136/1: ARMv7-M uses BE-8, not BE-32
spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe()
x86/hyperv: Protect set_hv_tscchange_cb() against getting preempted
parisc: fix warning in flush_tlb_all
task_stack: Fix end_of_stack() for architectures with upwards-growing stack
parisc/unwind: fix unwinder when CONFIG_64BIT is enabled
parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
Bluetooth: fix init and cleanup of sco_conn.timeout_work
cgroup: Make rebind_subsystems() disable v2 controllers all at once
net: dsa: rtl8366rb: Fix off-by-one bug
drm/amdgpu: fix warning for overflow check
media: em28xx: add missing em28xx_close_extension
media: dvb-usb: fix ununit-value in az6027_rc_query
media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()'
media: si470x: Avoid card name truncation
media: cx23885: Fix snd_card_free call on null card pointer
cpuidle: Fix kobject memory leaks in error paths
media: em28xx: Don't use ops->suspend if it is NULL
ath9k: Fix potential interrupt storm on queue reset
media: dvb-frontends: mn88443x: Handle errors of clk_prepare_enable()
crypto: qat - detect PFVF collision after ACK
crypto: qat - disregard spurious PFVF interrupts
hwrng: mtk - Force runtime pm ops for sleep ops
b43legacy: fix a lower bounds test
b43: fix a lower bounds test
mmc: sdhci-omap: Fix NULL pointer exception if regulator is not configured
memstick: avoid out-of-range warning
memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host()
hwmon: Fix possible memleak in __hwmon_device_register()
hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff
ath10k: fix max antenna gain unit
drm/msm: uninitialized variable in msm_gem_import()
net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
mmc: mxs-mmc: disable regulator on error and in the remove function
platform/x86: thinkpad_acpi: Fix bitwise vs. logical warning
rsi: stop thread firstly in rsi_91x_init() error handling
mwifiex: Send DELBA requests according to spec
phy: micrel: ksz8041nl: do not use power down mode
nvme-rdma: fix error code in nvme_rdma_setup_ctrl
PM: hibernate: fix sparse warnings
clocksource/drivers/timer-ti-dm: Select TIMER_OF
drm/msm: Fix potential NULL dereference in DPU SSPP
smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
s390/gmap: don't unconditionally call pte_unmap_unlock() in __gmap_zap()
irq: mips: avoid nested irq_enter()
tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()
samples/kretprobes: Fix return value if register_kretprobe() failed
KVM: s390: Fix handle_sske page fault handling
libertas_tf: Fix possible memory leak in probe and disconnect
libertas: Fix possible memory leak in probe and disconnect
wcn36xx: add proper DMA memory barriers in rx path
net: amd-xgbe: Toggle PLL settings during rate change
net: phylink: avoid mvneta warning when setting pause parameters
crypto: pcrypt - Delay write to padata->info
selftests/bpf: Fix fclose/pclose mismatch in test_progs
ibmvnic: Process crqs after enabling interrupts
RDMA/rxe: Fix wrong port_cap_flags
ARM: s3c: irq-s3c24xx: Fix return value check for s3c24xx_init_intc()
arm64: dts: rockchip: Fix GPU register width for RK3328
RDMA/bnxt_re: Fix query SRQ failure
ARM: dts: at91: tse850: the emac<->phy interface is rmii
scsi: dc395: Fix error case unwinding
MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT
JFS: fix memleak in jfs_mount
ALSA: hda: Reduce udelay() at SKL+ position reporting
arm: dts: omap3-gta04a4: accelerometer irq fix
soc/tegra: Fix an error handling path in tegra_powergate_power_up()
memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe
video: fbdev: chipsfb: use memset_io() instead of memset()
serial: 8250_dw: Drop wrong use of ACPI_PTR()
usb: gadget: hid: fix error code in do_config()
power: supply: rt5033_battery: Change voltage values to µV
scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn()
RDMA/mlx4: Return missed an error if device doesn't support steering
ASoC: cs42l42: Correct some register default values
ASoC: cs42l42: Defer probe if request_threaded_irq() returns EPROBE_DEFER
phy: qcom-qusb2: Fix a memory leak on probe
serial: xilinx_uartps: Fix race condition causing stuck TX
mips: cm: Convert to bitfield API to fix out-of-bounds access
power: supply: bq27xxx: Fix kernel crash on IRQ handler register error
apparmor: fix error check
rpmsg: Fix rpmsg_create_ept return when RPMSG config is not defined
pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_ds
drm/plane-helper: fix uninitialized variable reference
PCI: aardvark: Don't spam about PIO Response Status
NFS: Fix deadlocks in nfs_scan_commit_list()
fs: orangefs: fix error return code of orangefs_revalidate_lookup()
mtd: spi-nor: hisi-sfc: Remove excessive clk_disable_unprepare()
dmaengine: at_xdmac: fix AT_XDMAC_CC_PERID() macro
auxdisplay: img-ascii-lcd: Fix lock-up when displaying empty string
auxdisplay: ht16k33: Connect backlight to fbdev
auxdisplay: ht16k33: Fix frame buffer device blanking
netfilter: nfnetlink_queue: fix OOB when mac header was cleared
dmaengine: dmaengine_desc_callback_valid(): Check for `callback_result`
m68k: set a default value for MEMORY_RESERVE
watchdog: f71808e_wdt: fix inaccurate report in WDIOC_GETTIMEOUT
ar7: fix kernel builds for compiler test
scsi: qla2xxx: Fix gnl list corruption
scsi: qla2xxx: Turn off target reset during issue_lip
i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()'
xen-pciback: Fix return in pm_ctrl_init()
net: davinci_emac: Fix interrupt pacing disable
ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses
bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed
mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
zram: off by one in read_block_state()
llc: fix out-of-bound array index in llc_sk_dev_hash()
nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails
arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
vsock: prevent unnecessary refcnt inc for nonblocking connect
cxgb4: fix eeprom len when diagnostics not implemented
USB: chipidea: fix interrupt deadlock
ARM: 9155/1: fix early early_iounmap()
ARM: 9156/1: drop cc-option fallbacks for architecture selection
f2fs: should use GFP_NOFS for directory inodes
9p/net: fix missing error check in p9_check_errors
powerpc/lib: Add helper to check if offset is within conditional branch range
powerpc/bpf: Validate branch ranges
powerpc/bpf: Fix BPF_SUB when imm == 0x80000000
powerpc/security: Add a helper to query stf_barrier type
powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC
mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks
mm, oom: do not trigger out_of_memory from the #PF
backlight: gpio-backlight: Correct initial power state handling
video: backlight: Drop maximum brightness override for brightness zero
s390/cio: check the subchannel validity for dev_busid
s390/tape: fix timer initialization in tape_std_assign()
PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
fuse: truncate pagecache on atomic_o_trunc
x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
ext4: fix lazy initialization next schedule time computation in more granular unit
fortify: Explicitly disable Clang support
parisc/entry: fix trace test in syscall exit path
PCI/MSI: Destroy sysfs before freeing entries
PCI/MSI: Deal with devices lying about their MSI mask capability
PCI: Add MSI masking quirk for Nvidia ION AHCI
erofs: remove the occupied parameter from z_erofs_pagevec_enqueue()
erofs: fix unsafe pagevec reuse of hooked pclusters
arm64: zynqmp: Do not duplicate flash partition label property
arm64: zynqmp: Fix serial compatible string
scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
arm64: dts: hisilicon: fix arm,sp805 compatible string
usb: musb: tusb6010: check return value after calling platform_get_resource()
usb: typec: tipd: Remove WARN_ON in tps6598x_block_read
arm64: dts: freescale: fix arm,sp805 compatible string
ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect
scsi: advansys: Fix kernel pointer leak
firmware_loader: fix pre-allocated buf built-in firmware use
ARM: dts: omap: fix gpmc,mux-add-data type
usb: host: ohci-tmio: check return value after calling platform_get_resource()
ALSA: ISA: not for M68K
tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
MIPS: sni: Fix the build
scsi: target: Fix ordered tag handling
scsi: target: Fix alua_tg_pt_gps_count tracking
powerpc/5200: dts: fix memory node unit name
ALSA: gus: fix null pointer dereference on pointer block
powerpc/dcr: Use cmplwi instead of 3-argument cmpli
sh: check return code of request_irq
maple: fix wrong return value of maple_bus_init().
f2fs: fix up f2fs_lookup tracepoints
sh: fix kconfig unmet dependency warning for FRAME_POINTER
sh: define __BIG_ENDIAN for math-emu
mips: BCM63XX: ensure that CPU_SUPPORTS_32BIT_KERNEL is set
sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrame
net: bnx2x: fix variable dereferenced before check
iavf: check for null in iavf_fix_features
iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset
MIPS: generic/yamon-dt: fix uninitialized variable error
mips: bcm63xx: add support for clk_get_parent()
mips: lantiq: add support for clk_get_parent()
platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()'
net: virtio_net_hdr_to_skb: count transport header in UFO
i40e: Fix correct max_pkt_size on VF RX queue
i40e: Fix NULL ptr dereference on VSI filter sync
i40e: Fix changing previously set num_queue_pairs for PFs
i40e: Fix display error code in dmesg
NFC: reorganize the functions in nci_request
NFC: reorder the logic in nfc_{un,}register_device
perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
tun: fix bonding active backup with arp monitoring
hexagon: export raw I/O routines for modules
ipc: WARN if trying to remove ipc object which is absent
mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails
udf: Fix crash after seekdir
btrfs: fix memory ordering between normal and ordered work functions
parisc/sticon: fix reverse colors
cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
drm/udl: fix control-message timeout
drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors
perf/core: Avoid put_page() when GUP fails
batman-adv: mcast: fix duplicate mcast packets in BLA backbone from LAN
batman-adv: Consider fragmentation for needed_headroom
batman-adv: Reserve needed_*room for fragments
batman-adv: Don't always reallocate the fragmentation skb head
RDMA/netlink: Add __maybe_unused to static inline in C file
ASoC: DAPM: Cover regression by kctl change notification fix
usb: max-3421: Use driver data instead of maintaining a list of bound devices
soc/tegra: pmc: Fix imbalanced clock disabling in error code path
Linux 4.19.218
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3f87fc92fe2a7a19ddddb522916f74dba7929583
|
||
|
|
5c6fb0e0c7 |
bpf: Prevent increasing bpf_jit_limit above max
[ Upstream commit fadb7ff1a6c2c565af56b4aacdd086b067eed440 ] Restrict bpf_jit_limit to the maximum supported by the arch's JIT. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211014142554.53120-4-lmb@cloudflare.com Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
11156bde8d |
Merge 4.19.207 into android-4.19-stable
Changes in 4.19.207 ext4: fix race writing to an inline_data file while its xattrs are changing xtensa: fix kconfig unmet dependency warning for HAVE_FUTEX_CMPXCHG gpu: ipu-v3: Fix i.MX IPU-v3 offset calculations for (semi)planar U/V formats qed: Fix the VF msix vectors flow net: macb: Add a NULL check on desc_ptp qede: Fix memset corruption perf/x86/intel/pt: Fix mask of num_address_ranges perf/x86/amd/ibs: Work around erratum #1197 cryptoloop: add a deprecation warning ARM: 8918/2: only build return_address() if needed ALSA: pcm: fix divide error in snd_pcm_lib_ioctl clk: fix build warning for orphan_list media: stkwebcam: fix memory leak in stk_camera_probe ARM: imx: add missing clk_disable_unprepare() ARM: imx: fix missing 3rd argument in macro imx_mmdc_perf_init igmp: Add ip_mc_list lock in ip_check_mc_rcu USB: serial: mos7720: improve OOM-handling in read_mos_reg() ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2) SUNRPC/nfs: Fix return value for nfs4_callback_compound() crypto: talitos - reduce max key size for SEC1 powerpc/module64: Fix comment in R_PPC64_ENTRY handling powerpc/boot: Delete unneeded .globl _zimage_start net: ll_temac: Remove left-over debug message mm/page_alloc: speed up the iteration of max_order Revert "btrfs: compression: don't try to compress if we don't have enough pages" ALSA: usb-audio: Add registration quirk for JBL Quantum 800 usb: host: xhci-rcar: Don't reload firmware after the completion usb: mtu3: use @mult for HS isoc or intr usb: mtu3: fix the wrong HS mult value x86/reboot: Limit Dell Optiplex 990 quirk to early BIOS versions PCI: Call Max Payload Size-related fixup quirks early locking/mutex: Fix HANDOFF condition regmap: fix the offset of register error log crypto: mxs-dcp - Check for DMA mapping errors sched/deadline: Fix reset_on_fork reporting of DL tasks power: supply: axp288_fuel_gauge: Report register-address on readb / writeb errors crypto: omap-sham - clear dma flags only after omap_sham_update_dma_stop() sched/deadline: Fix missing clock update in migrate_task_rq_dl() hrtimer: Avoid double reprogramming in __hrtimer_start_range_ns() udf: Check LVID earlier isofs: joliet: Fix iocharset=utf8 mount option bcache: add proper error unwinding in bcache_device_init nvme-rdma: don't update queue count when failing to set io queues power: supply: max17042_battery: fix typo in MAx17042_TOFF s390/cio: add dev_busid sysfs entry for each subchannel libata: fix ata_host_start() crypto: qat - do not ignore errors from enable_vf2pf_comms() crypto: qat - handle both source of interrupt in VF ISR crypto: qat - fix reuse of completion variable crypto: qat - fix naming for init/shutdown VF to PF notifications crypto: qat - do not export adf_iov_putmsg() fcntl: fix potential deadlock for &fasync_struct.fa_lock udf_get_extendedattr() had no boundary checks. m68k: emu: Fix invalid free in nfeth_cleanup() spi: spi-fsl-dspi: Fix issue with uninitialized dma_slave_config spi: spi-pic32: Fix issue with uninitialized dma_slave_config lib/mpi: use kcalloc in mpi_resize clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel crypto: qat - use proper type for vf_mask certs: Trigger creation of RSA module signing key if it's not an RSA key spi: sprd: Fix the wrong WDG_LOAD_VAL media: TDA1997x: enable EDID support soc: rockchip: ROCKCHIP_GRF should not default to y, unconditionally media: dvb-usb: fix uninit-value in dvb_usb_adapter_dvb_init media: dvb-usb: fix uninit-value in vp702x_read_mac_addr media: go7007: remove redundant initialization Bluetooth: sco: prevent information leak in sco_conn_defer_accept() tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos net: cipso: fix warnings in netlbl_cipsov4_add_std i2c: highlander: add IRQ check media: em28xx-input: fix refcount bug in em28xx_usb_disconnect media: venus: venc: Fix potential null pointer dereference on pointer fmt PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently PCI: PM: Enable PME if it can be signaled from D3cold soc: qcom: smsm: Fix missed interrupts if state changes while masked Bluetooth: increase BTNAMSIZ to 21 chars to fix potential buffer overflow drm/msm/dpu: make dpu_hw_ctl_clear_all_blendstages clear necessary LMs arm64: dts: exynos: correct GIC CPU interfaces address range on Exynos7 Bluetooth: fix repeated calls to sco_sock_kill drm/msm/dsi: Fix some reference counted resource leaks usb: gadget: udc: at91: add IRQ check usb: phy: fsl-usb: add IRQ check usb: phy: twl6030: add IRQ checks Bluetooth: Move shutdown callback before flushing tx and rx queue usb: host: ohci-tmio: add IRQ check usb: phy: tahvo: add IRQ check mac80211: Fix insufficient headroom issue for AMSDU usb: gadget: mv_u3d: request_irq() after initializing UDC Bluetooth: add timeout sanity check to hci_inquiry i2c: iop3xx: fix deferred probing i2c: s3c2410: fix IRQ check mmc: dw_mmc: Fix issue with uninitialized dma_slave_config mmc: moxart: Fix issue with uninitialized dma_slave_config CIFS: Fix a potencially linear read overflow i2c: mt65xx: fix IRQ check usb: ehci-orion: Handle errors of clk_prepare_enable() in probe usb: bdc: Fix an error handling path in 'bdc_probe()' when no suitable DMA config is available tty: serial: fsl_lpuart: fix the wrong mapbase value ath6kl: wmi: fix an error code in ath6kl_wmi_sync_point() bcma: Fix memory leak for internally-handled cores ipv4: make exception cache less predictible net: sched: Fix qdisc_rate_table refcount leak when get tcf_block failed net: qualcomm: fix QCA7000 checksum handling ipv4: fix endianness issue in inet_rtm_getroute_build_skb() netns: protect netns ID lookups with RCU fscrypt: add fscrypt_symlink_getattr() for computing st_size ext4: report correct st_size for encrypted symlinks f2fs: report correct st_size for encrypted symlinks ubifs: report correct st_size for encrypted symlinks tty: Fix data race between tiocsti() and flush_to_ldisc() x86/resctrl: Fix a maybe-uninitialized build warning treated as error KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted IMA: remove -Wmissing-prototypes warning IMA: remove the dependency on CRYPTO_MD5 fbmem: don't allow too huge resolutions backlight: pwm_bl: Improve bootloader/kernel device handover clk: kirkwood: Fix a clocking boot regression rtc: tps65910: Correct driver module alias btrfs: reset replace target device to allocation state on close blk-zoned: allow zone management send operations without CAP_SYS_ADMIN blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN PCI/MSI: Skip masking MSI-X on Xen PV powerpc/perf/hv-gpci: Fix counter value parsing xen: fix setting of max_pfn in shared_info include/linux/list.h: add a macro to test if entry is pointing to the head 9p/xen: Fix end of loop tests for list_for_each_entry bpf/verifier: per-register parent pointers bpf: correct slot_type marking logic to allow more stack slot sharing bpf: Support variable offset stack access from helpers bpf: Reject indirect var_off stack access in raw mode bpf: Reject indirect var_off stack access in unpriv mode bpf: Sanity check max value for var_off stack access selftests/bpf: Test variable offset stack access bpf: track spill/fill of constants selftests/bpf: fix tests due to const spill/fill bpf: Introduce BPF nospec instruction for mitigating Spectre v4 bpf: Fix leakage due to insufficient speculative store bypass mitigation bpf: verifier: Allocate idmap scratch in verifier env bpf: Fix pointer arithmetic mask tightening under state pruning tools/thermal/tmon: Add cross compiling support soc: aspeed: lpc-ctrl: Fix boundary check for mmap arm64: head: avoid over-mapping in map_memory crypto: public_key: fix overflow during implicit conversion block: bfq: fix bfq_set_next_ioprio_data() power: supply: max17042: handle fails of reading status register dm crypt: Avoid percpu_counter spinlock contention in crypt_page_alloc() VMCI: fix NULL pointer dereference when unmapping queue pair media: uvc: don't do DMA on stack media: rc-loopback: return number of emitters rather than error libata: add ATA_HORKAGE_NO_NCQ_TRIM for Samsung 860 and 870 SSDs ARM: 9105/1: atags_to_fdt: don't warn about stack size PCI: Restrict ASMedia ASM1062 SATA Max Payload Size Supported PCI: Return ~0 data on pciconfig_read() CAP_SYS_ADMIN failure PCI: xilinx-nwl: Enable the clock through CCF PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO response PCI: aardvark: Fix masking and unmasking legacy INTx interrupts HID: input: do not report stylus battery state as "full" RDMA/iwcm: Release resources if iw_cm module initialization fails docs: Fix infiniband uverbs minor number pinctrl: samsung: Fix pinctrl bank pin count vfio: Use config not menuconfig for VFIO_NOIOMMU powerpc/stacktrace: Include linux/delay.h openrisc: don't printk() unconditionally pinctrl: single: Fix error return code in pcs_parse_bits_in_pinctrl_entry() scsi: qedi: Fix error codes in qedi_alloc_global_queues() platform/x86: dell-smbios-wmi: Add missing kfree in error-exit from run_smbios_call fscache: Fix cookie key hashing f2fs: fix to account missing .skipped_gc_rwsem f2fs: fix to unmap pages from userspace process in punch_hole() MIPS: Malta: fix alignment of the devicetree buffer userfaultfd: prevent concurrent API initialization media: dib8000: rewrite the init prbs logic crypto: mxs-dcp - Use sg_mapping_iter to copy data PCI: Use pci_update_current_state() in pci_enable_device_flags() tipc: keep the skb in rcv queue until the whole data is read iio: dac: ad5624r: Fix incorrect handling of an optional regulator. ARM: dts: qcom: apq8064: correct clock names video: fbdev: kyro: fix a DoS bug by restricting user input netlink: Deal with ESRCH error in nlmsg_notify() Smack: Fix wrong semantics in smk_access_entry() usb: host: fotg210: fix the endpoint's transactional opportunities calculation usb: host: fotg210: fix the actual_length of an iso packet usb: gadget: u_ether: fix a potential null pointer dereference usb: gadget: composite: Allow bMaxPower=0 if self-powered staging: board: Fix uninitialized spinlock when attaching genpd tty: serial: jsm: hold port lock when reporting modem line changes drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex bpf/tests: Fix copy-and-paste error in double word test bpf/tests: Do not PASS tests without actually testing the result video: fbdev: asiliantfb: Error out if 'pixclock' equals zero video: fbdev: kyro: Error out if 'pixclock' equals zero video: fbdev: riva: Error out if 'pixclock' equals zero ipv4: ip_output.c: Fix out-of-bounds warning in ip_copy_addrs() flow_dissector: Fix out-of-bounds warnings s390/jump_label: print real address in a case of a jump label bug serial: 8250: Define RX trigger levels for OxSemi 950 devices xtensa: ISS: don't panic in rs_init hvsi: don't panic on tty_register_driver failure serial: 8250_pci: make setup_port() parameters explicitly unsigned staging: ks7010: Fix the initialization of the 'sleep_status' structure samples: bpf: Fix tracex7 error raised on the missing argument ata: sata_dwc_460ex: No need to call phy_exit() befre phy_init() Bluetooth: skip invalid hci_sync_conn_complete_evt bonding: 3ad: fix the concurrency between __bond_release_one() and bond_3ad_state_machine_handler() ASoC: Intel: bytcr_rt5640: Move "Platform Clock" routes to the maps for the matching in-/output media: imx258: Rectify mismatch of VTS value media: imx258: Limit the max analogue gain to 480 media: v4l2-dv-timings.c: fix wrong condition in two for-loops media: TDA1997x: fix tda1997x_query_dv_timings() return value media: tegra-cec: Handle errors of clk_prepare_enable() ARM: dts: imx53-ppd: Fix ACHC entry arm64: dts: qcom: sdm660: use reg value for memory node net: ethernet: stmmac: Do not use unreachable() in ipq806x_gmac_probe() Bluetooth: schedule SCO timeouts with delayed_work Bluetooth: avoid circular locks in sco_sock_connect gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port() ARM: tegra: tamonten: Fix UART pad setting Bluetooth: Fix handling of LE Enhanced Connection Complete serial: sh-sci: fix break handling for sysrq tcp: enable data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD rpc: fix gss_svc_init cleanup on failure staging: rts5208: Fix get_ms_information() heap buffer size gfs2: Don't call dlm after protocol is unmounted of: Don't allow __of_attached_node_sysfs() without CONFIG_SYSFS mmc: sdhci-of-arasan: Check return value of non-void funtions mmc: rtsx_pci: Fix long reads when clock is prescaled selftests/bpf: Enlarge select() timeout for test_maps mmc: core: Return correct emmc response in case of ioctl error cifs: fix wrong release in sess_alloc_buffer() failed path Revert "USB: xhci: fix U1/U2 handling for hardware with XHCI_INTEL_HOST quirk set" usb: musb: musb_dsps: request_irq() after initializing musb usbip: give back URBs for unsent unlink requests during cleanup usbip:vhci_hcd USB port can get stuck in the disabled state ASoC: rockchip: i2s: Fix regmap_ops hang ASoC: rockchip: i2s: Fixup config for DAIFMT_DSP_A/B parport: remove non-zero check on count ath9k: fix OOB read ar9300_eeprom_restore_internal ath9k: fix sleeping in atomic context net: fix NULL pointer reference in cipso_v4_doi_free net: w5100: check return value after calling platform_get_resource() parisc: fix crash with signals and alloca ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup() scsi: BusLogic: Fix missing pr_cont() use scsi: qla2xxx: Sync queue idx with queue_pair_map idx cpufreq: powernv: Fix init_chip_info initialization in numa=off mm/hugetlb: initialize hugetlb_usage in mm_init memcg: enable accounting for pids in nested pid namespaces platform/chrome: cros_ec_proto: Send command again when timeout occurs drm/amdgpu: Fix BUG_ON assert dm thin metadata: Fix use-after-free in dm_bm_set_read_only xen: reset legacy rtc flag for PV domU bnx2x: Fix enabling network interfaces without VFs arm64/sve: Use correct size when reinitialising SVE state PM: base: power: don't try to use non-existing RTC for storing data PCI: Add AMD GPU multi-function power dependencies x86/mm: Fix kern_addr_valid() to cope with existing but not present entries tipc: fix an use-after-free issue in tipc_recvmsg net-caif: avoid user-triggerable WARN_ON(1) ptp: dp83640: don't define PAGE0 dccp: don't duplicate ccid when cloning dccp sock net/l2tp: Fix reference count leak in l2tp_udp_recv_core r6040: Restore MDIO clock frequency after MAC reset tipc: increase timeout in tipc_sk_enqueue() perf machine: Initialize srcline string member in add_location struct net/mlx5: Fix potential sleeping in atomic context events: Reuse value read using READ_ONCE instead of re-reading it net/af_unix: fix a data-race in unix_dgram_poll net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup tcp: fix tp->undo_retrans accounting in tcp_sacktag_one() qed: Handle management FW error ibmvnic: check failover_pending in login response net: hns3: pad the short tunnel frame before sending to hardware mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range() KVM: s390: index kvm->arch.idle_mask by vcpu_idx dt-bindings: mtd: gpmc: Fix the ECC bytes vs. OOB bytes equation mfd: Don't use irq_create_mapping() to resolve a mapping PCI: Add ACS quirks for Cavium multi-function devices net: usb: cdc_mbim: avoid altsetting toggling for Telit LN920 block, bfq: honor already-setup queue merges ethtool: Fix an error code in cxgb2.c NTB: perf: Fix an error code in perf_setup_inbuf() mfd: axp20x: Update AXP288 volatile ranges PCI: Fix pci_dev_str_match_path() alloc while atomic bug KVM: arm64: Handle PSCI resets before userspace touches vCPU state PCI: Sync __pci_register_driver() stub for CONFIG_PCI=n mtd: rawnand: cafe: Fix a resource leak in the error handling path of 'cafe_nand_probe()' ARC: export clear_user_page() for modules net: dsa: b53: Fix calculating number of switch ports netfilter: socket: icmp6: fix use-after-scope fq_codel: reject silly quantum parameters qlcnic: Remove redundant unlock in qlcnic_pinit_from_rom ip_gre: validate csum_start only on pull net: renesas: sh_eth: Fix freeing wrong tx descriptor s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant Linux 4.19.207 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I18108cb47ba9e95838ebe55aaabe34de345ee846 |
||
|
|
91cdb5b362 |
bpf: Introduce BPF nospec instruction for mitigating Spectre v4
commit f5e81d1117501546b7be050c5fbafa6efd2c722c upstream. In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org> [OP: adjusted context for 4.19, drop riscv and ppc32 changes] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
a6850bb536 |
Merge 4.19.206 into android-4.19-stable
Changes in 4.19.206 net: qrtr: fix another OOB Read in qrtr_endpoint_post bpf: Do not use ax register in interpreter on div/mod bpf: Fix 32 bit src register truncation on div/mod bpf: Fix truncation handling for mod32 dst reg wrt zero ARC: Fix CONFIG_STACKDEPOT netfilter: conntrack: collect all entries in one cycle once: Fix panic when module unload can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters Revert "USB: serial: ch341: fix character loss at high transfer rates" USB: serial: option: add new VID/PID to support Fibocom FG150 usb: dwc3: gadget: Fix dwc3_calc_trbs_left() usb: dwc3: gadget: Stop EP0 transfers during pullup disable IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs() e1000e: Fix the max snoop/no-snoop latency for 10M ip_gre: add validation for csum_start xgene-v2: Fix a resource leak in the error handling path of 'xge_probe()' net: marvell: fix MVNETA_TX_IN_PRGRS bit number net: hns3: fix get wrong pfc_en when query PFC configuration usb: gadget: u_audio: fix race condition on endpoint stop opp: remove WARN when no valid OPPs remain virtio: Improve vq->broken access to avoid any compiler optimization virtio_pci: Support surprise removal of virtio pci device vringh: Use wiov->used to check for read/write desc order qed: qed ll2 race condition fixes qed: Fix null-pointer dereference in qed_rdma_create_qp() drm: Copy drm_wait_vblank to user before returning drm/nouveau/disp: power down unused DP links during init net/rds: dma_map_sg is entitled to merge entries vt_kdsetmode: extend console locking fbmem: add margin check to fb_check_caps() KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs Revert "floppy: reintroduce O_NDELAY fix" net: don't unconditionally copy_from_user a struct ifreq for socket ioctls Linux 4.19.206 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I04e05680c5e311bc4cd79daae49d654b66f774a0 |
||
|
|
c348d806ed |
bpf: Do not use ax register in interpreter on div/mod
Partially undo old commit 144cd91c4c2b ("bpf: move tmp variable into ax
register in interpreter"). The reason we need this here is because ax
register will be used for holding temporary state for div/mod instruction
which otherwise interpreter would corrupt. This will cause a small +8 byte
stack increase for interpreter, but with the gain that we can use it from
verifier rewrites as scratch register.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
[cascardo: This partial revert is needed in order to support using AX for
the following two commits, as there is no JMP32 on 4.19.y]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
|
4812ec5093 |
BACKPORT: bpf: add bpf_ktime_get_boot_ns()
On a device like a cellphone which is constantly suspending and resuming CLOCK_MONOTONIC is not particularly useful for keeping track of or reacting to external network events. Instead you want to use CLOCK_BOOTTIME. Hence add bpf_ktime_get_boot_ns() as a mirror of bpf_ktime_get_ns() based around CLOCK_BOOTTIME instead of CLOCK_MONOTONIC. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> (cherry picked from commit 71d19214776e61b33da48f7c1b46e522c7f78221) Change-Id: Ifd62c410dcc5112fd1a473a7e1f70231ca514bc0 |
||
|
|
7f86a29040 |
ANDROID: bpf: fix export symbol type
In commit ff5bf35998cc ("ANDROID: bpf: validate bpf_func when BPF_JIT is
enabled with CFI") a new symbol was exported, but it should have been
set as a _GPL symbol.
Fix this up by properly.
Bug: 145210207
Cc: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7239bb8e0ef329cd7eac6afcd06c341b17ea680b
|
||
|
|
9a11e8da57 |
ANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI
With CONFIG_BPF_JIT, the kernel makes indirect calls to dynamically generated code, which the compile-time Control-Flow Integrity (CFI) checking cannot validate. This change adds basic sanity checking to ensure we are jumping to a valid location, which narrows down the attack surface on the stored pointer. In addition, this change adds a weak arch_bpf_jit_check_func function, which architectures that implement BPF JIT can override to perform additional validation, such as verifying that the pointer points to the correct memory region. Bug: 140377409 Change-Id: I8ebac6637ab6bd9db44716b1c742add267298669 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> |
||
|
|
54e8cf41b2 |
bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ]
Michael and Sandipan report:
Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
and is adjusted again at init time if MODULES_VADDR is defined.
For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
the compile-time default at boot-time, which is 0x9c400000 when
using 64K page size. This overflows the signed 32-bit bpf_jit_limit
value:
root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
-1673527296
and can cause various unexpected failures throughout the network
stack. In one case `strace dhclient eth0` reported:
setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
16) = -1 ENOTSUPP (Unknown error 524)
and similar failures can be seen with tools like tcpdump. This doesn't
always reproduce however, and I'm not sure why. The more consistent
failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
host would time out on systemd/netplan configuring a virtio-net NIC
with no noticeable errors in the logs.
Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().
Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.
Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.
Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
||
|
|
43caa29c99 |
bpf: add bpf_jit_limit knob to restrict unpriv allocations
commit ede95a63b5e84ddeea6b0c473b36ab8bfd8c6ce3 upstream. Rick reported that the BPF JIT could potentially fill the entire module space with BPF programs from unprivileged users which would prevent later attempts to load normal kernel modules or privileged BPF programs, for example. If JIT was enabled but unsuccessful to generate the image, then before commit |
||
|
|
232ac70dd3 |
bpf: enable access to ax register also from verifier rewrite
[ commit 9b73bfdd08e73231d6a90ae6db4b46b3fbf56c30 upstream ] Right now we are using BPF ax register in JIT for constant blinding as well as in interpreter as temporary variable. Verifier will not be able to use it simply because its use will get overridden from the former in bpf_jit_blind_insn(). However, it can be made to work in that blinding will be skipped if there is prior use in either source or destination register on the instruction. Taking constraints of ax into account, the verifier is then open to use it in rewrites under some constraints. Note, ax register already has mappings in every eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
b855e31037 |
bpf: move tmp variable into ax register in interpreter
[ commit 144cd91c4c2bced6eb8a7e25e590f6618a11e854 upstream ] This change moves the on-stack 64 bit tmp variable in ___bpf_prog_run() into the hidden ax register. The latter is currently only used in JITs for constant blinding as a temporary scratch register, meaning the BPF interpreter will never see the use of ax. Therefore it is safe to use it for the cases where tmp has been used earlier. This is needed to later on allow restricted hidden use of ax in both interpreter and JITs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
965931e3a8 |
bpf: fix a rcu usage warning in bpf_prog_array_copy_core()
Commit |
||
|
|
cd33943176 |
bpf: introduce the bpf_get_local_storage() helper function
The bpf_get_local_storage() helper function is used to get a pointer to the bpf local storage from a bpf program. It takes a pointer to a storage map and flags as arguments. Right now it accepts only cgroup storage maps, and flags argument has to be 0. Further it can be extended to support other types of local storage: e.g. thread local storage etc. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
394e40a297 |
bpf: extend bpf_prog_array to store pointers to the cgroup storage
This patch converts bpf_prog_array from an array of prog pointers to the array of struct bpf_prog_array_item elements. This allows to save a cgroup storage pointer for each bpf program efficiently attached to a cgroup. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
d29ab6e1fa |
bpf: bpf_prog_array_alloc() should return a generic non-rcu pointer
Currently the return type of the bpf_prog_array_alloc() is
struct bpf_prog_array __rcu *, which is not quite correct.
Obviously, the returned pointer is a generic pointer, which
is valid for an indefinite amount of time and it's not shared
with anyone else, so there is no sense in marking it as __rcu.
This change eliminate the following sparse warnings:
kernel/bpf/core.c:1544:31: warning: incorrect type in return expression (different address spaces)
kernel/bpf/core.c:1544:31: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1544:31: got void *
kernel/bpf/core.c:1548:17: warning: incorrect type in return expression (different address spaces)
kernel/bpf/core.c:1548:17: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1548:17: got struct bpf_prog_array *<noident>
kernel/bpf/core.c:1681:15: warning: incorrect type in assignment (different address spaces)
kernel/bpf/core.c:1681:15: expected struct bpf_prog_array *array
kernel/bpf/core.c:1681:15: got struct bpf_prog_array [noderef] <asn:4>*
Fixes:
|
||
|
|
85782e037f |
bpf: undo prog rejection on read-only lock failure
Partially undo commit |
||
|
|
9facc33687 |
bpf: reject any prog that failed read-only lock
We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
as well as the BPF image as read-only through bpf_prog_lock_ro(). In
the case any of these would fail we throw a WARN_ON_ONCE() in order to
yell loudly to the log. Perhaps, to some extend, this may be comparable
to an allocation where __GFP_NOWARN is explicitly not set.
Added via
|
||
|
|
7d1982b4e3 |
bpf: fix panic in prog load calls cleanup
While testing I found that when hitting error path in bpf_prog_load()
where we jump to free_used_maps and prog contained BPF to BPF calls
that were JITed earlier, then we never clean up the bpf_prog_kallsyms_add()
done under jit_subprogs(). Add proper API to make BPF kallsyms deletion
more clear and fix that.
Fixes:
|
||
|
|
bf6fa2c893 |
bpf: implement bpf_get_current_cgroup_id() helper
bpf has been used extensively for tracing. For example, bcc contains an almost full set of bpf-based tools to trace kernel and user functions/events. Most tracing tools are currently either filtered based on pid or system-wide. Containers have been used quite extensively in industry and cgroup is often used together to provide resource isolation and protection. Several processes may run inside the same container. It is often desirable to get container-level tracing results as well, e.g. syscall count, function count, I/O activity, etc. This patch implements a new helper, bpf_get_current_cgroup_id(), which will return cgroup id based on the cgroup within which the current task is running. The later patch will provide an example to show that userspace can get the same cgroup id so it could configure a filter or policy in the bpf program based on task cgroup id. The helper is currently implemented for tracing. It can be added to other program types as well when needed. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
170a7e3ea0 |
bpf: bpf_prog_array_copy() should return -ENOENT if exclude_prog not found
This makes is it possible for bpf prog detach to return -ENOENT. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
6f6e434aa2 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
050fad7c45 |
bpf: fix truncated jump targets on heavy expansions
Recently during testing, I ran into the following panic: [ 207.892422] Internal error: Accessing user space memory outside uaccess.h routines: 96000004 [#1] SMP [ 207.901637] Modules linked in: binfmt_misc [...] [ 207.966530] CPU: 45 PID: 2256 Comm: test_verifier Tainted: G W 4.17.0-rc3+ #7 [ 207.974956] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017 [ 207.982428] pstate: 60400005 (nZCv daif +PAN -UAO) [ 207.987214] pc : bpf_skb_load_helper_8_no_cache+0x34/0xc0 [ 207.992603] lr : 0xffff000000bdb754 [ 207.996080] sp : ffff000013703ca0 [ 207.999384] x29: ffff000013703ca0 x28: 0000000000000001 [ 208.004688] x27: 0000000000000001 x26: 0000000000000000 [ 208.009992] x25: ffff000013703ce0 x24: ffff800fb4afcb00 [ 208.015295] x23: ffff00007d2f5038 x22: ffff00007d2f5000 [ 208.020599] x21: fffffffffeff2a6f x20: 000000000000000a [ 208.025903] x19: ffff000009578000 x18: 0000000000000a03 [ 208.031206] x17: 0000000000000000 x16: 0000000000000000 [ 208.036510] x15: 0000ffff9de83000 x14: 0000000000000000 [ 208.041813] x13: 0000000000000000 x12: 0000000000000000 [ 208.047116] x11: 0000000000000001 x10: ffff0000089e7f18 [ 208.052419] x9 : fffffffffeff2a6f x8 : 0000000000000000 [ 208.057723] x7 : 000000000000000a x6 : 00280c6160000000 [ 208.063026] x5 : 0000000000000018 x4 : 0000000000007db6 [ 208.068329] x3 : 000000000008647a x2 : 19868179b1484500 [ 208.073632] x1 : 0000000000000000 x0 : ffff000009578c08 [ 208.078938] Process test_verifier (pid: 2256, stack limit = 0x0000000049ca7974) [ 208.086235] Call trace: [ 208.088672] bpf_skb_load_helper_8_no_cache+0x34/0xc0 [ 208.093713] 0xffff000000bdb754 [ 208.096845] bpf_test_run+0x78/0xf8 [ 208.100324] bpf_prog_test_run_skb+0x148/0x230 [ 208.104758] sys_bpf+0x314/0x1198 [ 208.108064] el0_svc_naked+0x30/0x34 [ 208.111632] Code: 91302260 f9400001 f9001fa1 d2800001 (29500680) [ 208.117717] ---[ end trace 263cb8a59b5bf29f ]--- The program itself which caused this had a long jump over the whole instruction sequence where all of the inner instructions required heavy expansions into multiple BPF instructions. Additionally, I also had BPF hardening enabled which requires once more rewrites of all constant values in order to blind them. Each time we rewrite insns, bpf_adj_branches() would need to potentially adjust branch targets which cross the patchlet boundary to accommodate for the additional delta. Eventually that lead to the case where the target offset could not fit into insn->off's upper 0x7fff limit anymore where then offset wraps around becoming negative (in s16 universe), or vice versa depending on the jump direction. Therefore it becomes necessary to detect and reject any such occasions in a generic way for native eBPF and cBPF to eBPF migrations. For the latter we can simply check bounds in the bpf_convert_filter()'s BPF_EMIT_JMP helper macro and bail out once we surpass limits. The bpf_patch_insn_single() for native eBPF (and cBPF to eBPF in case of subsequent hardening) is a bit more complex in that we need to detect such truncations before hitting the bpf_prog_realloc(). Thus the latter is split into an extra pass to probe problematic offsets on the original program in order to fail early. With that in place and carefully tested I no longer hit the panic and the rewrites are rejected properly. The above example panic I've seen on bpf-next, though the issue itself is generic in that a guard against this issue in bpf seems more appropriate in this case. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
8111038444 |
bpf: sockmap, add hash map support
Sockmap is currently backed by an array and enforces keys to be four bytes. This works well for many use cases and was originally modeled after devmap which also uses four bytes keys. However, this has become limiting in larger use cases where a hash would be more appropriate. For example users may want to use the 5-tuple of the socket as the lookup key. To support this add hash support. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
6cb5fb3891 |
bpf: export bpf_event_output()
bpf_event_output() is useful for offloads to add events to BPF event rings, export it. Note that export is placed near the stub since tracing is optional and kernel/bpf/core.c is always going to be built. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
e0cea7ce98 |
bpf: implement ld_abs/ld_ind in native bpf
The main part of this work is to finally allow removal of LD_ABS and LD_IND from the BPF core by reimplementing them through native eBPF instead. Both LD_ABS/LD_IND were carried over from cBPF and keeping them around in native eBPF caused way more trouble than actually worth it. To just list some of the security issues in the past: * |
||
|
|
4d220ed0f8 |
bpf: remove tracepoints from bpf core
tracepoints to bpf core were added as a way to provide introspection to bpf programs and maps, but after some time it became clear that this approach is inadequate, so prog_id, map_id and corresponding get_next_id, get_fd_by_id, get_info_by_fd, prog_query APIs were introduced and fully adopted by bpftool and other applications. The tracepoints in bpf core started to rot and causing syzbot warnings: WARNING: CPU: 0 PID: 3008 at kernel/trace/trace_event_perf.c:274 Kernel panic - not syncing: panic_on_warn set ... perf_trace_bpf_map_keyval+0x260/0xbd0 include/trace/events/bpf.h:228 trace_bpf_map_update_elem include/trace/events/bpf.h:274 [inline] map_update_elem kernel/bpf/syscall.c:597 [inline] SYSC_bpf kernel/bpf/syscall.c:1478 [inline] Hence this patch deletes tracepoints in bpf core. Reported-by: Eric Biggers <ebiggers3@gmail.com> Reported-by: syzbot <bot+a9dbb3c3e64b62536a4bc5ee7bbd4ca627566188@syzkaller.appspotmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
c195651e56 |
bpf: add bpf_get_stack helper
Currently, stackmap and bpf_get_stackid helper are provided for bpf program to get the stack trace. This approach has a limitation though. If two stack traces have the same hash, only one will get stored in the stackmap table, so some stack traces are missing from user perspective. This patch implements a new helper, bpf_get_stack, will send stack traces directly to bpf program. The bpf program is able to see all stack traces, and then can do in-kernel processing or send stack traces to user space through shared map or bpf_perf_event_output. Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
3a38bb98d9 |
bpf/tracing: fix a deadlock in perf_event_detach_bpf_prog
syzbot reported a possible deadlock in perf_event_detach_bpf_prog. The error details: ====================================================== WARNING: possible circular locking dependency detected 4.16.0-rc7+ #3 Not tainted ------------------------------------------------------ syz-executor7/24531 is trying to acquire lock: (bpf_event_mutex){+.+.}, at: [<000000008a849b07>] perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 but task is already holding lock: (&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mm->mmap_sem){++++}: __might_fault+0x13a/0x1d0 mm/memory.c:4571 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694 perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891 _perf_ioctl kernel/events/core.c:4750 [inline] perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686 SYSC_ioctl fs/ioctl.c:701 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 -> #0 (bpf_event_mutex){+.+.}: lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920 __mutex_lock_common kernel/locking/mutex.c:756 [inline] __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 perf_event_free_bpf_prog kernel/events/core.c:8147 [inline] _free_event+0xbdb/0x10f0 kernel/events/core.c:4116 put_event+0x24/0x30 kernel/events/core.c:4204 perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172 remove_vma+0xb4/0x1b0 mm/mmap.c:172 remove_vma_list mm/mmap.c:2490 [inline] do_munmap+0x82a/0xdf0 mm/mmap.c:2731 mmap_region+0x59e/0x15a0 mm/mmap.c:1646 do_mmap+0x6c0/0xe00 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2223 [inline] vm_mmap_pgoff+0x1de/0x280 mm/util.c:355 SYSC_mmap_pgoff mm/mmap.c:1533 [inline] SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491 SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(bpf_event_mutex); lock(&mm->mmap_sem); lock(bpf_event_mutex); *** DEADLOCK *** ====================================================== The bug is introduced by Commit |
||
|
|
9c481b908b |
bpf: fix bpf_prog_array_copy_to_user warning from perf event prog query
syzkaller tried to perform a prog query in perf_event_query_prog_array() where struct perf_event_query_bpf had an ids_len of 1,073,741,353 and thus causing a warning due to failed kcalloc() allocation out of the bpf_prog_array_copy_to_user() helper. Given we cannot attach more than 64 programs to a perf event, there's no point in allowing huge ids_len. Therefore, allow a buffer that would fix the maximum number of ids and also add a __GFP_NOWARN to the temporary ids buffer. Fixes: |
||
|
|
0911287ce3 |
bpf: fix bpf_prog_array_copy_to_user() issues
1. move copy_to_user out of rcu section to fix the following issue:
./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section!
stack backtrace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
rcu_preempt_sleep_check include/linux/rcupdate.h:301 [inline]
___might_sleep+0x385/0x470 kernel/sched/core.c:6079
__might_sleep+0x95/0x190 kernel/sched/core.c:6067
__might_fault+0xab/0x1d0 mm/memory.c:4532
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
bpf_prog_array_copy_to_user+0x217/0x4d0 kernel/bpf/core.c:1587
bpf_prog_array_copy_info+0x17b/0x1c0 kernel/bpf/core.c:1685
perf_event_query_prog_array+0x196/0x280 kernel/trace/bpf_trace.c:877
_perf_ioctl kernel/events/core.c:4737 [inline]
perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4757
2. move *prog under rcu, since it's not ok to dereference it afterwards
3. in a rare case of prog array being swapped between bpf_prog_array_length()
and bpf_prog_array_copy_to_user() calls make sure to copy zeros to user space,
so the user doesn't walk over uninited prog_ids while kernel reported
uattr->query.prog_cnt > 0
Reported-by: syzbot+7dbcd2d3b85f9b608b23@syzkaller.appspotmail.com
Fixes:
|
||
|
|
f6b1b3bf0d |
bpf: fix subprog verifier bypass by div/mod by 0 exception
One of the ugly leftovers from the early eBPF days is that div/mod
operations based on registers have a hard-coded src_reg == 0 test
in the interpreter as well as in JIT code generators that would
return from the BPF program with exit code 0. This was basically
adopted from cBPF interpreter for historical reasons.
There are multiple reasons why this is very suboptimal and prone
to bugs. To name one: the return code mapping for such abnormal
program exit of 0 does not always match with a suitable program
type's exit code mapping. For example, '0' in tc means action 'ok'
where the packet gets passed further up the stack, which is just
undesirable for such cases (e.g. when implementing policy) and
also does not match with other program types.
While trying to work out an exception handling scheme, I also
noticed that programs crafted like the following will currently
pass the verifier:
0: (bf) r6 = r1
1: (85) call pc+8
caller:
R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
callee:
frame1: R1=ctx(id=0,off=0,imm=0) R10=fp0,call_1
10: (b4) (u32) r2 = (u32) 0
11: (b4) (u32) r3 = (u32) 1
12: (3c) (u32) r3 /= (u32) r2
13: (61) r0 = *(u32 *)(r1 +76)
14: (95) exit
returning from callee:
frame1: R0_w=pkt(id=0,off=0,r=0,imm=0)
R1=ctx(id=0,off=0,imm=0) R2_w=inv0
R3_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
R10=fp0,call_1
to caller at 2:
R0_w=pkt(id=0,off=0,r=0,imm=0) R6=ctx(id=0,off=0,imm=0)
R10=fp0,call_-1
from 14 to 2: R0=pkt(id=0,off=0,r=0,imm=0)
R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
2: (bf) r1 = r6
3: (61) r1 = *(u32 *)(r1 +80)
4: (bf) r2 = r0
5: (07) r2 += 8
6: (2d) if r2 > r1 goto pc+1
R0=pkt(id=0,off=0,r=8,imm=0) R1=pkt_end(id=0,off=0,imm=0)
R2=pkt(id=0,off=8,r=8,imm=0) R6=ctx(id=0,off=0,imm=0)
R10=fp0,call_-1
7: (71) r0 = *(u8 *)(r0 +0)
8: (b7) r0 = 1
9: (95) exit
from 6 to 8: safe
processed 16 insns (limit 131072), stack depth 0+0
Basically what happens is that in the subprog we make use of a
div/mod by 0 exception and in the 'normal' subprog's exit path
we just return skb->data back to the main prog. This has the
implication that the verifier thinks we always get a pkt pointer
in R0 while we still have the implicit 'return 0' from the div
as an alternative unconditional return path earlier. Thus, R0
then contains 0, meaning back in the parent prog we get the
address range of [0x0, skb->data_end] as read and writeable.
Similar can be crafted with other pointer register types.
Since i) BPF_ABS/IND is not allowed in programs that contain
BPF to BPF calls (and generally it's also disadvised to use in
native eBPF context), ii) unknown opcodes don't return zero
anymore, iii) we don't return an exception code in dead branches,
the only last missing case affected and to fix is the div/mod
handling.
What we would really need is some infrastructure to propagate
exceptions all the way to the original prog unwinding the
current stack and returning that code to the caller of the
BPF program. In user space such exception handling for similar
runtimes is typically implemented with setjmp(3) and longjmp(3)
as one possibility which is not available in the kernel,
though (kgdb used to implement it in kernel long time ago). I
implemented a PoC exception handling mechanism into the BPF
interpreter with porting setjmp()/longjmp() into x86_64 and
adding a new internal BPF_ABRT opcode that can use a program
specific exception code for all exception cases we have (e.g.
div/mod by 0, unknown opcodes, etc). While this seems to work
in the constrained BPF environment (meaning, here, we don't
need to deal with state e.g. from memory allocations that we
would need to undo before going into exception state), it still
has various drawbacks: i) we would need to implement the
setjmp()/longjmp() for every arch supported in the kernel and
for x86_64, arm64, sparc64 JITs currently supporting calls,
ii) it has unconditional additional cost on main program
entry to store CPU register state in initial setjmp() call,
and we would need some way to pass the jmp_buf down into
___bpf_prog_run() for main prog and all subprogs, but also
storing on stack is not really nice (other option would be
per-cpu storage for this, but it also has the drawback that
we need to disable preemption for every BPF program types).
All in all this approach would add a lot of complexity.
Another poor-man's solution would be to have some sort of
additional shared register or scratch buffer to hold state
for exceptions, and test that after every call return to
chain returns and pass R0 all the way down to BPF prog caller.
This is also problematic in various ways: i) an additional
register doesn't map well into JITs, and some other scratch
space could only be on per-cpu storage, which, again has the
side-effect that this only works when we disable preemption,
or somewhere in the input context which is not available
everywhere either, and ii) this adds significant runtime
overhead by putting conditionals after each and every call,
as well as implementation complexity.
Yet another option is to teach verifier that div/mod can
return an integer, which however is also complex to implement
as verifier would need to walk such fake 'mov r0,<code>; exit;'
sequeuence and there would still be no guarantee for having
propagation of this further down to the BPF caller as proper
exception code. For parent prog, it is also is not distinguishable
from a normal return of a constant scalar value.
The approach taken here is a completely different one with
little complexity and no additional overhead involved in
that we make use of the fact that a div/mod by 0 is undefined
behavior. Instead of bailing out, we adapt the same behavior
as on some major archs like ARMv8 [0] into eBPF as well:
X div 0 results in 0, and X mod 0 results in X. aarch64 and
aarch32 ISA do not generate any traps or otherwise aborts
of program execution for unsigned divides. I verified this
also with a test program compiled by gcc and clang, and the
behavior matches with the spec. Going forward we adapt the
eBPF verifier to emit such rewrites once div/mod by register
was seen. cBPF is not touched and will keep existing 'return 0'
semantics. Given the options, it seems the most suitable from
all of them, also since major archs have similar schemes in
place. Given this is all in the realm of undefined behavior,
we still have the option to adapt if deemed necessary and
this way we would also have the option of more flexibility
from LLVM code generation side (which is then fully visible
to verifier). Thus, this patch i) fixes the panic seen in
above program and ii) doesn't bypass the verifier observations.
[0] ARM Architecture Reference Manual, ARMv8 [ARM DDI 0487B.b]
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487b.b/DDI0487B_b_armv8_arm.pdf
1) aarch64 instruction set: section C3.4.7 and C6.2.279 (UDIV)
"A division by zero results in a zero being written to
the destination register, without any indication that
the division by zero occurred."
2) aarch32 instruction set: section F1.4.8 and F5.1.263 (UDIV)
"For the SDIV and UDIV instructions, division by zero
always returns a zero result."
Fixes:
|
||
|
|
5e581dad4f |
bpf: make unknown opcode handling more robust
Recent findings by syzcaller fixed in
|
||
|
|
ea9722e265 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-19 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) bpf array map HW offload, from Jakub. 2) support for bpf_get_next_key() for LPM map, from Yonghong. 3) test_verifier now runs loaded programs, from Alexei. 4) xdp cpumap monitoring, from Jesper. 5) variety of tests, cleanups and small x64 JIT optimization, from Daniel. 6) user space can now retrieve HW JITed program, from Jiong. Note there is a minor conflict between Russell's arm32 JIT fixes and removal of bpf_jit_enable variable by Daniel which should be resolved by keeping Russell's comment and removing that variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
8565d26bcb |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The BPF verifier conflict was some minor contextual issue. The TUN conflict was less trivial. Cong Wang fixed a memory leak of tfile->tx_array in 'net'. This is an skb_array. But meanwhile in net-next tun changed tfile->tx_arry into tfile->tx_ring which is a ptr_ring. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
fa9dd599b4 |
bpf: get rid of pure_initcall dependency to enable jits
Having a pure_initcall() callback just to permanently enable BPF JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave a small race window in future where JIT is still disabled on boot. Since we know about the setting at compilation time anyway, just initialize it properly there. Also consolidate all the individual bpf_jit_enable variables into a single one and move them under one location. Moreover, don't allow for setting unspecified garbage values on them. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
c366287ebd |
bpf: fix divides by zero
Divides by zero are not nice, lets avoid them if possible.
Also do_div() seems not needed when dealing with 32bit operands,
but this seems a minor detail.
Fixes:
|
||
|
|
19d28fbd30 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
290af86629 |
bpf: introduce BPF_JIT_ALWAYS_ON config
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
||
|
|
7105e828c0 |
bpf: allow for correlation of maps and helpers in dump
Currently a dump of an xlated prog (post verifier stage) doesn't
correlate used helpers as well as maps. The prog info lists
involved map ids, however there's no correlation of where in the
program they are used as of today. Likewise, bpftool does not
correlate helper calls with the target functions.
The latter can be done w/o any kernel changes through kallsyms,
and also has the advantage that this works with inlined helpers
and BPF calls.
Example, via interpreter:
# tc filter show dev foo ingress
filter protocol all pref 49152 bpf chain 0
filter protocol all pref 49152 bpf chain 0 handle 0x1 foo.o:[ingress] \
direct-action not_in_hw id 1 tag c74773051b364165 <-- prog id:1
* Output before patch (calls/maps remain unclear):
# bpftool prog dump xlated id 1 <-- dump prog id:1
0: (b7) r1 = 2
1: (63) *(u32 *)(r10 -4) = r1
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = 0xffff95c47a8d4800
6: (85) call unknown#73040
7: (15) if r0 == 0x0 goto pc+18
8: (bf) r2 = r10
9: (07) r2 += -4
10: (bf) r1 = r0
11: (85) call unknown#73040
12: (15) if r0 == 0x0 goto pc+23
[...]
* Output after patch:
# bpftool prog dump xlated id 1
0: (b7) r1 = 2
1: (63) *(u32 *)(r10 -4) = r1
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = map[id:2] <-- map id:2
6: (85) call bpf_map_lookup_elem#73424 <-- helper call
7: (15) if r0 == 0x0 goto pc+18
8: (bf) r2 = r10
9: (07) r2 += -4
10: (bf) r1 = r0
11: (85) call bpf_map_lookup_elem#73424
12: (15) if r0 == 0x0 goto pc+23
[...]
# bpftool map show id 2 <-- show/dump/etc map id:2
2: hash_of_maps flags 0x0
key 4B value 4B max_entries 3 memlock 4096B
Example, JITed, same prog:
# tc filter show dev foo ingress
filter protocol all pref 49152 bpf chain 0
filter protocol all pref 49152 bpf chain 0 handle 0x1 foo.o:[ingress] \
direct-action not_in_hw id 3 tag c74773051b364165 jited
# bpftool prog show id 3
3: sched_cls tag c74773051b364165
loaded_at Dec 19/13:48 uid 0
xlated 384B jited 257B memlock 4096B map_ids 2
# bpftool prog dump xlated id 3
0: (b7) r1 = 2
1: (63) *(u32 *)(r10 -4) = r1
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = map[id:2] <-- map id:2
6: (85) call __htab_map_lookup_elem#77408 <-+ inlined rewrite
7: (15) if r0 == 0x0 goto pc+2 |
8: (07) r0 += 56 |
9: (79) r0 = *(u64 *)(r0 +0) <-+
10: (15) if r0 == 0x0 goto pc+24
11: (bf) r2 = r10
12: (07) r2 += -4
[...]
Example, same prog, but kallsyms disabled (in that case we are
also not allowed to pass any relative offsets, etc, so prog
becomes pointer sanitized on dump):
# sysctl kernel.kptr_restrict=2
kernel.kptr_restrict = 2
# bpftool prog dump xlated id 3
0: (b7) r1 = 2
1: (63) *(u32 *)(r10 -4) = r1
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = map[id:2]
6: (85) call bpf_unspec#0
7: (15) if r0 == 0x0 goto pc+2
[...]
Example, BPF calls via interpreter:
# bpftool prog dump xlated id 1
0: (85) call pc+2#__bpf_prog_run_args32
1: (b7) r0 = 1
2: (95) exit
3: (b7) r0 = 2
4: (95) exit
Example, BPF calls via JIT:
# sysctl net.core.bpf_jit_enable=1
net.core.bpf_jit_enable = 1
# sysctl net.core.bpf_jit_kallsyms=1
net.core.bpf_jit_kallsyms = 1
# bpftool prog dump xlated id 1
0: (85) call pc+2#bpf_prog_3b185187f1855c4c_F
1: (b7) r0 = 1
2: (95) exit
3: (b7) r0 = 2
4: (95) exit
And finally, an example for tail calls that is now working
as well wrt correlation:
# bpftool prog dump xlated id 2
[...]
10: (b7) r2 = 8
11: (85) call bpf_trace_printk#-41312
12: (bf) r1 = r6
13: (18) r2 = map[id:1]
15: (b7) r3 = 0
16: (85) call bpf_tail_call#12
17: (b7) r1 = 42
18: (6b) *(u16 *)(r6 +46) = r1
19: (b7) r0 = 0
20: (95) exit
# bpftool map show id 1
1: prog_array flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
||
|
|
1c2a088a66 |
bpf: x64: add JIT support for multi-function programs
Typical JIT does several passes over bpf instructions to compute total size and relative offsets of jumps and calls. With multitple bpf functions calling each other all relative calls will have invalid offsets intially therefore we need to additional last pass over the program to emit calls with correct offsets. For example in case of three bpf functions: main: call foo call bpf_map_lookup exit foo: call bar exit bar: exit We will call bpf_int_jit_compile() indepedently for main(), foo() and bar() x64 JIT typically does 4-5 passes to converge. After these initial passes the image for these 3 functions will be good except call targets, since start addresses of foo() and bar() are unknown when we were JITing main() (note that call bpf_map_lookup will be resolved properly during initial passes). Once start addresses of 3 functions are known we patch call_insn->imm to point to right functions and call bpf_int_jit_compile() again which needs only one pass. Additional safety checks are done to make sure this last pass doesn't produce image that is larger or smaller than previous pass. When constant blinding is on it's applied to all functions at the first pass, since doing it once again at the last pass can change size of the JITed code. Tested on x64 and arm64 hw with JIT on/off, blinding on/off. x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter. All other JITs that support normal BPF_CALL will behave the same way since bpf-to-bpf call is equivalent to bpf-to-kernel call from JITs point of view. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |