* refs/heads/tmp-3f534fa:
Linux 4.19.49
media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
of: overlay: set node fields from properties when add new overlay node
of: overlay: validate overlay properties #address-cells and #size-cells
scsi: lpfc: Fix backport of faf5a744f4f8 ("scsi: lpfc: avoid uninitialized variable warning")
x86/kprobes: Set instruction page as executable
x86/ftrace: Set trampoline pages as executable
x86/ftrace: Do not call function graph from dynamic trampolines
binder: fix race between munmap() and direct reclaim
Revert "binder: fix handling of misaligned binder object"
Revert "x86/build: Move _etext to actual end of .text"
include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
Compiler Attributes: add support for __copy (gcc >= 9)
drm/lease: Make sure implicit planes are leased
drm/rockchip: shutdown drm subsystem on shutdown
drm/sun4i: Fix sun8i HDMI PHY configuration for > 148.5 MHz
drm/sun4i: Fix sun8i HDMI PHY clock initialization
drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set
drm/tegra: gem: Fix CPU-cache maintenance for BO's allocated using get_pages()
gcc-plugins: Fix build failures under Darwin host
Revert "lockd: Show pid of lockd for remote locks"
CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case
staging: wlan-ng: fix adapter initialization failure
staging: vc04_services: prevent integer overflow in create_pagelist()
serial: sh-sci: disable DMA for uart_console
vt/fbcon: deinitialize resources in visual_init() after failed memory allocation
evm: check hash algorithm passed to init_desc()
ima: show rules with IMA_INMASK correctly
doc: Cope with Sphinx logging deprecations
doc: Cope with the deprecation of AutoReporter
docs: Fix conf.py for Sphinx 2.0
arm64: Fix the arm64_personality() syscall wrapper redirection
kernel/signal.c: trace_signal_deliver when signal_group_exit
memcg: make it work on sparse non-0-node systems
tty: max310x: Fix external crystal register setup
tty: serial: msm_serial: Fix XON/XOFF
i2c: synquacer: fix synquacer_i2c_doxfer() return value
i2c: mlxcpld: Fix wrong initialization order in probe
drm/nouveau/i2c: Disable i2c bus access after ->fini()
KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID
ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops
ALSA: hda/realtek - Set default power save node to 0
ALSA: line6: Assure canceling delayed work at disconnection
powerpc/perf: Fix MMCRA corruption by bhrb_filter
KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts
s390/crypto: fix possible sleep during spinlock aquired
s390/crypto: fix gcm-aes-s390 selftest failures
iio: adc: ti-ads8688: fix timestamp is not updated in buffer
iio: dac: ds4422/ds4424 fix chip verification
Btrfs: incremental send, fix file corruption when no-holes feature is enabled
Btrfs: fix fsync not persisting changed attributes of a directory
Btrfs: fix race updating log root item during fsync
Btrfs: fix wrong ctime and mtime of a directory after log replay
tracing: Avoid memory leak in predicate_parse()
scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
brcmfmac: fix NULL pointer derefence during USB disconnect
media: smsusb: better handle optional alignment
media: usb: siano: Fix false-positive "uninitialized variable" warning
media: usb: siano: Fix general protection fault in smsusb
USB: rio500: fix memory leak in close after disconnect
USB: rio500: refuse more than one device at a time
USB: Add LPM quirk for Surface Dock GigE adapter
USB: sisusbvga: fix oops in error path of sisusb_probe
USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
usbip: usbip_host: fix stub_dev lock context imbalance regression
usbip: usbip_host: fix BUG: sleeping function called from invalid context
usb: xhci: avoid null pointer deref when bos field is NULL
xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
xhci: Use %zu for printing size_t type
xhci: update bounce buffer with correct sg num
include/linux/bitops.h: sanitize rotate primitives
sparc64: Fix regression in non-hypervisor TLB flush xcall
ANDROID: uid_sys_stats: report uid_cputime stats in microseconds
Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"
cuttlefish_defconfig: update with recent upstream change
cuttlefish_defconfig: update with recent upstream change
Change-Id: I62be41246e49d33b20377ca090ae4a73bc6b592d
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
While ULMK is active, prevent OOM for order <= COSTLY_ORDER
pages. No special actions are taken for larger orders.
Change-Id: I4ccba1b71155947569acc3d88ae2027e3b2e0620
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
* refs/heads/tmp-cca7d2d:
Linux 4.19.24
mm: proc: smaps_rollup: fix pss_locked calculation
drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set
drm/i915: Block fbdev HPD processing during suspend
drm/vkms: Fix license inconsistent
drm: Use array_size() when creating lease
dm thin: fix bug where bio that overwrites thin block ignores FUA
dm crypt: don't overallocate the integrity tag space
x86/a.out: Clear the dump structure initially
md/raid1: don't clear bitmap bits on interrupted recovery.
signal: Restore the stop PTRACE_EVENT_EXIT
scsi: sd: fix entropy gathering for most rotational disks
x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls
tracing/uprobes: Fix output for multiple string arguments
s390/zcrypt: fix specification exception on z196 during ap probe
alpha: Fix Eiger NR_IRQS to 128
alpha: fix page fault handling for r16-r18 targets
Revert "mm: slowly shrink slabs with a relatively small number of objects"
Revert "mm: don't reclaim inodes with many attached pages"
Revert "nfsd4: return default lease period"
Input: elantech - enable 3rd button support on Fujitsu CELSIUS H780
Input: bma150 - register input device after setting private data
mmc: block: handle complete_work on separate workqueue
mmc: sunxi: Filter out unsupported modes declared in the device tree
kvm: vmx: Fix entry number check for add_atomic_switch_msr()
x86/kvm/nVMX: read from MSR_IA32_VMX_PROCBASED_CTLS2 only when it is available
riscv: Add pte bit to distinguish swap from invalid
tools uapi: fix Alpha support
ASoC: hdmi-codec: fix oops on re-probe
ALSA: usb-audio: Fix implicit fb endpoint setup by quirk
ALSA: hda - Add quirk for HP EliteBook 840 G5
perf/x86: Add check_period PMU callback
perf/core: Fix impossible ring-buffer sizes warning
ARM: OMAP5+: Fix inverted nirq pin interrupts with irq_set_type
Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK
Revert "Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G"
gpio: mxc: move gpio noirq suspend/resume to syscore phase
CIFS: Do not assume one credit for async responses
kvm: sev: Fail KVM_SEV_INIT if already initialized
cifs: Limit memory used by lock request calls to a page
drm/nouveau/falcon: avoid touching registers if engine is off
drm/nouveau: Don't disable polling in fallback mode
gpio: pl061: handle failed allocations
ARM: dts: kirkwood: Fix polarity of GPIO fan lines
ARM: dts: da850-lcdk: Correct the sound card name
ARM: dts: da850-lcdk: Correct the audio codec regulators
ARM: dts: da850-evm: Correct the sound card name
ARM: dts: da850-evm: Correct the audio codec regulators
drm/amdgpu: set WRITE_BURST_LENGTH to 64B to workaround SDMA1 hang
nvme: pad fake subsys NQN vid and ssvid with zeros
nvme-multipath: zero out ANA log buffer
nvme-pci: fix out of bounds access in nvme_cqe_pending
nvme-pci: use the same attributes when freeing host_mem_desc_bufs.
drm/bridge: tc358767: fix output H/V syncs
drm/bridge: tc358767: reject modes which require too much BW
drm/bridge: tc358767: fix initial DP0/1_SRCCTRL value
drm/bridge: tc358767: fix single lane configuration
drm/bridge: tc358767: add defines for DP1_SRCCTRL & PHY_2LANE
drm/bridge: tc358767: add bus flags
cpufreq: check if policy is inactive early in __cpufreq_get()
riscv: fix trace_sys_exit hook
tools uapi: fix RISC-V 64-bit support
perf test shell: Use a fallback to get the pathname in vfs_getname
perf report: Fix wrong iteration count in --branch-history
ACPI: NUMA: Use correct type for printing addresses on i386-PAE
drm/amdgpu/sriov:Correct pfvf exchange logic
ARM: fix the cockup in the previous patch
ARM: ensure that processor vtables is not lost after boot
ARM: spectre-v2: per-CPU vtables to work around big.Little systems
ARM: add PROC_VTABLE and PROC_TABLE macros
ARM: clean up per-processor check_bugs method call
ARM: split out processor lookup
ARM: make lookup_processor_type() non-__init
ARM: 8810/1: vfp: Fix wrong assignement to ufp_exc
ARM: 8797/1: spectre-v1.1: harden __copy_to_user
ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization
ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()
ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit
ARM: 8793/1: signal: replace __put_user_error with __put_user
ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()
ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state
ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context
ARM: 8789/1: signal: copy registers using __copy_to_user()
blk-mq: fix a hung issue when fsync
eeprom: at24: add support for 24c2048
dt-bindings: eeprom: at24: add "atmel,24c2048" compatible string
Conflicts:
drivers/mmc/core/block.c
include/linux/mmc/card.h
Change-Id: I829d46ab020fcefca26c7d12e03c64c0ca7c3528
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
* refs/heads/tmp-0755dc9:
Linux 4.19.22
svcrdma: Remove max_sge check at connect time
svcrdma: Reduce max_send_sges
batman-adv: Force mac header to start of data on xmit
batman-adv: Avoid WARN on net_device without parent in netns
xfrm: refine validation of template and selector families
libceph: avoid KEEPALIVE_PENDING races in ceph_con_keepalive()
Revert "ext4: use ext4_write_inode() when fsyncing w/o a journal"
xfrm: Make set-mark default behavior backward compatible
SUNRPC: Always drop the XPRT_LOCK on XPRT_CLOSE_WAIT
drm/vmwgfx: Return error code from vmw_execbuf_copy_fence_user
drm/vmwgfx: Fix setting of dma masks
drm/i915: always return something on DDI clock selection
drm/amd/powerplay: Fix missing break in switch
drm/modes: Prevent division by zero htotal
mac80211: ensure that mgmt tx skbs have tailroom for encryption
mic: vop: Fix use-after-free on remove
powerpc/radix: Fix kernel crash with mremap()
firmware: arm_scmi: provide the mandatory device release callback
ARM: dts: da850: fix interrupt numbers for clocksource
ARM: tango: Improve ARCH_MULTIPLATFORM compatibility
ARM: iop32x/n2100: fix PCI IRQ mapping
MIPS: VDSO: Include $(ccflags-vdso) in o32,n32 .lds builds
mips: loongson64: remove unreachable(), fix loongson_poweroff().
MIPS: VDSO: Use same -m%-float cflag as the kernel proper
MIPS: OCTEON: don't set octeon_dma_bar_type if PCI is disabled
mips: cm: reprime error cause
tracing: uprobes: Fix typo in pr_fmt string
pinctrl: cherryview: fix Strago DMI workaround
pinctrl: sunxi: Correct number of IRQ banks on H6 main pin controller
debugfs: fix debugfs_rename parameter checking
samples: mei: use /dev/mei0 instead of /dev/mei
mei: me: add ice lake point device id.
misc: vexpress: Off by one in vexpress_syscfg_exec()
signal: Better detection of synchronous signals
signal: Always notice exiting tasks
iio: ti-ads8688: Update buffer allocation for timestamps
iio: chemical: atlas-ph-sensor: correct IIO_TEMP values to millicelsius
iio: adc: axp288: Fix TS-pin handling
tools: iio: iio_generic_buffer: make num_loops signed
libata: Add NOLPM quirk for SAMSUNG MZ7TE512HMHP-000L1 SSD
mtd: rawnand: gpmi: fix MX28 bus master lockup problem
mtd: spinand: Fix the error/cleanup path in spinand_init()
mtd: spinand: Handle the case where PROGRAM LOAD does not reset the cache
mtd: Make sure mtd->erasesize is valid even if the partition is of size 0
ANDROID: cuttlefish: enable CONFIG_NET_SCH_NETEM=y
Add XFRM-I to cuttlefish defconfigs
ANDROID: Move from clang r346389b to r349610.
Change-Id: Ie249267aa9e0d4eb169adecafc0cdc59a0a2eb0f
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
commit cf43a757fd49442bc38f76088b70c2299eed2c2f upstream.
In the middle of do_exit() there is there is a call
"ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process
in TACKED_TRACED aka "(TASK_WAKEKILL | __TASK_TRACED)" and waits for
for the debugger to release the task or SIGKILL to be delivered.
Skipping past dequeue_signal when we know a fatal signal has already
been delivered resulted in SIGKILL remaining pending and
TIF_SIGPENDING remaining set. This in turn caused the
scheduler to not sleep in PTACE_EVENT_EXIT as it figured
a fatal signal was pending. This also caused ptrace_freeze_traced
in ptrace_check_attach to fail because it left a per thread
SIGKILL pending which is what fatal_signal_pending tests for.
This difference in signal state caused strace to report
strace: Exit of unknown pid NNNNN ignored
Therefore update the signal handling state like dequeue_signal
would when removing a per thread SIGKILL, by removing SIGKILL
from the per thread signal mask and clearing TIF_SIGPENDING.
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ivan Delalande <colona@arista.com>
Cc: stable@vger.kernel.org
Fixes: 35634ffa1751 ("signal: Always notice exiting tasks")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7146db3317c67b517258cb5e1b08af387da0618b upstream.
Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop failing
to deliver SIGHUP but always trying.
When the stack overflows delivery of SIGHUP fails and force_sigsegv is
called. Unfortunately because SIGSEGV is numerically higher than
SIGHUP next_signal tries again to deliver a SIGHUP.
From a quality of implementation standpoint attempting to deliver the
timer SIGHUP signal is wrong. We should attempt to deliver the
synchronous SIGSEGV signal we just forced.
We can make that happening in a fairly straight forward manner by
instead of just looking at the signal number we also look at the
si_code. In particular for exceptions (aka synchronous signals) the
si_code is always greater than 0.
That still has the potential to pick up a number of asynchronous
signals as in a few cases the same si_codes that are used
for synchronous signals are also used for asynchronous signals,
and SI_KERNEL is also included in the list of possible si_codes.
Still the heuristic is much better and timer signals are definitely
excluded. Which is enough to prevent all known ways for someone
sending a process signals fast enough to cause unexpected and
arguably incorrect behavior.
Cc: stable@vger.kernel.org
Fixes: a27341cd5f ("Prioritize synchronous signals over 'normal' signals")
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.
Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop
failing to deliver SIGHUP but always trying.
Upon examination it turns out part of the problem is actually most of
the solution. Since 2.5 signal delivery has found all fatal signals,
marked the signal group for death, and queued SIGKILL in every threads
thread queue relying on signal->group_exit_code to preserve the
information of which was the actual fatal signal.
The conversion of all fatal signals to SIGKILL results in the
synchronous signal heuristic in next_signal kicking in and preferring
SIGHUP to SIGKILL. Which is especially problematic as all
fatal signals have already been transformed into SIGKILL.
Instead of dequeueing signals and depending upon SIGKILL to
be the first signal dequeued, first test if the signal group
has already been marked for death. This guarantees that
nothing in the signal queue can prevent a process that needs
to exit from exiting.
Cc: stable@vger.kernel.org
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* refs/heads/tmp-7950eb3:
Revert "scsi: ufs: Schedule clk gating work on correct queue"
Linux 4.19.2
MD: fix invalid stored role for a disk - try2
vga_switcheroo: Fix missing gpu_bound call at audio client registration
bpf: wait for running BPF programs when updating map-in-map
userns: also map extents in the reverse map to kernel IDs
vt: fix broken display when running aptitude
net: sched: Remove TCA_OPTIONS from policy
Btrfs: fix use-after-free when dumping free space
Btrfs: fix use-after-free during inode eviction
btrfs: move the dio_sem higher up the callchain
btrfs: don't run delayed_iputs in commit
btrfs: fix insert_reserved error handling
btrfs: only free reserved extent if we didn't insert it
btrfs: don't use ctl->free_space for max_extent_size
btrfs: set max_extent_size properly
btrfs: reset max_extent_size properly
Btrfs: fix deadlock when writing out free space caches
Btrfs: fix assertion on fsync of regular file when using no-holes feature
Btrfs: fix null pointer dereference on compressed write path error
btrfs: qgroup: Dirty all qgroups before rescan
Btrfs: fix wrong dentries after fsync of file that got its parent replaced
Btrfs: fix warning when replaying log after fsync of a tmpfile
btrfs: make sure we create all new block groups
btrfs: reset max_extent_size on clear in a bitmap
btrfs: protect space cache inode alloc with GFP_NOFS
btrfs: release metadata before running delayed refs
Btrfs: don't clean dirty pages during buffered writes
btrfs: wait on caching when putting the bg cache
btrfs: keep trim from interfering with transaction commits
btrfs: don't attempt to trim devices that don't support it
btrfs: iterate all devices during trim, instead of fs_devices::alloc_list
btrfs: Ensure btrfs_trim_fs can trim the whole filesystem
btrfs: Enhance btrfs_trim_fs function to handle error better
btrfs: fix error handling in btrfs_dev_replace_start
btrfs: fix error handling in free_log_tree
btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock
btrfs: Handle owner mismatch gracefully when walking up tree
btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled
tracing: Return -ENOENT if there is no target synthetic event
selftests/powerpc: Fix ptrace tm failure
selftests/ftrace: Fix synthetic event test to delete event correctly
soc/tegra: pmc: Fix child-node lookup
soc: qcom: rmtfs-mem: Validate that scm is available
arm64: dts: stratix10: Correct System Manager register size
ARM: dts: socfpga: Fix SDRAM node address for Arria10
Cramfs: fix abad comparison when wrap-arounds occur
rpmsg: smd: fix memory leak on channel create
arm64: lse: remove -fcall-used-x0 flag
media: hdmi.h: rename ADOBE_RGB to OPRGB and ADOBE_YCC to OPYCC
media: replace ADOBERGB by OPRGB
media: media colorspaces*.rst: rename AdobeRGB to opRGB
drm/mediatek: fix OF sibling-node lookup
media: adv7842: when the EDID is cleared, unconfigure CEC as well
media: adv7604: when the EDID is cleared, unconfigure CEC as well
media: em28xx: fix handler for vidioc_s_input()
media: em28xx: make v4l2-compliance happier by starting sequence on zero
media: em28xx: fix input name for Terratec AV 350
media: tvp5150: avoid going past array on v4l2_querymenu()
media: em28xx: use a default format if TRY_FMT fails
media: cec: forgot to cancel delayed work
media: cec: fix the Signal Free Time calculation
media: cec: add new tx/rx status bits to detect aborts/timeouts
xen-blkfront: fix kernel panic with negotiate_mq error path
xen: remove size limit of privcmd-buf mapping interface
xen: fix xen_qlock_wait()
media: cec: integrate cec_validate_phys_addr() in cec-api.c
media: cec: make cec_get_edid_spa_location() an inline function
remoteproc: qcom: q6v5: Propagate EPROBE_DEFER
kgdboc: Passing ekgdboc to command line causes panic
Revert "media: dvbsky: use just one mutex for serializing device R/W ops"
media: v4l2-tpg: fix kernel oops when enabling HFLIP and OSD
net: bcmgenet: fix OF child-node lookup
TC: Set DMA masks for devices
iommu/arm-smmu: Ensure that page-table updates are visible before TLBI
ocxl: Fix access to the AFU Descriptor Data
power: supply: twl4030-charger: fix OF sibling-node lookup
rtc: cmos: Remove the `use_acpi_alarm' module parameter for !ACPI
rtc: cmos: Fix non-ACPI undefined reference to `hpet_rtc_interrupt'
rtc: ds1307: fix ds1339 wakealarm support
MIPS: OCTEON: fix out of bounds array access on CN68XX
powerpc/64s/hash: Do not use PPC_INVALIDATE_ERAT on CPUs before POWER9
powerpc/tm: Fix HFSCR bit for no suspend case
powerpc/msi: Fix compile error on mpc83xx
powerpc64/module elfv1: Set opd addresses after module relocation
fsnotify: Fix busy inodes during unmount
media: ov7670: make "xclk" clock optional
dm zoned: fix various dmz_get_mblock() issues
dm zoned: fix metadata block ref counting
dm ioctl: harden copy_params()'s copy_from_user() from malicious users
lockd: fix access beyond unterminated strings in prints
nfsd: Fix an Oops in free_session()
nfsd: correctly decrement odstate refcount in error path
nfs: Fix a missed page unlock after pg_doio()
NFSv4.1: Fix the r/wsize checking
NFC: nfcmrvl_uart: fix OF child-node lookup
tpm: fix response size validation in tpm_get_random()
genirq: Fix race on spurious interrupt detection
printk: Fix panic caused by passing log_buf_len to command line
smb3: on kerberos mount if server doesn't specify auth type use krb5
smb3: do not attempt cifs operation in smb3 query info error path
smb3: allow stats which track session and share reconnects to be reset
w1: omap-hdq: fix missing bus unregister at removal
iio: adc: at91: fix wrong channel number in triggered buffer mode
iio: adc: at91: fix acking DRDY irq on simple conversions
iio: adc: imx25-gcq: Fix leak of device_node in mx25_gcq_setup_cfgs()
iio: ad5064: Fix regulator handling
kbuild: fix kernel/bounds.c 'W=1' warning
KVM: arm64: Fix caching of host MDCR_EL2 value
KVM: arm/arm64: Ensure only THP is candidate for adjustment
mm/hmm: fix race between hmm_mirror_unregister() and mmu_notifier callback
mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly
hugetlbfs: dirty pages as they are added to pagecache
ima: open a new file instance if no read permissions
ima: fix showing large 'violations' or 'runtime_measurements_count'
userfaultfd: disable irqs when taking the waitqueue lock
mm: /proc/pid/smaps_rollup: fix NULL pointer deref in smaps_pte_range()
crypto: speck - remove Speck
crypto: aegis/generic - fix for big endian systems
crypto: morus/generic - fix for big endian systems
crypto: aesni - don't use GFP_ATOMIC allocation if the request doesn't cross a page in gcm
crypto: tcrypt - fix ghash-generic speed test
crypto: lrw - Fix out-of bounds access on counter overflow
signal: Guard against negative signal numbers in copy_siginfo_from_user32
signal/GenWQE: Fix sending of SIGKILL
PCI: Add Device IDs for Intel GPU "spurious interrupt" quirk
PCI/ASPM: Fix link_state teardown on device removal
ARM: dts: dra7: Fix up unaligned access setting for PCIe EP
EDAC, skx_edac: Fix logical channel intermediate decoding
EDAC, {i7core,sb,skx}_edac: Fix uncorrected error counting
EDAC, amd64: Add Family 17h, models 10h-2fh support
HID: hiddev: fix potential Spectre v1
HID: wacom: Work around HID descriptor bug in DTK-2451 and DTH-2452
selinux: fix mounting of cgroup2 under older policies
ext4: fix use-after-free race in ext4_remount()'s error path
ext4: propagate error from dquot_initialize() in EXT4_IOC_FSSETXATTR
ext4: fix setattr project check in fssetxattr ioctl
ext4: initialize retries variable in ext4_da_write_inline_data_begin()
ext4: fix EXT4_IOC_SWAP_BOOT
gfs2_meta: ->mount() can get NULL dev_name
jbd2: fix use after free in jbd2_log_do_checkpoint()
IB/rxe: Revise the ib_wr_opcode enum
IB/mlx5: Fix MR cache initialization
ASoC: sta32x: set ->component pointer in private struct
ASoC: intel: skylake: Add missing break in skl_tplg_get_token()
libnvdimm, pmem: Fix badblocks population for 'raw' namespaces
libnvdimm, region: Fail badblocks listing for inactive regions
libnvdimm: Hold reference on parent while scheduling async init
scsi: target: Fix target_wait_for_sess_cmds breakage with active signals
scsi: sched/wait: Add wait_event_lock_irq_timeout for TASK_UNINTERRUPTIBLE usage
dmaengine: ppc4xx: fix off-by-one build failure
net/ipv4: defensive cipso option parsing
iwlwifi: mvm: check return value of rs_rate_from_ucode_rate()
mt76: mt76x2: fix multi-interface beacon configuration
usb: gadget: udc: renesas_usb3: Fix b-device mode for "workaround"
usb: typec: tcpm: Fix APDO PPS order checking to be based on voltage
usbip:vudc: BUG kmalloc-2048 (Not tainted): Poison overwritten
libertas: don't set URB_ZERO_PACKET on IN USB transfer
xen/pvh: don't try to unplug emulated devices
xen/pvh: increase early stack size
xen: make xen_qlock_wait() nestable
xen: fix race in xen_qlock_wait()
xen/balloon: Support xend-based toolstack
xen/blkfront: avoid NULL blkfront_info dereference on device removal
tpm: Restore functionality to xen vtpm driver.
xen-swiotlb: use actually allocated size on check physical continuous
ARM: dts: exynos: Mark 1 GHz CPU OPP as suspend OPP on Exynos5250
ARM: dts: exynos: Convert exynos5250.dtsi to opp-v2 bindings
OPP: Free OPP table properly on performance state irregularities
f2fs: fix to account IO correctly
f2fs: fix to recover cold bit of inode block during POR
f2fs: fix missing up_read
Revert "f2fs: fix to clear PG_checked flag in set_page_dirty()"
cpupower: Fix AMD Family 0x17 msr_pstate size
ALSA: hda: Check the non-cached stream buffers more explicitly
IB/rxe: fix for duplicate request processing and ack psns
dmaengine: dma-jz4780: Return error if not probed from DT
mfd: menelaus: Fix possible race condition and leak
f2fs: fix to flush all dirty inodes recovered in readonly fs
signal: Always deliver the kernel's SIGKILL and SIGSTOP to a pid namespace init
f2fs: report error if quota off error during umount
f2fs: avoid sleeping under spin_lock
scsi: lpfc: Correct race with abort on completion path
scsi: lpfc: Correct soft lockup when running mds diagnostics
uio: ensure class is registered before devices
IB/mlx5: Allow transition of DCI QP to reset
IB/ipoib: Use dev_port to expose network interface port numbers
firmware: coreboot: Unmap ioregion after device population
ASoC: AMD: Fix capture unstable in beginning for some runs
driver/dma/ioat: Call del_timer_sync() without holding prep_lock
Smack: ptrace capability use fixes
usb: chipidea: Prevent unbalanced IRQ disable
crypto: caam - fix implicit casts in endianness helpers
PCI: dwc: pci-dra7xx: Enable errata i870 for both EP and RC mode
coresight: etb10: Fix handling of perf mode
PCI/MSI: Warn and return error if driver enables MSI/MSI-X twice
f2fs: fix to recover inode's i_flags during POR
f2fs: fix to recover inode's crtime during POR
scsi: qla2xxx: Fix recursive mailbox timeout
xhci: Avoid USB autosuspend when resuming USB2 ports.
nvmem: check the return value of nvmem_add_cells()
PCI: cadence: Correct probe behaviour when failing to get PHY
MD: fix invalid stored role for a disk
ext4: fix argument checking in EXT4_IOC_MOVE_EXT
usb: gadget: udc: atmel: handle at91sam9rl PMC
usb: dwc2: fix a race with external vbus supply
usb: dwc2: fix call to vbus supply exit routine, call it unlocked
irqchip/pdc: Setup all edge interrupts as rising edge at GIC
xprtrdma: Reset credit grant properly after a disconnect
PCI / ACPI: Enable wake automatically for power managed bridges
VMCI: Resource wildcard match fixed
Drivers: hv: vmbus: Use cpumask_var_t for on-stack cpu mask
f2fs: clear PageError on the read path
tpm: suppress transmit cmd error logs when TPM 1.2 is disabled/deactivated
usb: typec: tcpm: Report back negotiated PPS voltage and current
PCI: cadence: Use AXI region 0 to signal interrupts from EP
PCI: mediatek: Fix mtk_pcie_find_port() endpoint/port matching logic
usb: host: ohci-at91: fix request of irq for optional gpio
RDMA/bnxt_re: Fix recursive lock warning in debug kernel
RDMA/bnxt_re: Avoid accessing nq->bar_reg_iomem in failure case
IB/ipoib: Clear IPCB before icmp_send
RDMA/cm: Respect returned status of cm_init_av_by_path
RDMA/core: Do not expose unsupported counters
scsi: megaraid_sas: fix a missing-check bug
KVM: nVMX: Clear reserved bits of #DB exit qualification
UAPI: ndctl: Fix g++-unsupported initialisation in headers
scsi: ufs: Schedule clk gating work on correct queue
scsi: esp_scsi: Track residual for PIO transfers
of: Add missing exports of node name compare functions
md: fix memleak for mempool
MD: Memory leak when flush bio size is zero
f2fs: fix to account IO correctly for cgroup writeback
net: stmmac: dwmac-sun8i: fix OF child-node lookup
cgroup, netclassid: add a preemption point to write_classid
cifs: fix a credits leak for compund commands
thermal: da9062/61: Prevent hardware access during system suspend
thermal: rcar_thermal: Prevent doing work after unbind
libata: Apply NOLPM quirk for SAMSUNG MZ7TD256HAFV-000L9
ath10k: schedule hardware restart if WMI command times out
wil6210: fix RX buffers release and unmap
ixgbevf: VF2VF TCP RSS
ixgbe: disallow IPsec Tx offload when in SR-IOV mode
gpio: brcmstb: allow 0 width GPIO banks
iwlwifi: mvm: fix BAR seq ctrl reporting
libertas_tf: prevent underflow in process_cmdrequest()
rsi: fix memory alignment issue in ARM32 platforms
mt76x2u: run device cleanup routine if resume fails
net: dsa: mv88e6xxx: Fix writing to a PHY page.
net: hns3: Fix for vf vlan delete failed problem
net: hns3: Fix ping exited problem when doing lp selftest
net: hns3: Preserve vlan 0 in hardware table
pinctrl: ssbi-gpio: Fix pm8xxx_pin_config_get() to be compliant
pinctrl: spmi-mpp: Fix pmic_mpp_config_get() to be compliant
perf tests: Fix record+probe_libc_inet_pton.sh without ping's debuginfo
failover: Add missing check to validate 'slave_dev' in net_failover_slave_unregister
bpf/verifier: fix verifier instability
pinctrl: qcom: spmi-mpp: Fix drive strength setting
ACPI / LPSS: Add alternative ACPI HIDs for Cherry Trail DMA controllers
spi: gpio: No MISO does not imply no RX
kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()
arm64: entry: Allow handling of undefined instructions from EL1
block, bfq: correctly charge and reset entity service in all cases
net: phy: phylink: ensure the carrier is off when starting phylink
net: hns3: Set STATE_DOWN bit of hdev state when stopping net
net: hns3: Check hdev state when getting link status
brcmfmac: fix for proper support of 160MHz bandwidth
pinctrl: qcom: spmi-mpp: Fix err handling of pmic_mpp_set_mux
pinctrl: sunxi: fix 'pctrl->functions' allocation in sunxi_pinctrl_build_state
net: hns3: Fix ets validate issue
net: hns3: Add nic state check before calling netif_tx_wake_queue
x86: boot: Fix EFI stub alignment
efi/x86: Call efi_parse_options() from efi_main()
Bluetooth: hci_qca: Remove hdev dereference in qca_close().
Bluetooth: btbcm: Add entry for BCM4335C0 UART bluetooth
net: hns3: Fix for packet buffer setting bug
ice: update fw version check logic
ice: fix changing of ring descriptor size (ethtool -G)
signal: Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
ath10k: fix tx status flag setting for management frames
nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O
mtd: rawnand: atmel: Fix potential NULL pointer dereference
x86/intel_rdt: Show missing resctrl mount options
cpufreq: dt: Try freeing static OPPs only if we have added them
ACPI / processor: Fix the return value of acpi_processor_ids_walk()
ACPI / PM: LPIT: Register sysfs attributes based on FADT
ACPI/PPTT: Handle architecturally unknown cache types
wlcore: Fix BUG with clear completion on timeout
x86/olpc: Indicate that legacy PC XO-1 platform should not register RTC
iwlwifi: mvm: check for n_profiles validity in EWRD ACPI
iwlwifi: mvm: clear HW_RESTART_REQUESTED when stopping the interface
iwlwifi: pcie: avoid empty free RB queue
mtd: rawnand: denali: set SPARE_AREA_SKIP_BYTES register to 8 if unset
sdhci: acpi: add free_slot callback
mmc: sdhci-pci-o2micro: Add quirk for O2 Micro dev 0x8620 rev 0x01
bcache: Populate writeback_rate_minimum attribute
cpupower: Fix coredump on VMWare
perf strbuf: Match va_{add,copy} with va_end
perf tools: Free 'printk' string in parse_ftrace_printk()
perf tools: Cleanup trace-event-info 'tdata' leak
perf tools: Free temporary 'sys' string in read_event_files()
spi: spi-ep93xx: Use dma_data_direction for ep93xx_spi_dma_{finish,prepare}
lightnvm: pblk: fix race condition on metadata I/O
lightnvm: pblk: fix two sleep-in-atomic-context bugs
lightnvm: pblk: fix race on sysfs line state
hwmon: (pwm-fan) Set fan speed to 0 on suspend
s390/sthyi: Fix machine name validity indication
tun: Consistently configure generic netdev params via rtnetlink
nfp: devlink port split support for 1x100G CXP NIC
hv_netvsc: fix vf serial matching with pci slot info
arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
swim: fix cleanup on setup error
ataflop: fix error handling during setup
netfilter: xt_nat: fix DNAT target for shifted portmap ranges
locking/lockdep: Fix debug_locks off performance problem
net: loopback: clear skb->tstamp before netif_rx()
net: socionext: Reset tx queue in ndo_stop
ARM: dts: exynos: Disable pull control for MAX8997 interrupts on Origen
x86/numa_emulation: Fix uniform-split numa emulation
x86/mm/pat: Disable preemption around __flush_tlb_all()
x86/kvm/nVMX: allow bare VMXON state migration
x86/corruption-check: Fix panic in memory_corruption_check() when boot option without value is provided
x86/xen: Fix boot loader version reported for PVH guests
x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
ALSA: hda - Fix incorrect clearance of thinkpad_acpi hooks
ALSA: ca0106: Disable IZD on SB0570 DAC to fix audio pops
ALSA: hda: Add 2 more models to the power_save blacklist
ALSA: hda - Add mic quirk for the Lenovo G50-30 (17aa:3905)
ALSA: hda/realtek - Fix the problem of the front MIC on the Lenovo M715
ALSA: hda - Fix headphone pin config for ASUS G751
ALSA: hda - Add quirk for ASUS G751 laptop
parisc: Fix exported address of os_hpmc handler
parisc: Fix map_pages() to not overwrite existing pte entries
parisc: Fix address in HPMC IVA
mailbox: PCC: handle parse error
ipmi: Fix timer race with module unload
kprobes/x86: Use preempt_enable() in optimized_callback()
acpi, nfit: Fix Address Range Scrub completion tracking
ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodes
ACPICA: AML interpreter: add region addresses in global list during initialization
ACPI / OSL: Use 'jiffies' as the time bassis for acpi_os_get_timer()
pcmcia: Implement CLKRUN protocol disabling for Ricoh bridges
dma-mapping: fix panic caused by passing empty cma command line argument
cpufreq: conservative: Take limits changes into account properly
block: make sure writesame bio is aligned with logical block size
block: make sure discard bio is aligned with logical block size
block: setup bounce bio_sets properly
jffs2: free jffs2_sb_info through jffs2_kill_sb()
hwmon: (pmbus) Fix page count auto-detection.
bcache: fix miss key refill->end in writeback
bcache: correct dirty data statistics
bcache: fix ioctl in flash device
bcache: trace missed reading by cache_missed
spi: bcm-qspi: fix calculation of address length
spi: bcm-qspi: switch back to reading flash using smaller chunks
spi: spi-mem: Adjust op len based on message/transfer size limitations
mtd: spi-nor: fsl-quadspi: Don't let -EINVAL on the bus
mtd: spi-nor: intel-spi: Add support for Intel Ice Lake SPI serial flash
mtd: spi-nor: fsl-quadspi: fix read error for flash size larger than 16MB
mtd: maps: gpio-addr-flash: Fix ioremapped size
mtd: rawnand: marvell: fix the IRQ handler complete() condition
gpio: mxs: Get rid of external API call
MIPS: VDSO: Reduce VDSO_RANDOMIZE_SIZE to 64MB for 64bit
bpf: fix partial copy of map_ptr when dst is scalar
Conflicts:
drivers/iommu/arm-smmu.c
Change-Id: Iff6f46fb6932b2a41a7a3df5f2a18f1eddfb9d66
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
commit a36700589b85443e28170be59fa11c8a104130a5 upstream.
While fixing an out of bounds array access in known_siginfo_layout
reported by the kernel test robot it became apparent that the same bug
exists in siginfo_layout and affects copy_siginfo_from_user32.
The straight forward fix that makes guards against making this mistake
in the future and should keep the code size small is to just take an
unsigned signal number instead of a signed signal number, as I did to
fix known_siginfo_layout.
Cc: stable@vger.kernel.org
Fixes: cc731525f2 ("signal: Remove kernel interal si_code magic")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3597dfe01d12f570bc739da67f857fd222a3ea66 ]
Instead of playing whack-a-mole and changing SEND_SIG_PRIV to
SEND_SIG_FORCED throughout the kernel to ensure a pid namespace init
gets signals sent by the kernel, stop allowing a pid namespace init to
ignore SIGKILL or SIGSTOP sent by the kernel. A pid namespace init is
only supposed to be able to ignore signals sent from itself and
children with SIG_DFL.
Fixes: 921cf9f630 ("signals: protect cinit from unblocked SIG_DFL signals")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 22839869f21ab3850fbbac9b425ccc4c0023926f ]
The sigaltstack(2) system call fails with -ENOMEM if the new alternative
signal stack is found to be smaller than SIGMINSTKSZ. On architectures
such as arm64, where the native value for SIGMINSTKSZ is larger than
the compat value, this can result in an unexpected error being reported
to a compat task. See, for example:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904385
This patch fixes the problem by extending do_sigaltstack to take the
minimum signal stack size as an additional parameter, allowing the
native and compat system call entry code to pass in their respective
values. COMPAT_SIGMINSTKSZ is just defined as SIGMINSTKSZ if it has not
been defined by the architecture.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Reported-by: Steve McIntyre <steve.mcintyre@arm.com>
Tested-by: Steve McIntyre <93sam@debian.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Free the pages parallely for a task that receives SIGKILL using the
oom_reaper. This freeing of pages will help to give the pages to buddy
system well advance.
This reaps for the process which received SIGKILL through
either sys_kill from user or kill_pid from kernel and that sending
process has CAP_KILL capability.
Also sysctl interface, reap_mem_on_sigkill, is added to turn on/off this
feature.
Change-Id: I21adb95de5e380a80d7eb0b87d9b5b553f52e28a
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
[swatsrid@codeaurora.org: Fix merge conflicts]
Signed-off-by: Swathi Sridhar <swatsrid@codeaurora.org>
Merge more updates from Andrew Morton:
- the rest of MM
- procfs updates
- various misc things
- more y2038 fixes
- get_maintainer updates
- lib/ updates
- checkpatch updates
- various epoll updates
- autofs updates
- hfsplus
- some reiserfs work
- fatfs updates
- signal.c cleanups
- ipc/ updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (166 commits)
ipc/util.c: update return value of ipc_getref from int to bool
ipc/util.c: further variable name cleanups
ipc: simplify ipc initialization
ipc: get rid of ids->tables_initialized hack
lib/rhashtable: guarantee initial hashtable allocation
lib/rhashtable: simplify bucket_table_alloc()
ipc: drop ipc_lock()
ipc/util.c: correct comment in ipc_obtain_object_check
ipc: rename ipcctl_pre_down_nolock()
ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
ipc: reorganize initialization of kern_ipc_perm.seq
ipc: compute kern_ipc_perm.id under the ipc lock
init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
adfs: use timespec64 for time conversion
kernel/sysctl.c: fix typos in comments
drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md
fork: don't copy inconsistent signal handler state to child
signal: make get_signal() return bool
signal: make sigkill_pending() return bool
...
Wen Yang <wen.yang99@zte.com.cn> and majiang <ma.jiang@zte.com.cn>
report that a periodic signal received during fork can cause fork to
continually restart preventing an application from making progress.
The code was being overly pessimistic. Fork needs to guarantee that a
signal sent to multiple processes is logically delivered before the
fork and just to the forking process or logically delivered after the
fork to both the forking process and it's newly spawned child. For
signals like periodic timers that are always delivered to a single
process fork can safely complete and let them appear to logically
delivered after the fork().
While examining this issue I also discovered that fork today will miss
signals delivered to multiple processes during the fork and handled by
another thread. Similarly the current code will also miss blocked
signals that are delivered to multiple process, as those signals will
not appear pending during fork.
Add a list of each thread that is currently forking, and keep on that
list a signal set that records all of the signals sent to multiple
processes. When fork completes initialize the new processes
shared_pending signal set with it. The calculate_sigpending function
will see those signals and set TIF_SIGPENDING causing the new task to
take the slow path to userspace to handle those signals. Making it
appear as if those signals were received immediately after the fork.
It is not possible to send real time signals to multiple processes and
exceptions don't go to multiple processes, which means that that are
no signals sent to multiple processes that require siginfo. This
means it is safe to not bother collecting siginfo on signals sent
during fork.
The sigaction of a child of fork is initially the same as the
sigaction of the parent process. So a signal the parent ignores the
child will also initially ignore. Therefore it is safe to ignore
signals sent to multiple processes and ignored by the forking process.
Signals sent to only a single process or only a single thread and delivered
during fork are treated as if they are received after the fork, and generally
not dealt with. They won't cause any problems.
V2: Added removal from the multiprocess list on failure.
V3: Use -ERESTARTNOINTR directly
V4: - Don't queue both SIGCONT and SIGSTOP
- Initialize signal_struct.multiprocess in init_task
- Move setting of shared_pending to before the new task
is visible to signals. This prevents signals from comming
in before shared_pending.signal is set to delayed.signal
and being lost.
V5: - rework list add and delete to account for idle threads
v6: - Use sigdelsetmask when removing stop signals
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200447
Reported-by: Wen Yang <wen.yang99@zte.com.cn> and
Reported-by: majiang <ma.jiang@zte.com.cn>
Fixes: 4a2c7a7837 ("[PATCH] make fork() atomic wrt pgrp/session signals")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
There are only two signals that are delivered to every member of a
signal group: SIGSTOP and SIGKILL. Signal delivery requires every
signal appear to be delivered either before or after a clone syscall.
SIGKILL terminates the clone so does not need to be considered. Which
leaves only SIGSTOP that needs to be considered when creating new
threads.
Today in the event of a group stop TIF_SIGPENDING will get set and the
fork will restart ensuring the fork syscall participates in the group
stop.
A fork (especially of a process with a lot of memory) is one of the
most expensive system so we really only want to restart a fork when
necessary.
It is easy so check to see if a SIGSTOP is ongoing and have the new
thread join it immediate after the clone completes. Making it appear
the clone completed happened just before the SIGSTOP.
The calculate_sigpending function will see the bits set in jobctl and
set TIF_SIGPENDING to ensure the new task takes the slow path to userspace.
V2: The call to task_join_group_stop was moved before the new task is
added to the thread group list. This should not matter as
sighand->siglock is held over both the addition of the threads,
the call to task_join_group_stop and do_signal_stop. But the change
is trivial and it is one less thing to worry about when reading
the code.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add a function calculate_sigpending to test to see if any signals are
pending for a new task immediately following fork. Signals have to
happen either before or after fork. Today our practice is to push
all of the signals to before the fork, but that has the downside that
frequent or periodic signals can make fork take much much longer than
normal or prevent fork from completing entirely.
So we need move signals that we can after the fork to prevent that.
This updates the code to set TIF_SIGPENDING on a new task if there
are signals or other activities that have moved so that they appear
to happen after the fork.
As the code today restarts if it sees any such activity this won't
immediately have an effect, as there will be no reason for it
to set TIF_SIGPENDING immediately after the fork.
Adding calculate_sigpending means the code in fork can safely be
changed to not always restart if a signal is pending.
The new calculate_sigpending function sets sigpending if there
are pending bits in jobctl, pending signals, the freezer needs
to freeze the new task or the live kernel patching framework
need the new thread to take the slow path to userspace.
I have verified that setting TIF_SIGPENDING does make a new process
take the slow path to userspace before it executes it's first userspace
instruction.
I have looked at the callers of signal_wake_up and the code paths
setting TIF_SIGPENDING and I don't see anything else that needs to be
handled. The code probably doesn't need to set TIF_SIGPENDING for the
kernel live patching as it uses a separate thread flag as well. But
at this point it seems safer reuse the recalc_sigpending logic and get
the kernel live patching folks to sort out their story later.
V2: I have moved the test into schedule_tail where siglock can
be grabbed and recalc_sigpending can be reused directly.
Further as the last action of setting up a new task this
guarantees that TIF_SIGPENDING will be properly set in the
new process.
The helper calculate_sigpending takes the siglock and
uncontitionally sets TIF_SIGPENDING and let's recalc_sigpending
clear TIF_SIGPENDING if it is unnecessary. This allows reusing
the existing code and keeps maintenance of the conditions simple.
Oleg Nesterov <oleg@redhat.com> suggested the movement
and pointed out the need to take siglock if this code
was going to be called while the new task is discoverable.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This is the bottom and by pushing this down it simplifies the callers
and otherwise leaves things as is. This is in preparation for allowing
fork to implement better handling of signals set to groups of processes.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This information is already available in the callers and by pushing it
down it makes the code a little clearer, and allows implementing
better handling of signales set to a group of processes in fork.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This information is already available in the callers and by pushing it
down it makes the code a little clearer, and allows better group
signal behavior in fork.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This passes the information we already have at the call sight into
do_send_sig_info. Ultimately allowing for better handling of signals
sent to a group of processes during fork.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This passes the information we already have at the call sight
into group_send_sig_info. Ultimatelly allowing for to better handle
signals sent to a group of processes.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Make the code more maintainable by performing more of the signal
related work in send_sigqueue.
A quick inspection of do_timer_create will show that this code path
does not lookup a thread group by a thread's pid. Making it safe
to find the task pointed to by it_pid with "pid_task(it_pid, type)";
This supports the changes needed in fork to tell if a signal was sent
to a single process or a group of processes.
Having the pid to task transition in signal.c will also make it easier
to sort out races with de_thread and and the thread group leader
exiting when it comes time to address that.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Commit a841796f11 ("signal: align __lock_task_sighand() irq disabling and
RCU") introduced a rcu read side critical section with interrupts
disabled. The changelog suggested that a better long-term fix would be "to
make rt_mutex_unlock() disable irqs when acquiring the rt_mutex structure's
->wait_lock".
This long-term fix has been made in commit b4abf91047 ("rtmutex: Make
wait_lock irq safe") for a different reason.
Therefore revert commit a841796f11 ("signal: align >
__lock_task_sighand() irq disabling and RCU") as the interrupt disable
dance is not longer required.
The change was tested on the base of b4abf91047 ("rtmutex: Make wait_lock
irq safe") with a four hour run of rcutorture scenario TREE03 with lockdep
enabled as suggested by Paul McKenney.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: bigeasy@linutronix.de
Link: https://lkml.kernel.org/r/20180525090507.22248-3-anna-maria@linutronix.de
Pull siginfo updates from Eric Biederman:
"This set of changes close the known issues with setting si_code to an
invalid value, and with not fully initializing struct siginfo. There
remains work to do on nds32, arc, unicore32, powerpc, arm, arm64, ia64
and x86 to get the code that generates siginfo into a simpler and more
maintainable state. Most of that work involves refactoring the signal
handling code and thus careful code review.
Also not included is the work to shrink the in kernel version of
struct siginfo. That depends on getting the number of places that
directly manipulate struct siginfo under control, as it requires the
introduction of struct kernel_siginfo for the in kernel things.
Overall this set of changes looks like it is making good progress, and
with a little luck I will be wrapping up the siginfo work next
development cycle"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
signal/sh: Stop gcc warning about an impossible case in do_divide_error
signal/mips: Report FPE_FLTUNK for undiagnosed floating point exceptions
signal/um: More carefully relay signals in relay_signal.
signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR}
signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code
signal/signalfd: Add support for SIGSYS
signal/signalfd: Remove __put_user from signalfd_copyinfo
signal/xtensa: Use force_sig_fault where appropriate
signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
signal/um: Use force_sig_fault where appropriate
signal/sparc: Use force_sig_fault where appropriate
signal/sparc: Use send_sig_fault where appropriate
signal/sh: Use force_sig_fault where appropriate
signal/s390: Use force_sig_fault where appropriate
signal/riscv: Replace do_trap_siginfo with force_sig_fault
signal/riscv: Use force_sig_fault where appropriate
signal/parisc: Use force_sig_fault where appropriate
signal/parisc: Use force_sig_mceerr where appropriate
signal/openrisc: Use force_sig_fault where appropriate
signal/nios2: Use force_sig_fault where appropriate
...
Gaurav reported a perceived problem with TASK_PARKED, which turned out
to be a broken wait-loop pattern in __kthread_parkme(), but the
reported issue can (and does) in fact happen for states that do not do
condition based sleeps.
When the 'current->state = TASK_RUNNING' store of a previous
(concurrent) try_to_wake_up() collides with the setting of a 'special'
sleep state, we can loose the sleep state.
Normal condition based wait-loops are immune to this problem, but for
sleep states that are not condition based are subject to this problem.
There already is a fix for TASK_DEAD. Abstract that and also apply it
to TASK_STOPPED and TASK_TRACED, both of which are also without
condition based wait-loop.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Update the siginfo_layout function and enum siginfo_layout to represent
all of the possible field layouts of struct siginfo.
This allows the uses of siginfo_layout in um and arm64 where they are testing
for SIL_FAULT to be more accurate as this rules out the other cases.
Further this allows the switch statements on siginfo_layout to be simpler
if perhaps a little more wordy. Making it easier to understand what is
actually going on.
As SIL_FAULT_BNDERR and SIL_FAULT_PKUERR are never expected to appear
in signalfd just treat them as SIL_FAULT. To include them would take
20 extra bytes an pretty much fill up what is left of
signalfd_siginfo.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The only architecture that does not support SEGV_PKUERR is ia64 and
ia64 has not had 32bit support since some time in 2008. Therefore
copy_siginfo_to_user32 and copy_siginfo_from_user32 do not need to
include support for a missing SEGV_PKUERR.
Compile test on ia64.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
With the recent architecture cleanups these si_codes are always
defined so there is no need to test for them.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
After the last round of cleanups to siginfo.h SEGV_BNDERR is defined
on all architectures so testing to see if it is defined is unnecessary.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
After more experience with the cases where no one the si_code of 0
is used both as a signal specific si_code, and as SI_USER it appears
that no one cares about the signal specific si_code case and the
good solution is to just fix the architectures by using
a different si_code.
In none of the conversations has anyone even suggested that
anything depends on the signal specific redefinition of SI_USER.
There are at least test cases that care when si_code as 0 does
not work as si_user.
So make things simple and keep the generic code from introducing
problems by removing the special casing of TRAP_FIXME and FPE_FIXME.
This will ensure the generic case of sending a signal with
kill will always set SI_USER and work.
The architecture specific, and signal specific overloads that
set si_code to 0 will now have problems with signalfd and
the 32bit compat versions of siginfo copying. At least
until they are fixed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Now that every instance of struct siginfo is now initialized it is no
longer necessary to copy struct siginfo piece by piece to userspace
but instead the entire structure can be copied.
As well as making the code simpler and more efficient this means that
copy_sinfo_to_user no longer cares which union member of struct
siginfo is in use.
In practice this means that all 32bit architectures that define
FPE_FIXME will handle properly send SI_USER when kill(SIGFPE) is sent.
While still performing their historic architectural brokenness when 0
is used a floating pointer signal. This matches the current behavior
of 64bit architectures that define FPE_FIXME who get lucky and an
overloaded SI_USER has continuted to work through copy_siginfo_to_user
because the 8 byte si_addr occupies the same bytes in struct siginfo
as the 4 byte si_pid and the 4 byte si_uid.
Problematic architectures still need to fix their ABI so that signalfd
and 32bit compat code will work properly.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pull general security layer updates from James Morris:
- Convert security hooks from list to hlist, a nice cleanup, saving
about 50% of space, from Sargun Dhillon.
- Only pass the cred, not the secid, to kill_pid_info_as_cred and
security_task_kill (as the secid can be determined from the cred),
from Stephen Smalley.
- Close a potential race in kernel_read_file(), by making the file
unwritable before calling the LSM check (vs after), from Kees Cook.
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: convert security hooks to use hlist
exec: Set file unwritable before LSM check
usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill