kernel_xiaomi_sm8250

Author	SHA1	Message	Date
Igor Druzhinin	2bc2a90a08	xen/pci: reserve MCFG areas earlier [ Upstream commit a4098bc6eed5e31e0391bcc068e61804c98138df ] If MCFG area is not reserved in E820, Xen by default will defer its usage until Dom0 registers it explicitly after ACPI parser recognizes it as a reserved resource in DSDT. Having it reserved in E820 is not mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2) and firmware is free to keep a hole in E820 in that place. Xen doesn't know what exactly is inside this hole since it lacks full ACPI view of the platform therefore it's potentially harmful to access MCFG region without additional checks as some machines are known to provide inconsistent information on the size of the region. Now xen_mcfg_late() runs after acpi_init() which is too late as some basic PCI enumeration starts exactly there as well. Trying to register a device prior to MCFG reservation causes multiple problems with PCIe extended capability initializations in Xen (e.g. SR-IOV VF BAR sizing). There are no convenient hooks for us to subscribe to so register MCFG areas earlier upon the first invocation of xen_add_device(). It should be safe to do once since all the boot time buses must have their MCFG areas in MCFG table already and we don't support PCI bus hot-plug. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:13 +02:00
Chengguang Xu	18dd2b05f3	9p: avoid attaching writeback_fid on mmap with type PRIVATE [ Upstream commit c87a37ebd40b889178664c2c09cc187334146292 ] Currently on mmap cache policy, we always attach writeback_fid whether mmap type is SHARED or PRIVATE. However, in the use case of kata-container which combines 9p(Guest OS) with overlayfs(Host OS), this behavior will trigger overlayfs' copy-up when excute command inside container. Link: http://lkml.kernel.org/r/20190820100325.10313-1-cgxu519@zoho.com.cn Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:13 +02:00
Lu Shuaibing	07f3596ce3	9p: Transport error uninitialized [ Upstream commit 0ce772fe79b68f83df40f07f28207b292785c677 ] The p9_tag_alloc() does not initialize the transport error t_err field. The struct p9_req_t *req is allocated and stored in a struct p9_client variable. The field t_err is never initialized before p9_conn_cancel() checks its value. KUMSAN(KernelUninitializedMemorySantizer, a new error detection tool) reports this bug. ================================================================== BUG: KUMSAN: use of uninitialized memory in p9_conn_cancel+0x2d9/0x3b0 Read of size 4 at addr ffff88805f9b600c by task kworker/1:2/1216 CPU: 1 PID: 1216 Comm: kworker/1:2 Not tainted 5.2.0-rc4+ #28 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Workqueue: events p9_write_work Call Trace: dump_stack+0x75/0xae __kumsan_report+0x17c/0x3e6 kumsan_report+0xe/0x20 p9_conn_cancel+0x2d9/0x3b0 p9_write_work+0x183/0x4a0 process_one_work+0x4d1/0x8c0 worker_thread+0x6e/0x780 kthread+0x1ca/0x1f0 ret_from_fork+0x35/0x40 Allocated by task 1979: save_stack+0x19/0x80 __kumsan_kmalloc.constprop.3+0xbc/0x120 kmem_cache_alloc+0xa7/0x170 p9_client_prepare_req.part.9+0x3b/0x380 p9_client_rpc+0x15e/0x880 p9_client_create+0x3d0/0xac0 v9fs_session_init+0x192/0xc80 v9fs_mount+0x67/0x470 legacy_get_tree+0x70/0xd0 vfs_get_tree+0x4a/0x1c0 do_mount+0xba9/0xf90 ksys_mount+0xa8/0x120 __x64_sys_mount+0x62/0x70 do_syscall_64+0x6d/0x1e0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 0: (stack is not available) The buggy address belongs to the object at ffff88805f9b6008 which belongs to the cache p9_req_t of size 144 The buggy address is located 4 bytes inside of 144-byte region [ffff88805f9b6008, ffff88805f9b6098) The buggy address belongs to the page: page:ffffea00017e6d80 refcount:1 mapcount:0 mapping:ffff888068b63740 index:0xffff88805f9b7d90 compound_mapcount: 0 flags: 0x100000000010200(slab\|head) raw: 0100000000010200 ffff888068b66450 ffff888068b66450 ffff888068b63740 raw: ffff88805f9b7d90 0000000000100001 00000001ffffffff 0000000000000000 page dumped because: kumsan: bad access detected ================================================================== Link: http://lkml.kernel.org/r/20190613070854.10434-1-shuaibinglu@126.com Signed-off-by: Lu Shuaibing <shuaibinglu@126.com> [dominique.martinet@cea.fr: grouped the added init with the others] Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:12 +02:00
Jia-Ju Bai	448deb13ab	fs: nfs: Fix possible null-pointer dereferences in encode_attrs() [ Upstream commit e2751463eaa6f9fec8fea80abbdc62dbc487b3c5 ] In encode_attrs(), there is an if statement on line 1145 to check whether label is NULL: if (label && (attrmask[2] & FATTR4_WORD2_SECURITY_LABEL)) When label is NULL, it is used on lines 1178-1181: p++ = cpu_to_be32(label->lfs); p++ = cpu_to_be32(label->pi); *p++ = cpu_to_be32(label->len); p = xdr_encode_opaque_fixed(p, label->label, label->len); To fix these bugs, label is checked before being used. These bugs are found by a static analysis tool STCheck written by us. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:11 +02:00
Sascha Hauer	4753e7a824	ima: fix freeing ongoing ahash_request [ Upstream commit 4ece3125f21b1d42b84896c5646dbf0e878464e1 ] integrity_kernel_read() can fail in which case we forward to call ahash_request_free() on a currently running request. We have to wait for its completion before we can free the request. This was observed by interrupting a "find / -type f -xdev -print0 \| xargs -0 cat 1>/dev/null" with ctrl-c on an IMA enabled filesystem. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:11 +02:00
Sascha Hauer	b69c3085fc	ima: always return negative code for error [ Upstream commit f5e1040196dbfe14c77ce3dfe3b7b08d2d961e88 ] integrity_kernel_read() returns the number of bytes read. If this is a short read then this positive value is returned from ima_calc_file_hash_atfm(). Currently this is only indirectly called from ima_calc_file_hash() and this function only tests for the return value being zero or nonzero and also doesn't forward the return value. Nevertheless there's no point in returning a positive value as an error, so translate a short read into -EINVAL. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-11 18:21:10 +02:00
Will Deacon	6df3c66de0	arm64: cpufeature: Detect SSBS and advertise to userspace commit d71be2b6c0e19180b5f80a6d42039cc074a693a2 upstream. Armv8.5 introduces a new PSTATE bit known as Speculative Store Bypass Safe (SSBS) which can be used as a mitigation against Spectre variant 4. Additionally, a CPU may provide instructions to manipulate PSTATE.SSBS directly, so that userspace can toggle the SSBS control without trapping to the kernel. This patch probes for the existence of SSBS and advertise the new instructions to userspace if they exist. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:10 +02:00
Johannes Berg	3a0e673305	cfg80211: initialize on-stack chandefs commit f43e5210c739fe76a4b0ed851559d6902f20ceb1 upstream. In a few places we don't properly initialize on-stack chandefs, resulting in EDMG data to be non-zero, which broke things. Additionally, in a few places we rely on the driver to init the data completely, but perhaps we shouldn't as non-EDMG drivers may not initialize the EDMG data, also initialize it there. Cc: stable@vger.kernel.org Fixes: 2a38075cd0be ("nl80211: Add support for EDMG channels") Reported-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/1569239475-I2dcce394ecf873376c386a78f31c2ec8b538fa25@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:09 +02:00
Vasily Gorbik	16c75eb13a	s390/cio: avoid calling strlen on null pointer commit ea298e6ee8b34b3ed4366be7eb799d0650ebe555 upstream. Fix the following kasan finding: BUG: KASAN: global-out-of-bounds in ccwgroup_create_dev+0x850/0x1140 Read of size 1 at addr 0000000000000000 by task systemd-udevd.r/561 CPU: 30 PID: 561 Comm: systemd-udevd.r Tainted: G B Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: ([<0000000231b3db7e>] show_stack+0x14e/0x1a8) [<0000000233826410>] dump_stack+0x1d0/0x218 [<000000023216fac4>] print_address_description+0x64/0x380 [<000000023216f5a8>] __kasan_report+0x138/0x168 [<00000002331b8378>] ccwgroup_create_dev+0x850/0x1140 [<00000002332b618a>] group_store+0x3a/0x50 [<00000002323ac706>] kernfs_fop_write+0x246/0x3b8 [<00000002321d409a>] vfs_write+0x132/0x450 [<00000002321d47da>] ksys_write+0x122/0x208 [<0000000233877102>] system_call+0x2a6/0x2c8 Triggered by: openat(AT_FDCWD, "/sys/bus/ccwgroup/drivers/qeth/group", O_WRONLY\|O_CREAT\|O_TRUNC\|O_CLOEXEC, 0666) = 16 write(16, "0.0.bd00,0.0.bd01,0.0.bd02", 26) = 26 The problem is that __get_next_id in ccwgroup_create_dev might set "buf" buffer pointer to NULL and explicit check for that is required. Cc: stable@vger.kernel.org Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:08 +02:00
Johan Hovold	3f41e88f4b	ieee802154: atusb: fix use-after-free at disconnect commit 7fd25e6fc035f4b04b75bca6d7e8daa069603a76 upstream. The disconnect callback was accessing the hardware-descriptor private data after having having freed it. Fixes: `7490b008d1` ("ieee802154: add support for atusb transceiver") Cc: stable <stable@vger.kernel.org> # 4.2 Cc: Alexander Aring <alex.aring@gmail.com> Reported-by: syzbot+f4509a9138a1472e7e80@syzkaller.appspotmail.com Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:07 +02:00
Juergen Gross	975859bb69	xen/xenbus: fix self-deadlock after killing user process commit a8fabb38525c51a094607768bac3ba46b3f4a9d5 upstream. In case a user process using xenbus has open transactions and is killed e.g. via ctrl-C the following cleanup of the allocated resources might result in a deadlock due to trying to end a transaction in the xenbus worker thread: [ 2551.474706] INFO: task xenbus:37 blocked for more than 120 seconds. [ 2551.492215] Tainted: P OE 5.0.0-29-generic #5 [ 2551.510263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2551.528585] xenbus D 0 37 2 0x80000080 [ 2551.528590] Call Trace: [ 2551.528603] __schedule+0x2c0/0x870 [ 2551.528606] ? _cond_resched+0x19/0x40 [ 2551.528632] schedule+0x2c/0x70 [ 2551.528637] xs_talkv+0x1ec/0x2b0 [ 2551.528642] ? wait_woken+0x80/0x80 [ 2551.528645] xs_single+0x53/0x80 [ 2551.528648] xenbus_transaction_end+0x3b/0x70 [ 2551.528651] xenbus_file_free+0x5a/0x160 [ 2551.528654] xenbus_dev_queue_reply+0xc4/0x220 [ 2551.528657] xenbus_thread+0x7de/0x880 [ 2551.528660] ? wait_woken+0x80/0x80 [ 2551.528665] kthread+0x121/0x140 [ 2551.528667] ? xb_read+0x1d0/0x1d0 [ 2551.528670] ? kthread_park+0x90/0x90 [ 2551.528673] ret_from_fork+0x35/0x40 Fix this by doing the cleanup via a workqueue instead. Reported-by: James Dingwall <james@dingwall.me.uk> Fixes: `fd8aa9095a` ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Cc: <stable@vger.kernel.org> # 4.11 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:06 +02:00
Wanpeng Li	e409b81d9d	Revert "locking/pvqspinlock: Don't wait if vCPU is preempted" commit 89340d0935c9296c7b8222b6eab30e67cb57ab82 upstream. This patch reverts commit `75437bb304` (locking/pvqspinlock: Don't wait if vCPU is preempted). A large performance regression was caused by this commit. on over-subscription scenarios. The test was run on a Xeon Skylake box, 2 sockets, 40 cores, 80 threads, with three VMs of 80 vCPUs each. The score of ebizzy -M is reduced from 13000-14000 records/s to 1700-1800 records/s: Host Guest score vanilla w/o kvm optimizations upstream 1700-1800 records/s vanilla w/o kvm optimizations revert 13000-14000 records/s vanilla w/ kvm optimizations upstream 4500-5000 records/s vanilla w/ kvm optimizations revert 14000-15500 records/s Exit from aggressive wait-early mechanism can result in premature yield and extra scheduling latency. Actually, only 6% of wait_early events are caused by vcpu_is_preempted() being true. However, when one vCPU voluntarily releases its vCPU, all the subsequently waiters in the queue will do the same and the cascading effect leads to bad performance. kvm optimizations: [1] commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts) [2] commit 266e85a5ec9 (KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption) Tested-by: loobinliu@tencent.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: loobinliu@tencent.com Cc: stable@vger.kernel.org Fixes: `75437bb304` (locking/pvqspinlock: Don't wait if vCPU is preempted) Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:06 +02:00
Russell King	7ed2867ceb	mmc: sdhci-of-esdhc: set DMA snooping based on DMA coherence commit 121bd08b029e03404c451bb237729cdff76eafed upstream. We must not unconditionally set the DMA snoop bit; if the DMA API is assuming that the device is not DMA coherent, and the device snoops the CPU caches, the device can see stale cache lines brought in by speculative prefetch. This leads to the device seeing stale data, potentially resulting in corrupted data transfers. Commonly, this results in a descriptor fetch error such as: mmc0: ADMA error mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== mmc0: sdhci: Sys addr: 0x00000000 \| Version: 0x00002202 mmc0: sdhci: Blk size: 0x00000008 \| Blk cnt: 0x00000001 mmc0: sdhci: Argument: 0x00000000 \| Trn mode: 0x00000013 mmc0: sdhci: Present: 0x01f50008 \| Host ctl: 0x00000038 mmc0: sdhci: Power: 0x00000003 \| Blk gap: 0x00000000 mmc0: sdhci: Wake-up: 0x00000000 \| Clock: 0x000040d8 mmc0: sdhci: Timeout: 0x00000003 \| Int stat: 0x00000001 mmc0: sdhci: Int enab: 0x037f108f \| Sig enab: 0x037f108b mmc0: sdhci: ACmd stat: 0x00000000 \| Slot int: 0x00002202 mmc0: sdhci: Caps: 0x35fa0000 \| Caps_1: 0x0000af00 mmc0: sdhci: Cmd: 0x0000333a \| Max curr: 0x00000000 mmc0: sdhci: Resp[0]: 0x00000920 \| Resp[1]: 0x001d8a33 mmc0: sdhci: Resp[2]: 0x325b5900 \| Resp[3]: 0x3f400e00 mmc0: sdhci: Host ctl2: 0x00000000 mmc0: sdhci: ADMA Err: 0x00000009 \| ADMA Ptr: 0x000000236d43820c mmc0: sdhci: ============================================ mmc0: error -5 whilst initialising SD card but can lead to other errors, and potentially direct the SDHCI controller to read/write data to other memory locations (e.g. if a valid descriptor is visible to the device in a stale cache line.) Fix this by ensuring that the DMA snoop bit corresponds with the behaviour of the DMA API. Since the driver currently only supports DT, use of_dma_is_coherent(). Note that device_get_dma_attr() can not be used as that risks re-introducing this bug if/when the driver is converted to ACPI. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:05 +02:00
Russell King	4509a19d50	mmc: sdhci: improve ADMA error reporting commit d1c536e3177390da43d99f20143b810c35433d1f upstream. ADMA errors are potentially data corrupting events; although we print the register state, we do not usefully print the ADMA descriptors. Worse than that, we print them by referencing their virtual address which is meaningless when the register state gives us the DMA address of the failing descriptor. Print the ADMA descriptors giving their DMA addresses rather than their virtual addresses, and print them using SDHCI_DUMP() rather than DBG(). We also do not show the correct value of the interrupt status register; the register dump shows the current value, after we have cleared the pending interrupts we are going to service. What is more useful is to print the interrupts that _were_ pending at the time the ADMA error was encountered. Fix that too. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:04 +02:00
Xiaolin Zhang	873f49d6a4	drm/i915/gvt: update vgpu workload head pointer correctly commit 0a3242bdb47713e09cb004a0ba4947d3edf82d8a upstream. when creating a vGPU workload, the guest context head pointer should be updated correctly by comparing with the exsiting workload in the guest worklod queue including the current running context. in some situation, there is a running context A and then received 2 new vGPU workload context B and A. in the new workload context A, it's head pointer should be updated with the running context A's tail. v2: walk through guest workload list in backward way. Cc: stable@vger.kernel.org Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:03 +02:00
Lyude Paul	198bc7040c	drm/nouveau/kms/nv50-: Don't create MSTMs for eDP connectors commit 698c1aa9f83b618de79e9e5e19a58f70a4a6ae0f upstream. On the ThinkPad P71, we have one eDP connector exposed along with 5 DP connectors, resulting in a total of 11 TMDS encoders. Since the GPU on this system is also capable of MST, we create an additional 4 fake MST encoders for each DP port. Unfortunately, we also do this for the eDP port as well, resulting in: 1 eDP port: +1 TMDS encoder +4 DPMST encoders 5 DP ports: +2 TMDS encoders +4 DPMST encoders *5 ports == 35 encoders Which breaks things, since DRM has a hard coded limit of 32 encoders. So, fix this by not creating MSTMs for any eDP connectors. This brings us down to 31 encoders, although we can do better. This fixes driver probing for nouveau on the ThinkPad P71. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:03 +02:00
Sean Paul	7a85c86735	drm/msm/dsi: Fix return value check for clk_get_parent commit 5fb9b797d5ccf311ae4aba69e86080d47668b5f7 upstream. clk_get_parent returns an error pointer upon failure, not NULL. So the checks as they exist won't catch a failure. This patch changes the checks and the return values to properly handle an error pointer. Fixes: `c4d8cfe516` ("drm/msm/dsi: add implementation for helper functions") Cc: Sibi Sankar <sibis@codeaurora.org> Cc: Sean Paul <seanpaul@chromium.org> Cc: Rob Clark <robdclark@chromium.org> Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:02 +02:00
Tomi Valkeinen	0e45633f49	drm/omap: fix max fclk divider for omap36xx commit e2c4ed148cf3ec8669a1d90dc66966028e5fad70 upstream. The OMAP36xx and AM/DM37x TRMs say that the maximum divider for DSS fclk (in CM_CLKSEL_DSS) is 32. Experimentation shows that this is not correct, and using divider of 32 breaks DSS with a flood or underflows and sync losts. Dividers up to 31 seem to work fine. There is another patch to the DT files to limit the divider correctly, but as the DSS driver also needs to know the maximum divider to be able to iteratively find good rates, we also need to do the fix in the DSS driver. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Adam Ford <aford173@gmail.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20191002122542.8449-1-tomi.valkeinen@ti.com Tested-by: Adam Ford <aford173@gmail.com> Reviewed-by: Jyri Sarha <jsarha@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:01 +02:00
Srikar Dronamraju	90ac402873	perf stat: Fix a segmentation fault when using repeat forever commit 443f2d5ba13d65ccfd879460f77941875159d154 upstream. Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 #1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 #2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 #3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 #4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 #5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 #6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 #1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 #2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 #3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 #4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 #5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 #6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 #7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: `d4f63a4741` ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:01 +02:00
Rasmus Villemoes	22f28afd3d	watchdog: imx2_wdt: fix min() calculation in imx2_wdt_set_timeout commit 144783a80cd2cbc45c6ce17db649140b65f203dd upstream. Converting from ms to s requires dividing by 1000, not multiplying. So this is currently taking the smaller of new_timeout and 1.28e8, i.e. effectively new_timeout. The driver knows what it set max_hw_heartbeat_ms to, so use that value instead of doing a division at run-time. FWIW, this can easily be tested by booting into a busybox shell and doing "watchdog -t 5 -T 130 /dev/watchdog" - without this patch, the watchdog fires after 130&127 == 2 seconds. Fixes: b07e228eee69 "watchdog: imx2_wdt: Fix set_timeout for big timeout values" Cc: stable@vger.kernel.org # 5.2 plus anything the above got backported to Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20190812131356.23039-1-linux@rasmusvillemoes.dk Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:00 +02:00
Sumit Saxena	e7cf8cc79f	PCI: Restore Resizable BAR size bits correctly for 1MB BARs commit d2182b2d4b71ff0549a07f414d921525fade707b upstream. In a Resizable BAR Control Register, bits 13:8 control the size of the BAR. The encoded values of these bits are as follows (see PCIe r5.0, sec 7.8.6.3): Value BAR size 0 1 MB (2^20 bytes) 1 2 MB (2^21 bytes) 2 4 MB (2^22 bytes) ... 43 8 EB (2^63 bytes) Previously we incorrectly set the BAR size bits for a 1 MB BAR to 0x1f instead of 0, so devices that support that size, e.g., new megaraid_sas and mpt3sas adapters, fail to initialize during resume from S3 sleep. Correctly calculate the BAR size bits for Resizable BAR control registers. Link: https://lore.kernel.org/r/20190725192552.24295-1-sumit.saxena@broadcom.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203939 Fixes: `d3252ace0b` ("PCI: Restore resized BAR state on resume") Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:21:00 +02:00
Jon Derrick	956ce989c4	PCI: vmd: Fix shadow offsets to reflect spec changes commit a1a30170138c9c5157bd514ccd4d76b47060f29b upstream. The shadow offset scratchpad was moved to 0x2000-0x2010. Update the location to get the correct shadow offset. Fixes: `6788958e4f` ("PCI: vmd: Assign membar addresses from shadow registers") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:59 +02:00
Li RongQing	06f250215b	timer: Read jiffies once when forwarding base clk commit e430d802d6a3aaf61bd3ed03d9404888a29b9bf9 upstream. The timer delayed for more than 3 seconds warning was triggered during testing. Workqueue: events_unbound sched_tick_remote RIP: 0010:sched_tick_remote+0xee/0x100 ... Call Trace: process_one_work+0x18c/0x3a0 worker_thread+0x30/0x380 kthread+0x113/0x130 ret_from_fork+0x22/0x40 The reason is that the code in collect_expired_timers() uses jiffies unprotected: if (next_event > jiffies) base->clk = jiffies; As the compiler is allowed to reload the value base->clk can advance between the check and the store and in the worst case advance farther than next event. That causes the timer expiry to be delayed until the wheel pointer wraps around. Convert the code to use READ_ONCE() Fixes: `236968383c` ("timers: Optimize collect_expired_timers() for NOHZ") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Liang ZhiCheng <liangzhicheng@baidu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1568894687-14499-1-git-send-email-lirongqing@baidu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:59 +02:00
Kees Cook	12c6c4a50f	usercopy: Avoid HIGHMEM pfn warning commit 314eed30ede02fa925990f535652254b5bad6b65 upstream. When running on a system with >512MB RAM with a 32-bit kernel built with: CONFIG_DEBUG_VIRTUAL=y CONFIG_HIGHMEM=y CONFIG_HARDENED_USERCOPY=y all execve()s will fail due to argv copying into kmap()ed pages, and on usercopy checking the calls ultimately of virt_to_page() will be looking for "bad" kmap (highmem) pointers due to CONFIG_DEBUG_VIRTUAL=y: ------------[ cut here ]------------ kernel BUG at ../arch/x86/mm/physaddr.c:83! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc8 #6 Hardware name: Dell Inc. Inspiron 1318/0C236D, BIOS A04 01/15/2009 EIP: __phys_addr+0xaf/0x100 ... Call Trace: __check_object_size+0xaf/0x3c0 ? __might_sleep+0x80/0xa0 copy_strings+0x1c2/0x370 copy_strings_kernel+0x2b/0x40 __do_execve_file+0x4ca/0x810 ? kmem_cache_alloc+0x1c7/0x370 do_execve+0x1b/0x20 ... The check is from arch/x86/mm/physaddr.c: VIRTUAL_BUG_ON((phys_addr >> PAGE_SHIFT) > max_low_pfn); Due to the kmap() in fs/exec.c: kaddr = kmap(kmapped_page); ... if (copy_from_user(kaddr+offset, str, bytes_to_copy)) ... Now we can fetch the correct page to avoid the pfn check. In both cases, hardened usercopy will need to walk the page-span checker (if enabled) to do sanity checking. Reported-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Fixes: `f5509cc18d` ("mm: Hardened usercopy") Cc: Matthew Wilcox <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/201909171056.7F2FFD17@keescook Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:58 +02:00
Tom Zanussi	e010c98351	tracing: Make sure variable reference alias has correct var_ref_idx commit 17f8607a1658a8e70415eef67909f990d13017b5 upstream. Original changelog from Steve Rostedt (except last sentence which explains the problem, and the Fixes: tag): I performed a three way histogram with the following commands: echo 'irq_lat u64 lat pid_t pid' > synthetic_events echo 'wake_lat u64 lat u64 irqlat pid_t pid' >> synthetic_events echo 'hist:keys=common_pid:irqts=common_timestamp.usecs if function == 0xffffffff81200580' > events/timer/hrtimer_start/trigger echo 'hist:keys=common_pid:lat=common_timestamp.usecs-$irqts:onmatch(timer.hrtimer_start).irq_lat($lat,pid) if common_flags & 1' > events/sched/sched_waking/trigger echo 'hist:keys=pid:wakets=common_timestamp.usecs,irqlat=lat' > events/synthetic/irq_lat/trigger echo 'hist:keys=next_pid:lat=common_timestamp.usecs-$wakets,irqlat=$irqlat:onmatch(synthetic.irq_lat).wake_lat($lat,$irqlat,next_pid)' > events/sched/sched_switch/trigger echo 1 > events/synthetic/wake_lat/enable Basically I wanted to see: hrtimer_start (calling function tick_sched_timer) Note: # grep tick_sched_timer /proc/kallsyms ffffffff81200580 t tick_sched_timer And save the time of that, and then record sched_waking if it is called in interrupt context and with the same pid as the hrtimer_start, it will record the latency between that and the waking event. I then look at when the task that is woken is scheduled in, and record the latency between the wakeup and the task running. At the end, the wake_lat synthetic event will show the wakeup to scheduled latency, as well as the irq latency in from hritmer_start to the wakeup. The problem is that I found this: <idle>-0 [007] d... 190.485261: wake_lat: lat=27 irqlat=190485230 pid=698 <idle>-0 [005] d... 190.485283: wake_lat: lat=40 irqlat=190485239 pid=10 <idle>-0 [002] d... 190.488327: wake_lat: lat=56 irqlat=190488266 pid=335 <idle>-0 [005] d... 190.489330: wake_lat: lat=64 irqlat=190489262 pid=10 <idle>-0 [003] d... 190.490312: wake_lat: lat=43 irqlat=190490265 pid=77 <idle>-0 [005] d... 190.493322: wake_lat: lat=54 irqlat=190493262 pid=10 <idle>-0 [005] d... 190.497305: wake_lat: lat=35 irqlat=190497267 pid=10 <idle>-0 [005] d... 190.501319: wake_lat: lat=50 irqlat=190501264 pid=10 The irqlat seemed quite large! Investigating this further, if I had enabled the irq_lat synthetic event, I noticed this: <idle>-0 [002] d.s. 249.429308: irq_lat: lat=164968 pid=335 <idle>-0 [002] d... 249.429369: wake_lat: lat=55 irqlat=249429308 pid=335 Notice that the timestamp of the irq_lat "249.429308" is awfully similar to the reported irqlat variable. In fact, all instances were like this. It appeared that: irqlat=$irqlat Wasn't assigning the old $irqlat to the new irqlat variable, but instead was assigning the $irqts to it. The issue is that assigning the old $irqlat to the new irqlat variable creates a variable reference alias, but the alias creation code forgets to make sure the alias uses the same var_ref_idx to access the reference. Link: http://lkml.kernel.org/r/1567375321.5282.12.camel@kernel.org Cc: Linux Trace Devel <linux-trace-devel@vger.kernel.org> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Cc: stable@vger.kernel.org Fixes: `7e8b88a30b` ("tracing: Add hist trigger support for variable reference aliases") Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:57 +02:00
Michael Nosthoff	022ca58f10	power: supply: sbs-battery: only return health when battery present commit fe55e770327363304c4111423e6f7ff3c650136d upstream. when the battery is set to sbs-mode and no gpio detection is enabled "health" is always returning a value even when the battery is not present. All other fields return "not present". This leads to a scenario where the driver is constantly switching between "present" and "not present" state. This generates a lot of constant traffic on the i2c. This commit changes the response of "health" to an error when the battery is not responding leading to a consistent "not present" state. Fixes: `76b16f4cdf` ("power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats") Cc: <stable@vger.kernel.org> Signed-off-by: Michael Nosthoff <committed@heine.so> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:56 +02:00
Michael Nosthoff	5cb6dd8231	power: supply: sbs-battery: use correct flags field commit 99956a9e08251a1234434b492875b1eaff502a12 upstream. the type flag is stored in the chip->flags field not in the client->flags field. This currently leads to never using the ti specific health function as client->flags doesn't use that bit. So it's always falling back to the general one. Fixes: `76b16f4cdf` ("power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats") Cc: <stable@vger.kernel.org> Signed-off-by: Michael Nosthoff <committed@heine.so> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:56 +02:00
Jiaxun Yang	fb93ccde08	MIPS: Treat Loongson Extensions as ASEs commit d2f965549006acb865c4638f1f030ebcefdc71f6 upstream. Recently, binutils had split Loongson-3 Extensions into four ASEs: MMI, CAM, EXT, EXT2. This patch do the samething in kernel and expose them in cpuinfo so applications can probe supported ASEs at runtime. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: Yunqiang Su <ysu@wavecomp.com> Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: linux-mips@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:55 +02:00
Gilad Ben-Yossef	a0dc60ac6b	crypto: ccree - use the full crypt length value commit 7a4be6c113c1f721818d1e3722a9015fe393295c upstream. In case of AEAD decryption verifcation error we were using the wrong value to zero out the plaintext buffer leaving the end of the buffer with the false plaintext. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Fixes: `ff27e85a85` ("crypto: ccree - add AEAD support") CC: stable@vger.kernel.org # v4.17+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:55 +02:00
Gilad Ben-Yossef	f5c087a0d9	crypto: ccree - account for TEE not ready to report commit 76a95bd8f9e10cade9c4c8df93b5c20ff45dc0f5 upstream. When ccree driver runs it checks the state of the Trusted Execution Environment CryptoCell driver before proceeding. We did not account for cases where the TEE side is not ready or not available at all. Fix it by only considering TEE error state after sync with the TEE side driver. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Fixes: `ab8ec9658f` ("crypto: ccree - add FIPS support") CC: stable@vger.kernel.org # v4.17+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:54 +02:00
Horia Geantă	561bf93092	crypto: caam - fix concurrency issue in givencrypt descriptor commit 48f89d2a2920166c35b1c0b69917dbb0390ebec7 upstream. IV transfer from ofifo to class2 (set up at [29][30]) is not guaranteed to be scheduled before the data transfer from ofifo to external memory (set up at [38]: [29] 10FA0004 ld: ind-nfifo (len=4) imm [30] 81F00010 <nfifo_entry: ofifo->class2 type=msg len=16> [31] 14820004 ld: ccb2-datasz len=4 offs=0 imm [32] 00000010 data:0x00000010 [33] 8210010D operation: cls1-op aes cbc init-final enc [34] A8080B04 math: (seqin + math0)->vseqout len=4 [35] 28000010 seqfifold: skip len=16 [36] A8080A04 math: (seqin + math0)->vseqin len=4 [37] 2F1E0000 seqfifold: both msg1->2-last2-last1 len=vseqinsz [38] 69300000 seqfifostr: msg len=vseqoutsz [39] 5C20000C seqstr: ccb2 ctx len=12 offs=0 If ofifo -> external memory transfer happens first, DECO will hang (issuing a Watchdog Timeout error, if WDOG is enabled) waiting for data availability in ofifo for the ofifo -> c2 ififo transfer. Make sure IV transfer happens first by waiting for all CAAM internal transfers to end before starting payload transfer. New descriptor with jump command inserted at [37]: [..] [36] A8080A04 math: (seqin + math0)->vseqin len=4 [37] A1000401 jump: jsl1 all-match[!nfifopend] offset=[01] local->[38] [38] 2F1E0000 seqfifold: both msg1->2-last2-last1 len=vseqinsz [39] 69300000 seqfifostr: msg len=vseqoutsz [40] 5C20000C seqstr: ccb2 ctx len=12 offs=0 [Note: the issue is present in the descriptor from the very beginning (cf. Fixes tag). However I've marked it v4.19+ since it's the oldest maintained kernel that the patch applies clean against.] Cc: <stable@vger.kernel.org> # v4.19+ Fixes: `1acebad3d8` ("crypto: caam - faster aead implementation") Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:54 +02:00
Wei Yongjun	3683dd7074	crypto: cavium/zip - Add missing single_release() commit c552ffb5c93d9d65aaf34f5f001c4e7e8484ced1 upstream. When using single_open() for opening, single_release() should be used instead of seq_release(), otherwise there is a memory leak. Fixes: `09ae5d37e0` ("crypto: zip - Add Compression/Decompression statistics") Cc: <stable@vger.kernel.org> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:53 +02:00
Herbert Xu	cd8e0a5d94	crypto: skcipher - Unmap pages after an external error commit 0ba3c026e685573bd3534c17e27da7c505ac99c4 upstream. skcipher_walk_done may be called with an error by internal or external callers. For those internal callers we shouldn't unmap pages but for external callers we must unmap any pages that are in use. This patch distinguishes between the two cases by checking whether walk->nbytes is zero or not. For internal callers, we now set walk->nbytes to zero prior to the call. For external callers, walk->nbytes has always been non-zero (as zero is used to indicate the termination of a walk). Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Fixes: `5cde0af2a9` ("[CRYPTO] cipher: Added block cipher type") Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:52 +02:00
Alexander Sverdlin	9349108ae4	crypto: qat - Silence smp_processor_id() warning commit 1b82feb6c5e1996513d0fb0bbb475417088b4954 upstream. It seems that smp_processor_id() is only used for a best-effort load-balancing, refer to qat_crypto_get_instance_node(). It's not feasible to disable preemption for the duration of the crypto requests. Therefore, just silence the warning. This commit is similar to `e7a9b05ca4` ("crypto: cavium - Fix smp_processor_id() warnings"). Silences the following splat: BUG: using smp_processor_id() in preemptible [00000000] code: cryptomgr_test/2904 caller is qat_alg_ablkcipher_setkey+0x300/0x4a0 [intel_qat] CPU: 1 PID: 2904 Comm: cryptomgr_test Tainted: P O 4.14.69 #1 ... Call Trace: dump_stack+0x5f/0x86 check_preemption_disabled+0xd3/0xe0 qat_alg_ablkcipher_setkey+0x300/0x4a0 [intel_qat] skcipher_setkey_ablkcipher+0x2b/0x40 __test_skcipher+0x1f3/0xb20 ? cpumask_next_and+0x26/0x40 ? find_busiest_group+0x10e/0x9d0 ? preempt_count_add+0x49/0xa0 ? try_module_get+0x61/0xf0 ? crypto_mod_get+0x15/0x30 ? __kmalloc+0x1df/0x1f0 ? __crypto_alloc_tfm+0x116/0x180 ? crypto_skcipher_init_tfm+0xa6/0x180 ? crypto_create_tfm+0x4b/0xf0 test_skcipher+0x21/0xa0 alg_test_skcipher+0x3f/0xa0 alg_test.part.6+0x126/0x2a0 ? finish_task_switch+0x21b/0x260 ? __schedule+0x1e9/0x800 ? __wake_up_common+0x8d/0x140 cryptomgr_test+0x40/0x50 kthread+0xff/0x130 ? cryptomgr_notify+0x540/0x540 ? kthread_create_on_node+0x70/0x70 ret_from_fork+0x24/0x50 Fixes: `ed8ccaef52` ("crypto: qat - Add support for SRIOV") Cc: stable@vger.kernel.org Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:52 +02:00
Steven Rostedt (VMware)	532920b266	tools lib traceevent: Fix "robust" test of do_generate_dynamic_list_file commit 82a2f88458d70704be843961e10b5cef9a6e95d3 upstream. The tools/lib/traceevent/Makefile had a test added to it to detect a failure of the "nm" when making the dynamic list file (whatever that is). The problem is that the test sorts the values "U W w" and some versions of sort will place "w" ahead of "W" (even though it has a higher ASCII value, and break the test. Add 'tr "w" "W"' to merge the two and not worry about the ordering. Reported-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michal rarek <mmarek@suse.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org Fixes: `6467753d61` ("tools lib traceevent: Robustify do_generate_dynamic_list_file") Link: http://lkml.kernel.org/r/20190805130150.25acfeb1@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:51 +02:00
Marc Kleine-Budde	4aaea17d3c	can: mcp251x: mcp251x_hw_reset(): allow more time after a reset commit d84ea2123f8d27144e3f4d58cd88c9c6ddc799de upstream. Some boards take longer than 5ms to power up after a reset, so allow some retries attempts before giving up. Fixes: `ff06d611a3` ("can: mcp251x: Improve mcp251x_hw_reset()") Cc: linux-stable <stable@vger.kernel.org> Tested-by: Sean Nyekjaer <sean@geanix.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:51 +02:00
Aneesh Kumar K.V	9124eac41a	powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions commit 677733e296b5c7a37c47da391fc70a43dc40bd67 upstream. The store ordering vs tlbie issue mentioned in commit `a5d4b5891c` ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") is fixed for Nimbus 2.3 and Cumulus 1.3 revisions. We don't need to apply the fixup if we are running on them We can only do this on PowerNV. On pseries guest with KVM we still don't support redoing the feature fixup after migration. So we should be enabling all the workarounds needed, because whe can possibly migrate between DD 2.3 and DD 2.2 Fixes: `a5d4b5891c` ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190924035254.24612-1-aneesh.kumar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:50 +02:00
Alexey Kardashevskiy	19c12f1209	powerpc/powernv/ioda: Fix race in TCE level allocation commit 56090a3902c80c296e822d11acdb6a101b322c52 upstream. pnv_tce() returns a pointer to a TCE entry and originally a TCE table would be pre-allocated. For the default case of 2GB window the table needs only a single level and that is fine. However if more levels are requested, it is possible to get a race when 2 threads want a pointer to a TCE entry from the same page of TCEs. This adds cmpxchg to handle the race. Note that once TCE is non-zero, it cannot become zero again. Fixes: `a68bd1267b` ("powerpc/powernv/ioda: Allocate indirect TCE levels on demand") CC: stable@vger.kernel.org # v4.19+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190718051139.74787-2-aik@ozlabs.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:50 +02:00
Andrew Donnellan	032ce7d766	powerpc/powernv: Restrict OPAL symbol map to only be readable by root commit e7de4f7b64c23e503a8c42af98d56f2a7462bd6d upstream. Currently the OPAL symbol map is globally readable, which seems bad as it contains physical addresses. Restrict it to root. Fixes: `c8742f8512` ("powerpc/powernv: Expose OPAL firmware symbol map") Cc: stable@vger.kernel.org # v3.19+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190503075253.22798-1-ajd@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:50 +02:00
Santosh Sivaraj	ba3ca9fcb0	powerpc/mce: Schedule work from irq_work commit b5bda6263cad9a927e1a4edb7493d542da0c1410 upstream. schedule_work() cannot be called from MCE exception context as MCE can interrupt even in interrupt disabled context. Fixes: `733e4a4c44` ("powerpc/mce: hookup memory_failure for UE errors") Cc: stable@vger.kernel.org # v4.15+ Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-2-santosh@fossix.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:49 +02:00
Balbir Singh	ee6eeeb88e	powerpc/mce: Fix MCE handling for huge pages commit 99ead78afd1128bfcebe7f88f3b102fb2da09aee upstream. The current code would fail on huge pages addresses, since the shift would be incorrect. Use the correct page shift value returned by __find_linux_pte() to get the correct physical address. The code is more generic and can handle both regular and compound pages. Fixes: `ba41e1e1cc` ("powerpc/mce: Hookup derror (load/store) UE errors") Signed-off-by: Balbir Singh <bsingharora@gmail.com> [arbab@linux.ibm.com: Fixup pseries_do_memory_failure()] Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-3-santosh@fossix.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:49 +02:00
Oleksandr Suvorov	1284f20734	ASoC: sgtl5000: Improve VAG power and mute control commit b1f373a11d25fc9a5f7679c9b85799fe09b0dc4a upstream. VAG power control is improved to fit the manual [1]. This patch fixes as minimum one bug: if customer muxes Headphone to Line-In right after boot, the VAG power remains off that leads to poor sound quality from line-in. I.e. after boot: - Connect sound source to Line-In jack; - Connect headphone to HP jack; - Run following commands: $ amixer set 'Headphone' 80% $ amixer set 'Headphone Mux' LINE_IN Change VAG power on/off control according to the following algorithm: - turn VAG power ON on the 1st incoming event. - keep it ON if there is any active VAG consumer (ADC/DAC/HP/Line-In). - turn VAG power OFF when there is the latest consumer's pre-down event come. - always delay after VAG power OFF to avoid pop. - delay after VAG power ON if the initiative consumer is Line-In, this prevents pop during line-in muxing. According to the data sheet [1], to avoid any pops/clicks, the outputs should be muted during input/output routing changes. [1] https://www.nxp.com/docs/en/data-sheet/SGTL5000.pdf Cc: stable@vger.kernel.org Fixes: `9b34e6cc3b` ("ASoC: Add Freescale SGTL5000 codec support") Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@toradex.com> Reviewed-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://lore.kernel.org/r/20190719100524.23300-3-oleksandr.suvorov@toradex.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:47 +02:00
Oleksandr Suvorov	50090b75fa	ASoC: Define a set of DAPM pre/post-up events commit cfc8f568aada98f9608a0a62511ca18d647613e2 upstream. Prepare to use SND_SOC_DAPM_PRE_POST_PMU definition to reduce coming code size and make it more readable. Cc: stable@vger.kernel.org Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@toradex.com> Reviewed-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Reviewed-by: Igor Opaniuk <igor.opaniuk@toradex.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20190719100524.23300-2-oleksandr.suvorov@toradex.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:47 +02:00
Dmitry Osipenko	42b888f633	PM / devfreq: tegra: Fix kHz to Hz conversion commit 62bacb06b9f08965c4ef10e17875450490c948c0 upstream. The kHz to Hz is incorrectly converted in a few places in the code, this results in a wrong frequency being calculated because devfreq core uses OPP frequencies that are given in Hz to clamp the rate, while tegra-devfreq gives to the core value in kHz and then it also expects to receive value in kHz from the core. In a result memory freq is always set to a value which is close to ULONG_MAX because of the bug. Hence the EMC frequency is always capped to the maximum and the driver doesn't do anything useful. This patch was tested on Tegra30 and Tegra124 SoC's, EMC frequency scaling works properly now. Cc: <stable@vger.kernel.org> # 4.14+ Tested-by: Steev Klimaszewski <steev@kali.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:46 +02:00
Mike Christie	9f0f39c92e	nbd: fix max number of supported devs commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 upstream. This fixes a bug added in 4.10 with commit: commit `9561a7ade0` Author: Josef Bacik <jbacik@fb.com> Date: Tue Nov 22 14:04:40 2016 -0500 nbd: add multi-connection support that limited the number of devices to 256. Before the patch we could create 1000s of devices, but the patch switched us from using our own thread to using a work queue which has a default limit of 256 active works. The problem is that our recv_work function sits in a loop until disconnection but only handles IO for one connection. The work is started when the connection is started/restarted, but if we end up creating 257 or more connections, the queue_work call just queues connection257+'s recv_work and that waits for connection 1 - 256's recv_work to be disconnected and that work instance completing. Instead of reverting back to kthreads, this has us allocate a workqueue_struct per device, so we can block in the work. Cc: stable@vger.kernel.org Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:46 +02:00
Jack Wang	eff3a54aae	KVM: nVMX: handle page fault in vmread fix During backport f7eea636c3d5 ("KVM: nVMX: handle page fault in vmread"), there was a mistake the exception reference should be passed to function kvm_write_guest_virt_system, instead of NULL, other wise, we will get NULL pointer deref, eg kvm-unit-test triggered a NULL pointer deref below: [ 948.518437] kvm [24114]: vcpu0, guest rIP: 0x407ef9 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x3, nop [ 949.106464] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 949.106707] PGD 0 P4D 0 [ 949.106872] Oops: 0002 [#1] SMP [ 949.107038] CPU: 2 PID: 24126 Comm: qemu-2.7 Not tainted 4.19.77-pserver #4.19.77-1+feature+daily+update+20191005.1625+a4168bb~deb9 [ 949.107283] Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 2.7.3 01/31/2018 [ 949.107549] RIP: 0010:kvm_write_guest_virt_system+0x12/0x40 [kvm] [ 949.107719] Code: c0 5d 41 5c 41 5d 41 5e 83 f8 03 41 0f 94 c0 41 c1 e0 02 e9 b0 ed ff ff 0f 1f 44 00 00 48 89 f0 c6 87 59 56 00 00 01 48 89 d6 <49> c7 00 00 00 00 00 89 ca 49 c7 40 08 00 00 00 00 49 c7 40 10 00 [ 949.108044] RSP: 0018:ffffb31b0a953cb0 EFLAGS: 00010202 [ 949.108216] RAX: 000000000046b4d8 RBX: ffff9e9f415b0000 RCX: 0000000000000008 [ 949.108389] RDX: ffffb31b0a953cc0 RSI: ffffb31b0a953cc0 RDI: ffff9e9f415b0000 [ 949.108562] RBP: 00000000d2e14928 R08: 0000000000000000 R09: 0000000000000000 [ 949.108733] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffc8 [ 949.108907] R13: 0000000000000002 R14: ffff9e9f4f26f2e8 R15: 0000000000000000 [ 949.109079] FS: 00007eff8694c700(0000) GS:ffff9e9f51a80000(0000) knlGS:0000000031415928 [ 949.109318] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 949.109495] CR2: 0000000000000000 CR3: 00000003be53b002 CR4: 00000000003626e0 [ 949.109671] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 949.109845] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 949.110017] Call Trace: [ 949.110186] handle_vmread+0x22b/0x2f0 [kvm_intel] [ 949.110356] ? vmexit_fill_RSB+0xc/0x30 [kvm_intel] [ 949.110549] kvm_arch_vcpu_ioctl_run+0xa98/0x1b30 [kvm] [ 949.110725] ? kvm_vcpu_ioctl+0x388/0x5d0 [kvm] [ 949.110901] kvm_vcpu_ioctl+0x388/0x5d0 [kvm] [ 949.111072] do_vfs_ioctl+0xa2/0x620 Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-11 18:20:45 +02:00
Wanpeng Li	21874027e1	KVM: X86: Fix userspace set invalid CR4 commit 3ca94192278ca8de169d78c085396c424be123b3 upstream. Reported by syzkaller: WARNING: CPU: 0 PID: 6544 at /home/kernel/data/kvm/arch/x86/kvm//vmx/vmx.c:4689 handle_desc+0x37/0x40 [kvm_intel] CPU: 0 PID: 6544 Comm: a.out Tainted: G OE 5.3.0-rc4+ #4 RIP: 0010:handle_desc+0x37/0x40 [kvm_intel] Call Trace: vmx_handle_exit+0xbe/0x6b0 [kvm_intel] vcpu_enter_guest+0x4dc/0x18d0 [kvm] kvm_arch_vcpu_ioctl_run+0x407/0x660 [kvm] kvm_vcpu_ioctl+0x3ad/0x690 [kvm] do_vfs_ioctl+0xa2/0x690 ksys_ioctl+0x6d/0x80 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x74/0x720 entry_SYSCALL_64_after_hwframe+0x49/0xbe When CR4.UMIP is set, guest should have UMIP cpuid flag. Current kvm set_sregs function doesn't have such check when userspace inputs sregs values. SECONDARY_EXEC_DESC is enabled on writes to CR4.UMIP in vmx_set_cr4 though guest doesn't have UMIP cpuid flag. The testcast triggers handle_desc warning when executing ltr instruction since guest architectural CR4 doesn't set UMIP. This patch fixes it by adding valid CR4 and CPUID combination checking in __set_sregs. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=138efb99600000 Reported-by: syzbot+0f1819555fbdce992df9@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:45 +02:00
Paul Mackerras	30fbe0d380	KVM: PPC: Book3S HV: Don't lose pending doorbell request on migration on P9 commit ff42df49e75f053a8a6b4c2533100cdcc23afe69 upstream. On POWER9, when userspace reads the value of the DPDES register on a vCPU, it is possible for 0 to be returned although there is a doorbell interrupt pending for the vCPU. This can lead to a doorbell interrupt being lost across migration. If the guest kernel uses doorbell interrupts for IPIs, then it could malfunction because of the lost interrupt. This happens because a newly-generated doorbell interrupt is signalled by setting vcpu->arch.doorbell_request to 1; the DPDES value in vcpu->arch.vcore->dpdes is not updated, because it can only be updated when holding the vcpu mutex, in order to avoid races. To fix this, we OR in vcpu->arch.doorbell_request when reading the DPDES value. Cc: stable@vger.kernel.org # v4.13+ Fixes: `579006944e` ("KVM: PPC: Book3S HV: Virtualize doorbell facility on POWER9") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:44 +02:00
Paul Mackerras	4faa7f05af	KVM: PPC: Book3S HV: Check for MMU ready on piggybacked virtual cores commit d28eafc5a64045c78136162af9d4ba42f8230080 upstream. When we are running multiple vcores on the same physical core, they could be from different VMs and so it is possible that one of the VMs could have its arch.mmu_ready flag cleared (for example by a concurrent HPT resize) when we go to run it on a physical core. We currently check the arch.mmu_ready flag for the primary vcore but not the flags for the other vcores that will be run alongside it. This adds that check, and also a check when we select the secondary vcores from the preempted vcores list. Cc: stable@vger.kernel.org # v4.14+ Fixes: `38c53af853` ("KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:44 +02:00
Paul Mackerras	577a5119d7	KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts commit 959c5d5134786b4988b6fdd08e444aa67d1667ed upstream. Escalation interrupts are interrupts sent to the host by the XIVE hardware when it has an interrupt to deliver to a guest VCPU but that VCPU is not running anywhere in the system. Hence we disable the escalation interrupt for the VCPU being run when we enter the guest and re-enable it when the guest does an H_CEDE hypercall indicating it is idle. It is possible that an escalation interrupt gets generated just as we are entering the guest. In that case the escalation interrupt may be using a queue entry in one of the interrupt queues, and that queue entry may not have been processed when the guest exits with an H_CEDE. The existing entry code detects this situation and does not clear the vcpu->arch.xive_esc_on flag as an indication that there is a pending queue entry (if the queue entry gets processed, xive_esc_irq() will clear the flag). There is a comment in the code saying that if the flag is still set on H_CEDE, we have to abort the cede rather than re-enabling the escalation interrupt, lest we end up with two occurrences of the escalation interrupt in the interrupt queue. However, the exit code doesn't do that; it aborts the cede in the sense that vcpu->arch.ceded gets cleared, but it still enables the escalation interrupt by setting the source's PQ bits to 00. Instead we need to set the PQ bits to 10, indicating that an interrupt has been triggered. We also need to avoid setting vcpu->arch.xive_esc_on in this case (i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because xive_esc_irq() will run at some point and clear it, and if we race with that we may end up with an incorrect result (i.e. xive_esc_on set when the escalation interrupt has just been handled). It is extremely unlikely that having two queue entries would cause observable problems; theoretically it could cause queue overflow, but the CPU would have to have thousands of interrupts targetted to it for that to be possible. However, this fix will also make it possible to determine accurately whether there is an unhandled escalation interrupt in the queue, which will be needed by the following patch. Fixes: `9b9b13a6d1` ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190813100349.GD9567@blackberry Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-11 18:20:44 +02:00

1 2 3 4 5 ...

791504 Commits