diff --git a/Documentation/ABI/testing/procfs-concurrent_time b/Documentation/ABI/testing/procfs-concurrent_time new file mode 100644 index 000000000000..55b414289b40 --- /dev/null +++ b/Documentation/ABI/testing/procfs-concurrent_time @@ -0,0 +1,16 @@ +What: /proc/uid_concurrent_active_time +Date: December 2018 +Contact: Connor O'Brien +Description: + The /proc/uid_concurrent_active_time file displays aggregated cputime + numbers for each uid, broken down by the total number of cores that were + active while the uid's task was running. + +What: /proc/uid_concurrent_policy_time +Date: December 2018 +Contact: Connor O'Brien +Description: + The /proc/uid_concurrent_policy_time file displays aggregated cputime + numbers for each uid, broken down based on the cpufreq policy + of the core used by the uid's task and the number of cores associated + with that policy that were active while the uid's task was running. diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt index eef7d9d259e8..f20589841d2e 100644 --- a/Documentation/filesystems/overlayfs.txt +++ b/Documentation/filesystems/overlayfs.txt @@ -102,6 +102,29 @@ Only the lists of names from directories are merged. Other content such as metadata and extended attributes are reported for the upper directory only. These attributes of the lower directory are hidden. +credentials +----------- + +By default, all access to the upper, lower and work directories is the +recorded mounter's MAC and DAC credentials. The incoming accesses are +checked against the caller's credentials. + +In the case where caller MAC or DAC credentials do not overlap, a +use case available in older versions of the driver, the +override_creds mount flag can be turned off and help when the use +pattern has caller with legitimate credentials where the mounter +does not. Several unintended side effects will occur though. The +caller without certain key capabilities or lower privilege will not +always be able to delete files or directories, create nodes, or +search some restricted directories. The ability to search and read +a directory entry is spotty as a result of the cache mechanism not +retesting the credentials because of the assumption, a privileged +caller can fill cache, then a lower privilege can read the directory +cache. The uneven security model where cache, upperdir and workdir +are opened at privilege, but accessed without creating a form of +privilege escalation, should only be used with strict understanding +of the side effects and of the security policies. + whiteouts and opaque directories -------------------------------- diff --git a/Documentation/power/energy-model.txt b/Documentation/power/energy-model.txt new file mode 100644 index 000000000000..5a23c6f33d28 --- /dev/null +++ b/Documentation/power/energy-model.txt @@ -0,0 +1,169 @@ + ==================== + Energy Model of CPUs + ==================== + +1. Overview +----------- + +The Energy Model (EM) framework serves as an interface between drivers knowing +the power consumed by CPUs at various performance levels, and the kernel +subsystems willing to use that information to make energy-aware decisions. + +The source of the information about the power consumed by CPUs can vary greatly +from one platform to another. These power costs can be estimated using +devicetree data in some cases. In others, the firmware will know better. +Alternatively, userspace might be best positioned. And so on. In order to avoid +each and every client subsystem to re-implement support for each and every +possible source of information on its own, the EM framework intervenes as an +abstraction layer which standardizes the format of power cost tables in the +kernel, hence enabling to avoid redundant work. + +The figure below depicts an example of drivers (Arm-specific here, but the +approach is applicable to any architecture) providing power costs to the EM +framework, and interested clients reading the data from it. + + +---------------+ +-----------------+ +---------------+ + | Thermal (IPA) | | Scheduler (EAS) | | Other | + +---------------+ +-----------------+ +---------------+ + | | em_pd_energy() | + | | em_cpu_get() | + +---------+ | +---------+ + | | | + v v v + +---------------------+ + | Energy Model | + | Framework | + +---------------------+ + ^ ^ ^ + | | | em_register_perf_domain() + +----------+ | +---------+ + | | | + +---------------+ +---------------+ +--------------+ + | cpufreq-dt | | arm_scmi | | Other | + +---------------+ +---------------+ +--------------+ + ^ ^ ^ + | | | + +--------------+ +---------------+ +--------------+ + | Device Tree | | Firmware | | ? | + +--------------+ +---------------+ +--------------+ + +The EM framework manages power cost tables per 'performance domain' in the +system. A performance domain is a group of CPUs whose performance is scaled +together. Performance domains generally have a 1-to-1 mapping with CPUFreq +policies. All CPUs in a performance domain are required to have the same +micro-architecture. CPUs in different performance domains can have different +micro-architectures. + + +2. Core APIs +------------ + + 2.1 Config options + +CONFIG_ENERGY_MODEL must be enabled to use the EM framework. + + + 2.2 Registration of performance domains + +Drivers are expected to register performance domains into the EM framework by +calling the following API: + + int em_register_perf_domain(cpumask_t *span, unsigned int nr_states, + struct em_data_callback *cb); + +Drivers must specify the CPUs of the performance domains using the cpumask +argument, and provide a callback function returning tuples +for each capacity state. The callback function provided by the driver is free +to fetch data from any relevant location (DT, firmware, ...), and by any mean +deemed necessary. See Section 3. for an example of driver implementing this +callback, and kernel/power/energy_model.c for further documentation on this +API. + + + 2.3 Accessing performance domains + +Subsystems interested in the energy model of a CPU can retrieve it using the +em_cpu_get() API. The energy model tables are allocated once upon creation of +the performance domains, and kept in memory untouched. + +The energy consumed by a performance domain can be estimated using the +em_pd_energy() API. The estimation is performed assuming that the schedutil +CPUfreq governor is in use. + +More details about the above APIs can be found in include/linux/energy_model.h. + + +3. Example driver +----------------- + +This section provides a simple example of a CPUFreq driver registering a +performance domain in the Energy Model framework using the (fake) 'foo' +protocol. The driver implements an est_power() function to be provided to the +EM framework. + + -> drivers/cpufreq/foo_cpufreq.c + +01 static int est_power(unsigned long *mW, unsigned long *KHz, int cpu) +02 { +03 long freq, power; +04 +05 /* Use the 'foo' protocol to ceil the frequency */ +06 freq = foo_get_freq_ceil(cpu, *KHz); +07 if (freq < 0); +08 return freq; +09 +10 /* Estimate the power cost for the CPU at the relevant freq. */ +11 power = foo_estimate_power(cpu, freq); +12 if (power < 0); +13 return power; +14 +15 /* Return the values to the EM framework */ +16 *mW = power; +17 *KHz = freq; +18 +19 return 0; +20 } +21 +22 static int foo_cpufreq_init(struct cpufreq_policy *policy) +23 { +24 struct em_data_callback em_cb = EM_DATA_CB(est_power); +25 int nr_opp, ret; +26 +27 /* Do the actual CPUFreq init work ... */ +28 ret = do_foo_cpufreq_init(policy); +29 if (ret) +30 return ret; +31 +32 /* Find the number of OPPs for this policy */ +33 nr_opp = foo_get_nr_opp(policy); +34 +35 /* And register the new performance domain */ +36 em_register_perf_domain(policy->cpus, nr_opp, &em_cb); +37 +38 return 0; +39 } + + +4. Support for legacy Energy Models (DEPRECATED) +------------------------------------------------ + +The Android kernel version 4.14 and before used a different type of EM for EAS, +referred to as the 'legacy' EM. The legacy EM relies on the out-of-tree +'sched-energy-costs' devicetree bindings to provide the kernel with power costs. +The usage of such bindings in Android has now been DEPRECATED in favour of the +mainline equivalents. + +The currently supported alternatives to populate the EM include: + - using a firmware-based solution such as Arm SCMI (supported in + drivers/cpufreq/scmi-cpufreq.c); + - using the 'dynamic-power-coefficient' devicetree binding together with + PM_OPP. See the of_dev_pm_opp_get_cpu_power() helper in PM_OPP, and the + reference implementation in drivers/cpufreq/cpufreq-dt.c. + +In order to ease the transition to the new EM format, Android 4.19 also provides +a compatibility driver able to load a legacy EM from DT into the EM framework. +*** Please note that THIS FEATURE WILL NOT BE AVAILABLE in future Android +kernels, and as such it must be considered only as a temporary workaround. *** + +If you know what you're doing and still want to use this driver, you need to set +CONFIG_LEGACY_ENERGY_MODEL_DT=y in your kernel configuration to enable it. diff --git a/Documentation/scheduler/sched-energy.txt b/Documentation/scheduler/sched-energy.txt new file mode 100644 index 000000000000..197d81f4b836 --- /dev/null +++ b/Documentation/scheduler/sched-energy.txt @@ -0,0 +1,425 @@ + ======================= + Energy Aware Scheduling + ======================= + +1. Introduction +--------------- + +Energy Aware Scheduling (or EAS) gives the scheduler the ability to predict +the impact of its decisions on the energy consumed by CPUs. EAS relies on an +Energy Model (EM) of the CPUs to select an energy efficient CPU for each task, +with a minimal impact on throughput. This document aims at providing an +introduction on how EAS works, what are the main design decisions behind it, and +details what is needed to get it to run. + +Before going any further, please note that at the time of writing: + + /!\ EAS does not support platforms with symmetric CPU topologies /!\ + +EAS operates only on heterogeneous CPU topologies (such as Arm big.LITTLE) +because this is where the potential for saving energy through scheduling is +the highest. + +The actual EM used by EAS is _not_ maintained by the scheduler, but by a +dedicated framework. For details about this framework and what it provides, +please refer to its documentation (see Documentation/power/energy-model.txt). + + +2. Background and Terminology +----------------------------- + +To make it clear from the start: + - energy = [joule] (resource like a battery on powered devices) + - power = energy/time = [joule/second] = [watt] + +The goal of EAS is to minimize energy, while still getting the job done. That +is, we want to maximize: + + performance [inst/s] + -------------------- + power [W] + +which is equivalent to minimizing: + + energy [J] + ----------- + instruction + +while still getting 'good' performance. It is essentially an alternative +optimization objective to the current performance-only objective for the +scheduler. This alternative considers two objectives: energy-efficiency and +performance. + +The idea behind introducing an EM is to allow the scheduler to evaluate the +implications of its decisions rather than blindly applying energy-saving +techniques that may have positive effects only on some platforms. At the same +time, the EM must be as simple as possible to minimize the scheduler latency +impact. + +In short, EAS changes the way CFS tasks are assigned to CPUs. When it is time +for the scheduler to decide where a task should run (during wake-up), the EM +is used to break the tie between several good CPU candidates and pick the one +that is predicted to yield the best energy consumption without harming the +system's throughput. The predictions made by EAS rely on specific elements of +knowledge about the platform's topology, which include the 'capacity' of CPUs, +and their respective energy costs. + + +3. Topology information +----------------------- + +EAS (as well as the rest of the scheduler) uses the notion of 'capacity' to +differentiate CPUs with different computing throughput. The 'capacity' of a CPU +represents the amount of work it can absorb when running at its highest +frequency compared to the most capable CPU of the system. Capacity values are +normalized in a 1024 range, and are comparable with the utilization signals of +tasks and CPUs computed by the Per-Entity Load Tracking (PELT) mechanism. Thanks +to capacity and utilization values, EAS is able to estimate how big/busy a +task/CPU is, and to take this into consideration when evaluating performance vs +energy trade-offs. The capacity of CPUs is provided via arch-specific code +through the arch_scale_cpu_capacity() callback. + +The rest of platform knowledge used by EAS is directly read from the Energy +Model (EM) framework. The EM of a platform is composed of a power cost table +per 'performance domain' in the system (see Documentation/power/energy-model.txt +for futher details about performance domains). + +The scheduler manages references to the EM objects in the topology code when the +scheduling domains are built, or re-built. For each root domain (rd), the +scheduler maintains a singly linked list of all performance domains intersecting +the current rd->span. Each node in the list contains a pointer to a struct +em_perf_domain as provided by the EM framework. + +The lists are attached to the root domains in order to cope with exclusive +cpuset configurations. Since the boundaries of exclusive cpusets do not +necessarily match those of performance domains, the lists of different root +domains can contain duplicate elements. + +Example 1. + Let us consider a platform with 12 CPUs, split in 3 performance domains + (pd0, pd4 and pd8), organized as follows: + + CPUs: 0 1 2 3 4 5 6 7 8 9 10 11 + PDs: |--pd0--|--pd4--|---pd8---| + RDs: |----rd1----|-----rd2-----| + + Now, consider that userspace decided to split the system with two + exclusive cpusets, hence creating two independent root domains, each + containing 6 CPUs. The two root domains are denoted rd1 and rd2 in the + above figure. Since pd4 intersects with both rd1 and rd2, it will be + present in the linked list '->pd' attached to each of them: + * rd1->pd: pd0 -> pd4 + * rd2->pd: pd4 -> pd8 + + Please note that the scheduler will create two duplicate list nodes for + pd4 (one for each list). However, both just hold a pointer to the same + shared data structure of the EM framework. + +Since the access to these lists can happen concurrently with hotplug and other +things, they are protected by RCU, like the rest of topology structures +manipulated by the scheduler. + +EAS also maintains a static key (sched_energy_present) which is enabled when at +least one root domain meets all conditions for EAS to start. Those conditions +are summarized in Section 6. + + +4. Energy-Aware task placement +------------------------------ + +EAS overrides the CFS task wake-up balancing code. It uses the EM of the +platform and the PELT signals to choose an energy-efficient target CPU during +wake-up balance. When EAS is enabled, select_task_rq_fair() calls +find_energy_efficient_cpu() to do the placement decision. This function looks +for the CPU with the highest spare capacity (CPU capacity - CPU utilization) in +each performance domain since it is the one which will allow us to keep the +frequency the lowest. Then, the function checks if placing the task there could +save energy compared to leaving it on prev_cpu, i.e. the CPU where the task ran +in its previous activation. + +find_energy_efficient_cpu() uses compute_energy() to estimate what will be the +energy consumed by the system if the waking task was migrated. compute_energy() +looks at the current utilization landscape of the CPUs and adjusts it to +'simulate' the task migration. The EM framework provides the em_pd_energy() API +which computes the expected energy consumption of each performance domain for +the given utilization landscape. + +An example of energy-optimized task placement decision is detailed below. + +Example 2. + Let us consider a (fake) platform with 2 independent performance domains + composed of two CPUs each. CPU0 and CPU1 are little CPUs; CPU2 and CPU3 + are big. + + The scheduler must decide where to place a task P whose util_avg = 200 + and prev_cpu = 0. + + The current utilization landscape of the CPUs is depicted on the graph + below. CPUs 0-3 have a util_avg of 400, 100, 600 and 500 respectively + Each performance domain has three Operating Performance Points (OPPs). + The CPU capacity and power cost associated with each OPP is listed in + the Energy Model table. The util_avg of P is shown on the figures + below as 'PP'. + + CPU util. + 1024 - - - - - - - Energy Model + +-----------+-------------+ + | Little | Big | + 768 ============= +-----+-----+------+------+ + | Cap | Pwr | Cap | Pwr | + +-----+-----+------+------+ + 512 =========== - ##- - - - - | 170 | 50 | 512 | 400 | + ## ## | 341 | 150 | 768 | 800 | + 341 -PP - - - - ## ## | 512 | 300 | 1024 | 1700 | + PP ## ## +-----+-----+------+------+ + 170 -## - - - - ## ## + ## ## ## ## + ------------ ------------- + CPU0 CPU1 CPU2 CPU3 + + Current OPP: ===== Other OPP: - - - util_avg (100 each): ## + + + find_energy_efficient_cpu() will first look for the CPUs with the + maximum spare capacity in the two performance domains. In this example, + CPU1 and CPU3. Then it will estimate the energy of the system if P was + placed on either of them, and check if that would save some energy + compared to leaving P on CPU0. EAS assumes that OPPs follow utilization + (which is coherent with the behaviour of the schedutil CPUFreq + governor, see Section 6. for more details on this topic). + + Case 1. P is migrated to CPU1 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + 1024 - - - - - - - + + Energy calculation: + 768 ============= * CPU0: 200 / 341 * 150 = 88 + * CPU1: 300 / 341 * 150 = 131 + * CPU2: 600 / 768 * 800 = 625 + 512 - - - - - - - ##- - - - - * CPU3: 500 / 768 * 800 = 520 + ## ## => total_energy = 1364 + 341 =========== ## ## + PP ## ## + 170 -## - - PP- ## ## + ## ## ## ## + ------------ ------------- + CPU0 CPU1 CPU2 CPU3 + + + Case 2. P is migrated to CPU3 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + 1024 - - - - - - - + + Energy calculation: + 768 ============= * CPU0: 200 / 341 * 150 = 88 + * CPU1: 100 / 341 * 150 = 43 + PP * CPU2: 600 / 768 * 800 = 625 + 512 - - - - - - - ##- - -PP - * CPU3: 700 / 768 * 800 = 729 + ## ## => total_energy = 1485 + 341 =========== ## ## + ## ## + 170 -## - - - - ## ## + ## ## ## ## + ------------ ------------- + CPU0 CPU1 CPU2 CPU3 + + + Case 3. P stays on prev_cpu / CPU 0 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + 1024 - - - - - - - + + Energy calculation: + 768 ============= * CPU0: 400 / 512 * 300 = 234 + * CPU1: 100 / 512 * 300 = 58 + * CPU2: 600 / 768 * 800 = 625 + 512 =========== - ##- - - - - * CPU3: 500 / 768 * 800 = 520 + ## ## => total_energy = 1437 + 341 -PP - - - - ## ## + PP ## ## + 170 -## - - - - ## ## + ## ## ## ## + ------------ ------------- + CPU0 CPU1 CPU2 CPU3 + + + From these calculations, the Case 1 has the lowest total energy. So CPU 1 + is be the best candidate from an energy-efficiency standpoint. + +Big CPUs are generally more power hungry than the little ones and are thus used +mainly when a task doesn't fit the littles. However, little CPUs aren't always +necessarily more energy-efficient than big CPUs. For some systems, the high OPPs +of the little CPUs can be less energy-efficient than the lowest OPPs of the +bigs, for example. So, if the little CPUs happen to have enough utilization at +a specific point in time, a small task waking up at that moment could be better +of executing on the big side in order to save energy, even though it would fit +on the little side. + +And even in the case where all OPPs of the big CPUs are less energy-efficient +than those of the little, using the big CPUs for a small task might still, under +specific conditions, save energy. Indeed, placing a task on a little CPU can +result in raising the OPP of the entire performance domain, and that will +increase the cost of the tasks already running there. If the waking task is +placed on a big CPU, its own execution cost might be higher than if it was +running on a little, but it won't impact the other tasks of the little CPUs +which will keep running at a lower OPP. So, when considering the total energy +consumed by CPUs, the extra cost of running that one task on a big core can be +smaller than the cost of raising the OPP on the little CPUs for all the other +tasks. + +The examples above would be nearly impossible to get right in a generic way, and +for all platforms, without knowing the cost of running at different OPPs on all +CPUs of the system. Thanks to its EM-based design, EAS should cope with them +correctly without too many troubles. However, in order to ensure a minimal +impact on throughput for high-utilization scenarios, EAS also implements another +mechanism called 'over-utilization'. + + +5. Over-utilization +------------------- + +From a general standpoint, the use-cases where EAS can help the most are those +involving a light/medium CPU utilization. Whenever long CPU-bound tasks are +being run, they will require all of the available CPU capacity, and there isn't +much that can be done by the scheduler to save energy without severly harming +throughput. In order to avoid hurting performance with EAS, CPUs are flagged as +'over-utilized' as soon as they are used at more than 80% of their compute +capacity. As long as no CPUs are over-utilized in a root domain, load balancing +is disabled and EAS overridess the wake-up balancing code. EAS is likely to load +the most energy efficient CPUs of the system more than the others if that can be +done without harming throughput. So, the load-balancer is disabled to prevent +it from breaking the energy-efficient task placement found by EAS. It is safe to +do so when the system isn't overutilized since being below the 80% tipping point +implies that: + + a. there is some idle time on all CPUs, so the utilization signals used by + EAS are likely to accurately represent the 'size' of the various tasks + in the system; + b. all tasks should already be provided with enough CPU capacity, + regardless of their nice values; + c. since there is spare capacity all tasks must be blocking/sleeping + regularly and balancing at wake-up is sufficient. + +As soon as one CPU goes above the 80% tipping point, at least one of the three +assumptions above becomes incorrect. In this scenario, the 'overutilized' flag +is raised for the entire root domain, EAS is disabled, and the load-balancer is +re-enabled. By doing so, the scheduler falls back onto load-based algorithms for +wake-up and load balance under CPU-bound conditions. This provides a better +respect of the nice values of tasks. + +Since the notion of overutilization largely relies on detecting whether or not +there is some idle time in the system, the CPU capacity 'stolen' by higher +(than CFS) scheduling classes (as well as IRQ) must be taken into account. As +such, the detection of overutilization accounts for the capacity used not only +by CFS tasks, but also by the other scheduling classes and IRQ. + + +6. Dependencies and requirements for EAS +---------------------------------------- + +Energy Aware Scheduling depends on the CPUs of the system having specific +hardware properties and on other features of the kernel being enabled. This +section lists these dependencies and provides hints as to how they can be met. + + + 6.1 - Asymmetric CPU topology + +As mentioned in the introduction, EAS is only supported on platforms with +asymmetric CPU topologies for now. This requirement is checked at run-time by +looking for the presence of the SD_ASYM_CPUCAPACITY flag when the scheduling +domains are built. + +The flag is set/cleared automatically by the scheduler topology code whenever +there are CPUs with different capacities in a root domain. The capacities of +CPUs are provided by arch-specific code through the arch_scale_cpu_capacity() +callback. As an example, arm and arm64 share an implementation of this callback +which uses a combination of CPUFreq data and device-tree bindings to compute the +capacity of CPUs (see drivers/base/arch_topology.c for more details). + +So, in order to use EAS on your platform your architecture must implement the +arch_scale_cpu_capacity() callback, and some of the CPUs must have a lower +capacity than others. + +Please note that EAS is not fundamentally incompatible with SMP, but no +significant savings on SMP platforms have been observed yet. This restriction +could be amended in the future if proven otherwise. + + + 6.2 - Energy Model presence + +EAS uses the EM of a platform to estimate the impact of scheduling decisions on +energy. So, your platform must provide power cost tables to the EM framework in +order to make EAS start. To do so, please refer to documentation of the +independent EM framework in Documentation/power/energy-model.txt. + +Please also note that the scheduling domains need to be re-built after the +EM has been registered in order to start EAS. + + + 6.3 - Energy Model complexity + +The task wake-up path is very latency-sensitive. When the EM of a platform is +too complex (too many CPUs, too many performance domains, too many performance +states, ...), the cost of using it in the wake-up path can become prohibitive. +The energy-aware wake-up algorithm has a complexity of: + + C = Nd * (Nc + Ns) + +with: Nd the number of performance domains; Nc the number of CPUs; and Ns the +total number of OPPs (ex: for two perf. domains with 4 OPPs each, Ns = 8). + +A complexity check is performed at the root domain level, when scheduling +domains are built. EAS will not start on a root domain if its C happens to be +higher than the completely arbitrary EM_MAX_COMPLEXITY threshold (2048 at the +time of writing). + +If you really want to use EAS but the complexity of your platform's Energy +Model is too high to be used with a single root domain, you're left with only +two possible options: + + 1. split your system into separate, smaller, root domains using exclusive + cpusets and enable EAS locally on each of them. This option has the + benefit to work out of the box but the drawback of preventing load + balance between root domains, which can result in an unbalanced system + overall; + 2. submit patches to reduce the complexity of the EAS wake-up algorithm, + hence enabling it to cope with larger EMs in reasonable time. + + + 6.4 - Schedutil governor + +EAS tries to predict at which OPP will the CPUs be running in the close future +in order to estimate their energy consumption. To do so, it is assumed that OPPs +of CPUs follow their utilization. + +Although it is very difficult to provide hard guarantees regarding the accuracy +of this assumption in practice (because the hardware might not do what it is +told to do, for example), schedutil as opposed to other CPUFreq governors at +least _requests_ frequencies calculated using the utilization signals. +Consequently, the only sane governor to use together with EAS is schedutil, +because it is the only one providing some degree of consistency between +frequency requests and energy predictions. + +Using EAS with any other governor than schedutil is not supported. + + + 6.5 Scale-invariant utilization signals + +In order to make accurate prediction across CPUs and for all performance +states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can +be obtained using the architecture-defined arch_scale{cpu,freq}_capacity() +callbacks. + +Using EAS on a platform that doesn't implement these two callbacks is not +supported. + + + 6.6 Multithreading (SMT) + +EAS in its current form is SMT unaware and is not able to leverage +multithreaded hardware to save energy. EAS considers threads as independent +CPUs, which can actually be counter-productive for both performance and energy. + +EAS on SMT is not supported. diff --git a/Makefile b/Makefile index c05200f92c6d..277bce19b629 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 19 -SUBLEVEL = 27 +SUBLEVEL = 30 EXTRAVERSION = NAME = "People's Front" diff --git a/arch/arm/boot/dts/exynos3250.dtsi b/arch/arm/boot/dts/exynos3250.dtsi index 27a1ee28c3bb..94efca78c42f 100644 --- a/arch/arm/boot/dts/exynos3250.dtsi +++ b/arch/arm/boot/dts/exynos3250.dtsi @@ -168,6 +168,9 @@ interrupt-controller; #interrupt-cells = <3>; interrupt-parent = <&gic>; + clock-names = "clkout8"; + clocks = <&cmu CLK_FIN_PLL>; + #clock-cells = <1>; }; mipi_phy: video-phy { diff --git a/arch/arm/boot/dts/exynos4412-odroid-common.dtsi b/arch/arm/boot/dts/exynos4412-odroid-common.dtsi index a09e46c9dbc0..00820d239753 100644 --- a/arch/arm/boot/dts/exynos4412-odroid-common.dtsi +++ b/arch/arm/boot/dts/exynos4412-odroid-common.dtsi @@ -49,7 +49,7 @@ }; emmc_pwrseq: pwrseq { - pinctrl-0 = <&sd1_cd>; + pinctrl-0 = <&emmc_rstn>; pinctrl-names = "default"; compatible = "mmc-pwrseq-emmc"; reset-gpios = <&gpk1 2 GPIO_ACTIVE_LOW>; @@ -161,12 +161,6 @@ cpu0-supply = <&buck2_reg>; }; -/* RSTN signal for eMMC */ -&sd1_cd { - samsung,pin-pud = ; - samsung,pin-drv = ; -}; - &pinctrl_1 { gpio_power_key: power_key { samsung,pins = "gpx1-3"; @@ -184,6 +178,11 @@ samsung,pins = "gpx3-7"; samsung,pin-pud = ; }; + + emmc_rstn: emmc-rstn { + samsung,pins = "gpk1-2"; + samsung,pin-pud = ; + }; }; &ehci { diff --git a/arch/arm/boot/dts/exynos5422-odroid-core.dtsi b/arch/arm/boot/dts/exynos5422-odroid-core.dtsi index 2f4f40882dab..27214e6ebe4f 100644 --- a/arch/arm/boot/dts/exynos5422-odroid-core.dtsi +++ b/arch/arm/boot/dts/exynos5422-odroid-core.dtsi @@ -334,7 +334,7 @@ buck8_reg: BUCK8 { regulator-name = "vdd_1.8v_ldo"; regulator-min-microvolt = <800000>; - regulator-max-microvolt = <1500000>; + regulator-max-microvolt = <2000000>; regulator-always-on; regulator-boot-on; }; diff --git a/arch/arm/boot/dts/imx6sx.dtsi b/arch/arm/boot/dts/imx6sx.dtsi index 844caa39364f..50083cecc6c9 100644 --- a/arch/arm/boot/dts/imx6sx.dtsi +++ b/arch/arm/boot/dts/imx6sx.dtsi @@ -462,7 +462,7 @@ }; gpt: gpt@2098000 { - compatible = "fsl,imx6sx-gpt", "fsl,imx31-gpt"; + compatible = "fsl,imx6sx-gpt", "fsl,imx6dl-gpt"; reg = <0x02098000 0x4000>; interrupts = ; clocks = <&clks IMX6SX_CLK_GPT_BUS>, diff --git a/arch/arm/boot/dts/meson.dtsi b/arch/arm/boot/dts/meson.dtsi index 0d9faf1a51ea..a86b89086334 100644 --- a/arch/arm/boot/dts/meson.dtsi +++ b/arch/arm/boot/dts/meson.dtsi @@ -263,7 +263,7 @@ compatible = "amlogic,meson6-dwmac", "snps,dwmac"; reg = <0xc9410000 0x10000 0xc1108108 0x4>; - interrupts = ; + interrupts = ; interrupt-names = "macirq"; status = "disabled"; }; diff --git a/arch/arm/boot/dts/meson8b-odroidc1.dts b/arch/arm/boot/dts/meson8b-odroidc1.dts index ef3177d3da3d..8fdeeffecbdb 100644 --- a/arch/arm/boot/dts/meson8b-odroidc1.dts +++ b/arch/arm/boot/dts/meson8b-odroidc1.dts @@ -125,7 +125,6 @@ /* Realtek RTL8211F (0x001cc916) */ eth_phy: ethernet-phy@0 { reg = <0>; - eee-broken-1000t; interrupt-parent = <&gpio_intc>; /* GPIOH_3 */ interrupts = <17 IRQ_TYPE_LEVEL_LOW>; @@ -172,8 +171,7 @@ cap-sd-highspeed; disable-wp; - cd-gpios = <&gpio CARD_6 GPIO_ACTIVE_HIGH>; - cd-inverted; + cd-gpios = <&gpio CARD_6 GPIO_ACTIVE_LOW>; vmmc-supply = <&tflash_vdd>; vqmmc-supply = <&tf_io>; diff --git a/arch/arm/boot/dts/meson8m2-mxiii-plus.dts b/arch/arm/boot/dts/meson8m2-mxiii-plus.dts index f5853610b20b..6ac02beb5fa7 100644 --- a/arch/arm/boot/dts/meson8m2-mxiii-plus.dts +++ b/arch/arm/boot/dts/meson8m2-mxiii-plus.dts @@ -206,8 +206,7 @@ cap-sd-highspeed; disable-wp; - cd-gpios = <&gpio CARD_6 GPIO_ACTIVE_HIGH>; - cd-inverted; + cd-gpios = <&gpio CARD_6 GPIO_ACTIVE_LOW>; vmmc-supply = <&vcc_3v3>; }; diff --git a/arch/arm/boot/dts/motorola-cpcap-mapphone.dtsi b/arch/arm/boot/dts/motorola-cpcap-mapphone.dtsi index ddc7a7bb33c0..f57acf8f66b9 100644 --- a/arch/arm/boot/dts/motorola-cpcap-mapphone.dtsi +++ b/arch/arm/boot/dts/motorola-cpcap-mapphone.dtsi @@ -105,7 +105,7 @@ interrupts-extended = < &cpcap 15 0 &cpcap 14 0 &cpcap 28 0 &cpcap 19 0 &cpcap 18 0 &cpcap 17 0 &cpcap 16 0 &cpcap 49 0 - &cpcap 48 1 + &cpcap 48 0 >; interrupt-names = "id_ground", "id_float", "se0conn", "vbusvld", diff --git a/arch/arm/boot/dts/omap3-n950-n9.dtsi b/arch/arm/boot/dts/omap3-n950-n9.dtsi index 0d9b85317529..e142e6c70a59 100644 --- a/arch/arm/boot/dts/omap3-n950-n9.dtsi +++ b/arch/arm/boot/dts/omap3-n950-n9.dtsi @@ -370,6 +370,19 @@ compatible = "ti,omap2-onenand"; reg = <0 0 0x20000>; /* CS0, offset 0, IO size 128K */ + /* + * These timings are based on CONFIG_OMAP_GPMC_DEBUG=y reported + * bootloader set values when booted with v4.19 using both N950 + * and N9 devices (OneNAND Manufacturer: Samsung): + * + * gpmc cs0 before gpmc_cs_program_settings: + * cs0 GPMC_CS_CONFIG1: 0xfd001202 + * cs0 GPMC_CS_CONFIG2: 0x00181800 + * cs0 GPMC_CS_CONFIG3: 0x00030300 + * cs0 GPMC_CS_CONFIG4: 0x18001804 + * cs0 GPMC_CS_CONFIG5: 0x03171d1d + * cs0 GPMC_CS_CONFIG6: 0x97080000 + */ gpmc,sync-read; gpmc,sync-write; gpmc,burst-length = <16>; @@ -379,26 +392,27 @@ gpmc,device-width = <2>; gpmc,mux-add-data = <2>; gpmc,cs-on-ns = <0>; - gpmc,cs-rd-off-ns = <87>; - gpmc,cs-wr-off-ns = <87>; + gpmc,cs-rd-off-ns = <122>; + gpmc,cs-wr-off-ns = <122>; gpmc,adv-on-ns = <0>; - gpmc,adv-rd-off-ns = <10>; - gpmc,adv-wr-off-ns = <10>; - gpmc,oe-on-ns = <15>; - gpmc,oe-off-ns = <87>; + gpmc,adv-rd-off-ns = <15>; + gpmc,adv-wr-off-ns = <15>; + gpmc,oe-on-ns = <20>; + gpmc,oe-off-ns = <122>; gpmc,we-on-ns = <0>; - gpmc,we-off-ns = <87>; - gpmc,rd-cycle-ns = <112>; - gpmc,wr-cycle-ns = <112>; - gpmc,access-ns = <81>; + gpmc,we-off-ns = <122>; + gpmc,rd-cycle-ns = <148>; + gpmc,wr-cycle-ns = <148>; + gpmc,access-ns = <117>; gpmc,page-burst-access-ns = <15>; gpmc,bus-turnaround-ns = <0>; gpmc,cycle2cycle-delay-ns = <0>; gpmc,wait-monitoring-ns = <0>; - gpmc,clk-activation-ns = <5>; - gpmc,wr-data-mux-bus-ns = <30>; - gpmc,wr-access-ns = <81>; - gpmc,sync-clk-ps = <15000>; + gpmc,clk-activation-ns = <10>; + gpmc,wr-data-mux-bus-ns = <40>; + gpmc,wr-access-ns = <117>; + + gpmc,sync-clk-ps = <15000>; /* TBC; Where this value came? */ /* * MTD partition table corresponding to Nokia's MeeGo 1.2 diff --git a/arch/arm/boot/dts/sun8i-h3-beelink-x2.dts b/arch/arm/boot/dts/sun8i-h3-beelink-x2.dts index 5d23667dc2d2..25540b7694d5 100644 --- a/arch/arm/boot/dts/sun8i-h3-beelink-x2.dts +++ b/arch/arm/boot/dts/sun8i-h3-beelink-x2.dts @@ -53,7 +53,7 @@ aliases { serial0 = &uart0; - /* ethernet0 is the H3 emac, defined in sun8i-h3.dtsi */ + ethernet0 = &emac; ethernet1 = &sdiowifi; }; diff --git a/arch/arm/plat-pxa/ssp.c b/arch/arm/plat-pxa/ssp.c index ed36dcab80f1..f51919974183 100644 --- a/arch/arm/plat-pxa/ssp.c +++ b/arch/arm/plat-pxa/ssp.c @@ -190,8 +190,6 @@ static int pxa_ssp_remove(struct platform_device *pdev) if (ssp == NULL) return -ENODEV; - iounmap(ssp->mmio_base); - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); release_mem_region(res->start, resource_size(res)); @@ -201,7 +199,6 @@ static int pxa_ssp_remove(struct platform_device *pdev) list_del(&ssp->node); mutex_unlock(&ssp_lock); - kfree(ssp); return 0; } diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts index f4964bee6a1a..e80a792827ed 100644 --- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts +++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts @@ -118,6 +118,7 @@ reset-gpios = <&gpio0 5 GPIO_ACTIVE_LOW>; clocks = <&pmic>; clock-names = "ext_clock"; + post-power-on-delay-ms = <10>; power-off-delay-us = <10>; }; @@ -300,7 +301,6 @@ dwmmc_0: dwmmc0@f723d000 { cap-mmc-highspeed; - mmc-hs200-1_8v; non-removable; bus-width = <0x8>; vmmc-supply = <&ldo19>; diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi index cd3865e7a270..8c86c41a0d25 100644 --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi @@ -399,7 +399,7 @@ }; intc: interrupt-controller@9bc0000 { - compatible = "arm,gic-v3"; + compatible = "qcom,msm8996-gic-v3", "arm,gic-v3"; #interrupt-cells = <3>; interrupt-controller; #redistributor-regions = <1>; diff --git a/arch/arm64/boot/dts/renesas/r8a7796.dtsi b/arch/arm64/boot/dts/renesas/r8a7796.dtsi index cbd35c00b4af..33cb0281c39c 100644 --- a/arch/arm64/boot/dts/renesas/r8a7796.dtsi +++ b/arch/arm64/boot/dts/renesas/r8a7796.dtsi @@ -1161,6 +1161,9 @@ <&cpg CPG_CORE R8A7796_CLK_S3D1>, <&scif_clk>; clock-names = "fck", "brg_int", "scif_clk"; + dmas = <&dmac1 0x13>, <&dmac1 0x12>, + <&dmac2 0x13>, <&dmac2 0x12>; + dma-names = "tx", "rx", "tx", "rx"; power-domains = <&sysc R8A7796_PD_ALWAYS_ON>; resets = <&cpg 310>; status = "disabled"; diff --git a/arch/arm64/boot/dts/renesas/r8a77965.dtsi b/arch/arm64/boot/dts/renesas/r8a77965.dtsi index 0cd44461a0bd..f60f08ba1a6f 100644 --- a/arch/arm64/boot/dts/renesas/r8a77965.dtsi +++ b/arch/arm64/boot/dts/renesas/r8a77965.dtsi @@ -951,6 +951,9 @@ <&cpg CPG_CORE R8A77965_CLK_S3D1>, <&scif_clk>; clock-names = "fck", "brg_int", "scif_clk"; + dmas = <&dmac1 0x13>, <&dmac1 0x12>, + <&dmac2 0x13>, <&dmac2 0x12>; + dma-names = "tx", "rx", "tx", "rx"; power-domains = <&sysc R8A77965_PD_ALWAYS_ON>; resets = <&cpg 310>; status = "disabled"; diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-zcu100-revC.dts b/arch/arm64/boot/dts/xilinx/zynqmp-zcu100-revC.dts index eb5e8bddb610..8954c8c6f547 100644 --- a/arch/arm64/boot/dts/xilinx/zynqmp-zcu100-revC.dts +++ b/arch/arm64/boot/dts/xilinx/zynqmp-zcu100-revC.dts @@ -101,6 +101,7 @@ sdio_pwrseq: sdio_pwrseq { compatible = "mmc-pwrseq-simple"; reset-gpios = <&gpio 7 GPIO_ACTIVE_LOW>; /* WIFI_EN */ + post-power-on-delay-ms = <10>; }; }; diff --git a/arch/arm64/configs/cuttlefish_defconfig b/arch/arm64/configs/cuttlefish_defconfig index d1ea96f0ba3f..87a22fa4c3f6 100644 --- a/arch/arm64/configs/cuttlefish_defconfig +++ b/arch/arm64/configs/cuttlefish_defconfig @@ -58,6 +58,7 @@ CONFIG_ENERGY_MODEL=y CONFIG_CPU_IDLE=y CONFIG_ARM_CPUIDLE=y CONFIG_CPU_FREQ=y +CONFIG_CPU_FREQ_TIMES=y CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y @@ -124,6 +125,7 @@ CONFIG_NF_CT_NETLINK=y CONFIG_NETFILTER_XT_TARGET_CLASSIFY=y CONFIG_NETFILTER_XT_TARGET_CONNMARK=y CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=y +CONFIG_NETFILTER_XT_TARGET_CT=y CONFIG_NETFILTER_XT_TARGET_IDLETIMER=y CONFIG_NETFILTER_XT_TARGET_MARK=y CONFIG_NETFILTER_XT_TARGET_NFLOG=y @@ -229,6 +231,7 @@ CONFIG_PPP_DEFLATE=y CONFIG_PPP_MPPE=y CONFIG_PPTP=y CONFIG_PPPOL2TP=y +CONFIG_USB_RTL8152=y CONFIG_USB_USBNET=y # CONFIG_USB_NET_AX8817X is not set # CONFIG_USB_NET_AX88179_178A is not set @@ -299,6 +302,12 @@ CONFIG_DRM=y CONFIG_DRM_VIRTIO_GPU=y CONFIG_SOUND=y CONFIG_SND=y +CONFIG_SND_HRTIMER=y +# CONFIG_SND_SUPPORT_OLD_API is not set +# CONFIG_SND_VERBOSE_PROCFS is not set +# CONFIG_SND_DRIVERS is not set +CONFIG_SND_INTEL8X0=y +# CONFIG_SND_USB is not set CONFIG_HIDRAW=y CONFIG_UHID=y CONFIG_HID_A4TECH=y diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c index b5a367d4bba6..30bb13797034 100644 --- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -478,13 +478,13 @@ bool arch_within_kprobe_blacklist(unsigned long addr) addr < (unsigned long)__entry_text_end) || (addr >= (unsigned long)__idmap_text_start && addr < (unsigned long)__idmap_text_end) || + (addr >= (unsigned long)__hyp_text_start && + addr < (unsigned long)__hyp_text_end) || !!search_exception_tables(addr)) return true; if (!is_kernel_in_hyp_mode()) { - if ((addr >= (unsigned long)__hyp_text_start && - addr < (unsigned long)__hyp_text_end) || - (addr >= (unsigned long)__hyp_idmap_text_start && + if ((addr >= (unsigned long)__hyp_idmap_text_start && addr < (unsigned long)__hyp_idmap_text_end)) return true; } diff --git a/arch/mips/boot/dts/ingenic/ci20.dts b/arch/mips/boot/dts/ingenic/ci20.dts index 50cff3cbcc6d..4f7b1fa31cf5 100644 --- a/arch/mips/boot/dts/ingenic/ci20.dts +++ b/arch/mips/boot/dts/ingenic/ci20.dts @@ -76,7 +76,7 @@ status = "okay"; pinctrl-names = "default"; - pinctrl-0 = <&pins_uart2>; + pinctrl-0 = <&pins_uart3>; }; &uart4 { @@ -196,9 +196,9 @@ bias-disable; }; - pins_uart2: uart2 { - function = "uart2"; - groups = "uart2-data", "uart2-hwflow"; + pins_uart3: uart3 { + function = "uart3"; + groups = "uart3-data", "uart3-hwflow"; bias-disable; }; diff --git a/arch/mips/kernel/irq.c b/arch/mips/kernel/irq.c index ba150c755fcc..85b6c60f285d 100644 --- a/arch/mips/kernel/irq.c +++ b/arch/mips/kernel/irq.c @@ -52,6 +52,7 @@ asmlinkage void spurious_interrupt(void) void __init init_IRQ(void) { int i; + unsigned int order = get_order(IRQ_STACK_SIZE); for (i = 0; i < NR_IRQS; i++) irq_set_noprobe(i); @@ -62,8 +63,7 @@ void __init init_IRQ(void) arch_init_irq(); for_each_possible_cpu(i) { - int irq_pages = IRQ_STACK_SIZE / PAGE_SIZE; - void *s = (void *)__get_free_pages(GFP_KERNEL, irq_pages); + void *s = (void *)__get_free_pages(GFP_KERNEL, order); irq_stack[i] = s; pr_debug("CPU%d IRQ stack at 0x%p - 0x%p\n", i, diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index d4f7fd4550e1..85522c137f19 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -371,7 +371,7 @@ static inline int is_sp_move_ins(union mips_instruction *ip, int *frame_size) static int get_frame_info(struct mips_frame_info *info) { bool is_mmips = IS_ENABLED(CONFIG_CPU_MICROMIPS); - union mips_instruction insn, *ip, *ip_end; + union mips_instruction insn, *ip; const unsigned int max_insns = 128; unsigned int last_insn_size = 0; unsigned int i; @@ -384,10 +384,9 @@ static int get_frame_info(struct mips_frame_info *info) if (!ip) goto err; - ip_end = (void *)ip + info->func_size; - - for (i = 0; i < max_insns && ip < ip_end; i++) { + for (i = 0; i < max_insns; i++) { ip = (void *)ip + last_insn_size; + if (is_mmips && mm_insn_16bit(ip->halfword[0])) { insn.word = ip->halfword[0] << 16; last_insn_size = 2; diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 3fe4af8147d2..c23578a37b44 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -22,7 +22,7 @@ * This decides where the kernel will search for a free chunk of vm * space during mmap's. */ -#define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE >> 1) +#define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3) #define STACK_TOP TASK_SIZE #define STACK_TOP_MAX STACK_TOP diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index b2d26d9d8489..9713d4e8c22b 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -186,7 +186,7 @@ static void __init setup_bootmem(void) BUG_ON(mem_size == 0); set_max_mapnr(PFN_DOWN(mem_size)); - max_low_pfn = memblock_end_of_DRAM(); + max_low_pfn = PFN_DOWN(memblock_end_of_DRAM()); #ifdef CONFIG_BLK_DEV_INITRD setup_initrd(); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 58a522f9bcc3..200a4b315e15 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -29,7 +29,8 @@ static void __init zone_sizes_init(void) unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, }; #ifdef CONFIG_ZONE_DMA32 - max_zone_pfns[ZONE_DMA32] = PFN_DOWN(min(4UL * SZ_1G, max_low_pfn)); + max_zone_pfns[ZONE_DMA32] = PFN_DOWN(min(4UL * SZ_1G, + (unsigned long) PFN_PHYS(max_low_pfn))); #endif max_zone_pfns[ZONE_NORMAL] = max_low_pfn; diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 64037895b085..f105ae8651c9 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -600,6 +600,14 @@ ENTRY(trampoline_32bit_src) leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%ecx), %eax movl %eax, %cr3 3: + /* Set EFER.LME=1 as a precaution in case hypervsior pulls the rug */ + pushl %ecx + movl $MSR_EFER, %ecx + rdmsr + btsl $_EFER_LME, %eax + wrmsr + popl %ecx + /* Enable PAE and LA57 (if required) paging modes */ movl $X86_CR4_PAE, %eax cmpl $0, %edx diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h index 91f75638f6e6..6ff7e81b5628 100644 --- a/arch/x86/boot/compressed/pgtable.h +++ b/arch/x86/boot/compressed/pgtable.h @@ -6,7 +6,7 @@ #define TRAMPOLINE_32BIT_PGTABLE_OFFSET 0 #define TRAMPOLINE_32BIT_CODE_OFFSET PAGE_SIZE -#define TRAMPOLINE_32BIT_CODE_SIZE 0x60 +#define TRAMPOLINE_32BIT_CODE_SIZE 0x70 #define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index 9e2157371491..f8debf7aeb4c 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -1,5 +1,7 @@ +#include #include #include +#include #include "pgtable.h" #include "../string.h" @@ -37,9 +39,10 @@ int cmdline_find_option_bool(const char *option); static unsigned long find_trampoline_placement(void) { - unsigned long bios_start, ebda_start; + unsigned long bios_start = 0, ebda_start = 0; unsigned long trampoline_start; struct boot_e820_entry *entry; + char *signature; int i; /* @@ -47,8 +50,18 @@ static unsigned long find_trampoline_placement(void) * This code is based on reserve_bios_regions(). */ - ebda_start = *(unsigned short *)0x40e << 4; - bios_start = *(unsigned short *)0x413 << 10; + /* + * EFI systems may not provide legacy ROM. The memory may not be mapped + * at all. + * + * Only look for values in the legacy ROM for non-EFI system. + */ + signature = (char *)&boot_params->efi_info.efi_loader_signature; + if (strncmp(signature, EFI32_LOADER_SIGNATURE, 4) && + strncmp(signature, EFI64_LOADER_SIGNATURE, 4)) { + ebda_start = *(unsigned short *)0x40e << 4; + bios_start = *(unsigned short *)0x413 << 10; + } if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX) bios_start = BIOS_START_MAX; diff --git a/arch/x86/configs/x86_64_cuttlefish_defconfig b/arch/x86/configs/x86_64_cuttlefish_defconfig index 8229cf8e66ea..4b536d0063f7 100644 --- a/arch/x86/configs/x86_64_cuttlefish_defconfig +++ b/arch/x86/configs/x86_64_cuttlefish_defconfig @@ -58,6 +58,7 @@ CONFIG_ACPI_PROCFS_POWER=y # CONFIG_ACPI_FAN is not set # CONFIG_ACPI_THERMAL is not set # CONFIG_X86_PM_TIMER is not set +CONFIG_CPU_FREQ_TIMES=y CONFIG_CPU_FREQ_GOV_ONDEMAND=y CONFIG_X86_ACPI_CPUFREQ=y CONFIG_PCI_MSI=y @@ -96,6 +97,7 @@ CONFIG_SYN_COOKIES=y CONFIG_NET_IPVTI=y CONFIG_INET_ESP=y # CONFIG_INET_XFRM_MODE_BEET is not set +CONFIG_INET_UDP_DIAG=y CONFIG_INET_DIAG_DESTROY=y CONFIG_TCP_CONG_ADVANCED=y # CONFIG_TCP_CONG_BIC is not set @@ -128,6 +130,7 @@ CONFIG_NF_CT_NETLINK=y CONFIG_NETFILTER_XT_TARGET_CLASSIFY=y CONFIG_NETFILTER_XT_TARGET_CONNMARK=y CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=y +CONFIG_NETFILTER_XT_TARGET_CT=y CONFIG_NETFILTER_XT_TARGET_IDLETIMER=y CONFIG_NETFILTER_XT_TARGET_MARK=y CONFIG_NETFILTER_XT_TARGET_NFLOG=y @@ -234,6 +237,7 @@ CONFIG_PPP=y CONFIG_PPP_BSDCOMP=y CONFIG_PPP_DEFLATE=y CONFIG_PPP_MPPE=y +CONFIG_USB_RTL8152=y CONFIG_USB_USBNET=y # CONFIG_USB_NET_AX8817X is not set # CONFIG_USB_NET_AX88179_178A is not set @@ -311,6 +315,12 @@ CONFIG_DRM=y CONFIG_DRM_VIRTIO_GPU=y CONFIG_SOUND=y CONFIG_SND=y +CONFIG_SND_HRTIMER=y +# CONFIG_SND_SUPPORT_OLD_API is not set +# CONFIG_SND_VERBOSE_PROCFS is not set +# CONFIG_SND_DRIVERS is not set +CONFIG_SND_INTEL8X0=y +# CONFIG_SND_USB is not set CONFIG_HIDRAW=y CONFIG_UHID=y CONFIG_HID_A4TECH=y diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index c04a8813cff9..a41554350893 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1970,7 +1970,7 @@ static int x86_pmu_commit_txn(struct pmu *pmu) */ static void free_fake_cpuc(struct cpu_hw_events *cpuc) { - kfree(cpuc->shared_regs); + intel_cpuc_finish(cpuc); kfree(cpuc); } @@ -1982,14 +1982,11 @@ static struct cpu_hw_events *allocate_fake_cpuc(void) cpuc = kzalloc(sizeof(*cpuc), GFP_KERNEL); if (!cpuc) return ERR_PTR(-ENOMEM); - - /* only needed, if we have extra_regs */ - if (x86_pmu.extra_regs) { - cpuc->shared_regs = allocate_shared_regs(cpu); - if (!cpuc->shared_regs) - goto error; - } cpuc->is_fake = 1; + + if (intel_cpuc_prepare(cpuc, cpu)) + goto error; + return cpuc; error: free_fake_cpuc(cpuc); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index fbd7551a8d44..12453cf7c11b 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -1995,6 +1995,39 @@ static void intel_pmu_nhm_enable_all(int added) intel_pmu_enable_all(added); } +static void intel_set_tfa(struct cpu_hw_events *cpuc, bool on) +{ + u64 val = on ? MSR_TFA_RTM_FORCE_ABORT : 0; + + if (cpuc->tfa_shadow != val) { + cpuc->tfa_shadow = val; + wrmsrl(MSR_TSX_FORCE_ABORT, val); + } +} + +static void intel_tfa_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr) +{ + /* + * We're going to use PMC3, make sure TFA is set before we touch it. + */ + if (cntr == 3 && !cpuc->is_fake) + intel_set_tfa(cpuc, true); +} + +static void intel_tfa_pmu_enable_all(int added) +{ + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + + /* + * If we find PMC3 is no longer used when we enable the PMU, we can + * clear TFA. + */ + if (!test_bit(3, cpuc->active_mask)) + intel_set_tfa(cpuc, false); + + intel_pmu_enable_all(added); +} + static inline u64 intel_pmu_get_status(void) { u64 status; @@ -2652,6 +2685,35 @@ intel_stop_scheduling(struct cpu_hw_events *cpuc) raw_spin_unlock(&excl_cntrs->lock); } +static struct event_constraint * +dyn_constraint(struct cpu_hw_events *cpuc, struct event_constraint *c, int idx) +{ + WARN_ON_ONCE(!cpuc->constraint_list); + + if (!(c->flags & PERF_X86_EVENT_DYNAMIC)) { + struct event_constraint *cx; + + /* + * grab pre-allocated constraint entry + */ + cx = &cpuc->constraint_list[idx]; + + /* + * initialize dynamic constraint + * with static constraint + */ + *cx = *c; + + /* + * mark constraint as dynamic + */ + cx->flags |= PERF_X86_EVENT_DYNAMIC; + c = cx; + } + + return c; +} + static struct event_constraint * intel_get_excl_constraints(struct cpu_hw_events *cpuc, struct perf_event *event, int idx, struct event_constraint *c) @@ -2682,27 +2744,7 @@ intel_get_excl_constraints(struct cpu_hw_events *cpuc, struct perf_event *event, * only needed when constraint has not yet * been cloned (marked dynamic) */ - if (!(c->flags & PERF_X86_EVENT_DYNAMIC)) { - struct event_constraint *cx; - - /* - * grab pre-allocated constraint entry - */ - cx = &cpuc->constraint_list[idx]; - - /* - * initialize dynamic constraint - * with static constraint - */ - *cx = *c; - - /* - * mark constraint as dynamic, so we - * can free it later on - */ - cx->flags |= PERF_X86_EVENT_DYNAMIC; - c = cx; - } + c = dyn_constraint(cpuc, c, idx); /* * From here on, the constraint is dynamic. @@ -3229,6 +3271,26 @@ glp_get_event_constraints(struct cpu_hw_events *cpuc, int idx, return c; } +static bool allow_tsx_force_abort = true; + +static struct event_constraint * +tfa_get_event_constraints(struct cpu_hw_events *cpuc, int idx, + struct perf_event *event) +{ + struct event_constraint *c = hsw_get_event_constraints(cpuc, idx, event); + + /* + * Without TFA we must not use PMC3. + */ + if (!allow_tsx_force_abort && test_bit(3, c->idxmsk) && idx >= 0) { + c = dyn_constraint(cpuc, c, idx); + c->idxmsk64 &= ~(1ULL << 3); + c->weight--; + } + + return c; +} + /* * Broadwell: * @@ -3282,7 +3344,7 @@ ssize_t intel_event_sysfs_show(char *page, u64 config) return x86_event_sysfs_show(page, config, event); } -struct intel_shared_regs *allocate_shared_regs(int cpu) +static struct intel_shared_regs *allocate_shared_regs(int cpu) { struct intel_shared_regs *regs; int i; @@ -3314,23 +3376,24 @@ static struct intel_excl_cntrs *allocate_excl_cntrs(int cpu) return c; } -static int intel_pmu_cpu_prepare(int cpu) -{ - struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); +int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu) +{ if (x86_pmu.extra_regs || x86_pmu.lbr_sel_map) { cpuc->shared_regs = allocate_shared_regs(cpu); if (!cpuc->shared_regs) goto err; } - if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) { + if (x86_pmu.flags & (PMU_FL_EXCL_CNTRS | PMU_FL_TFA)) { size_t sz = X86_PMC_IDX_MAX * sizeof(struct event_constraint); - cpuc->constraint_list = kzalloc(sz, GFP_KERNEL); + cpuc->constraint_list = kzalloc_node(sz, GFP_KERNEL, cpu_to_node(cpu)); if (!cpuc->constraint_list) goto err_shared_regs; + } + if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) { cpuc->excl_cntrs = allocate_excl_cntrs(cpu); if (!cpuc->excl_cntrs) goto err_constraint_list; @@ -3352,6 +3415,11 @@ static int intel_pmu_cpu_prepare(int cpu) return -ENOMEM; } +static int intel_pmu_cpu_prepare(int cpu) +{ + return intel_cpuc_prepare(&per_cpu(cpu_hw_events, cpu), cpu); +} + static void flip_smm_bit(void *data) { unsigned long set = *(unsigned long *)data; @@ -3423,9 +3491,8 @@ static void intel_pmu_cpu_starting(int cpu) } } -static void free_excl_cntrs(int cpu) +static void free_excl_cntrs(struct cpu_hw_events *cpuc) { - struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); struct intel_excl_cntrs *c; c = cpuc->excl_cntrs; @@ -3433,9 +3500,10 @@ static void free_excl_cntrs(int cpu) if (c->core_id == -1 || --c->refcnt == 0) kfree(c); cpuc->excl_cntrs = NULL; - kfree(cpuc->constraint_list); - cpuc->constraint_list = NULL; } + + kfree(cpuc->constraint_list); + cpuc->constraint_list = NULL; } static void intel_pmu_cpu_dying(int cpu) @@ -3443,9 +3511,8 @@ static void intel_pmu_cpu_dying(int cpu) fini_debug_store_on_cpu(cpu); } -static void intel_pmu_cpu_dead(int cpu) +void intel_cpuc_finish(struct cpu_hw_events *cpuc) { - struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); struct intel_shared_regs *pc; pc = cpuc->shared_regs; @@ -3455,7 +3522,12 @@ static void intel_pmu_cpu_dead(int cpu) cpuc->shared_regs = NULL; } - free_excl_cntrs(cpu); + free_excl_cntrs(cpuc); +} + +static void intel_pmu_cpu_dead(int cpu) +{ + intel_cpuc_finish(&per_cpu(cpu_hw_events, cpu)); } static void intel_pmu_sched_task(struct perf_event_context *ctx, @@ -3917,8 +3989,11 @@ static struct attribute *intel_pmu_caps_attrs[] = { NULL }; +static DEVICE_BOOL_ATTR(allow_tsx_force_abort, 0644, allow_tsx_force_abort); + static struct attribute *intel_pmu_attrs[] = { &dev_attr_freeze_on_smi.attr, + NULL, /* &dev_attr_allow_tsx_force_abort.attr.attr */ NULL, }; @@ -4374,6 +4449,15 @@ __init int intel_pmu_init(void) x86_pmu.cpu_events = get_hsw_events_attrs(); intel_pmu_pebs_data_source_skl( boot_cpu_data.x86_model == INTEL_FAM6_SKYLAKE_X); + + if (boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) { + x86_pmu.flags |= PMU_FL_TFA; + x86_pmu.get_event_constraints = tfa_get_event_constraints; + x86_pmu.enable_all = intel_tfa_pmu_enable_all; + x86_pmu.commit_scheduling = intel_tfa_commit_scheduling; + intel_pmu_attrs[1] = &dev_attr_allow_tsx_force_abort.attr.attr; + } + pr_cont("Skylake events, "); name = "skylake"; break; @@ -4515,7 +4599,7 @@ static __init int fixup_ht_bug(void) hardlockup_detector_perf_restart(); for_each_online_cpu(c) - free_excl_cntrs(c); + free_excl_cntrs(&per_cpu(cpu_hw_events, c)); cpus_read_unlock(); pr_info("PMU erratum BJ122, BV98, HSD29 workaround disabled, HT off\n"); diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 0ee3a441ad79..42a36280d168 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -242,6 +242,11 @@ struct cpu_hw_events { struct intel_excl_cntrs *excl_cntrs; int excl_thread_id; /* 0 or 1 */ + /* + * SKL TSX_FORCE_ABORT shadow + */ + u64 tfa_shadow; + /* * AMD specific bits */ @@ -679,6 +684,7 @@ do { \ #define PMU_FL_EXCL_CNTRS 0x4 /* has exclusive counter requirements */ #define PMU_FL_EXCL_ENABLED 0x8 /* exclusive counter active */ #define PMU_FL_PEBS_ALL 0x10 /* all events are valid PEBS events */ +#define PMU_FL_TFA 0x20 /* deal with TSX force abort */ #define EVENT_VAR(_id) event_attr_##_id #define EVENT_PTR(_id) &event_attr_##_id.attr.attr @@ -887,7 +893,8 @@ struct event_constraint * x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx, struct perf_event *event); -struct intel_shared_regs *allocate_shared_regs(int cpu); +extern int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu); +extern void intel_cpuc_finish(struct cpu_hw_events *cpuc); int intel_pmu_init(void); @@ -1023,9 +1030,13 @@ static inline int intel_pmu_init(void) return 0; } -static inline struct intel_shared_regs *allocate_shared_regs(int cpu) +static inline int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu) +{ + return 0; +} + +static inline void intel_cpuc_finish(struct cpu_hw_events *cpuc) { - return NULL; } static inline int is_ht_workaround_enabled(void) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 89a048c2faec..7b31ee5223fc 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -340,6 +340,7 @@ /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ #define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ +#define X86_FEATURE_TSX_FORCE_ABORT (18*32+13) /* "" TSX_FORCE_ABORT */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 1f9de7635bcb..f14ca0be1e3f 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -629,6 +629,12 @@ #define MSR_IA32_TSC_DEADLINE 0x000006E0 + +#define MSR_TSX_FORCE_ABORT 0x0000010F + +#define MSR_TFA_RTM_FORCE_ABORT_BIT 0 +#define MSR_TFA_RTM_FORCE_ABORT BIT_ULL(MSR_TFA_RTM_FORCE_ABORT_BIT) + /* P4/Xeon+ specific */ #define MSR_IA32_MCG_EAX 0x00000180 #define MSR_IA32_MCG_EBX 0x00000181 diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h index b99d497e342d..0b6352aabbd3 100644 --- a/arch/x86/include/asm/page_64_types.h +++ b/arch/x86/include/asm/page_64_types.h @@ -7,7 +7,11 @@ #endif #ifdef CONFIG_KASAN +#ifdef CONFIG_KASAN_EXTRA +#define KASAN_STACK_ORDER 2 +#else #define KASAN_STACK_ORDER 1 +#endif #else #define KASAN_STACK_ORDER 0 #endif diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index eeea634bee0a..6a25278e0092 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -818,11 +818,9 @@ static void init_amd_bd(struct cpuinfo_x86 *c) static void init_amd_zn(struct cpuinfo_x86 *c) { set_cpu_cap(c, X86_FEATURE_ZEN); - /* - * Fix erratum 1076: CPB feature bit not being set in CPUID. It affects - * all up to and including B1. - */ - if (c->x86_model <= 1 && c->x86_stepping <= 1) + + /* Fix erratum 1076: CPB feature bit not being set in CPUID. */ + if (!cpu_has(c, X86_FEATURE_CPB)) set_cpu_cap(c, X86_FEATURE_CPB); } diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 07b5fc00b188..a4e7e100ed26 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -707,7 +707,7 @@ load_microcode_amd(bool save, u8 family, const u8 *data, size_t size) if (!p) { return ret; } else { - if (boot_cpu_data.microcode == p->patch_id) + if (boot_cpu_data.microcode >= p->patch_id) return ret; ret = UCODE_NEW; diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 278cd07228dd..9490a2845f14 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -167,6 +167,9 @@ setup_efi_state(struct boot_params *params, unsigned long params_load_addr, struct efi_info *current_ei = &boot_params.efi_info; struct efi_info *ei = ¶ms->efi_info; + if (!efi_enabled(EFI_RUNTIME_SERVICES)) + return 0; + if (!current_ei->efi_memmap_size) return 0; diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c index 13f4485ca388..bd372e896557 100644 --- a/arch/x86/pci/fixup.c +++ b/arch/x86/pci/fixup.c @@ -641,6 +641,22 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x334b, quirk_no_aersid); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x334c, quirk_no_aersid); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x334d, quirk_no_aersid); +static void quirk_intel_th_dnv(struct pci_dev *dev) +{ + struct resource *r = &dev->resource[4]; + + /* + * Denverton reports 2k of RTIT_BAR (intel_th resource 4), which + * appears to be 4 MB in reality. + */ + if (r->end == r->start + 0x7ff) { + r->start = 0; + r->end = 0x3fffff; + r->flags |= IORESOURCE_UNSET; + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x19e1, quirk_intel_th_dnv); + #ifdef CONFIG_PHYS_ADDR_T_64BIT #define AMD_141b_MMIO_BASE(x) (0x80 + (x) * 0x8) diff --git a/arch/xtensa/configs/smp_lx200_defconfig b/arch/xtensa/configs/smp_lx200_defconfig index 11fed6c06a7c..b5938160fb3d 100644 --- a/arch/xtensa/configs/smp_lx200_defconfig +++ b/arch/xtensa/configs/smp_lx200_defconfig @@ -33,6 +33,7 @@ CONFIG_SMP=y CONFIG_HOTPLUG_CPU=y # CONFIG_INITIALIZE_XTENSA_MMU_INSIDE_VMLINUX is not set # CONFIG_PCI is not set +CONFIG_VECTORS_OFFSET=0x00002000 CONFIG_XTENSA_PLATFORM_XTFPGA=y CONFIG_CMDLINE_BOOL=y CONFIG_CMDLINE="earlycon=uart8250,mmio32native,0xfd050020,115200n8 console=ttyS0,115200n8 ip=dhcp root=/dev/nfs rw debug memmap=96M@0" diff --git a/arch/xtensa/kernel/head.S b/arch/xtensa/kernel/head.S index 9053a5622d2c..5bd38ea2da38 100644 --- a/arch/xtensa/kernel/head.S +++ b/arch/xtensa/kernel/head.S @@ -280,12 +280,13 @@ should_never_return: movi a2, cpu_start_ccount 1: + memw l32i a3, a2, 0 beqi a3, 0, 1b movi a3, 0 s32i a3, a2, 0 - memw 1: + memw l32i a3, a2, 0 beqi a3, 0, 1b wsr a3, ccount @@ -321,11 +322,13 @@ ENTRY(cpu_restart) rsr a0, prid neg a2, a0 movi a3, cpu_start_id + memw s32i a2, a3, 0 #if XCHAL_DCACHE_IS_WRITEBACK dhwbi a3, 0 #endif 1: + memw l32i a2, a3, 0 dhi a3, 0 bne a2, a0, 1b diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c index 4bb68133a72a..5a0e0bd68b76 100644 --- a/arch/xtensa/kernel/process.c +++ b/arch/xtensa/kernel/process.c @@ -320,8 +320,8 @@ unsigned long get_wchan(struct task_struct *p) /* Stack layout: sp-4: ra, sp-3: sp' */ - pc = MAKE_PC_FROM_RA(*(unsigned long*)sp - 4, sp); - sp = *(unsigned long *)sp - 3; + pc = MAKE_PC_FROM_RA(SPILL_SLOT(sp, 0), sp); + sp = SPILL_SLOT(sp, 1); } while (count++ < 16); return 0; } diff --git a/arch/xtensa/kernel/smp.c b/arch/xtensa/kernel/smp.c index 932d64689bac..be1f280c322c 100644 --- a/arch/xtensa/kernel/smp.c +++ b/arch/xtensa/kernel/smp.c @@ -83,7 +83,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) { unsigned i; - for (i = 0; i < max_cpus; ++i) + for_each_possible_cpu(i) set_cpu_present(i, true); } @@ -96,6 +96,11 @@ void __init smp_init_cpus(void) pr_info("%s: Core Count = %d\n", __func__, ncpus); pr_info("%s: Core Id = %d\n", __func__, core_id); + if (ncpus > NR_CPUS) { + ncpus = NR_CPUS; + pr_info("%s: limiting core count by %d\n", __func__, ncpus); + } + for (i = 0; i < ncpus; ++i) set_cpu_possible(i, true); } @@ -195,9 +200,11 @@ static int boot_secondary(unsigned int cpu, struct task_struct *ts) int i; #ifdef CONFIG_HOTPLUG_CPU - cpu_start_id = cpu; - system_flush_invalidate_dcache_range( - (unsigned long)&cpu_start_id, sizeof(cpu_start_id)); + WRITE_ONCE(cpu_start_id, cpu); + /* Pairs with the third memw in the cpu_restart */ + mb(); + system_flush_invalidate_dcache_range((unsigned long)&cpu_start_id, + sizeof(cpu_start_id)); #endif smp_call_function_single(0, mx_cpu_start, (void *)cpu, 1); @@ -206,18 +213,21 @@ static int boot_secondary(unsigned int cpu, struct task_struct *ts) ccount = get_ccount(); while (!ccount); - cpu_start_ccount = ccount; + WRITE_ONCE(cpu_start_ccount, ccount); - while (time_before(jiffies, timeout)) { + do { + /* + * Pairs with the first two memws in the + * .Lboot_secondary. + */ mb(); - if (!cpu_start_ccount) - break; - } + ccount = READ_ONCE(cpu_start_ccount); + } while (ccount && time_before(jiffies, timeout)); - if (cpu_start_ccount) { + if (ccount) { smp_call_function_single(0, mx_cpu_stop, - (void *)cpu, 1); - cpu_start_ccount = 0; + (void *)cpu, 1); + WRITE_ONCE(cpu_start_ccount, 0); return -EIO; } } @@ -237,6 +247,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) pr_debug("%s: Calling wakeup_secondary(cpu:%d, idle:%p, sp: %08lx)\n", __func__, cpu, idle, start_info.stack); + init_completion(&cpu_running); ret = boot_secondary(cpu, idle); if (ret == 0) { wait_for_completion_timeout(&cpu_running, @@ -298,8 +309,10 @@ void __cpu_die(unsigned int cpu) unsigned long timeout = jiffies + msecs_to_jiffies(1000); while (time_before(jiffies, timeout)) { system_invalidate_dcache_range((unsigned long)&cpu_start_id, - sizeof(cpu_start_id)); - if (cpu_start_id == -cpu) { + sizeof(cpu_start_id)); + /* Pairs with the second memw in the cpu_restart */ + mb(); + if (READ_ONCE(cpu_start_id) == -cpu) { platform_cpu_kill(cpu); return; } diff --git a/arch/xtensa/kernel/time.c b/arch/xtensa/kernel/time.c index fd524a54d2ab..378186b5eb40 100644 --- a/arch/xtensa/kernel/time.c +++ b/arch/xtensa/kernel/time.c @@ -89,7 +89,7 @@ static int ccount_timer_shutdown(struct clock_event_device *evt) container_of(evt, struct ccount_timer, evt); if (timer->irq_enabled) { - disable_irq(evt->irq); + disable_irq_nosync(evt->irq); timer->irq_enabled = 0; } return 0; diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index 19923f8a029d..b154e057ca67 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -72,6 +72,7 @@ #include #include #include +#include #include "blk-rq-qos.h" #include "blk-stat.h" @@ -568,6 +569,9 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) return; enabled = blk_iolatency_enabled(iolat->blkiolat); + if (!enabled) + return; + while (blkg && blkg->parent) { iolat = blkg_to_lat(blkg); if (!iolat) { @@ -577,7 +581,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) rqw = &iolat->rq_wait; atomic_dec(&rqw->inflight); - if (!enabled || iolat->min_lat_nsec == 0) + if (iolat->min_lat_nsec == 0) goto next; iolatency_record_time(iolat, &bio->bi_issue, now, issue_as_root); @@ -721,10 +725,13 @@ int blk_iolatency_init(struct request_queue *q) return 0; } -static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) +/* + * return 1 for enabling iolatency, return -1 for disabling iolatency, otherwise + * return 0. + */ +static int iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) { struct iolatency_grp *iolat = blkg_to_lat(blkg); - struct blk_iolatency *blkiolat = iolat->blkiolat; u64 oldval = iolat->min_lat_nsec; iolat->min_lat_nsec = val; @@ -733,9 +740,10 @@ static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) BLKIOLATENCY_MAX_WIN_SIZE); if (!oldval && val) - atomic_inc(&blkiolat->enabled); + return 1; if (oldval && !val) - atomic_dec(&blkiolat->enabled); + return -1; + return 0; } static void iolatency_clear_scaling(struct blkcg_gq *blkg) @@ -768,6 +776,7 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, u64 lat_val = 0; u64 oldval; int ret; + int enable = 0; ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, buf, &ctx); if (ret) @@ -803,7 +812,12 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, blkg = ctx.blkg; oldval = iolat->min_lat_nsec; - iolatency_set_min_lat_nsec(blkg, lat_val); + enable = iolatency_set_min_lat_nsec(blkg, lat_val); + if (enable) { + WARN_ON_ONCE(!blk_get_queue(blkg->q)); + blkg_get(blkg); + } + if (oldval != iolat->min_lat_nsec) { iolatency_clear_scaling(blkg); } @@ -811,6 +825,24 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, ret = 0; out: blkg_conf_finish(&ctx); + if (ret == 0 && enable) { + struct iolatency_grp *tmp = blkg_to_lat(blkg); + struct blk_iolatency *blkiolat = tmp->blkiolat; + + blk_mq_freeze_queue(blkg->q); + + if (enable == 1) + atomic_inc(&blkiolat->enabled); + else if (enable == -1) + atomic_dec(&blkiolat->enabled); + else + WARN_ON_ONCE(1); + + blk_mq_unfreeze_queue(blkg->q); + + blkg_put(blkg); + blk_put_queue(blkg->q); + } return ret ?: nbytes; } @@ -910,8 +942,14 @@ static void iolatency_pd_offline(struct blkg_policy_data *pd) { struct iolatency_grp *iolat = pd_to_lat(pd); struct blkcg_gq *blkg = lat_to_blkg(iolat); + struct blk_iolatency *blkiolat = iolat->blkiolat; + int ret; - iolatency_set_min_lat_nsec(blkg, 0); + ret = iolatency_set_min_lat_nsec(blkg, 0); + if (ret == 1) + atomic_inc(&blkiolat->enabled); + if (ret == -1) + atomic_dec(&blkiolat->enabled); iolatency_clear_scaling(blkg); } diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 467c43cc9371..e378af5382ca 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -984,9 +984,9 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv->remove(dev); device_links_driver_cleanup(dev); - dma_deconfigure(dev); devres_release_all(dev); + dma_deconfigure(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); if (dev->pm_domain && dev->pm_domain->dismiss) diff --git a/drivers/bluetooth/btrtl.c b/drivers/bluetooth/btrtl.c index 7f9ea8e4c1b2..1342f8e6025c 100644 --- a/drivers/bluetooth/btrtl.c +++ b/drivers/bluetooth/btrtl.c @@ -544,10 +544,9 @@ struct btrtl_device_info *btrtl_initialize(struct hci_dev *hdev, hdev->bus); if (!btrtl_dev->ic_info) { - rtl_dev_err(hdev, "rtl: unknown IC info, lmp subver %04x, hci rev %04x, hci ver %04x", + rtl_dev_info(hdev, "rtl: unknown IC info, lmp subver %04x, hci rev %04x, hci ver %04x", lmp_subver, hci_rev, hci_ver); - ret = -EINVAL; - goto err_free; + return btrtl_dev; } if (btrtl_dev->ic_info->has_rom_version) { @@ -602,6 +601,11 @@ int btrtl_download_firmware(struct hci_dev *hdev, * standard btusb. Once that firmware is uploaded, the subver changes * to a different value. */ + if (!btrtl_dev->ic_info) { + rtl_dev_info(hdev, "rtl: assuming no firmware upload needed\n"); + return 0; + } + switch (btrtl_dev->ic_info->lmp_subver) { case RTL_ROM_LMP_8723A: case RTL_ROM_LMP_3499: diff --git a/drivers/char/applicom.c b/drivers/char/applicom.c index c0a5b1f3a986..4ccc39e00ced 100644 --- a/drivers/char/applicom.c +++ b/drivers/char/applicom.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -386,7 +387,11 @@ static ssize_t ac_write(struct file *file, const char __user *buf, size_t count, TicCard = st_loc.tic_des_from_pc; /* tic number to send */ IndexCard = NumCard - 1; - if((NumCard < 1) || (NumCard > MAX_BOARD) || !apbs[IndexCard].RamIO) + if (IndexCard >= MAX_BOARD) + return -EINVAL; + IndexCard = array_index_nospec(IndexCard, MAX_BOARD); + + if (!apbs[IndexCard].RamIO) return -EINVAL; #ifdef DEBUG @@ -697,6 +702,7 @@ static long ac_ioctl(struct file *file, unsigned int cmd, unsigned long arg) unsigned char IndexCard; void __iomem *pmem; int ret = 0; + static int warncount = 10; volatile unsigned char byte_reset_it; struct st_ram_io *adgl; void __user *argp = (void __user *)arg; @@ -711,16 +717,12 @@ static long ac_ioctl(struct file *file, unsigned int cmd, unsigned long arg) mutex_lock(&ac_mutex); IndexCard = adgl->num_card-1; - if(cmd != 6 && ((IndexCard >= MAX_BOARD) || !apbs[IndexCard].RamIO)) { - static int warncount = 10; - if (warncount) { - printk( KERN_WARNING "APPLICOM driver IOCTL, bad board number %d\n",(int)IndexCard+1); - warncount--; - } - kfree(adgl); - mutex_unlock(&ac_mutex); - return -EINVAL; - } + if (cmd != 6 && IndexCard >= MAX_BOARD) + goto err; + IndexCard = array_index_nospec(IndexCard, MAX_BOARD); + + if (cmd != 6 && !apbs[IndexCard].RamIO) + goto err; switch (cmd) { @@ -838,5 +840,16 @@ static long ac_ioctl(struct file *file, unsigned int cmd, unsigned long arg) kfree(adgl); mutex_unlock(&ac_mutex); return 0; + +err: + if (warncount) { + pr_warn("APPLICOM driver IOCTL, bad board number %d\n", + (int)IndexCard + 1); + warncount--; + } + kfree(adgl); + mutex_unlock(&ac_mutex); + return -EINVAL; + } diff --git a/drivers/clk/qcom/gcc-sdm845.c b/drivers/clk/qcom/gcc-sdm845.c index fa1a196350f1..3bf11a620094 100644 --- a/drivers/clk/qcom/gcc-sdm845.c +++ b/drivers/clk/qcom/gcc-sdm845.c @@ -131,8 +131,8 @@ static const char * const gcc_parent_names_6[] = { "core_bi_pll_test_se", }; -static const char * const gcc_parent_names_7[] = { - "bi_tcxo", +static const char * const gcc_parent_names_7_ao[] = { + "bi_tcxo_ao", "gpll0", "gpll0_out_even", "core_bi_pll_test_se", @@ -144,6 +144,12 @@ static const char * const gcc_parent_names_8[] = { "core_bi_pll_test_se", }; +static const char * const gcc_parent_names_8_ao[] = { + "bi_tcxo_ao", + "gpll0", + "core_bi_pll_test_se", +}; + static const struct parent_map gcc_parent_map_10[] = { { P_BI_TCXO, 0 }, { P_GPLL0_OUT_MAIN, 1 }, @@ -226,7 +232,7 @@ static struct clk_rcg2 gcc_cpuss_ahb_clk_src = { .freq_tbl = ftbl_gcc_cpuss_ahb_clk_src, .clkr.hw.init = &(struct clk_init_data){ .name = "gcc_cpuss_ahb_clk_src", - .parent_names = gcc_parent_names_7, + .parent_names = gcc_parent_names_7_ao, .num_parents = 4, .ops = &clk_rcg2_ops, }, @@ -245,7 +251,7 @@ static struct clk_rcg2 gcc_cpuss_rbcpr_clk_src = { .freq_tbl = ftbl_gcc_cpuss_rbcpr_clk_src, .clkr.hw.init = &(struct clk_init_data){ .name = "gcc_cpuss_rbcpr_clk_src", - .parent_names = gcc_parent_names_8, + .parent_names = gcc_parent_names_8_ao, .num_parents = 3, .ops = &clk_rcg2_ops, }, diff --git a/drivers/clk/ti/divider.c b/drivers/clk/ti/divider.c index ccfb4d9a152a..079f0beda8b6 100644 --- a/drivers/clk/ti/divider.c +++ b/drivers/clk/ti/divider.c @@ -367,8 +367,10 @@ int ti_clk_parse_divider_data(int *div_table, int num_dividers, int max_div, num_dividers = i; tmp = kcalloc(valid_div + 1, sizeof(*tmp), GFP_KERNEL); - if (!tmp) + if (!tmp) { + *table = ERR_PTR(-ENOMEM); return -ENOMEM; + } valid_div = 0; *width = 0; @@ -403,6 +405,7 @@ struct clk_hw *ti_clk_build_component_div(struct ti_clk_divider *setup) { struct clk_omap_divider *div; struct clk_omap_reg *reg; + int ret; if (!setup) return NULL; @@ -422,6 +425,12 @@ struct clk_hw *ti_clk_build_component_div(struct ti_clk_divider *setup) div->flags |= CLK_DIVIDER_POWER_OF_TWO; div->table = _get_div_table_from_setup(setup, &div->width); + if (IS_ERR(div->table)) { + ret = PTR_ERR(div->table); + kfree(div); + return ERR_PTR(ret); + } + div->shift = setup->bit_shift; div->latch = -EINVAL; diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c index ed5e42461094..ad48fd52cb53 100644 --- a/drivers/connector/cn_proc.c +++ b/drivers/connector/cn_proc.c @@ -250,6 +250,7 @@ void proc_coredump_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; + struct task_struct *parent; __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); if (atomic_read(&proc_event_num_listeners) < 1) @@ -262,8 +263,14 @@ void proc_coredump_connector(struct task_struct *task) ev->what = PROC_EVENT_COREDUMP; ev->event_data.coredump.process_pid = task->pid; ev->event_data.coredump.process_tgid = task->tgid; - ev->event_data.coredump.parent_pid = task->real_parent->pid; - ev->event_data.coredump.parent_tgid = task->real_parent->tgid; + + rcu_read_lock(); + if (pid_alive(task)) { + parent = rcu_dereference(task->real_parent); + ev->event_data.coredump.parent_pid = parent->pid; + ev->event_data.coredump.parent_tgid = parent->tgid; + } + rcu_read_unlock(); memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id)); msg->ack = 0; /* not used */ @@ -276,6 +283,7 @@ void proc_exit_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; + struct task_struct *parent; __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); if (atomic_read(&proc_event_num_listeners) < 1) @@ -290,8 +298,14 @@ void proc_exit_connector(struct task_struct *task) ev->event_data.exit.process_tgid = task->tgid; ev->event_data.exit.exit_code = task->exit_code; ev->event_data.exit.exit_signal = task->exit_signal; - ev->event_data.exit.parent_pid = task->real_parent->pid; - ev->event_data.exit.parent_tgid = task->real_parent->tgid; + + rcu_read_lock(); + if (pid_alive(task)) { + parent = rcu_dereference(task->real_parent); + ev->event_data.exit.parent_pid = parent->pid; + ev->event_data.exit.parent_tgid = parent->tgid; + } + rcu_read_unlock(); memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id)); msg->ack = 0; /* not used */ diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index c01c683b5e86..52798398d566 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -358,7 +358,7 @@ static void cpufreq_notify_transition(struct cpufreq_policy *policy, } cpufreq_stats_record_transition(policy, freqs->new); - cpufreq_times_record_transition(freqs); + cpufreq_times_record_transition(policy, freqs->new); policy->cur = freqs->new; } } @@ -555,13 +555,13 @@ EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us); * SYSFS INTERFACE * *********************************************************************/ static ssize_t show_boost(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { return sprintf(buf, "%d\n", cpufreq_driver->boost_enabled); } -static ssize_t store_boost(struct kobject *kobj, struct attribute *attr, - const char *buf, size_t count) +static ssize_t store_boost(struct kobject *kobj, struct kobj_attribute *attr, + const char *buf, size_t count) { int ret, enable; @@ -1869,9 +1869,15 @@ EXPORT_SYMBOL(cpufreq_unregister_notifier); unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, unsigned int target_freq) { + int ret; + target_freq = clamp_val(target_freq, policy->min, policy->max); - return cpufreq_driver->fast_switch(policy, target_freq); + ret = cpufreq_driver->fast_switch(policy, target_freq); + if (ret) + cpufreq_times_record_transition(policy, ret); + + return ret; } EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch); diff --git a/drivers/cpufreq/cpufreq_times.c b/drivers/cpufreq/cpufreq_times.c index a43eeee30e8e..2883d675b1eb 100644 --- a/drivers/cpufreq/cpufreq_times.c +++ b/drivers/cpufreq/cpufreq_times.c @@ -32,11 +32,17 @@ static DECLARE_HASHTABLE(uid_hash_table, UID_HASH_BITS); static DEFINE_SPINLOCK(task_time_in_state_lock); /* task->time_in_state */ static DEFINE_SPINLOCK(uid_lock); /* uid_hash_table */ +struct concurrent_times { + atomic64_t active[NR_CPUS]; + atomic64_t policy[NR_CPUS]; +}; + struct uid_entry { uid_t uid; unsigned int max_state; struct hlist_node hash; struct rcu_head rcu; + struct concurrent_times *concurrent_times; u64 time_in_state[0]; }; @@ -87,6 +93,7 @@ static struct uid_entry *find_uid_entry_locked(uid_t uid) static struct uid_entry *find_or_register_uid_locked(uid_t uid) { struct uid_entry *uid_entry, *temp; + struct concurrent_times *times; unsigned int max_state = READ_ONCE(next_offset); size_t alloc_size = sizeof(*uid_entry) + max_state * sizeof(uid_entry->time_in_state[0]); @@ -115,9 +122,15 @@ static struct uid_entry *find_or_register_uid_locked(uid_t uid) uid_entry = kzalloc(alloc_size, GFP_ATOMIC); if (!uid_entry) return NULL; + times = kzalloc(sizeof(*times), GFP_ATOMIC); + if (!times) { + kfree(uid_entry); + return NULL; + } uid_entry->uid = uid; uid_entry->max_state = max_state; + uid_entry->concurrent_times = times; hash_add_rcu(uid_hash_table, &uid_entry->hash, uid); @@ -180,10 +193,12 @@ static void *uid_seq_start(struct seq_file *seq, loff_t *pos) static void *uid_seq_next(struct seq_file *seq, void *v, loff_t *pos) { - (*pos)++; + do { + (*pos)++; - if (*pos >= HASH_SIZE(uid_hash_table)) - return NULL; + if (*pos >= HASH_SIZE(uid_hash_table)) + return NULL; + } while (hlist_empty(&uid_hash_table[*pos])); return &uid_hash_table[*pos]; } @@ -207,7 +222,8 @@ static int uid_time_in_state_seq_show(struct seq_file *m, void *v) if (freqs->freq_table[i] == CPUFREQ_ENTRY_INVALID) continue; - seq_printf(m, " %d", freqs->freq_table[i]); + seq_put_decimal_ull(m, " ", + freqs->freq_table[i]); } } seq_putc(m, '\n'); @@ -216,13 +232,16 @@ static int uid_time_in_state_seq_show(struct seq_file *m, void *v) rcu_read_lock(); hlist_for_each_entry_rcu(uid_entry, (struct hlist_head *)v, hash) { - if (uid_entry->max_state) - seq_printf(m, "%d:", uid_entry->uid); + if (uid_entry->max_state) { + seq_put_decimal_ull(m, "", uid_entry->uid); + seq_putc(m, ':'); + } for (i = 0; i < uid_entry->max_state; ++i) { + u64 time; if (freq_index_invalid(i)) continue; - seq_printf(m, " %lu", (unsigned long)nsec_to_clock_t( - uid_entry->time_in_state[i])); + time = nsec_to_clock_t(uid_entry->time_in_state[i]); + seq_put_decimal_ull(m, " ", time); } if (uid_entry->max_state) seq_putc(m, '\n'); @@ -232,6 +251,86 @@ static int uid_time_in_state_seq_show(struct seq_file *m, void *v) return 0; } +static int concurrent_time_seq_show(struct seq_file *m, void *v, + atomic64_t *(*get_times)(struct concurrent_times *)) +{ + struct uid_entry *uid_entry; + int i, num_possible_cpus = num_possible_cpus(); + + rcu_read_lock(); + + hlist_for_each_entry_rcu(uid_entry, (struct hlist_head *)v, hash) { + atomic64_t *times = get_times(uid_entry->concurrent_times); + + seq_put_decimal_ull(m, "", (u64)uid_entry->uid); + seq_putc(m, ':'); + + for (i = 0; i < num_possible_cpus; ++i) { + u64 time = nsec_to_clock_t(atomic64_read(×[i])); + + seq_put_decimal_ull(m, " ", time); + } + seq_putc(m, '\n'); + } + + rcu_read_unlock(); + + return 0; +} + +static inline atomic64_t *get_active_times(struct concurrent_times *times) +{ + return times->active; +} + +static int concurrent_active_time_seq_show(struct seq_file *m, void *v) +{ + if (v == uid_hash_table) { + seq_put_decimal_ull(m, "cpus: ", num_possible_cpus()); + seq_putc(m, '\n'); + } + + return concurrent_time_seq_show(m, v, get_active_times); +} + +static inline atomic64_t *get_policy_times(struct concurrent_times *times) +{ + return times->policy; +} + +static int concurrent_policy_time_seq_show(struct seq_file *m, void *v) +{ + int i; + struct cpu_freqs *freqs, *last_freqs = NULL; + + if (v == uid_hash_table) { + int cnt = 0; + + for_each_possible_cpu(i) { + freqs = all_freqs[i]; + if (!freqs) + continue; + if (freqs != last_freqs) { + if (last_freqs) { + seq_put_decimal_ull(m, ": ", cnt); + seq_putc(m, ' '); + cnt = 0; + } + seq_put_decimal_ull(m, "policy", i); + + last_freqs = freqs; + } + cnt++; + } + if (last_freqs) { + seq_put_decimal_ull(m, ": ", cnt); + seq_putc(m, '\n'); + } + } + + return concurrent_time_seq_show(m, v, get_policy_times); +} + void cpufreq_task_times_init(struct task_struct *p) { unsigned long flags; @@ -326,11 +425,16 @@ void cpufreq_acct_update_power(struct task_struct *p, u64 cputime) { unsigned long flags; unsigned int state; + unsigned int active_cpu_cnt = 0; + unsigned int policy_cpu_cnt = 0; + unsigned int policy_first_cpu; struct uid_entry *uid_entry; struct cpu_freqs *freqs = all_freqs[task_cpu(p)]; + struct cpufreq_policy *policy; uid_t uid = from_kuid_munged(current_user_ns(), task_uid(p)); + int cpu = 0; - if (!freqs || p->flags & PF_EXITING) + if (!freqs || is_idle_task(p) || p->flags & PF_EXITING) return; state = freqs->offset + READ_ONCE(freqs->last_index); @@ -346,6 +450,42 @@ void cpufreq_acct_update_power(struct task_struct *p, u64 cputime) if (uid_entry && state < uid_entry->max_state) uid_entry->time_in_state[state] += cputime; spin_unlock_irqrestore(&uid_lock, flags); + + rcu_read_lock(); + uid_entry = find_uid_entry_rcu(uid); + if (!uid_entry) { + rcu_read_unlock(); + return; + } + + for_each_possible_cpu(cpu) + if (!idle_cpu(cpu)) + ++active_cpu_cnt; + + atomic64_add(cputime, + &uid_entry->concurrent_times->active[active_cpu_cnt - 1]); + + policy = cpufreq_cpu_get(task_cpu(p)); + if (!policy) { + /* + * This CPU may have just come up and not have a cpufreq policy + * yet. + */ + rcu_read_unlock(); + return; + } + + for_each_cpu(cpu, policy->related_cpus) + if (!idle_cpu(cpu)) + ++policy_cpu_cnt; + + policy_first_cpu = cpumask_first(policy->related_cpus); + cpufreq_cpu_put(policy); + + atomic64_add(cputime, + &uid_entry->concurrent_times->policy[policy_first_cpu + + policy_cpu_cnt - 1]); + rcu_read_unlock(); } void cpufreq_times_create_policy(struct cpufreq_policy *policy) @@ -387,6 +527,14 @@ void cpufreq_times_create_policy(struct cpufreq_policy *policy) all_freqs[cpu] = freqs; } +static void uid_entry_reclaim(struct rcu_head *rcu) +{ + struct uid_entry *uid_entry = container_of(rcu, struct uid_entry, rcu); + + kfree(uid_entry->concurrent_times); + kfree(uid_entry); +} + void cpufreq_task_times_remove_uids(uid_t uid_start, uid_t uid_end) { struct uid_entry *uid_entry; @@ -400,7 +548,7 @@ void cpufreq_task_times_remove_uids(uid_t uid_start, uid_t uid_end) hash, uid_start) { if (uid_start == uid_entry->uid) { hash_del_rcu(&uid_entry->hash); - kfree_rcu(uid_entry, rcu); + call_rcu(&uid_entry->rcu, uid_entry_reclaim); } } } @@ -408,24 +556,17 @@ void cpufreq_task_times_remove_uids(uid_t uid_start, uid_t uid_end) spin_unlock_irqrestore(&uid_lock, flags); } -void cpufreq_times_record_transition(struct cpufreq_freqs *freq) +void cpufreq_times_record_transition(struct cpufreq_policy *policy, + unsigned int new_freq) { int index; - struct cpu_freqs *freqs = all_freqs[freq->cpu]; - struct cpufreq_policy *policy; - + struct cpu_freqs *freqs = all_freqs[policy->cpu]; if (!freqs) return; - policy = cpufreq_cpu_get(freq->cpu); - if (!policy) - return; - - index = cpufreq_frequency_table_get_index(policy, freq->new); + index = cpufreq_frequency_table_get_index(policy, new_freq); if (index >= 0) WRITE_ONCE(freqs->last_index, index); - - cpufreq_cpu_put(policy); } static const struct seq_operations uid_time_in_state_seq_ops = { @@ -453,11 +594,55 @@ static const struct file_operations uid_time_in_state_fops = { .release = seq_release, }; +static const struct seq_operations concurrent_active_time_seq_ops = { + .start = uid_seq_start, + .next = uid_seq_next, + .stop = uid_seq_stop, + .show = concurrent_active_time_seq_show, +}; + +static int concurrent_active_time_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &concurrent_active_time_seq_ops); +} + +static const struct file_operations concurrent_active_time_fops = { + .open = concurrent_active_time_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +static const struct seq_operations concurrent_policy_time_seq_ops = { + .start = uid_seq_start, + .next = uid_seq_next, + .stop = uid_seq_stop, + .show = concurrent_policy_time_seq_show, +}; + +static int concurrent_policy_time_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &concurrent_policy_time_seq_ops); +} + +static const struct file_operations concurrent_policy_time_fops = { + .open = concurrent_policy_time_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + static int __init cpufreq_times_init(void) { proc_create_data("uid_time_in_state", 0444, NULL, &uid_time_in_state_fops, NULL); + proc_create_data("uid_concurrent_active_time", 0444, NULL, + &concurrent_active_time_fops, NULL); + + proc_create_data("uid_concurrent_policy_time", 0444, NULL, + &concurrent_policy_time_fops, NULL); + return 0; } diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index b6a1aadaff9f..a005711f909e 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -833,7 +833,7 @@ static void intel_pstate_update_policies(void) /************************** sysfs begin ************************/ #define show_one(file_name, object) \ static ssize_t show_##file_name \ - (struct kobject *kobj, struct attribute *attr, char *buf) \ + (struct kobject *kobj, struct kobj_attribute *attr, char *buf) \ { \ return sprintf(buf, "%u\n", global.object); \ } @@ -842,7 +842,7 @@ static ssize_t intel_pstate_show_status(char *buf); static int intel_pstate_update_status(const char *buf, size_t size); static ssize_t show_status(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { ssize_t ret; @@ -853,7 +853,7 @@ static ssize_t show_status(struct kobject *kobj, return ret; } -static ssize_t store_status(struct kobject *a, struct attribute *b, +static ssize_t store_status(struct kobject *a, struct kobj_attribute *b, const char *buf, size_t count) { char *p = memchr(buf, '\n', count); @@ -867,7 +867,7 @@ static ssize_t store_status(struct kobject *a, struct attribute *b, } static ssize_t show_turbo_pct(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { struct cpudata *cpu; int total, no_turbo, turbo_pct; @@ -893,7 +893,7 @@ static ssize_t show_turbo_pct(struct kobject *kobj, } static ssize_t show_num_pstates(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { struct cpudata *cpu; int total; @@ -914,7 +914,7 @@ static ssize_t show_num_pstates(struct kobject *kobj, } static ssize_t show_no_turbo(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { ssize_t ret; @@ -936,7 +936,7 @@ static ssize_t show_no_turbo(struct kobject *kobj, return ret; } -static ssize_t store_no_turbo(struct kobject *a, struct attribute *b, +static ssize_t store_no_turbo(struct kobject *a, struct kobj_attribute *b, const char *buf, size_t count) { unsigned int input; @@ -983,7 +983,7 @@ static ssize_t store_no_turbo(struct kobject *a, struct attribute *b, return count; } -static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b, +static ssize_t store_max_perf_pct(struct kobject *a, struct kobj_attribute *b, const char *buf, size_t count) { unsigned int input; @@ -1013,7 +1013,7 @@ static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b, return count; } -static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b, +static ssize_t store_min_perf_pct(struct kobject *a, struct kobj_attribute *b, const char *buf, size_t count) { unsigned int input; @@ -1045,12 +1045,13 @@ static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b, } static ssize_t show_hwp_dynamic_boost(struct kobject *kobj, - struct attribute *attr, char *buf) + struct kobj_attribute *attr, char *buf) { return sprintf(buf, "%u\n", hwp_boost); } -static ssize_t store_hwp_dynamic_boost(struct kobject *a, struct attribute *b, +static ssize_t store_hwp_dynamic_boost(struct kobject *a, + struct kobj_attribute *b, const char *buf, size_t count) { unsigned int input; diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c index 4bf72561667c..a75b95fac3bd 100644 --- a/drivers/dma/at_xdmac.c +++ b/drivers/dma/at_xdmac.c @@ -203,6 +203,7 @@ struct at_xdmac_chan { u32 save_cim; u32 save_cnda; u32 save_cndc; + u32 irq_status; unsigned long status; struct tasklet_struct tasklet; struct dma_slave_config sconfig; @@ -1580,8 +1581,8 @@ static void at_xdmac_tasklet(unsigned long data) struct at_xdmac_desc *desc; u32 error_mask; - dev_dbg(chan2dev(&atchan->chan), "%s: status=0x%08lx\n", - __func__, atchan->status); + dev_dbg(chan2dev(&atchan->chan), "%s: status=0x%08x\n", + __func__, atchan->irq_status); error_mask = AT_XDMAC_CIS_RBEIS | AT_XDMAC_CIS_WBEIS @@ -1589,15 +1590,15 @@ static void at_xdmac_tasklet(unsigned long data) if (at_xdmac_chan_is_cyclic(atchan)) { at_xdmac_handle_cyclic(atchan); - } else if ((atchan->status & AT_XDMAC_CIS_LIS) - || (atchan->status & error_mask)) { + } else if ((atchan->irq_status & AT_XDMAC_CIS_LIS) + || (atchan->irq_status & error_mask)) { struct dma_async_tx_descriptor *txd; - if (atchan->status & AT_XDMAC_CIS_RBEIS) + if (atchan->irq_status & AT_XDMAC_CIS_RBEIS) dev_err(chan2dev(&atchan->chan), "read bus error!!!"); - if (atchan->status & AT_XDMAC_CIS_WBEIS) + if (atchan->irq_status & AT_XDMAC_CIS_WBEIS) dev_err(chan2dev(&atchan->chan), "write bus error!!!"); - if (atchan->status & AT_XDMAC_CIS_ROIS) + if (atchan->irq_status & AT_XDMAC_CIS_ROIS) dev_err(chan2dev(&atchan->chan), "request overflow error!!!"); spin_lock_bh(&atchan->lock); @@ -1652,7 +1653,7 @@ static irqreturn_t at_xdmac_interrupt(int irq, void *dev_id) atchan = &atxdmac->chan[i]; chan_imr = at_xdmac_chan_read(atchan, AT_XDMAC_CIM); chan_status = at_xdmac_chan_read(atchan, AT_XDMAC_CIS); - atchan->status = chan_status & chan_imr; + atchan->irq_status = chan_status & chan_imr; dev_vdbg(atxdmac->dma.dev, "%s: chan%d: imr=0x%x, status=0x%x\n", __func__, i, chan_imr, chan_status); @@ -1666,7 +1667,7 @@ static irqreturn_t at_xdmac_interrupt(int irq, void *dev_id) at_xdmac_chan_read(atchan, AT_XDMAC_CDA), at_xdmac_chan_read(atchan, AT_XDMAC_CUBC)); - if (atchan->status & (AT_XDMAC_CIS_RBEIS | AT_XDMAC_CIS_WBEIS)) + if (atchan->irq_status & (AT_XDMAC_CIS_RBEIS | AT_XDMAC_CIS_WBEIS)) at_xdmac_write(atxdmac, AT_XDMAC_GD, atchan->mask); tasklet_schedule(&atchan->tasklet); diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c index aa1712beb0cc..7b7fba0c9253 100644 --- a/drivers/dma/dmatest.c +++ b/drivers/dma/dmatest.c @@ -642,11 +642,9 @@ static int dmatest_func(void *data) srcs[i] = um->addr[i] + src_off; ret = dma_mapping_error(dev->dev, um->addr[i]); if (ret) { - dmaengine_unmap_put(um); result("src mapping error", total_tests, src_off, dst_off, len, ret); - failed_tests++; - continue; + goto error_unmap_continue; } um->to_cnt++; } @@ -661,11 +659,9 @@ static int dmatest_func(void *data) DMA_BIDIRECTIONAL); ret = dma_mapping_error(dev->dev, dsts[i]); if (ret) { - dmaengine_unmap_put(um); result("dst mapping error", total_tests, src_off, dst_off, len, ret); - failed_tests++; - continue; + goto error_unmap_continue; } um->bidi_cnt++; } @@ -693,12 +689,10 @@ static int dmatest_func(void *data) } if (!tx) { - dmaengine_unmap_put(um); result("prep error", total_tests, src_off, dst_off, len, ret); msleep(100); - failed_tests++; - continue; + goto error_unmap_continue; } done->done = false; @@ -707,12 +701,10 @@ static int dmatest_func(void *data) cookie = tx->tx_submit(tx); if (dma_submit_error(cookie)) { - dmaengine_unmap_put(um); result("submit error", total_tests, src_off, dst_off, len, ret); msleep(100); - failed_tests++; - continue; + goto error_unmap_continue; } dma_async_issue_pending(chan); @@ -725,16 +717,14 @@ static int dmatest_func(void *data) dmaengine_unmap_put(um); result("test timed out", total_tests, src_off, dst_off, len, 0); - failed_tests++; - continue; + goto error_unmap_continue; } else if (status != DMA_COMPLETE) { dmaengine_unmap_put(um); result(status == DMA_ERROR ? "completion error status" : "completion busy status", total_tests, src_off, dst_off, len, ret); - failed_tests++; - continue; + goto error_unmap_continue; } dmaengine_unmap_put(um); @@ -779,6 +769,12 @@ static int dmatest_func(void *data) verbose_result("test passed", total_tests, src_off, dst_off, len, 0); } + + continue; + +error_unmap_continue: + dmaengine_unmap_put(um); + failed_tests++; } ktime = ktime_sub(ktime_get(), ktime); ktime = ktime_sub(ktime, comparetime); diff --git a/drivers/firmware/iscsi_ibft.c b/drivers/firmware/iscsi_ibft.c index 6bc8e6640d71..c51462f5aa1e 100644 --- a/drivers/firmware/iscsi_ibft.c +++ b/drivers/firmware/iscsi_ibft.c @@ -542,6 +542,7 @@ static umode_t __init ibft_check_tgt_for(void *data, int type) case ISCSI_BOOT_TGT_NIC_ASSOC: case ISCSI_BOOT_TGT_CHAP_TYPE: rc = S_IRUGO; + break; case ISCSI_BOOT_TGT_NAME: if (tgt->tgt_name_len) rc = S_IRUGO; diff --git a/drivers/gnss/sirf.c b/drivers/gnss/sirf.c index 2c22836d3ffd..4596fde16dfe 100644 --- a/drivers/gnss/sirf.c +++ b/drivers/gnss/sirf.c @@ -310,30 +310,26 @@ static int sirf_probe(struct serdev_device *serdev) ret = -ENODEV; goto err_put_device; } + + ret = regulator_enable(data->vcc); + if (ret) + goto err_put_device; + + /* Wait for chip to boot into hibernate mode. */ + msleep(SIRF_BOOT_DELAY); } if (data->wakeup) { ret = gpiod_to_irq(data->wakeup); if (ret < 0) - goto err_put_device; - + goto err_disable_vcc; data->irq = ret; - ret = devm_request_threaded_irq(dev, data->irq, NULL, - sirf_wakeup_handler, + ret = request_threaded_irq(data->irq, NULL, sirf_wakeup_handler, IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT, "wakeup", data); if (ret) - goto err_put_device; - } - - if (data->on_off) { - ret = regulator_enable(data->vcc); - if (ret) - goto err_put_device; - - /* Wait for chip to boot into hibernate mode */ - msleep(SIRF_BOOT_DELAY); + goto err_disable_vcc; } if (IS_ENABLED(CONFIG_PM)) { @@ -342,7 +338,7 @@ static int sirf_probe(struct serdev_device *serdev) } else { ret = sirf_runtime_resume(dev); if (ret < 0) - goto err_disable_vcc; + goto err_free_irq; } ret = gnss_register_device(gdev); @@ -356,6 +352,9 @@ static int sirf_probe(struct serdev_device *serdev) pm_runtime_disable(dev); else sirf_runtime_suspend(dev); +err_free_irq: + if (data->wakeup) + free_irq(data->irq, data); err_disable_vcc: if (data->on_off) regulator_disable(data->vcc); @@ -376,6 +375,9 @@ static void sirf_remove(struct serdev_device *serdev) else sirf_runtime_suspend(&serdev->dev); + if (data->wakeup) + free_irq(data->irq, data); + if (data->on_off) regulator_disable(data->vcc); diff --git a/drivers/gpio/gpio-vf610.c b/drivers/gpio/gpio-vf610.c index d4ad6d0e02a2..7e09ce75ffb2 100644 --- a/drivers/gpio/gpio-vf610.c +++ b/drivers/gpio/gpio-vf610.c @@ -259,6 +259,7 @@ static int vf610_gpio_probe(struct platform_device *pdev) struct vf610_gpio_port *port; struct resource *iores; struct gpio_chip *gc; + int i; int ret; port = devm_kzalloc(&pdev->dev, sizeof(*port), GFP_KERNEL); @@ -298,6 +299,10 @@ static int vf610_gpio_probe(struct platform_device *pdev) if (ret < 0) return ret; + /* Mask all GPIO interrupts */ + for (i = 0; i < gc->ngpio; i++) + vf610_gpio_writel(0, port->base + PORT_PCR(i)); + /* Clear the interrupt status register for all GPIO's */ vf610_gpio_writel(~0, port->base + PORT_ISFR); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c index 7b4e657a95c7..c3df75a9f65d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c @@ -1443,7 +1443,8 @@ static umode_t hwmon_attributes_visible(struct kobject *kobj, effective_mode &= ~S_IWUSR; if ((adev->flags & AMD_IS_APU) && - (attr == &sensor_dev_attr_power1_cap_max.dev_attr.attr || + (attr == &sensor_dev_attr_power1_average.dev_attr.attr || + attr == &sensor_dev_attr_power1_cap_max.dev_attr.attr || attr == &sensor_dev_attr_power1_cap_min.dev_attr.attr|| attr == &sensor_dev_attr_power1_cap.dev_attr.attr)) return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index 1c5d97f4b4dd..8dcf6227ab99 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -37,6 +37,7 @@ #include "amdgpu_display.h" #include #include +#include static const struct dma_buf_ops amdgpu_dmabuf_ops; @@ -188,6 +189,48 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, return ERR_PTR(ret); } +static int +__reservation_object_make_exclusive(struct reservation_object *obj) +{ + struct dma_fence **fences; + unsigned int count; + int r; + + if (!reservation_object_get_list(obj)) /* no shared fences to convert */ + return 0; + + r = reservation_object_get_fences_rcu(obj, NULL, &count, &fences); + if (r) + return r; + + if (count == 0) { + /* Now that was unexpected. */ + } else if (count == 1) { + reservation_object_add_excl_fence(obj, fences[0]); + dma_fence_put(fences[0]); + kfree(fences); + } else { + struct dma_fence_array *array; + + array = dma_fence_array_create(count, fences, + dma_fence_context_alloc(1), 0, + false); + if (!array) + goto err_fences_put; + + reservation_object_add_excl_fence(obj, &array->base); + dma_fence_put(&array->base); + } + + return 0; + +err_fences_put: + while (count--) + dma_fence_put(fences[count]); + kfree(fences); + return -ENOMEM; +} + /** * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation * @dma_buf: shared DMA buffer @@ -219,16 +262,16 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, if (attach->dev->driver != adev->dev->driver) { /* - * Wait for all shared fences to complete before we switch to future - * use of exclusive fence on this prime shared bo. + * We only create shared fences for internal use, but importers + * of the dmabuf rely on exclusive fences for implicitly + * tracking write hazards. As any of the current fences may + * correspond to a write, we need to convert all existing + * fences on the reservation object into a single exclusive + * fence. */ - r = reservation_object_wait_timeout_rcu(bo->tbo.resv, - true, false, - MAX_SCHEDULE_TIMEOUT); - if (unlikely(r < 0)) { - DRM_DEBUG_PRIME("Fence wait failed: %li\n", r); + r = __reservation_object_make_exclusive(bo->tbo.resv); + if (r) goto error_unreserve; - } } /* pin buffer into GTT */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 6a84526e20e0..49fe5084c53d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -3011,14 +3011,15 @@ void amdgpu_vm_get_task_info(struct amdgpu_device *adev, unsigned int pasid, struct amdgpu_task_info *task_info) { struct amdgpu_vm *vm; + unsigned long flags; - spin_lock(&adev->vm_manager.pasid_lock); + spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags); vm = idr_find(&adev->vm_manager.pasid_idr, pasid); if (vm) *task_info = vm->task_info; - spin_unlock(&adev->vm_manager.pasid_lock); + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags); } /** diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 1d74aed7e471..94f5c3646cb7 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1573,6 +1573,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, if (old_plane_state->fb != new_plane_state->fb) return -EINVAL; + /* + * FIXME: Since prepare_fb and cleanup_fb are always called on + * the new_plane_state for async updates we need to block framebuffer + * changes. This prevents use of a fb that's been cleaned up and + * double cleanups from occuring. + */ + if (old_plane_state->fb != new_plane_state->fb) + return -EINVAL; + funcs = plane->helper_private; if (!funcs->atomic_async_update) return -EINVAL; diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c index d587779a80b4..a97294ac96d5 100644 --- a/drivers/gpu/drm/radeon/ci_dpm.c +++ b/drivers/gpu/drm/radeon/ci_dpm.c @@ -5676,7 +5676,7 @@ int ci_dpm_init(struct radeon_device *rdev) u16 data_offset, size; u8 frev, crev; struct ci_power_info *pi; - enum pci_bus_speed speed_cap; + enum pci_bus_speed speed_cap = PCI_SPEED_UNKNOWN; struct pci_dev *root = rdev->pdev->bus->self; int ret; @@ -5685,7 +5685,8 @@ int ci_dpm_init(struct radeon_device *rdev) return -ENOMEM; rdev->pm.dpm.priv = pi; - speed_cap = pcie_get_speed_cap(root); + if (!pci_is_root_bus(rdev->pdev->bus)) + speed_cap = pcie_get_speed_cap(root); if (speed_cap == PCI_SPEED_UNKNOWN) { pi->sys_pcie_mask = 0; } else { diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index 8fb60b3af015..0a785ef0ab66 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -6899,7 +6899,7 @@ int si_dpm_init(struct radeon_device *rdev) struct ni_power_info *ni_pi; struct si_power_info *si_pi; struct atom_clock_dividers dividers; - enum pci_bus_speed speed_cap; + enum pci_bus_speed speed_cap = PCI_SPEED_UNKNOWN; struct pci_dev *root = rdev->pdev->bus->self; int ret; @@ -6911,7 +6911,8 @@ int si_dpm_init(struct radeon_device *rdev) eg_pi = &ni_pi->eg; pi = &eg_pi->rv7xx; - speed_cap = pcie_get_speed_cap(root); + if (!pci_is_root_bus(rdev->pdev->bus)) + speed_cap = pcie_get_speed_cap(root); if (speed_cap == PCI_SPEED_UNKNOWN) { si_pi->sys_pcie_mask = 0; } else { diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c b/drivers/gpu/drm/sun4i/sun4i_tcon.c index 3fb084f802e2..8c31c9ab06f8 100644 --- a/drivers/gpu/drm/sun4i/sun4i_tcon.c +++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c @@ -672,6 +672,7 @@ static int sun4i_tcon_init_clocks(struct device *dev, return PTR_ERR(tcon->sclk0); } } + clk_prepare_enable(tcon->sclk0); if (tcon->quirks->has_channel_1) { tcon->sclk1 = devm_clk_get(dev, "tcon-ch1"); @@ -686,6 +687,7 @@ static int sun4i_tcon_init_clocks(struct device *dev, static void sun4i_tcon_free_clocks(struct sun4i_tcon *tcon) { + clk_disable_unprepare(tcon->sclk0); clk_disable_unprepare(tcon->clk); } diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c index 65d06a819307..2ac86096ddd9 100644 --- a/drivers/i2c/busses/i2c-omap.c +++ b/drivers/i2c/busses/i2c-omap.c @@ -1498,8 +1498,7 @@ static int omap_i2c_remove(struct platform_device *pdev) return 0; } -#ifdef CONFIG_PM -static int omap_i2c_runtime_suspend(struct device *dev) +static int __maybe_unused omap_i2c_runtime_suspend(struct device *dev) { struct omap_i2c_dev *omap = dev_get_drvdata(dev); @@ -1525,7 +1524,7 @@ static int omap_i2c_runtime_suspend(struct device *dev) return 0; } -static int omap_i2c_runtime_resume(struct device *dev) +static int __maybe_unused omap_i2c_runtime_resume(struct device *dev) { struct omap_i2c_dev *omap = dev_get_drvdata(dev); @@ -1540,20 +1539,18 @@ static int omap_i2c_runtime_resume(struct device *dev) } static const struct dev_pm_ops omap_i2c_pm_ops = { + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, + pm_runtime_force_resume) SET_RUNTIME_PM_OPS(omap_i2c_runtime_suspend, omap_i2c_runtime_resume, NULL) }; -#define OMAP_I2C_PM_OPS (&omap_i2c_pm_ops) -#else -#define OMAP_I2C_PM_OPS NULL -#endif /* CONFIG_PM */ static struct platform_driver omap_i2c_driver = { .probe = omap_i2c_probe, .remove = omap_i2c_remove, .driver = { .name = "omap_i2c", - .pm = OMAP_I2C_PM_OPS, + .pm = &omap_i2c_pm_ops, .of_match_table = of_match_ptr(omap_i2c_of_match), }, }; diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c index 70d39fc450a1..54eb69564264 100644 --- a/drivers/infiniband/hw/hfi1/ud.c +++ b/drivers/infiniband/hw/hfi1/ud.c @@ -980,7 +980,6 @@ void hfi1_ud_rcv(struct hfi1_packet *packet) opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { wc.ex.imm_data = packet->ohdr->u.ud.imm_data; wc.wc_flags = IB_WC_WITH_IMM; - tlen -= sizeof(u32); } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { wc.ex.imm_data = 0; wc.wc_flags = 0; diff --git a/drivers/infiniband/hw/qib/qib_ud.c b/drivers/infiniband/hw/qib/qib_ud.c index f8d029a2390f..bce2b5cd3c7b 100644 --- a/drivers/infiniband/hw/qib/qib_ud.c +++ b/drivers/infiniband/hw/qib/qib_ud.c @@ -513,7 +513,6 @@ void qib_ud_rcv(struct qib_ibport *ibp, struct ib_header *hdr, opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { wc.ex.imm_data = ohdr->u.ud.imm_data; wc.wc_flags = IB_WC_WITH_IMM; - tlen -= sizeof(u32); } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { wc.ex.imm_data = 0; wc.wc_flags = 0; diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 1abe3c62f106..b22d02c9de90 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -248,7 +248,6 @@ struct ipoib_cm_tx { struct list_head list; struct net_device *dev; struct ipoib_neigh *neigh; - struct ipoib_path *path; struct ipoib_tx_buf *tx_ring; unsigned int tx_head; unsigned int tx_tail; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 0428e01e8f69..aa9dcfc36cd3 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1312,7 +1312,6 @@ struct ipoib_cm_tx *ipoib_cm_create_tx(struct net_device *dev, struct ipoib_path neigh->cm = tx; tx->neigh = neigh; - tx->path = path; tx->dev = dev; list_add(&tx->list, &priv->cm.start_list); set_bit(IPOIB_FLAG_INITIALIZED, &tx->flags); @@ -1371,7 +1370,7 @@ static void ipoib_cm_tx_start(struct work_struct *work) neigh->daddr + QPN_AND_OPTIONS_OFFSET); goto free_neigh; } - memcpy(&pathrec, &p->path->pathrec, sizeof(pathrec)); + memcpy(&pathrec, &path->pathrec, sizeof(pathrec)); spin_unlock_irqrestore(&priv->lock, flags); netif_tx_unlock_bh(dev); diff --git a/drivers/input/mouse/elan_i2c_core.c b/drivers/input/mouse/elan_i2c_core.c index 225ae6980182..628ef617bb2f 100644 --- a/drivers/input/mouse/elan_i2c_core.c +++ b/drivers/input/mouse/elan_i2c_core.c @@ -1337,6 +1337,7 @@ static const struct acpi_device_id elan_acpi_id[] = { { "ELAN0000", 0 }, { "ELAN0100", 0 }, { "ELAN0600", 0 }, + { "ELAN0601", 0 }, { "ELAN0602", 0 }, { "ELAN0605", 0 }, { "ELAN0608", 0 }, diff --git a/drivers/input/tablet/wacom_serial4.c b/drivers/input/tablet/wacom_serial4.c index 38bfaca48eab..150f9eecaca7 100644 --- a/drivers/input/tablet/wacom_serial4.c +++ b/drivers/input/tablet/wacom_serial4.c @@ -187,6 +187,7 @@ enum { MODEL_DIGITIZER_II = 0x5544, /* UD */ MODEL_GRAPHIRE = 0x4554, /* ET */ MODEL_PENPARTNER = 0x4354, /* CT */ + MODEL_ARTPAD_II = 0x4B54, /* KT */ }; static void wacom_handle_model_response(struct wacom *wacom) @@ -245,6 +246,7 @@ static void wacom_handle_model_response(struct wacom *wacom) wacom->flags = F_HAS_STYLUS2 | F_HAS_SCROLLWHEEL; break; + case MODEL_ARTPAD_II: case MODEL_DIGITIZER_II: wacom->dev->name = "Wacom Digitizer II"; wacom->dev->id.version = MODEL_DIGITIZER_II; diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 34c9aa76a7bd..27500abe8ca7 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1929,16 +1929,13 @@ static void do_attach(struct iommu_dev_data *dev_data, static void do_detach(struct iommu_dev_data *dev_data) { + struct protection_domain *domain = dev_data->domain; struct amd_iommu *iommu; u16 alias; iommu = amd_iommu_rlookup_table[dev_data->devid]; alias = dev_data->alias; - /* decrease reference counters */ - dev_data->domain->dev_iommu[iommu->index] -= 1; - dev_data->domain->dev_cnt -= 1; - /* Update data structures */ dev_data->domain = NULL; list_del(&dev_data->list); @@ -1948,6 +1945,16 @@ static void do_detach(struct iommu_dev_data *dev_data) /* Flush the DTE entry */ device_flush_dte(dev_data); + + /* Flush IOTLB */ + domain_flush_tlb_pde(domain); + + /* Wait for the flushes to finish */ + domain_flush_complete(domain); + + /* decrease reference counters - needs to happen after the flushes */ + domain->dev_iommu[iommu->index] -= 1; + domain->dev_cnt -= 1; } /* @@ -2555,13 +2562,13 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, bus_addr = address + s->dma_address + (j << PAGE_SHIFT); iommu_unmap_page(domain, bus_addr, PAGE_SIZE); - if (--mapped_pages) + if (--mapped_pages == 0) goto out_free_iova; } } out_free_iova: - free_iova_fast(&dma_dom->iovad, address, npages); + free_iova_fast(&dma_dom->iovad, address >> PAGE_SHIFT, npages); out_err: return 0; diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 4c2246fe5dbe..15579cba1a88 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1581,6 +1581,9 @@ static unsigned long *its_lpi_alloc(int nr_irqs, u32 *base, int *nr_ids) nr_irqs /= 2; } while (nr_irqs > 0); + if (!nr_irqs) + err = -ENOSPC; + if (err) goto out; @@ -1951,6 +1954,29 @@ static void its_free_pending_table(struct page *pt) get_order(max_t(u32, LPI_PENDBASE_SZ, SZ_64K))); } +static u64 its_clear_vpend_valid(void __iomem *vlpi_base) +{ + u32 count = 1000000; /* 1s! */ + bool clean; + u64 val; + + val = gits_read_vpendbaser(vlpi_base + GICR_VPENDBASER); + val &= ~GICR_VPENDBASER_Valid; + gits_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER); + + do { + val = gits_read_vpendbaser(vlpi_base + GICR_VPENDBASER); + clean = !(val & GICR_VPENDBASER_Dirty); + if (!clean) { + count--; + cpu_relax(); + udelay(1); + } + } while (!clean && count); + + return val; +} + static void its_cpu_init_lpis(void) { void __iomem *rbase = gic_data_rdist_rd_base(); @@ -2024,6 +2050,30 @@ static void its_cpu_init_lpis(void) val |= GICR_CTLR_ENABLE_LPIS; writel_relaxed(val, rbase + GICR_CTLR); + if (gic_rdists->has_vlpis) { + void __iomem *vlpi_base = gic_data_rdist_vlpi_base(); + + /* + * It's possible for CPU to receive VLPIs before it is + * sheduled as a vPE, especially for the first CPU, and the + * VLPI with INTID larger than 2^(IDbits+1) will be considered + * as out of range and dropped by GIC. + * So we initialize IDbits to known value to avoid VLPI drop. + */ + val = (LPI_NRBITS - 1) & GICR_VPROPBASER_IDBITS_MASK; + pr_debug("GICv4: CPU%d: Init IDbits to 0x%llx for GICR_VPROPBASER\n", + smp_processor_id(), val); + gits_write_vpropbaser(val, vlpi_base + GICR_VPROPBASER); + + /* + * Also clear Valid bit of GICR_VPENDBASER, in case some + * ancient programming gets left in and has possibility of + * corrupting memory. + */ + val = its_clear_vpend_valid(vlpi_base); + WARN_ON(val & GICR_VPENDBASER_Dirty); + } + /* Make sure the GIC has seen the above */ dsb(sy); } @@ -2644,26 +2694,11 @@ static void its_vpe_schedule(struct its_vpe *vpe) static void its_vpe_deschedule(struct its_vpe *vpe) { void __iomem *vlpi_base = gic_data_rdist_vlpi_base(); - u32 count = 1000000; /* 1s! */ - bool clean; u64 val; - /* We're being scheduled out */ - val = gits_read_vpendbaser(vlpi_base + GICR_VPENDBASER); - val &= ~GICR_VPENDBASER_Valid; - gits_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER); + val = its_clear_vpend_valid(vlpi_base); - do { - val = gits_read_vpendbaser(vlpi_base + GICR_VPENDBASER); - clean = !(val & GICR_VPENDBASER_Dirty); - if (!clean) { - count--; - cpu_relax(); - udelay(1); - } - } while (!clean && count); - - if (unlikely(!clean && !count)) { + if (unlikely(val & GICR_VPENDBASER_Dirty)) { pr_err_ratelimited("ITS virtual pending table not cleaning\n"); vpe->idai = false; vpe->pending_last = true; diff --git a/drivers/irqchip/irq-mmp.c b/drivers/irqchip/irq-mmp.c index 25f32e1d7764..3496b61a312a 100644 --- a/drivers/irqchip/irq-mmp.c +++ b/drivers/irqchip/irq-mmp.c @@ -34,6 +34,9 @@ #define SEL_INT_PENDING (1 << 6) #define SEL_INT_NUM_MASK 0x3f +#define MMP2_ICU_INT_ROUTE_PJ4_IRQ (1 << 5) +#define MMP2_ICU_INT_ROUTE_PJ4_FIQ (1 << 6) + struct icu_chip_data { int nr_irqs; unsigned int virq_base; @@ -190,7 +193,8 @@ static const struct mmp_intc_conf mmp_conf = { static const struct mmp_intc_conf mmp2_conf = { .conf_enable = 0x20, .conf_disable = 0x0, - .conf_mask = 0x7f, + .conf_mask = MMP2_ICU_INT_ROUTE_PJ4_IRQ | + MMP2_ICU_INT_ROUTE_PJ4_FIQ, }; static void __exception_irq_entry mmp_handle_irq(struct pt_regs *regs) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 7033a2880771..9df1334608b7 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -4630,7 +4630,6 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, atomic_inc(&r10_bio->remaining); read_bio->bi_next = NULL; generic_make_request(read_bio); - sector_nr += nr_sectors; sectors_done += nr_sectors; if (sector_nr <= last) goto read_more; diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c index 361abbc00486..6f1fd40fce10 100644 --- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -1065,11 +1065,19 @@ static int uvc_parse_standard_control(struct uvc_device *dev, return -EINVAL; } - /* Make sure the terminal type MSB is not null, otherwise it - * could be confused with a unit. + /* + * Reject invalid terminal types that would cause issues: + * + * - The high byte must be non-zero, otherwise it would be + * confused with a unit. + * + * - Bit 15 must be 0, as we use it internally as a terminal + * direction flag. + * + * Other unknown types are accepted. */ type = get_unaligned_le16(&buffer[4]); - if ((type & 0xff00) == 0) { + if ((type & 0x7f00) == 0 || (type & 0x8000) != 0) { uvc_trace(UVC_TRACE_DESCR, "device %d videocontrol " "interface %d INPUT_TERMINAL %d has invalid " "type 0x%04x, skipping\n", udev->devnum, diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index a6fcc5c96070..b2c42cae3081 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1171,29 +1171,22 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb) } } - /* Link-local multicast packets should be passed to the - * stack on the link they arrive as well as pass them to the - * bond-master device. These packets are mostly usable when - * stack receives it with the link on which they arrive - * (e.g. LLDP) they also must be available on master. Some of - * the use cases include (but are not limited to): LLDP agents - * that must be able to operate both on enslaved interfaces as - * well as on bonds themselves; linux bridges that must be able - * to process/pass BPDUs from attached bonds when any kind of - * STP version is enabled on the network. + /* + * For packets determined by bond_should_deliver_exact_match() call to + * be suppressed we want to make an exception for link-local packets. + * This is necessary for e.g. LLDP daemons to be able to monitor + * inactive slave links without being forced to bind to them + * explicitly. + * + * At the same time, packets that are passed to the bonding master + * (including link-local ones) can have their originating interface + * determined via PACKET_ORIGDEV socket option. */ - if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) { - struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC); - - if (nskb) { - nskb->dev = bond->dev; - nskb->queue_mapping = 0; - netif_rx(nskb); - } - return RX_HANDLER_PASS; - } - if (bond_should_deliver_exact_match(skb, slave, bond)) + if (bond_should_deliver_exact_match(skb, slave, bond)) { + if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) + return RX_HANDLER_PASS; return RX_HANDLER_EXACT; + } skb->dev = bond->dev; diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 9f697a5b8e3d..c078c791f481 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -884,7 +884,7 @@ static uint64_t _mv88e6xxx_get_ethtool_stat(struct mv88e6xxx_chip *chip, default: return U64_MAX; } - value = (((u64)high) << 16) | low; + value = (((u64)high) << 32) | low; return value; } @@ -3070,7 +3070,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = { .port_disable_pri_override = mv88e6xxx_port_disable_pri_override, .port_link_state = mv88e6352_port_link_state, .port_get_cmode = mv88e6185_port_get_cmode, - .stats_snapshot = mv88e6320_g1_stats_snapshot, + .stats_snapshot = mv88e6xxx_g1_stats_snapshot, .stats_set_histogram = mv88e6095_g1_stats_set_histogram, .stats_get_sset_count = mv88e6095_stats_get_sset_count, .stats_get_strings = mv88e6095_stats_get_strings, @@ -4188,7 +4188,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = { .name = "Marvell 88E6190", .num_databases = 4096, .num_ports = 11, /* 10 + Z80 */ - .num_internal_phys = 11, + .num_internal_phys = 9, .num_gpio = 16, .max_vid = 8191, .port_base_addr = 0x0, @@ -4211,7 +4211,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = { .name = "Marvell 88E6190X", .num_databases = 4096, .num_ports = 11, /* 10 + Z80 */ - .num_internal_phys = 11, + .num_internal_phys = 9, .num_gpio = 16, .max_vid = 8191, .port_base_addr = 0x0, @@ -4234,7 +4234,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = { .name = "Marvell 88E6191", .num_databases = 4096, .num_ports = 11, /* 10 + Z80 */ - .num_internal_phys = 11, + .num_internal_phys = 9, .max_vid = 8191, .port_base_addr = 0x0, .phy_base_addr = 0x0, @@ -4281,7 +4281,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = { .name = "Marvell 88E6290", .num_databases = 4096, .num_ports = 11, /* 10 + Z80 */ - .num_internal_phys = 11, + .num_internal_phys = 9, .num_gpio = 16, .max_vid = 8191, .port_base_addr = 0x0, @@ -4443,7 +4443,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = { .name = "Marvell 88E6390", .num_databases = 4096, .num_ports = 11, /* 10 + Z80 */ - .num_internal_phys = 11, + .num_internal_phys = 9, .num_gpio = 16, .max_vid = 8191, .port_base_addr = 0x0, @@ -4466,7 +4466,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = { .name = "Marvell 88E6390X", .num_databases = 4096, .num_ports = 11, /* 10 + Z80 */ - .num_internal_phys = 11, + .num_internal_phys = 9, .num_gpio = 16, .max_vid = 8191, .port_base_addr = 0x0, @@ -4561,6 +4561,14 @@ static int mv88e6xxx_smi_init(struct mv88e6xxx_chip *chip, return 0; } +static void mv88e6xxx_ports_cmode_init(struct mv88e6xxx_chip *chip) +{ + int i; + + for (i = 0; i < mv88e6xxx_num_ports(chip); i++) + chip->ports[i].cmode = MV88E6XXX_PORT_STS_CMODE_INVALID; +} + static enum dsa_tag_protocol mv88e6xxx_get_tag_protocol(struct dsa_switch *ds, int port) { @@ -4597,6 +4605,8 @@ static const char *mv88e6xxx_drv_probe(struct device *dsa_dev, if (err) goto free; + mv88e6xxx_ports_cmode_init(chip); + mutex_lock(&chip->reg_lock); err = mv88e6xxx_switch_reset(chip); mutex_unlock(&chip->reg_lock); diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 92945841c8e8..7fffce734f0a 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -190,7 +190,7 @@ int mv88e6xxx_port_set_duplex(struct mv88e6xxx_chip *chip, int port, int dup) /* normal duplex detection */ break; default: - return -EINVAL; + return -EOPNOTSUPP; } err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_MAC_CTL, reg); @@ -374,6 +374,10 @@ int mv88e6390x_port_set_cmode(struct mv88e6xxx_chip *chip, int port, cmode = 0; } + /* cmode doesn't change, nothing to do for us */ + if (cmode == chip->ports[port].cmode) + return 0; + lane = mv88e6390x_serdes_get_lane(chip, port); if (lane < 0) return lane; @@ -384,7 +388,7 @@ int mv88e6390x_port_set_cmode(struct mv88e6xxx_chip *chip, int port, return err; } - err = mv88e6390_serdes_power(chip, port, false); + err = mv88e6390x_serdes_power(chip, port, false); if (err) return err; @@ -400,7 +404,7 @@ int mv88e6390x_port_set_cmode(struct mv88e6xxx_chip *chip, int port, if (err) return err; - err = mv88e6390_serdes_power(chip, port, true); + err = mv88e6390x_serdes_power(chip, port, true); if (err) return err; diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index b31910023bb6..95b59f5eb393 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -52,6 +52,7 @@ #define MV88E6185_PORT_STS_CMODE_1000BASE_X 0x0005 #define MV88E6185_PORT_STS_CMODE_PHY 0x0006 #define MV88E6185_PORT_STS_CMODE_DISABLED 0x0007 +#define MV88E6XXX_PORT_STS_CMODE_INVALID 0xff /* Offset 0x01: MAC (or PCS or Physical) Control Register */ #define MV88E6XXX_PORT_MAC_CTL 0x01 diff --git a/drivers/net/ethernet/altera/altera_msgdma.c b/drivers/net/ethernet/altera/altera_msgdma.c index 0fb986ba3290..0ae723f75341 100644 --- a/drivers/net/ethernet/altera/altera_msgdma.c +++ b/drivers/net/ethernet/altera/altera_msgdma.c @@ -145,7 +145,8 @@ u32 msgdma_tx_completions(struct altera_tse_private *priv) & 0xffff; if (inuse) { /* Tx FIFO is not empty */ - ready = priv->tx_prod - priv->tx_cons - inuse - 1; + ready = max_t(int, + priv->tx_prod - priv->tx_cons - inuse - 1, 0); } else { /* Check for buffered last packet */ status = csrrd32(priv->tx_dma_csr, msgdma_csroffs(status)); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 034f57500f00..1fdaf86bbe8f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -463,6 +463,12 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) } length >>= 9; + if (unlikely(length >= ARRAY_SIZE(bnxt_lhint_arr))) { + dev_warn_ratelimited(&pdev->dev, "Dropped oversize %d bytes TX packet.\n", + skb->len); + i = 0; + goto tx_dma_error; + } flags |= bnxt_lhint_arr[length]; txbd->tx_bd_len_flags_type = cpu_to_le32(flags); diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h index 3d45f4c92cf6..9bbaad9f3d63 100644 --- a/drivers/net/ethernet/cadence/macb.h +++ b/drivers/net/ethernet/cadence/macb.h @@ -643,6 +643,7 @@ #define MACB_CAPS_JUMBO 0x00000020 #define MACB_CAPS_GEM_HAS_PTP 0x00000040 #define MACB_CAPS_BD_RD_PREFETCH 0x00000080 +#define MACB_CAPS_NEEDS_RSTONUBR 0x00000100 #define MACB_CAPS_FIFO_MODE 0x10000000 #define MACB_CAPS_GIGABIT_MODE_AVAILABLE 0x20000000 #define MACB_CAPS_SG_DISABLED 0x40000000 @@ -1214,6 +1215,8 @@ struct macb { int rx_bd_rd_prefetch; int tx_bd_rd_prefetch; + + u32 rx_intr_mask; }; #ifdef CONFIG_MACB_USE_HWSTAMP diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 8f4b2f9a8e07..8abea1c3844f 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -56,8 +56,7 @@ /* level of occupied TX descriptors under which we wake up TX process */ #define MACB_TX_WAKEUP_THRESH(bp) (3 * (bp)->tx_ring_size / 4) -#define MACB_RX_INT_FLAGS (MACB_BIT(RCOMP) | MACB_BIT(RXUBR) \ - | MACB_BIT(ISR_ROVR)) +#define MACB_RX_INT_FLAGS (MACB_BIT(RCOMP) | MACB_BIT(ISR_ROVR)) #define MACB_TX_ERR_FLAGS (MACB_BIT(ISR_TUND) \ | MACB_BIT(ISR_RLE) \ | MACB_BIT(TXERR)) @@ -1271,7 +1270,7 @@ static int macb_poll(struct napi_struct *napi, int budget) queue_writel(queue, ISR, MACB_BIT(RCOMP)); napi_reschedule(napi); } else { - queue_writel(queue, IER, MACB_RX_INT_FLAGS); + queue_writel(queue, IER, bp->rx_intr_mask); } } @@ -1289,7 +1288,7 @@ static void macb_hresp_error_task(unsigned long data) u32 ctrl; for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) { - queue_writel(queue, IDR, MACB_RX_INT_FLAGS | + queue_writel(queue, IDR, bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP)); } @@ -1319,7 +1318,7 @@ static void macb_hresp_error_task(unsigned long data) /* Enable interrupts */ queue_writel(queue, IER, - MACB_RX_INT_FLAGS | + bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP)); } @@ -1373,14 +1372,14 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id) (unsigned int)(queue - bp->queues), (unsigned long)status); - if (status & MACB_RX_INT_FLAGS) { + if (status & bp->rx_intr_mask) { /* There's no point taking any more interrupts * until we have processed the buffers. The * scheduling call may fail if the poll routine * is already scheduled, so disable interrupts * now. */ - queue_writel(queue, IDR, MACB_RX_INT_FLAGS); + queue_writel(queue, IDR, bp->rx_intr_mask); if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE) queue_writel(queue, ISR, MACB_BIT(RCOMP)); @@ -1413,8 +1412,9 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id) /* There is a hardware issue under heavy load where DMA can * stop, this causes endless "used buffer descriptor read" * interrupts but it can be cleared by re-enabling RX. See - * the at91 manual, section 41.3.1 or the Zynq manual - * section 16.7.4 for details. + * the at91rm9200 manual, section 41.3.1 or the Zynq manual + * section 16.7.4 for details. RXUBR is only enabled for + * these two versions. */ if (status & MACB_BIT(RXUBR)) { ctrl = macb_readl(bp, NCR); @@ -2264,7 +2264,7 @@ static void macb_init_hw(struct macb *bp) /* Enable interrupts */ queue_writel(queue, IER, - MACB_RX_INT_FLAGS | + bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP)); } @@ -3912,6 +3912,7 @@ static const struct macb_config sama5d4_config = { }; static const struct macb_config emac_config = { + .caps = MACB_CAPS_NEEDS_RSTONUBR, .clk_init = at91ether_clk_init, .init = at91ether_init, }; @@ -3933,7 +3934,8 @@ static const struct macb_config zynqmp_config = { }; static const struct macb_config zynq_config = { - .caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_NO_GIGABIT_HALF, + .caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_NO_GIGABIT_HALF | + MACB_CAPS_NEEDS_RSTONUBR, .dma_burst_length = 16, .clk_init = macb_clk_init, .init = macb_init, @@ -4088,6 +4090,10 @@ static int macb_probe(struct platform_device *pdev) macb_dma_desc_get_size(bp); } + bp->rx_intr_mask = MACB_RX_INT_FLAGS; + if (bp->caps & MACB_CAPS_NEEDS_RSTONUBR) + bp->rx_intr_mask |= MACB_BIT(RXUBR); + mac = of_get_mac_address(np); if (mac) { ether_addr_copy(bp->dev->dev_addr, mac); diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c index 6242249c9f4c..b043370c2685 100644 --- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c +++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c @@ -2419,6 +2419,8 @@ static int hns_nic_dev_probe(struct platform_device *pdev) out_notify_fail: (void)cancel_work_sync(&priv->service_task); out_read_prop_fail: + /* safe for ACPI FW */ + of_node_put(to_of_node(priv->fwnode)); free_netdev(ndev); return ret; } @@ -2448,6 +2450,9 @@ static int hns_nic_dev_remove(struct platform_device *pdev) set_bit(NIC_STATE_REMOVING, &priv->state); (void)cancel_work_sync(&priv->service_task); + /* safe for ACPI FW */ + of_node_put(to_of_node(priv->fwnode)); + free_netdev(ndev); return 0; } diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c index 774beda040a1..e2710ff48fb0 100644 --- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c @@ -1157,16 +1157,18 @@ static int hns_get_regs_len(struct net_device *net_dev) */ static int hns_nic_nway_reset(struct net_device *netdev) { - int ret = 0; struct phy_device *phy = netdev->phydev; - if (netif_running(netdev)) { - /* if autoneg is disabled, don't restart auto-negotiation */ - if (phy && phy->autoneg == AUTONEG_ENABLE) - ret = genphy_restart_aneg(phy); - } + if (!netif_running(netdev)) + return 0; - return ret; + if (!phy) + return -EOPNOTSUPP; + + if (phy->autoneg != AUTONEG_ENABLE) + return -EINVAL; + + return genphy_restart_aneg(phy); } static u32 diff --git a/drivers/net/ethernet/hisilicon/hns_mdio.c b/drivers/net/ethernet/hisilicon/hns_mdio.c index 017e08452d8c..baf5cc251f32 100644 --- a/drivers/net/ethernet/hisilicon/hns_mdio.c +++ b/drivers/net/ethernet/hisilicon/hns_mdio.c @@ -321,7 +321,7 @@ static int hns_mdio_read(struct mii_bus *bus, int phy_id, int regnum) } hns_mdio_cmd_write(mdio_dev, is_c45, - MDIO_C45_WRITE_ADDR, phy_id, devad); + MDIO_C45_READ, phy_id, devad); } /* Step 5: waitting for MDIO_COMMAND_REG 's mdio_start==0,*/ diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 41fa22c562c1..f81ad0aa8b09 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -424,9 +424,9 @@ static void i40e_get_netdev_stats_struct(struct net_device *netdev, struct rtnl_link_stats64 *stats) { struct i40e_netdev_priv *np = netdev_priv(netdev); - struct i40e_ring *tx_ring, *rx_ring; struct i40e_vsi *vsi = np->vsi; struct rtnl_link_stats64 *vsi_stats = i40e_get_vsi_stats_struct(vsi); + struct i40e_ring *ring; int i; if (test_bit(__I40E_VSI_DOWN, vsi->state)) @@ -440,24 +440,26 @@ static void i40e_get_netdev_stats_struct(struct net_device *netdev, u64 bytes, packets; unsigned int start; - tx_ring = READ_ONCE(vsi->tx_rings[i]); - if (!tx_ring) + ring = READ_ONCE(vsi->tx_rings[i]); + if (!ring) continue; - i40e_get_netdev_stats_struct_tx(tx_ring, stats); + i40e_get_netdev_stats_struct_tx(ring, stats); - rx_ring = &tx_ring[1]; + if (i40e_enabled_xdp_vsi(vsi)) { + ring++; + i40e_get_netdev_stats_struct_tx(ring, stats); + } + ring++; do { - start = u64_stats_fetch_begin_irq(&rx_ring->syncp); - packets = rx_ring->stats.packets; - bytes = rx_ring->stats.bytes; - } while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start)); + start = u64_stats_fetch_begin_irq(&ring->syncp); + packets = ring->stats.packets; + bytes = ring->stats.bytes; + } while (u64_stats_fetch_retry_irq(&ring->syncp, start)); stats->rx_packets += packets; stats->rx_bytes += bytes; - if (i40e_enabled_xdp_vsi(vsi)) - i40e_get_netdev_stats_struct_tx(&rx_ring[1], stats); } rcu_read_unlock(); diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c index ae2f35039343..1485f66cf7b0 100644 --- a/drivers/net/ethernet/marvell/sky2.c +++ b/drivers/net/ethernet/marvell/sky2.c @@ -46,6 +46,7 @@ #include #include #include +#include #include @@ -93,7 +94,7 @@ static int copybreak __read_mostly = 128; module_param(copybreak, int, 0); MODULE_PARM_DESC(copybreak, "Receive copy threshold"); -static int disable_msi = 0; +static int disable_msi = -1; module_param(disable_msi, int, 0); MODULE_PARM_DESC(disable_msi, "Disable Message Signaled Interrupt (MSI)"); @@ -4931,6 +4932,24 @@ static const char *sky2_name(u8 chipid, char *buf, int sz) return buf; } +static const struct dmi_system_id msi_blacklist[] = { + { + .ident = "Dell Inspiron 1545", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Inspiron 1545"), + }, + }, + { + .ident = "Gateway P-79", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Gateway"), + DMI_MATCH(DMI_PRODUCT_NAME, "P-79"), + }, + }, + {} +}; + static int sky2_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { struct net_device *dev, *dev1; @@ -5042,6 +5061,9 @@ static int sky2_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_out_free_pci; } + if (disable_msi == -1) + disable_msi = !!dmi_check_system(msi_blacklist); + if (!disable_msi && pci_enable_msi(pdev) == 0) { err = sky2_test_msi(hw); if (err) { diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c index e65bc3c95630..857588e2488d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c @@ -2645,6 +2645,8 @@ int mlx4_cmd_use_events(struct mlx4_dev *dev) if (!priv->cmd.context) return -ENOMEM; + if (mlx4_is_mfunc(dev)) + mutex_lock(&priv->cmd.slave_cmd_mutex); down_write(&priv->cmd.switch_sem); for (i = 0; i < priv->cmd.max_cmds; ++i) { priv->cmd.context[i].token = i; @@ -2670,6 +2672,8 @@ int mlx4_cmd_use_events(struct mlx4_dev *dev) down(&priv->cmd.poll_sem); priv->cmd.use_events = 1; up_write(&priv->cmd.switch_sem); + if (mlx4_is_mfunc(dev)) + mutex_unlock(&priv->cmd.slave_cmd_mutex); return err; } @@ -2682,6 +2686,8 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev) struct mlx4_priv *priv = mlx4_priv(dev); int i; + if (mlx4_is_mfunc(dev)) + mutex_lock(&priv->cmd.slave_cmd_mutex); down_write(&priv->cmd.switch_sem); priv->cmd.use_events = 0; @@ -2689,9 +2695,12 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev) down(&priv->cmd.event_sem); kfree(priv->cmd.context); + priv->cmd.context = NULL; up(&priv->cmd.poll_sem); up_write(&priv->cmd.switch_sem); + if (mlx4_is_mfunc(dev)) + mutex_unlock(&priv->cmd.slave_cmd_mutex); } struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev) diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c index 31bd56727022..676428a57662 100644 --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c @@ -2719,13 +2719,13 @@ static int qp_get_mtt_size(struct mlx4_qp_context *qpc) int total_pages; int total_mem; int page_offset = (be32_to_cpu(qpc->params2) >> 6) & 0x3f; + int tot; sq_size = 1 << (log_sq_size + log_sq_sride + 4); rq_size = (srq|rss|xrc) ? 0 : (1 << (log_rq_size + log_rq_stride + 4)); total_mem = sq_size + rq_size; - total_pages = - roundup_pow_of_two((total_mem + (page_offset << 6)) >> - page_shift); + tot = (total_mem + (page_offset << 6)) >> page_shift; + total_pages = !tot ? 1 : roundup_pow_of_two(tot); return total_pages; } diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index 42f5bfa33694..208341541087 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -585,8 +585,7 @@ static int lan743x_intr_open(struct lan743x_adapter *adapter) if (adapter->csr.flags & LAN743X_CSR_FLAG_SUPPORTS_INTR_AUTO_SET_CLR) { - flags = LAN743X_VECTOR_FLAG_VECTOR_ENABLE_AUTO_CLEAR | - LAN743X_VECTOR_FLAG_VECTOR_ENABLE_AUTO_SET | + flags = LAN743X_VECTOR_FLAG_VECTOR_ENABLE_AUTO_SET | LAN743X_VECTOR_FLAG_SOURCE_ENABLE_AUTO_SET | LAN743X_VECTOR_FLAG_SOURCE_ENABLE_AUTO_CLEAR | LAN743X_VECTOR_FLAG_SOURCE_STATUS_AUTO_CLEAR; @@ -599,12 +598,6 @@ static int lan743x_intr_open(struct lan743x_adapter *adapter) /* map TX interrupt to vector */ int_vec_map1 |= INT_VEC_MAP1_TX_VEC_(index, vector); lan743x_csr_write(adapter, INT_VEC_MAP1, int_vec_map1); - if (flags & - LAN743X_VECTOR_FLAG_VECTOR_ENABLE_AUTO_CLEAR) { - int_vec_en_auto_clr |= INT_VEC_EN_(vector); - lan743x_csr_write(adapter, INT_VEC_EN_AUTO_CLR, - int_vec_en_auto_clr); - } /* Remove TX interrupt from shared mask */ intr->vector_list[0].int_mask &= ~int_bit; @@ -1403,7 +1396,8 @@ static int lan743x_tx_frame_start(struct lan743x_tx *tx, } static void lan743x_tx_frame_add_lso(struct lan743x_tx *tx, - unsigned int frame_length) + unsigned int frame_length, + int nr_frags) { /* called only from within lan743x_tx_xmit_frame. * assuming tx->ring_lock has already been acquired. @@ -1413,6 +1407,10 @@ static void lan743x_tx_frame_add_lso(struct lan743x_tx *tx, /* wrap up previous descriptor */ tx->frame_data0 |= TX_DESC_DATA0_EXT_; + if (nr_frags <= 0) { + tx->frame_data0 |= TX_DESC_DATA0_LS_; + tx->frame_data0 |= TX_DESC_DATA0_IOC_; + } tx_descriptor = &tx->ring_cpu_ptr[tx->frame_tail]; tx_descriptor->data0 = tx->frame_data0; @@ -1517,8 +1515,11 @@ static void lan743x_tx_frame_end(struct lan743x_tx *tx, u32 tx_tail_flags = 0; /* wrap up previous descriptor */ - tx->frame_data0 |= TX_DESC_DATA0_LS_; - tx->frame_data0 |= TX_DESC_DATA0_IOC_; + if ((tx->frame_data0 & TX_DESC_DATA0_DTYPE_MASK_) == + TX_DESC_DATA0_DTYPE_DATA_) { + tx->frame_data0 |= TX_DESC_DATA0_LS_; + tx->frame_data0 |= TX_DESC_DATA0_IOC_; + } tx_descriptor = &tx->ring_cpu_ptr[tx->frame_tail]; buffer_info = &tx->buffer_info[tx->frame_tail]; @@ -1603,7 +1604,7 @@ static netdev_tx_t lan743x_tx_xmit_frame(struct lan743x_tx *tx, } if (gso) - lan743x_tx_frame_add_lso(tx, frame_length); + lan743x_tx_frame_add_lso(tx, frame_length, nr_frags); if (nr_frags <= 0) goto finish; @@ -1897,7 +1898,17 @@ static int lan743x_rx_next_index(struct lan743x_rx *rx, int index) return ((++index) % rx->ring_size); } -static int lan743x_rx_allocate_ring_element(struct lan743x_rx *rx, int index) +static struct sk_buff *lan743x_rx_allocate_skb(struct lan743x_rx *rx) +{ + int length = 0; + + length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING); + return __netdev_alloc_skb(rx->adapter->netdev, + length, GFP_ATOMIC | GFP_DMA); +} + +static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index, + struct sk_buff *skb) { struct lan743x_rx_buffer_info *buffer_info; struct lan743x_rx_descriptor *descriptor; @@ -1906,9 +1917,7 @@ static int lan743x_rx_allocate_ring_element(struct lan743x_rx *rx, int index) length = (LAN743X_MAX_FRAME_SIZE + ETH_HLEN + 4 + RX_HEAD_PADDING); descriptor = &rx->ring_cpu_ptr[index]; buffer_info = &rx->buffer_info[index]; - buffer_info->skb = __netdev_alloc_skb(rx->adapter->netdev, - length, - GFP_ATOMIC | GFP_DMA); + buffer_info->skb = skb; if (!(buffer_info->skb)) return -ENOMEM; buffer_info->dma_ptr = dma_map_single(&rx->adapter->pdev->dev, @@ -2055,8 +2064,19 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx) /* packet is available */ if (first_index == last_index) { /* single buffer packet */ + struct sk_buff *new_skb = NULL; int packet_length; + new_skb = lan743x_rx_allocate_skb(rx); + if (!new_skb) { + /* failed to allocate next skb. + * Memory is very low. + * Drop this packet and reuse buffer. + */ + lan743x_rx_reuse_ring_element(rx, first_index); + goto process_extension; + } + buffer_info = &rx->buffer_info[first_index]; skb = buffer_info->skb; descriptor = &rx->ring_cpu_ptr[first_index]; @@ -2076,7 +2096,7 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx) skb_put(skb, packet_length - 4); skb->protocol = eth_type_trans(skb, rx->adapter->netdev); - lan743x_rx_allocate_ring_element(rx, first_index); + lan743x_rx_init_ring_element(rx, first_index, new_skb); } else { int index = first_index; @@ -2089,26 +2109,23 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx) if (first_index <= last_index) { while ((index >= first_index) && (index <= last_index)) { - lan743x_rx_release_ring_element(rx, - index); - lan743x_rx_allocate_ring_element(rx, - index); + lan743x_rx_reuse_ring_element(rx, + index); index = lan743x_rx_next_index(rx, index); } } else { while ((index >= first_index) || (index <= last_index)) { - lan743x_rx_release_ring_element(rx, - index); - lan743x_rx_allocate_ring_element(rx, - index); + lan743x_rx_reuse_ring_element(rx, + index); index = lan743x_rx_next_index(rx, index); } } } +process_extension: if (extension_index >= 0) { descriptor = &rx->ring_cpu_ptr[extension_index]; buffer_info = &rx->buffer_info[extension_index]; @@ -2285,7 +2302,9 @@ static int lan743x_rx_ring_init(struct lan743x_rx *rx) rx->last_head = 0; for (index = 0; index < rx->ring_size; index++) { - ret = lan743x_rx_allocate_ring_element(rx, index); + struct sk_buff *new_skb = lan743x_rx_allocate_skb(rx); + + ret = lan743x_rx_init_ring_element(rx, index, new_skb); if (ret) goto cleanup; } diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index 2f69ee9221c6..4dd82a1612aa 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -473,19 +473,19 @@ static void qed_init_qm_pq(struct qed_hwfn *p_hwfn, /* get pq index according to PQ_FLAGS */ static u16 *qed_init_qm_get_idx_from_flags(struct qed_hwfn *p_hwfn, - u32 pq_flags) + unsigned long pq_flags) { struct qed_qm_info *qm_info = &p_hwfn->qm_info; /* Can't have multiple flags set here */ - if (bitmap_weight((unsigned long *)&pq_flags, + if (bitmap_weight(&pq_flags, sizeof(pq_flags) * BITS_PER_BYTE) > 1) { - DP_ERR(p_hwfn, "requested multiple pq flags 0x%x\n", pq_flags); + DP_ERR(p_hwfn, "requested multiple pq flags 0x%lx\n", pq_flags); goto err; } if (!(qed_get_pq_flags(p_hwfn) & pq_flags)) { - DP_ERR(p_hwfn, "pq flag 0x%x is not set\n", pq_flags); + DP_ERR(p_hwfn, "pq flag 0x%lx is not set\n", pq_flags); goto err; } diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c b/drivers/net/ethernet/qlogic/qed/qed_l2.c index 67c02ea93906..64ac95ca4df2 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_l2.c +++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c @@ -609,6 +609,10 @@ qed_sp_update_accept_mode(struct qed_hwfn *p_hwfn, (!!(accept_filter & QED_ACCEPT_MCAST_MATCHED) && !!(accept_filter & QED_ACCEPT_MCAST_UNMATCHED))); + SET_FIELD(state, ETH_VPORT_TX_MODE_UCAST_ACCEPT_ALL, + (!!(accept_filter & QED_ACCEPT_UCAST_MATCHED) && + !!(accept_filter & QED_ACCEPT_UCAST_UNMATCHED))); + SET_FIELD(state, ETH_VPORT_TX_MODE_BCAST_ACCEPT_ALL, !!(accept_filter & QED_ACCEPT_BCAST)); @@ -744,6 +748,11 @@ int qed_sp_vport_update(struct qed_hwfn *p_hwfn, return rc; } + if (p_params->update_ctl_frame_check) { + p_cmn->ctl_frame_mac_check_en = p_params->mac_chk_en; + p_cmn->ctl_frame_ethtype_check_en = p_params->ethtype_chk_en; + } + /* Update mcast bins for VFs, PF doesn't use this functionality */ qed_sp_update_mcast_bin(p_hwfn, p_ramrod, p_params); @@ -2207,7 +2216,7 @@ static int qed_fill_eth_dev_info(struct qed_dev *cdev, u16 num_queues = 0; /* Since the feature controls only queue-zones, - * make sure we have the contexts [rx, tx, xdp] to + * make sure we have the contexts [rx, xdp, tcs] to * match. */ for_each_hwfn(cdev, i) { @@ -2217,7 +2226,8 @@ static int qed_fill_eth_dev_info(struct qed_dev *cdev, u16 cids; cids = hwfn->pf_params.eth_pf_params.num_cons; - num_queues += min_t(u16, l2_queues, cids / 3); + cids /= (2 + info->num_tc); + num_queues += min_t(u16, l2_queues, cids); } /* queues might theoretically be >256, but interrupts' @@ -2688,7 +2698,8 @@ static int qed_configure_filter_rx_mode(struct qed_dev *cdev, if (type == QED_FILTER_RX_MODE_TYPE_PROMISC) { accept_flags.rx_accept_filter |= QED_ACCEPT_UCAST_UNMATCHED | QED_ACCEPT_MCAST_UNMATCHED; - accept_flags.tx_accept_filter |= QED_ACCEPT_MCAST_UNMATCHED; + accept_flags.tx_accept_filter |= QED_ACCEPT_UCAST_UNMATCHED | + QED_ACCEPT_MCAST_UNMATCHED; } else if (type == QED_FILTER_RX_MODE_TYPE_MULTI_PROMISC) { accept_flags.rx_accept_filter |= QED_ACCEPT_MCAST_UNMATCHED; accept_flags.tx_accept_filter |= QED_ACCEPT_MCAST_UNMATCHED; diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.h b/drivers/net/ethernet/qlogic/qed/qed_l2.h index 8d80f1095d17..7127d5aaac42 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_l2.h +++ b/drivers/net/ethernet/qlogic/qed/qed_l2.h @@ -219,6 +219,9 @@ struct qed_sp_vport_update_params { struct qed_rss_params *rss_params; struct qed_filter_accept_flags accept_flags; struct qed_sge_tpa_params *sge_tpa_params; + u8 update_ctl_frame_check; + u8 mac_chk_en; + u8 ethtype_chk_en; }; int qed_sp_vport_update(struct qed_hwfn *p_hwfn, diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c index 92cd8abeb41d..015de1e0addd 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c +++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c @@ -2430,19 +2430,24 @@ static int qed_ll2_start_xmit(struct qed_dev *cdev, struct sk_buff *skb, { struct qed_ll2_tx_pkt_info pkt; const skb_frag_t *frag; + u8 flags = 0, nr_frags; int rc = -EINVAL, i; dma_addr_t mapping; u16 vlan = 0; - u8 flags = 0; if (unlikely(skb->ip_summed != CHECKSUM_NONE)) { DP_INFO(cdev, "Cannot transmit a checksummed packet\n"); return -EINVAL; } - if (1 + skb_shinfo(skb)->nr_frags > CORE_LL2_TX_MAX_BDS_PER_PACKET) { + /* Cache number of fragments from SKB since SKB may be freed by + * the completion routine after calling qed_ll2_prepare_tx_packet() + */ + nr_frags = skb_shinfo(skb)->nr_frags; + + if (1 + nr_frags > CORE_LL2_TX_MAX_BDS_PER_PACKET) { DP_ERR(cdev, "Cannot transmit a packet with %d fragments\n", - 1 + skb_shinfo(skb)->nr_frags); + 1 + nr_frags); return -EINVAL; } @@ -2464,7 +2469,7 @@ static int qed_ll2_start_xmit(struct qed_dev *cdev, struct sk_buff *skb, } memset(&pkt, 0, sizeof(pkt)); - pkt.num_of_bds = 1 + skb_shinfo(skb)->nr_frags; + pkt.num_of_bds = 1 + nr_frags; pkt.vlan = vlan; pkt.bd_flags = flags; pkt.tx_dest = QED_LL2_TX_DEST_NW; @@ -2475,12 +2480,17 @@ static int qed_ll2_start_xmit(struct qed_dev *cdev, struct sk_buff *skb, test_bit(QED_LL2_XMIT_FLAGS_FIP_DISCOVERY, &xmit_flags)) pkt.remove_stag = true; + /* qed_ll2_prepare_tx_packet() may actually send the packet if + * there are no fragments in the skb and subsequently the completion + * routine may run and free the SKB, so no dereferencing the SKB + * beyond this point unless skb has any fragments. + */ rc = qed_ll2_prepare_tx_packet(&cdev->hwfns[0], cdev->ll2->handle, &pkt, 1); if (rc) goto err; - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + for (i = 0; i < nr_frags; i++) { frag = &skb_shinfo(skb)->frags[i]; mapping = skb_frag_dma_map(&cdev->pdev->dev, frag, 0, diff --git a/drivers/net/ethernet/qlogic/qed/qed_sp.h b/drivers/net/ethernet/qlogic/qed/qed_sp.h index 3157c0d99441..dae2896e1d8e 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_sp.h +++ b/drivers/net/ethernet/qlogic/qed/qed_sp.h @@ -380,6 +380,7 @@ void qed_consq_setup(struct qed_hwfn *p_hwfn); * @param p_hwfn */ void qed_consq_free(struct qed_hwfn *p_hwfn); +int qed_spq_pend_post(struct qed_hwfn *p_hwfn); /** * @file diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c index 7106ad17afe2..a0ee847f379b 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_spq.c +++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c @@ -402,6 +402,11 @@ int qed_eq_completion(struct qed_hwfn *p_hwfn, void *cookie) qed_eq_prod_update(p_hwfn, qed_chain_get_prod_idx(p_chain)); + /* Attempt to post pending requests */ + spin_lock_bh(&p_hwfn->p_spq->lock); + rc = qed_spq_pend_post(p_hwfn); + spin_unlock_bh(&p_hwfn->p_spq->lock); + return rc; } @@ -745,7 +750,7 @@ static int qed_spq_post_list(struct qed_hwfn *p_hwfn, return 0; } -static int qed_spq_pend_post(struct qed_hwfn *p_hwfn) +int qed_spq_pend_post(struct qed_hwfn *p_hwfn) { struct qed_spq *p_spq = p_hwfn->p_spq; struct qed_spq_entry *p_ent = NULL; @@ -883,7 +888,6 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, struct qed_spq_entry *p_ent = NULL; struct qed_spq_entry *tmp; struct qed_spq_entry *found = NULL; - int rc; if (!p_hwfn) return -EINVAL; @@ -941,12 +945,7 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, */ qed_spq_return_entry(p_hwfn, found); - /* Attempt to post pending requests */ - spin_lock_bh(&p_spq->lock); - rc = qed_spq_pend_post(p_hwfn); - spin_unlock_bh(&p_spq->lock); - - return rc; + return 0; } int qed_consq_alloc(struct qed_hwfn *p_hwfn) diff --git a/drivers/net/ethernet/qlogic/qed/qed_sriov.c b/drivers/net/ethernet/qlogic/qed/qed_sriov.c index ca6290fa0f30..71a7af134dd8 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_sriov.c +++ b/drivers/net/ethernet/qlogic/qed/qed_sriov.c @@ -1969,7 +1969,9 @@ static void qed_iov_vf_mbx_start_vport(struct qed_hwfn *p_hwfn, params.vport_id = vf->vport_id; params.max_buffers_per_cqe = start->max_buffers_per_cqe; params.mtu = vf->mtu; - params.check_mac = true; + + /* Non trusted VFs should enable control frame filtering */ + params.check_mac = !vf->p_vf_info.is_trusted_configured; rc = qed_sp_eth_vport_start(p_hwfn, ¶ms); if (rc) { @@ -5130,6 +5132,9 @@ static void qed_iov_handle_trust_change(struct qed_hwfn *hwfn) params.opaque_fid = vf->opaque_fid; params.vport_id = vf->vport_id; + params.update_ctl_frame_check = 1; + params.mac_chk_en = !vf_info->is_trusted_configured; + if (vf_info->rx_accept_mode & mask) { flags->update_rx_mode_config = 1; flags->rx_accept_filter = vf_info->rx_accept_mode; @@ -5147,7 +5152,8 @@ static void qed_iov_handle_trust_change(struct qed_hwfn *hwfn) } if (flags->update_rx_mode_config || - flags->update_tx_mode_config) + flags->update_tx_mode_config || + params.update_ctl_frame_check) qed_sp_vport_update(hwfn, ¶ms, QED_SPQ_MODE_EBLOCK, NULL); } diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c b/drivers/net/ethernet/qlogic/qed/qed_vf.c index be118d057b92..6ab3fb008139 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_vf.c +++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c @@ -261,6 +261,7 @@ static int qed_vf_pf_acquire(struct qed_hwfn *p_hwfn) struct pfvf_acquire_resp_tlv *resp = &p_iov->pf2vf_reply->acquire_resp; struct pf_vf_pfdev_info *pfdev_info = &resp->pfdev_info; struct vf_pf_resc_request *p_resc; + u8 retry_cnt = VF_ACQUIRE_THRESH; bool resources_acquired = false; struct vfpf_acquire_tlv *req; int rc = 0, attempts = 0; @@ -314,6 +315,15 @@ static int qed_vf_pf_acquire(struct qed_hwfn *p_hwfn) /* send acquire request */ rc = qed_send_msg2pf(p_hwfn, &resp->hdr.status, sizeof(*resp)); + + /* Re-try acquire in case of vf-pf hw channel timeout */ + if (retry_cnt && rc == -EBUSY) { + DP_VERBOSE(p_hwfn, QED_MSG_IOV, + "VF retrying to acquire due to VPC timeout\n"); + retry_cnt--; + continue; + } + if (rc) goto exit; diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h index 6a4d266fb8e2..d242a5724069 100644 --- a/drivers/net/ethernet/qlogic/qede/qede.h +++ b/drivers/net/ethernet/qlogic/qede/qede.h @@ -489,6 +489,9 @@ struct qede_reload_args { /* Datapath functions definition */ netdev_tx_t qede_start_xmit(struct sk_buff *skb, struct net_device *ndev); +u16 qede_select_queue(struct net_device *dev, struct sk_buff *skb, + struct net_device *sb_dev, + select_queue_fallback_t fallback); netdev_features_t qede_features_check(struct sk_buff *skb, struct net_device *dev, netdev_features_t features); diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c index 1a78027de071..a96da16f3404 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_fp.c +++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c @@ -1695,6 +1695,19 @@ netdev_tx_t qede_start_xmit(struct sk_buff *skb, struct net_device *ndev) return NETDEV_TX_OK; } +u16 qede_select_queue(struct net_device *dev, struct sk_buff *skb, + struct net_device *sb_dev, + select_queue_fallback_t fallback) +{ + struct qede_dev *edev = netdev_priv(dev); + int total_txq; + + total_txq = QEDE_TSS_COUNT(edev) * edev->dev_info.num_tc; + + return QEDE_TSS_COUNT(edev) ? + fallback(dev, skb, NULL) % total_txq : 0; +} + /* 8B udp header + 8B base tunnel header + 32B option length */ #define QEDE_MAX_TUN_HDR_LEN 48 diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index 46d0f2eaa0c0..f3d9c40c4115 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -631,6 +631,7 @@ static const struct net_device_ops qede_netdev_ops = { .ndo_open = qede_open, .ndo_stop = qede_close, .ndo_start_xmit = qede_start_xmit, + .ndo_select_queue = qede_select_queue, .ndo_set_rx_mode = qede_set_rx_mode, .ndo_set_mac_address = qede_set_mac_addr, .ndo_validate_addr = eth_validate_addr, @@ -666,6 +667,7 @@ static const struct net_device_ops qede_netdev_vf_ops = { .ndo_open = qede_open, .ndo_stop = qede_close, .ndo_start_xmit = qede_start_xmit, + .ndo_select_queue = qede_select_queue, .ndo_set_rx_mode = qede_set_rx_mode, .ndo_set_mac_address = qede_set_mac_addr, .ndo_validate_addr = eth_validate_addr, @@ -684,6 +686,7 @@ static const struct net_device_ops qede_netdev_vf_xdp_ops = { .ndo_open = qede_open, .ndo_stop = qede_close, .ndo_start_xmit = qede_start_xmit, + .ndo_select_queue = qede_select_queue, .ndo_set_rx_mode = qede_set_rx_mode, .ndo_set_mac_address = qede_set_mac_addr, .ndo_validate_addr = eth_validate_addr, diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 8441c86d9f3b..5f092bbd0514 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -459,7 +459,7 @@ static int ravb_dmac_init(struct net_device *ndev) RCR_EFFS | RCR_ENCF | RCR_ETS0 | RCR_ESF | 0x18000000, RCR); /* Set FIFO size */ - ravb_write(ndev, TGC_TQP_AVBMODE1 | 0x00222200, TGC); + ravb_write(ndev, TGC_TQP_AVBMODE1 | 0x00112200, TGC); /* Timestamp enable */ ravb_write(ndev, TCCR_TFEN, TCCR); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c index 7b923362ee55..3b174eae77c1 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c @@ -1342,8 +1342,10 @@ static int rk_gmac_powerup(struct rk_priv_data *bsp_priv) } ret = phy_power_on(bsp_priv, true); - if (ret) + if (ret) { + gmac_clk_enable(bsp_priv, false); return ret; + } pm_runtime_enable(dev); pm_runtime_get_sync(dev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c index 9caf79ba5ef1..4d5fb4b51cc4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c @@ -719,8 +719,11 @@ static u32 stmmac_usec2riwt(u32 usec, struct stmmac_priv *priv) { unsigned long clk = clk_get_rate(priv->plat->stmmac_clk); - if (!clk) - return 0; + if (!clk) { + clk = priv->plat->clk_ref_rate; + if (!clk) + return 0; + } return (usec * (clk / 1000000)) / 256; } @@ -729,8 +732,11 @@ static u32 stmmac_riwt2usec(u32 riwt, struct stmmac_priv *priv) { unsigned long clk = clk_get_rate(priv->plat->stmmac_clk); - if (!clk) - return 0; + if (!clk) { + clk = priv->plat->clk_ref_rate; + if (!clk) + return 0; + } return (riwt * 256) / (clk / 1000000); } diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 123b74e25ed8..43ab9e905bed 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -3028,10 +3028,22 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) tx_q = &priv->tx_queue[queue]; + if (priv->tx_path_in_lpi_mode) + stmmac_disable_eee_mode(priv); + /* Manage oversized TCP frames for GMAC4 device */ if (skb_is_gso(skb) && priv->tso) { - if (skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)) + if (skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)) { + /* + * There is no way to determine the number of TSO + * capable Queues. Let's use always the Queue 0 + * because if TSO is supported then at least this + * one will be capable. + */ + skb_set_queue_mapping(skb, 0); + return stmmac_tso_xmit(skb, dev); + } } if (unlikely(stmmac_tx_avail(priv, queue) < nfrags + 1)) { @@ -3046,9 +3058,6 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_BUSY; } - if (priv->tx_path_in_lpi_mode) - stmmac_disable_eee_mode(priv); - entry = tx_q->cur_tx; first_entry = entry; WARN_ON(tx_q->tx_skbuff[first_entry]); diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 01711e6e9a39..e1427b56a073 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -636,15 +636,20 @@ static int geneve_sock_add(struct geneve_dev *geneve, bool ipv6) static int geneve_open(struct net_device *dev) { struct geneve_dev *geneve = netdev_priv(dev); - bool ipv6 = !!(geneve->info.mode & IP_TUNNEL_INFO_IPV6); bool metadata = geneve->collect_md; + bool ipv4, ipv6; int ret = 0; + ipv6 = geneve->info.mode & IP_TUNNEL_INFO_IPV6 || metadata; + ipv4 = !ipv6 || metadata; #if IS_ENABLED(CONFIG_IPV6) - if (ipv6 || metadata) + if (ipv6) { ret = geneve_sock_add(geneve, true); + if (ret < 0 && ret != -EAFNOSUPPORT) + ipv4 = false; + } #endif - if (!ret && (!ipv6 || metadata)) + if (ipv4) ret = geneve_sock_add(geneve, false); if (ret < 0) geneve_sock_release(geneve); diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index c9e2a986ccb7..c8320405c8f1 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -743,6 +743,14 @@ void netvsc_linkstatus_callback(struct net_device *net, schedule_delayed_work(&ndev_ctx->dwork, 0); } +static void netvsc_comp_ipcsum(struct sk_buff *skb) +{ + struct iphdr *iph = (struct iphdr *)skb->data; + + iph->check = 0; + iph->check = ip_fast_csum(iph, iph->ihl); +} + static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net, struct napi_struct *napi, const struct ndis_tcp_ip_checksum_info *csum_info, @@ -766,9 +774,17 @@ static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net, /* skb is already created with CHECKSUM_NONE */ skb_checksum_none_assert(skb); - /* - * In Linux, the IP checksum is always checked. - * Do L4 checksum offload if enabled and present. + /* Incoming packets may have IP header checksum verified by the host. + * They may not have IP header checksum computed after coalescing. + * We compute it here if the flags are set, because on Linux, the IP + * checksum is always checked. + */ + if (csum_info && csum_info->receive.ip_checksum_value_invalid && + csum_info->receive.ip_checksum_succeeded && + skb->protocol == htons(ETH_P_IP)) + netvsc_comp_ipcsum(skb); + + /* Do L4 checksum offload if enabled and present. */ if (csum_info && (net->features & NETIF_F_RXCSUM)) { if (csum_info->receive.tcp_checksum_succeeded || diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index 5fb541897863..68b8007da82b 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -494,6 +494,8 @@ static int ipvlan_nl_changelink(struct net_device *dev, if (!data) return 0; + if (!ns_capable(dev_net(ipvlan->phy_dev)->user_ns, CAP_NET_ADMIN)) + return -EPERM; if (data[IFLA_IPVLAN_MODE]) { u16 nmode = nla_get_u16(data[IFLA_IPVLAN_MODE]); @@ -596,6 +598,8 @@ int ipvlan_link_new(struct net *src_net, struct net_device *dev, struct ipvl_dev *tmp = netdev_priv(phy_dev); phy_dev = tmp->phy_dev; + if (!ns_capable(dev_net(phy_dev)->user_ns, CAP_NET_ADMIN)) + return -EPERM; } else if (!netif_is_ipvlan_port(phy_dev)) { /* Exit early if the underlying link is invalid or busy */ if (phy_dev->type != ARPHRD_ETHER || diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 15c5586d74ff..c5588d4508f9 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -380,7 +380,6 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner) err = device_register(&bus->dev); if (err) { pr_err("mii_bus %s failed to register\n", bus->id); - put_device(&bus->dev); return -EINVAL; } diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 3db06b40580d..05a6ae32ff65 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -339,6 +339,17 @@ static int ksz8041_config_aneg(struct phy_device *phydev) return genphy_config_aneg(phydev); } +static int ksz8061_config_init(struct phy_device *phydev) +{ + int ret; + + ret = phy_write_mmd(phydev, MDIO_MMD_PMAPMD, MDIO_DEVID1, 0xB61A); + if (ret) + return ret; + + return kszphy_config_init(phydev); +} + static int ksz9021_load_values_from_of(struct phy_device *phydev, const struct device_node *of_node, u16 reg, @@ -934,7 +945,7 @@ static struct phy_driver ksphy_driver[] = { .phy_id_mask = MICREL_PHY_ID_MASK, .features = PHY_BASIC_FEATURES, .flags = PHY_HAS_INTERRUPT, - .config_init = kszphy_config_init, + .config_init = ksz8061_config_init, .ack_interrupt = kszphy_ack_interrupt, .config_intr = kszphy_config_intr, .suspend = genphy_suspend, diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 2787e8b1d668..f6e70f2dfd12 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -348,6 +348,10 @@ static int phylink_get_mac_state(struct phylink *pl, struct phylink_link_state * linkmode_zero(state->lp_advertising); state->interface = pl->link_config.interface; state->an_enabled = pl->link_config.an_enabled; + state->speed = SPEED_UNKNOWN; + state->duplex = DUPLEX_UNKNOWN; + state->pause = MLO_PAUSE_NONE; + state->an_complete = 0; state->link = 1; return pl->ops->mac_link_state(ndev, state); diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c index 8f09edd811e9..50c60550f295 100644 --- a/drivers/net/ppp/pptp.c +++ b/drivers/net/ppp/pptp.c @@ -532,6 +532,7 @@ static void pptp_sock_destruct(struct sock *sk) pppox_unbind_sock(sk); } skb_queue_purge(&sk->sk_receive_queue); + dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1)); } static int pptp_create(struct net *net, struct socket *sock, int kern) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index 723814d84b7d..95ee9d815d76 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -1259,7 +1259,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev, list_add_tail_rcu(&port->list, &team->port_list); team_port_enable(team, port); __team_compute_features(team); - __team_port_change_port_added(port, !!netif_carrier_ok(port_dev)); + __team_port_change_port_added(port, !!netif_oper_up(port_dev)); __team_options_change_check(team); netdev_info(dev, "Port device %s added\n", portname); @@ -2918,7 +2918,7 @@ static int team_device_event(struct notifier_block *unused, switch (event) { case NETDEV_UP: - if (netif_carrier_ok(dev)) + if (netif_oper_up(dev)) team_port_change_check(port, true); break; case NETDEV_DOWN: diff --git a/drivers/net/team/team_mode_loadbalance.c b/drivers/net/team/team_mode_loadbalance.c index a5ef97010eb3..5541e1c19936 100644 --- a/drivers/net/team/team_mode_loadbalance.c +++ b/drivers/net/team/team_mode_loadbalance.c @@ -325,6 +325,20 @@ static int lb_bpf_func_set(struct team *team, struct team_gsetter_ctx *ctx) return 0; } +static void lb_bpf_func_free(struct team *team) +{ + struct lb_priv *lb_priv = get_lb_priv(team); + struct bpf_prog *fp; + + if (!lb_priv->ex->orig_fprog) + return; + + __fprog_destroy(lb_priv->ex->orig_fprog); + fp = rcu_dereference_protected(lb_priv->fp, + lockdep_is_held(&team->lock)); + bpf_prog_destroy(fp); +} + static int lb_tx_method_get(struct team *team, struct team_gsetter_ctx *ctx) { struct lb_priv *lb_priv = get_lb_priv(team); @@ -639,6 +653,7 @@ static void lb_exit(struct team *team) team_options_unregister(team, lb_options, ARRAY_SIZE(lb_options)); + lb_bpf_func_free(team); cancel_delayed_work_sync(&lb_priv->ex->stats.refresh_dw); free_percpu(lb_priv->pcpu_stats); kfree(lb_priv->ex); diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 0baade235c83..ee4f901864bb 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -2126,9 +2126,9 @@ static void *tun_ring_recv(struct tun_file *tfile, int noblock, int *err) } add_wait_queue(&tfile->wq.wait, &wait); - current->state = TASK_INTERRUPTIBLE; while (1) { + set_current_state(TASK_INTERRUPTIBLE); ptr = ptr_ring_consume(&tfile->tx_ring); if (ptr) break; @@ -2144,7 +2144,7 @@ static void *tun_ring_recv(struct tun_file *tfile, int noblock, int *err) schedule(); } - current->state = TASK_RUNNING; + __set_current_state(TASK_RUNNING); remove_wait_queue(&tfile->wq.wait, &wait); out: diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 735ad838e2ba..6e381354f658 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -976,6 +976,13 @@ static const struct usb_device_id products[] = { 0xff), .driver_info = (unsigned long)&qmi_wwan_info_quirk_dtr, }, + { /* Quectel EG12/EM12 */ + USB_DEVICE_AND_INTERFACE_INFO(0x2c7c, 0x0512, + USB_CLASS_VENDOR_SPEC, + USB_SUBCLASS_VENDOR_SPEC, + 0xff), + .driver_info = (unsigned long)&qmi_wwan_info_quirk_dtr, + }, /* 3. Combined interface devices matching on interface number */ {QMI_FIXED_INTF(0x0408, 0xea42, 4)}, /* Yota / Megafon M100-1 */ @@ -1343,17 +1350,20 @@ static bool quectel_ec20_detected(struct usb_interface *intf) return false; } -static bool quectel_ep06_diag_detected(struct usb_interface *intf) +static bool quectel_diag_detected(struct usb_interface *intf) { struct usb_device *dev = interface_to_usbdev(intf); struct usb_interface_descriptor intf_desc = intf->cur_altsetting->desc; + u16 id_vendor = le16_to_cpu(dev->descriptor.idVendor); + u16 id_product = le16_to_cpu(dev->descriptor.idProduct); - if (le16_to_cpu(dev->descriptor.idVendor) == 0x2c7c && - le16_to_cpu(dev->descriptor.idProduct) == 0x0306 && - intf_desc.bNumEndpoints == 2) + if (id_vendor != 0x2c7c || intf_desc.bNumEndpoints != 2) + return false; + + if (id_product == 0x0306 || id_product == 0x0512) return true; - - return false; + else + return false; } static int qmi_wwan_probe(struct usb_interface *intf, @@ -1390,13 +1400,13 @@ static int qmi_wwan_probe(struct usb_interface *intf, return -ENODEV; } - /* Quectel EP06/EM06/EG06 supports dynamic interface configuration, so + /* Several Quectel modems supports dynamic interface configuration, so * we need to match on class/subclass/protocol. These values are * identical for the diagnostic- and QMI-interface, but bNumEndpoints is * different. Ignore the current interface if the number of endpoints * the number for the diag interface (two). */ - if (quectel_ep06_diag_detected(intf)) + if (quectel_diag_detected(intf)) return -ENODEV; return usbnet_probe(intf, id); diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 9fc9aed6ca9a..52387f7f12ed 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1469,6 +1469,14 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb) goto drop; } + rcu_read_lock(); + + if (unlikely(!(vxlan->dev->flags & IFF_UP))) { + rcu_read_unlock(); + atomic_long_inc(&vxlan->dev->rx_dropped); + goto drop; + } + stats = this_cpu_ptr(vxlan->dev->tstats); u64_stats_update_begin(&stats->syncp); stats->rx_packets++; @@ -1476,6 +1484,9 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb) u64_stats_update_end(&stats->syncp); gro_cells_receive(&vxlan->gro_cells, skb); + + rcu_read_unlock(); + return 0; drop: @@ -2460,6 +2471,8 @@ static void vxlan_uninit(struct net_device *dev) { struct vxlan_dev *vxlan = netdev_priv(dev); + gro_cells_destroy(&vxlan->gro_cells); + vxlan_fdb_delete_default(vxlan, vxlan->cfg.vni); free_percpu(dev->tstats); @@ -3526,7 +3539,6 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head) vxlan_flush(vxlan, true); - gro_cells_destroy(&vxlan->gro_cells); list_del(&vxlan->next); unregister_netdevice_queue(dev, head); } diff --git a/drivers/net/wireless/ath/ath9k/init.c b/drivers/net/wireless/ath/ath9k/init.c index c070a9e51ebf..fae572b38416 100644 --- a/drivers/net/wireless/ath/ath9k/init.c +++ b/drivers/net/wireless/ath/ath9k/init.c @@ -636,15 +636,15 @@ static int ath9k_of_init(struct ath_softc *sc) ret = ath9k_eeprom_request(sc, eeprom_name); if (ret) return ret; + + ah->ah_flags &= ~AH_USE_EEPROM; + ah->ah_flags |= AH_NO_EEP_SWAP; } mac = of_get_mac_address(np); if (mac) ether_addr_copy(common->macaddr, mac); - ah->ah_flags &= ~AH_USE_EEPROM; - ah->ah_flags |= AH_NO_EEP_SWAP; - return 0; } diff --git a/drivers/net/wireless/ti/wlcore/sdio.c b/drivers/net/wireless/ti/wlcore/sdio.c index 750bea3574ee..627df164b7b6 100644 --- a/drivers/net/wireless/ti/wlcore/sdio.c +++ b/drivers/net/wireless/ti/wlcore/sdio.c @@ -164,6 +164,12 @@ static int wl12xx_sdio_power_on(struct wl12xx_sdio_glue *glue) } sdio_claim_host(func); + /* + * To guarantee that the SDIO card is power cycled, as required to make + * the FW programming to succeed, let's do a brute force HW reset. + */ + mmc_hw_reset(card->host); + sdio_enable_func(func); sdio_release_host(func); @@ -174,20 +180,13 @@ static int wl12xx_sdio_power_off(struct wl12xx_sdio_glue *glue) { struct sdio_func *func = dev_to_sdio_func(glue->dev); struct mmc_card *card = func->card; - int error; sdio_claim_host(func); sdio_disable_func(func); sdio_release_host(func); /* Let runtime PM know the card is powered off */ - error = pm_runtime_put(&card->dev); - if (error < 0 && error != -EBUSY) { - dev_err(&card->dev, "%s failed: %i\n", __func__, error); - - return error; - } - + pm_runtime_put(&card->dev); return 0; } diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c index 0ccb021f1e78..10d580c3dea3 100644 --- a/drivers/net/xen-netback/hash.c +++ b/drivers/net/xen-netback/hash.c @@ -454,6 +454,8 @@ void xenvif_init_hash(struct xenvif *vif) if (xenvif_hash_cache_size == 0) return; + BUG_ON(vif->hash.cache.count); + spin_lock_init(&vif->hash.cache.lock); INIT_LIST_HEAD(&vif->hash.cache.list); } diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index f6ae23fc3f6b..82add0ac4a5f 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -153,6 +153,13 @@ static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb, { struct xenvif *vif = netdev_priv(dev); unsigned int size = vif->hash.size; + unsigned int num_queues; + + /* If queues are not set up internally - always return 0 + * as the packet going to be dropped anyway */ + num_queues = READ_ONCE(vif->num_queues); + if (num_queues < 1) + return 0; if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) return fallback(dev, skb, NULL) % dev->real_num_tx_queues; diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 3621e05a7494..d5081ffdc8f0 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1072,11 +1072,6 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s skb_frag_size_set(&frags[i], len); } - /* Copied all the bits from the frag list -- free it. */ - skb_frag_list_init(skb); - xenvif_skb_zerocopy_prepare(queue, nskb); - kfree_skb(nskb); - /* Release all the original (foreign) frags. */ for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) skb_frag_unref(skb, f); @@ -1145,6 +1140,8 @@ static int xenvif_tx_submit(struct xenvif_queue *queue) xenvif_fill_frags(queue, skb); if (unlikely(skb_has_frag_list(skb))) { + struct sk_buff *nskb = skb_shinfo(skb)->frag_list; + xenvif_skb_zerocopy_prepare(queue, nskb); if (xenvif_handle_frag_list(queue, skb)) { if (net_ratelimit()) netdev_err(queue->vif->dev, @@ -1153,6 +1150,9 @@ static int xenvif_tx_submit(struct xenvif_queue *queue) kfree_skb(skb); continue; } + /* Copied all the bits from the frag list -- free it. */ + skb_frag_list_init(skb); + kfree_skb(nskb); } skb->dev = queue->vif->dev; diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index e0d2b7473901..2cdb3032ca0f 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1182,6 +1182,7 @@ static u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, * effects say only one namespace is affected. */ if (effects & (NVME_CMD_EFFECTS_LBCC | NVME_CMD_EFFECTS_CSE_MASK)) { + mutex_lock(&ctrl->scan_lock); nvme_start_freeze(ctrl); nvme_wait_freeze(ctrl); } @@ -1210,8 +1211,10 @@ static void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects) */ if (effects & NVME_CMD_EFFECTS_LBCC) nvme_update_formats(ctrl); - if (effects & (NVME_CMD_EFFECTS_LBCC | NVME_CMD_EFFECTS_CSE_MASK)) + if (effects & (NVME_CMD_EFFECTS_LBCC | NVME_CMD_EFFECTS_CSE_MASK)) { nvme_unfreeze(ctrl); + mutex_unlock(&ctrl->scan_lock); + } if (effects & NVME_CMD_EFFECTS_CCC) nvme_init_identify(ctrl); if (effects & (NVME_CMD_EFFECTS_NIC | NVME_CMD_EFFECTS_NCC)) @@ -3292,6 +3295,7 @@ static void nvme_scan_work(struct work_struct *work) if (nvme_identify_ctrl(ctrl, &id)) return; + mutex_lock(&ctrl->scan_lock); nn = le32_to_cpu(id->nn); if (ctrl->vs >= NVME_VS(1, 1, 0) && !(ctrl->quirks & NVME_QUIRK_IDENTIFY_CNS)) { @@ -3300,6 +3304,7 @@ static void nvme_scan_work(struct work_struct *work) } nvme_scan_ns_sequential(ctrl, nn); out_free_id: + mutex_unlock(&ctrl->scan_lock); kfree(id); down_write(&ctrl->namespaces_rwsem); list_sort(NULL, &ctrl->namespaces, ns_cmp); @@ -3535,6 +3540,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev, ctrl->state = NVME_CTRL_NEW; spin_lock_init(&ctrl->lock); + mutex_init(&ctrl->scan_lock); INIT_LIST_HEAD(&ctrl->namespaces); init_rwsem(&ctrl->namespaces_rwsem); ctrl->dev = dev; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 60220de2db52..e82cdaec81c9 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -148,6 +148,7 @@ struct nvme_ctrl { enum nvme_ctrl_state state; bool identified; spinlock_t lock; + struct mutex scan_lock; const struct nvme_ctrl_ops *ops; struct request_queue *admin_q; struct request_queue *connect_q; diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index f46313f441ec..7b9ef8e734e7 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2260,6 +2260,27 @@ static void nvme_reset_work(struct work_struct *work) if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) nvme_dev_disable(dev, false); + mutex_lock(&dev->shutdown_lock); + result = nvme_pci_enable(dev); + if (result) + goto out_unlock; + + result = nvme_pci_configure_admin_queue(dev); + if (result) + goto out_unlock; + + result = nvme_alloc_admin_tags(dev); + if (result) + goto out_unlock; + + /* + * Limit the max command size to prevent iod->sg allocations going + * over a single page. + */ + dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1; + dev->ctrl.max_segments = NVME_MAX_SEGS; + mutex_unlock(&dev->shutdown_lock); + /* * Introduce CONNECTING state from nvme-fc/rdma transports to mark the * initializing procedure here. @@ -2270,25 +2291,6 @@ static void nvme_reset_work(struct work_struct *work) goto out; } - result = nvme_pci_enable(dev); - if (result) - goto out; - - result = nvme_pci_configure_admin_queue(dev); - if (result) - goto out; - - result = nvme_alloc_admin_tags(dev); - if (result) - goto out; - - /* - * Limit the max command size to prevent iod->sg allocations going - * over a single page. - */ - dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1; - dev->ctrl.max_segments = NVME_MAX_SEGS; - result = nvme_init_identify(&dev->ctrl); if (result) goto out; @@ -2352,6 +2354,8 @@ static void nvme_reset_work(struct work_struct *work) nvme_start_ctrl(&dev->ctrl); return; + out_unlock: + mutex_unlock(&dev->shutdown_lock); out: nvme_remove_dead_ctrl(dev, result); } diff --git a/drivers/pinctrl/pinctrl-mcp23s08.c b/drivers/pinctrl/pinctrl-mcp23s08.c index cf73a403d22d..cecbce21d01f 100644 --- a/drivers/pinctrl/pinctrl-mcp23s08.c +++ b/drivers/pinctrl/pinctrl-mcp23s08.c @@ -832,8 +832,13 @@ static int mcp23s08_probe_one(struct mcp23s08 *mcp, struct device *dev, break; case MCP_TYPE_S18: + one_regmap_config = + devm_kmemdup(dev, &mcp23x17_regmap, + sizeof(struct regmap_config), GFP_KERNEL); + if (!one_regmap_config) + return -ENOMEM; mcp->regmap = devm_regmap_init(dev, &mcp23sxx_spi_regmap, mcp, - &mcp23x17_regmap); + one_regmap_config); mcp->reg_shift = 1; mcp->chip.ngpio = 16; mcp->chip.label = "mcp23s18"; diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index 0c1aa6c314f5..7563c07e14e4 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -856,6 +856,7 @@ config TOSHIBA_WMI config ACPI_CMPC tristate "CMPC Laptop Extras" depends on ACPI && INPUT + depends on BACKLIGHT_LCD_SUPPORT depends on RFKILL || RFKILL=n select BACKLIGHT_CLASS_DEVICE help @@ -1077,6 +1078,7 @@ config INTEL_OAKTRAIL config SAMSUNG_Q10 tristate "Samsung Q10 Extras" depends on ACPI + depends on BACKLIGHT_LCD_SUPPORT select BACKLIGHT_CLASS_DEVICE ---help--- This driver provides support for backlight control on Samsung Q10 diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h index 970654fcc48d..2d1f6a583641 100644 --- a/drivers/s390/net/qeth_core.h +++ b/drivers/s390/net/qeth_core.h @@ -22,6 +22,7 @@ #include #include #include +#include #include #include diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index b03515d43745..56aacf32f71b 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -565,6 +565,7 @@ static int __qeth_issue_next_read(struct qeth_card *card) QETH_DBF_MESSAGE(2, "%s error in starting next read ccw! " "rc=%i\n", dev_name(&card->gdev->dev), rc); atomic_set(&channel->irq_pending, 0); + qeth_release_buffer(channel, iob); card->read_or_write_problem = 1; qeth_schedule_recovery(card); wake_up(&card->wait_q); @@ -1187,6 +1188,8 @@ static void qeth_irq(struct ccw_device *cdev, unsigned long intparm, rc = qeth_get_problem(cdev, irb); if (rc) { card->read_or_write_problem = 1; + if (iob) + qeth_release_buffer(iob->channel, iob); qeth_clear_ipacmd_list(card); qeth_schedule_recovery(card); goto out; @@ -1852,6 +1855,7 @@ static int qeth_idx_activate_get_answer(struct qeth_channel *channel, QETH_DBF_MESSAGE(2, "Error2 in activating channel rc=%d\n", rc); QETH_DBF_TEXT_(SETUP, 2, "2err%d", rc); atomic_set(&channel->irq_pending, 0); + qeth_release_buffer(channel, iob); wake_up(&card->wait_q); return rc; } @@ -1923,6 +1927,7 @@ static int qeth_idx_activate_channel(struct qeth_channel *channel, rc); QETH_DBF_TEXT_(SETUP, 2, "1err%d", rc); atomic_set(&channel->irq_pending, 0); + qeth_release_buffer(channel, iob); wake_up(&card->wait_q); return rc; } @@ -2110,6 +2115,7 @@ int qeth_send_control_data(struct qeth_card *card, int len, } reply = qeth_alloc_reply(card); if (!reply) { + qeth_release_buffer(channel, iob); return -ENOMEM; } reply->callback = reply_cb; @@ -2448,11 +2454,12 @@ static int qeth_init_qdio_out_buf(struct qeth_qdio_out_q *q, int bidx) return 0; } -static void qeth_free_qdio_out_buf(struct qeth_qdio_out_q *q) +static void qeth_free_output_queue(struct qeth_qdio_out_q *q) { if (!q) return; + qeth_clear_outq_buffers(q, 1); qdio_free_buffers(q->qdio_bufs, QDIO_MAX_BUFFERS_PER_Q); kfree(q); } @@ -2526,10 +2533,8 @@ static int qeth_alloc_qdio_buffers(struct qeth_card *card) card->qdio.out_qs[i]->bufs[j] = NULL; } out_freeoutq: - while (i > 0) { - qeth_free_qdio_out_buf(card->qdio.out_qs[--i]); - qeth_clear_outq_buffers(card->qdio.out_qs[i], 1); - } + while (i > 0) + qeth_free_output_queue(card->qdio.out_qs[--i]); kfree(card->qdio.out_qs); card->qdio.out_qs = NULL; out_freepool: @@ -2562,10 +2567,8 @@ static void qeth_free_qdio_buffers(struct qeth_card *card) qeth_free_buffer_pool(card); /* free outbound qdio_qs */ if (card->qdio.out_qs) { - for (i = 0; i < card->qdio.no_out_queues; ++i) { - qeth_clear_outq_buffers(card->qdio.out_qs[i], 1); - qeth_free_qdio_out_buf(card->qdio.out_qs[i]); - } + for (i = 0; i < card->qdio.no_out_queues; i++) + qeth_free_output_queue(card->qdio.out_qs[i]); kfree(card->qdio.out_qs); card->qdio.out_qs = NULL; } diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c index 76b2fba5fba2..b7513c5848cf 100644 --- a/drivers/s390/net/qeth_l2_main.c +++ b/drivers/s390/net/qeth_l2_main.c @@ -854,6 +854,8 @@ static void qeth_l2_remove_device(struct ccwgroup_device *cgdev) if (cgdev->state == CCWGROUP_ONLINE) qeth_l2_set_offline(cgdev); + + cancel_work_sync(&card->close_dev_work); if (qeth_netdev_is_registered(card->dev)) unregister_netdev(card->dev); } diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c index b7f6a8384543..7f71ca0d08e7 100644 --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@ -2611,6 +2611,7 @@ static void qeth_l3_remove_device(struct ccwgroup_device *cgdev) if (cgdev->state == CCWGROUP_ONLINE) qeth_l3_set_offline(cgdev); + cancel_work_sync(&card->close_dev_work); if (qeth_netdev_is_registered(card->dev)) unregister_netdev(card->dev); qeth_l3_clear_ip_htable(card, 0); diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 6be77b3aa8a5..ac79f2088b31 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -295,7 +295,7 @@ NCR_700_detect(struct scsi_host_template *tpnt, if(tpnt->sdev_attrs == NULL) tpnt->sdev_attrs = NCR_700_dev_attrs; - memory = dma_alloc_attrs(hostdata->dev, TOTAL_MEM_SIZE, &pScript, + memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript, GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); if(memory == NULL) { printk(KERN_ERR "53c700: Failed to allocate memory for driver, detaching\n"); diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index 6e1b022a823d..3236240a4edd 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -1304,8 +1304,9 @@ static void aac_handle_aif(struct aac_dev * dev, struct fib * fibptr) ADD : DELETE; break; } - case AifBuManagerEvent: - aac_handle_aif_bu(dev, aifcmd); + break; + case AifBuManagerEvent: + aac_handle_aif_bu(dev, aifcmd); break; } diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 350257c13a5b..bc9f2a2365f4 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -240,6 +240,7 @@ struct bnx2fc_cmd_mgr *bnx2fc_cmd_mgr_alloc(struct bnx2fc_hba *hba) return NULL; } + cmgr->hba = hba; cmgr->free_list = kcalloc(arr_sz, sizeof(*cmgr->free_list), GFP_KERNEL); if (!cmgr->free_list) { @@ -256,7 +257,6 @@ struct bnx2fc_cmd_mgr *bnx2fc_cmd_mgr_alloc(struct bnx2fc_hba *hba) goto mem_err; } - cmgr->hba = hba; cmgr->cmds = (struct bnx2fc_cmd **)(cmgr + 1); for (i = 0; i < arr_sz; i++) { @@ -295,7 +295,7 @@ struct bnx2fc_cmd_mgr *bnx2fc_cmd_mgr_alloc(struct bnx2fc_hba *hba) /* Allocate pool of io_bdts - one for each bnx2fc_cmd */ mem_size = num_ios * sizeof(struct io_bdt *); - cmgr->io_bdt_pool = kmalloc(mem_size, GFP_KERNEL); + cmgr->io_bdt_pool = kzalloc(mem_size, GFP_KERNEL); if (!cmgr->io_bdt_pool) { printk(KERN_ERR PFX "failed to alloc io_bdt_pool\n"); goto mem_err; diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c index be83590ed955..ff943f477d6f 100644 --- a/drivers/scsi/libfc/fc_lport.c +++ b/drivers/scsi/libfc/fc_lport.c @@ -1726,14 +1726,14 @@ void fc_lport_flogi_resp(struct fc_seq *sp, struct fc_frame *fp, fc_frame_payload_op(fp) != ELS_LS_ACC) { FC_LPORT_DBG(lport, "FLOGI not accepted or bad response\n"); fc_lport_error(lport, fp); - goto err; + goto out; } flp = fc_frame_payload_get(fp, sizeof(*flp)); if (!flp) { FC_LPORT_DBG(lport, "FLOGI bad response\n"); fc_lport_error(lport, fp); - goto err; + goto out; } mfs = ntohs(flp->fl_csp.sp_bb_data) & @@ -1743,7 +1743,7 @@ void fc_lport_flogi_resp(struct fc_seq *sp, struct fc_frame *fp, FC_LPORT_DBG(lport, "FLOGI bad mfs:%hu response, " "lport->mfs:%hu\n", mfs, lport->mfs); fc_lport_error(lport, fp); - goto err; + goto out; } if (mfs <= lport->mfs) { diff --git a/drivers/scsi/libfc/fc_rport.c b/drivers/scsi/libfc/fc_rport.c index 372387a450df..1797e47fab38 100644 --- a/drivers/scsi/libfc/fc_rport.c +++ b/drivers/scsi/libfc/fc_rport.c @@ -184,7 +184,6 @@ void fc_rport_destroy(struct kref *kref) struct fc_rport_priv *rdata; rdata = container_of(kref, struct fc_rport_priv, kref); - WARN_ON(!list_empty(&rdata->peers)); kfree_rcu(rdata, rcu); } EXPORT_SYMBOL(fc_rport_destroy); diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 60bcc6df97a9..65305b3848bc 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -62,7 +62,7 @@ /* make sure inq_product_rev string corresponds to this version */ #define SDEBUG_VERSION "0188" /* format to fit INQUIRY revision field */ -static const char *sdebug_version_date = "20180128"; +static const char *sdebug_version_date = "20190125"; #define MY_NAME "scsi_debug" @@ -735,7 +735,7 @@ static inline bool scsi_debug_lbp(void) (sdebug_lbpu || sdebug_lbpws || sdebug_lbpws10); } -static void *fake_store(unsigned long long lba) +static void *lba2fake_store(unsigned long long lba) { lba = do_div(lba, sdebug_store_sectors); @@ -2514,8 +2514,8 @@ static int do_device_access(struct scsi_cmnd *scmd, u32 sg_skip, u64 lba, return ret; } -/* If fake_store(lba,num) compares equal to arr(num), then copy top half of - * arr into fake_store(lba,num) and return true. If comparison fails then +/* If lba2fake_store(lba,num) compares equal to arr(num), then copy top half of + * arr into lba2fake_store(lba,num) and return true. If comparison fails then * return false. */ static bool comp_write_worker(u64 lba, u32 num, const u8 *arr) { @@ -2643,7 +2643,7 @@ static int prot_verify_read(struct scsi_cmnd *SCpnt, sector_t start_sec, if (sdt->app_tag == cpu_to_be16(0xffff)) continue; - ret = dif_verify(sdt, fake_store(sector), sector, ei_lba); + ret = dif_verify(sdt, lba2fake_store(sector), sector, ei_lba); if (ret) { dif_errors++; return ret; @@ -3261,10 +3261,12 @@ static int resp_write_scat(struct scsi_cmnd *scp, static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num, u32 ei_lba, bool unmap, bool ndob) { + int ret; unsigned long iflags; unsigned long long i; - int ret; - u64 lba_off; + u32 lb_size = sdebug_sector_size; + u64 block, lbaa; + u8 *fs1p; ret = check_device_access_params(scp, lba, num); if (ret) @@ -3276,31 +3278,30 @@ static int resp_write_same(struct scsi_cmnd *scp, u64 lba, u32 num, unmap_region(lba, num); goto out; } - - lba_off = lba * sdebug_sector_size; + lbaa = lba; + block = do_div(lbaa, sdebug_store_sectors); /* if ndob then zero 1 logical block, else fetch 1 logical block */ + fs1p = fake_storep + (block * lb_size); if (ndob) { - memset(fake_storep + lba_off, 0, sdebug_sector_size); + memset(fs1p, 0, lb_size); ret = 0; } else - ret = fetch_to_dev_buffer(scp, fake_storep + lba_off, - sdebug_sector_size); + ret = fetch_to_dev_buffer(scp, fs1p, lb_size); if (-1 == ret) { write_unlock_irqrestore(&atomic_rw, iflags); return DID_ERROR << 16; - } else if (sdebug_verbose && !ndob && (ret < sdebug_sector_size)) + } else if (sdebug_verbose && !ndob && (ret < lb_size)) sdev_printk(KERN_INFO, scp->device, "%s: %s: lb size=%u, IO sent=%d bytes\n", - my_name, "write same", - sdebug_sector_size, ret); + my_name, "write same", lb_size, ret); /* Copy first sector to remaining blocks */ - for (i = 1 ; i < num ; i++) - memcpy(fake_storep + ((lba + i) * sdebug_sector_size), - fake_storep + lba_off, - sdebug_sector_size); - + for (i = 1 ; i < num ; i++) { + lbaa = lba + i; + block = do_div(lbaa, sdebug_store_sectors); + memmove(fake_storep + (block * lb_size), fs1p, lb_size); + } if (scsi_debug_lbp()) map_region(lba, num); out: diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ffeac4b07722..c678bf9c4d0a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -761,6 +761,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result) set_host_byte(cmd, DID_OK); return BLK_STS_TARGET; case DID_NEXUS_FAILURE: + set_host_byte(cmd, DID_OK); return BLK_STS_NEXUS; case DID_ALLOC_FAILURE: set_host_byte(cmd, DID_OK); diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 8cc015183043..a4ac6073c555 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -1081,18 +1081,19 @@ static void qm_mr_process_task(struct work_struct *work); static irqreturn_t portal_isr(int irq, void *ptr) { struct qman_portal *p = ptr; - - u32 clear = QM_DQAVAIL_MASK | p->irq_sources; u32 is = qm_in(&p->p, QM_REG_ISR) & p->irq_sources; + u32 clear = 0; if (unlikely(!is)) return IRQ_NONE; /* DQRR-handling if it's interrupt-driven */ - if (is & QM_PIRQ_DQRI) + if (is & QM_PIRQ_DQRI) { __poll_portal_fast(p, QMAN_POLL_LIMIT); + clear = QM_DQAVAIL_MASK | QM_PIRQ_DQRI; + } /* Handling of anything else that's interrupt-driven */ - clear |= __poll_portal_slow(p, is); + clear |= __poll_portal_slow(p, is) & QM_PIRQ_SLOW; qm_out(&p->p, QM_REG_ISR, clear); return IRQ_HANDLED; } diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index a880b5c6c6c3..be815330ed95 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -75,6 +75,9 @@ struct ashmem_range { /* LRU list of unpinned pages, protected by ashmem_mutex */ static LIST_HEAD(ashmem_lru_list); +static atomic_t ashmem_shrink_inflight = ATOMIC_INIT(0); +static DECLARE_WAIT_QUEUE_HEAD(ashmem_shrink_wait); + /* * long lru_count - The count of pages on our LRU list. * @@ -168,19 +171,15 @@ static inline void lru_del(struct ashmem_range *range) * @end: The ending page (inclusive) * * This function is protected by ashmem_mutex. - * - * Return: 0 if successful, or -ENOMEM if there is an error */ -static int range_alloc(struct ashmem_area *asma, - struct ashmem_range *prev_range, unsigned int purged, - size_t start, size_t end) +static void range_alloc(struct ashmem_area *asma, + struct ashmem_range *prev_range, unsigned int purged, + size_t start, size_t end, + struct ashmem_range **new_range) { - struct ashmem_range *range; - - range = kmem_cache_zalloc(ashmem_range_cachep, GFP_KERNEL); - if (!range) - return -ENOMEM; + struct ashmem_range *range = *new_range; + *new_range = NULL; range->asma = asma; range->pgstart = start; range->pgend = end; @@ -190,8 +189,6 @@ static int range_alloc(struct ashmem_area *asma, if (range_on_lru(range)) lru_add(range); - - return 0; } /** @@ -438,7 +435,6 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) static unsigned long ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct ashmem_range *range, *next; unsigned long freed = 0; /* We might recurse into filesystem code, so bail out if necessary */ @@ -448,21 +444,33 @@ ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) if (!mutex_trylock(&ashmem_mutex)) return -1; - list_for_each_entry_safe(range, next, &ashmem_lru_list, lru) { + while (!list_empty(&ashmem_lru_list)) { + struct ashmem_range *range = + list_first_entry(&ashmem_lru_list, typeof(*range), lru); loff_t start = range->pgstart * PAGE_SIZE; loff_t end = (range->pgend + 1) * PAGE_SIZE; + struct file *f = range->asma->file; - range->asma->file->f_op->fallocate(range->asma->file, - FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, - start, end - start); + get_file(f); + atomic_inc(&ashmem_shrink_inflight); range->purged = ASHMEM_WAS_PURGED; lru_del(range); freed += range_size(range); + mutex_unlock(&ashmem_mutex); + f->f_op->fallocate(f, + FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, + start, end - start); + fput(f); + if (atomic_dec_and_test(&ashmem_shrink_inflight)) + wake_up_all(&ashmem_shrink_wait); + if (!mutex_trylock(&ashmem_mutex)) + goto out; if (--sc->nr_to_scan <= 0) break; } mutex_unlock(&ashmem_mutex); +out: return freed; } @@ -582,7 +590,8 @@ static int get_name(struct ashmem_area *asma, void __user *name) * * Caller must hold ashmem_mutex. */ -static int ashmem_pin(struct ashmem_area *asma, size_t pgstart, size_t pgend) +static int ashmem_pin(struct ashmem_area *asma, size_t pgstart, size_t pgend, + struct ashmem_range **new_range) { struct ashmem_range *range, *next; int ret = ASHMEM_NOT_PURGED; @@ -635,7 +644,7 @@ static int ashmem_pin(struct ashmem_area *asma, size_t pgstart, size_t pgend) * second half and adjust the first chunk's endpoint. */ range_alloc(asma, range, range->purged, - pgend + 1, range->pgend); + pgend + 1, range->pgend, new_range); range_shrink(range, range->pgstart, pgstart - 1); break; } @@ -649,7 +658,8 @@ static int ashmem_pin(struct ashmem_area *asma, size_t pgstart, size_t pgend) * * Caller must hold ashmem_mutex. */ -static int ashmem_unpin(struct ashmem_area *asma, size_t pgstart, size_t pgend) +static int ashmem_unpin(struct ashmem_area *asma, size_t pgstart, size_t pgend, + struct ashmem_range **new_range) { struct ashmem_range *range, *next; unsigned int purged = ASHMEM_NOT_PURGED; @@ -675,7 +685,8 @@ static int ashmem_unpin(struct ashmem_area *asma, size_t pgstart, size_t pgend) } } - return range_alloc(asma, range, purged, pgstart, pgend); + range_alloc(asma, range, purged, pgstart, pgend, new_range); + return 0; } /* @@ -708,11 +719,19 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, struct ashmem_pin pin; size_t pgstart, pgend; int ret = -EINVAL; + struct ashmem_range *range = NULL; if (copy_from_user(&pin, p, sizeof(pin))) return -EFAULT; + if (cmd == ASHMEM_PIN || cmd == ASHMEM_UNPIN) { + range = kmem_cache_zalloc(ashmem_range_cachep, GFP_KERNEL); + if (!range) + return -ENOMEM; + } + mutex_lock(&ashmem_mutex); + wait_event(ashmem_shrink_wait, !atomic_read(&ashmem_shrink_inflight)); if (!asma->file) goto out_unlock; @@ -735,10 +754,10 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, switch (cmd) { case ASHMEM_PIN: - ret = ashmem_pin(asma, pgstart, pgend); + ret = ashmem_pin(asma, pgstart, pgend, &range); break; case ASHMEM_UNPIN: - ret = ashmem_unpin(asma, pgstart, pgend); + ret = ashmem_unpin(asma, pgstart, pgend, &range); break; case ASHMEM_GET_PIN_STATUS: ret = ashmem_get_pin_status(asma, pgstart, pgend); @@ -747,6 +766,8 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, out_unlock: mutex_unlock(&ashmem_mutex); + if (range) + kmem_cache_free(ashmem_range_cachep, range); return ret; } diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 9453d6fe06e1..35355e5d4bd0 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -619,6 +619,7 @@ static int ion_system_heap_create_pools(struct ion_page_pool **pools, bool cached) { int i; + for (i = 0; i < NUM_ORDERS; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; diff --git a/drivers/staging/comedi/drivers/ni_660x.c b/drivers/staging/comedi/drivers/ni_660x.c index e521ed9d0887..35bd4d2efe16 100644 --- a/drivers/staging/comedi/drivers/ni_660x.c +++ b/drivers/staging/comedi/drivers/ni_660x.c @@ -602,6 +602,7 @@ static int ni_660x_set_pfi_routing(struct comedi_device *dev, case NI_660X_PFI_OUTPUT_DIO: if (chan > 31) return -EINVAL; + break; default: return -EINVAL; } diff --git a/drivers/staging/erofs/inode.c b/drivers/staging/erofs/inode.c index 9e7815f55a17..7448744cc515 100644 --- a/drivers/staging/erofs/inode.c +++ b/drivers/staging/erofs/inode.c @@ -184,16 +184,16 @@ static int fill_inode(struct inode *inode, int isdir) /* setup the new inode */ if (S_ISREG(inode->i_mode)) { #ifdef CONFIG_EROFS_FS_XATTR - if (vi->xattr_isize) - inode->i_op = &erofs_generic_xattr_iops; + inode->i_op = &erofs_generic_xattr_iops; #endif inode->i_fop = &generic_ro_fops; } else if (S_ISDIR(inode->i_mode)) { inode->i_op = #ifdef CONFIG_EROFS_FS_XATTR - vi->xattr_isize ? &erofs_dir_xattr_iops : -#endif + &erofs_dir_xattr_iops; +#else &erofs_dir_iops; +#endif inode->i_fop = &erofs_dir_fops; } else if (S_ISLNK(inode->i_mode)) { /* by default, page_get_link is used for symlink */ diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h index 9f44ed8f0023..58d8cbc3f921 100644 --- a/drivers/staging/erofs/internal.h +++ b/drivers/staging/erofs/internal.h @@ -260,6 +260,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt) } #define __erofs_workgroup_get(grp) atomic_inc(&(grp)->refcount) +#define __erofs_workgroup_put(grp) atomic_dec(&(grp)->refcount) extern int erofs_workgroup_put(struct erofs_workgroup *grp); @@ -327,12 +328,17 @@ static inline erofs_off_t iloc(struct erofs_sb_info *sbi, erofs_nid_t nid) return blknr_to_addr(sbi->meta_blkaddr) + (nid << sbi->islotbits); } -#define inode_set_inited_xattr(inode) (EROFS_V(inode)->flags |= 1) -#define inode_has_inited_xattr(inode) (EROFS_V(inode)->flags & 1) +/* atomic flag definitions */ +#define EROFS_V_EA_INITED_BIT 0 + +/* bitlock definitions (arranged in reverse order) */ +#define EROFS_V_BL_XATTR_BIT (BITS_PER_LONG - 1) struct erofs_vnode { erofs_nid_t nid; - unsigned int flags; + + /* atomic flags (including bitlocks) */ + unsigned long flags; unsigned char data_mapping_mode; /* inline size in bytes */ @@ -485,8 +491,9 @@ struct erofs_map_blocks_iter { }; -static inline struct page *erofs_get_inline_page(struct inode *inode, - erofs_blk_t blkaddr) +static inline struct page * +erofs_get_inline_page(struct inode *inode, + erofs_blk_t blkaddr) { return erofs_get_meta_page(inode->i_sb, blkaddr, S_ISDIR(inode->i_mode)); diff --git a/drivers/staging/erofs/namei.c b/drivers/staging/erofs/namei.c index 546a47156101..023f64fa2c87 100644 --- a/drivers/staging/erofs/namei.c +++ b/drivers/staging/erofs/namei.c @@ -15,74 +15,77 @@ #include -/* based on the value of qn->len is accurate */ -static inline int dirnamecmp(struct qstr *qn, - struct qstr *qd, unsigned *matched) +struct erofs_qstr { + const unsigned char *name; + const unsigned char *end; +}; + +/* based on the end of qn is accurate and it must have the trailing '\0' */ +static inline int dirnamecmp(const struct erofs_qstr *qn, + const struct erofs_qstr *qd, + unsigned int *matched) { - unsigned i = *matched, len = min(qn->len, qd->len); -loop: - if (unlikely(i >= len)) { - *matched = i; - if (qn->len < qd->len) { - /* - * actually (qn->len == qd->len) - * when qd->name[i] == '\0' - */ - return qd->name[i] == '\0' ? 0 : -1; + unsigned int i = *matched; + + /* + * on-disk error, let's only BUG_ON in the debugging mode. + * otherwise, it will return 1 to just skip the invalid name + * and go on (in consideration of the lookup performance). + */ + DBG_BUGON(qd->name > qd->end); + + /* qd could not have trailing '\0' */ + /* However it is absolutely safe if < qd->end */ + while (qd->name + i < qd->end && qd->name[i] != '\0') { + if (qn->name[i] != qd->name[i]) { + *matched = i; + return qn->name[i] > qd->name[i] ? 1 : -1; } - return (qn->len > qd->len); + ++i; } - - if (qn->name[i] != qd->name[i]) { - *matched = i; - return qn->name[i] > qd->name[i] ? 1 : -1; - } - - ++i; - goto loop; + *matched = i; + /* See comments in __d_alloc on the terminating NUL character */ + return qn->name[i] == '\0' ? 0 : 1; } -static struct erofs_dirent *find_target_dirent( - struct qstr *name, - u8 *data, int maxsize) +#define nameoff_from_disk(off, sz) (le16_to_cpu(off) & ((sz) - 1)) + +static struct erofs_dirent *find_target_dirent(struct erofs_qstr *name, + u8 *data, + unsigned int dirblksize, + const int ndirents) { - unsigned ndirents, head, back; - unsigned startprfx, endprfx; + int head, back; + unsigned int startprfx, endprfx; struct erofs_dirent *const de = (struct erofs_dirent *)data; - /* make sure that maxsize is valid */ - BUG_ON(maxsize < sizeof(struct erofs_dirent)); - - ndirents = le16_to_cpu(de->nameoff) / sizeof(*de); - - /* corrupted dir (may be unnecessary...) */ - BUG_ON(!ndirents); - - head = 0; + /* since the 1st dirent has been evaluated previously */ + head = 1; back = ndirents - 1; startprfx = endprfx = 0; while (head <= back) { - unsigned mid = head + (back - head) / 2; - unsigned nameoff = le16_to_cpu(de[mid].nameoff); - unsigned matched = min(startprfx, endprfx); - - struct qstr dname = QSTR_INIT(data + nameoff, - unlikely(mid >= ndirents - 1) ? - maxsize - nameoff : - le16_to_cpu(de[mid + 1].nameoff) - nameoff); + const int mid = head + (back - head) / 2; + const int nameoff = nameoff_from_disk(de[mid].nameoff, + dirblksize); + unsigned int matched = min(startprfx, endprfx); + struct erofs_qstr dname = { + .name = data + nameoff, + .end = unlikely(mid >= ndirents - 1) ? + data + dirblksize : + data + nameoff_from_disk(de[mid + 1].nameoff, + dirblksize) + }; /* string comparison without already matched prefix */ int ret = dirnamecmp(name, &dname, &matched); - if (unlikely(!ret)) + if (unlikely(!ret)) { return de + mid; - else if (ret > 0) { + } else if (ret > 0) { head = mid + 1; startprfx = matched; - } else if (unlikely(mid < 1)) /* fix "mid" overflow */ - break; - else { + } else { back = mid - 1; endprfx = matched; } @@ -91,12 +94,12 @@ static struct erofs_dirent *find_target_dirent( return ERR_PTR(-ENOENT); } -static struct page *find_target_block_classic( - struct inode *dir, - struct qstr *name, int *_diff) +static struct page *find_target_block_classic(struct inode *dir, + struct erofs_qstr *name, + int *_ndirents) { - unsigned startprfx, endprfx; - unsigned head, back; + unsigned int startprfx, endprfx; + int head, back; struct address_space *const mapping = dir->i_mapping; struct page *candidate = ERR_PTR(-ENOENT); @@ -105,41 +108,43 @@ static struct page *find_target_block_classic( back = inode_datablocks(dir) - 1; while (head <= back) { - unsigned mid = head + (back - head) / 2; + const int mid = head + (back - head) / 2; struct page *page = read_mapping_page(mapping, mid, NULL); - if (IS_ERR(page)) { -exact_out: - if (!IS_ERR(candidate)) /* valid candidate */ - put_page(candidate); - return page; - } else { - int diff; - unsigned ndirents, matched; - struct qstr dname; + if (!IS_ERR(page)) { struct erofs_dirent *de = kmap_atomic(page); - unsigned nameoff = le16_to_cpu(de->nameoff); + const int nameoff = nameoff_from_disk(de->nameoff, + EROFS_BLKSIZ); + const int ndirents = nameoff / sizeof(*de); + int diff; + unsigned int matched; + struct erofs_qstr dname; - ndirents = nameoff / sizeof(*de); - - /* corrupted dir (should have one entry at least) */ - BUG_ON(!ndirents || nameoff > PAGE_SIZE); + if (unlikely(!ndirents)) { + DBG_BUGON(1); + kunmap_atomic(de); + put_page(page); + page = ERR_PTR(-EIO); + goto out; + } matched = min(startprfx, endprfx); dname.name = (u8 *)de + nameoff; - dname.len = ndirents == 1 ? - /* since the rest of the last page is 0 */ - EROFS_BLKSIZ - nameoff - : le16_to_cpu(de[1].nameoff) - nameoff; + if (ndirents == 1) + dname.end = (u8 *)de + EROFS_BLKSIZ; + else + dname.end = (u8 *)de + + nameoff_from_disk(de[1].nameoff, + EROFS_BLKSIZ); /* string comparison without already matched prefix */ diff = dirnamecmp(name, &dname, &matched); kunmap_atomic(de); if (unlikely(!diff)) { - *_diff = 0; - goto exact_out; + *_ndirents = 0; + goto out; } else if (diff > 0) { head = mid + 1; startprfx = matched; @@ -147,45 +152,51 @@ static struct page *find_target_block_classic( if (likely(!IS_ERR(candidate))) put_page(candidate); candidate = page; + *_ndirents = ndirents; } else { put_page(page); - if (unlikely(mid < 1)) /* fix "mid" overflow */ - break; - back = mid - 1; endprfx = matched; } + continue; } +out: /* free if the candidate is valid */ + if (!IS_ERR(candidate)) + put_page(candidate); + return page; } - *_diff = 1; return candidate; } int erofs_namei(struct inode *dir, - struct qstr *name, - erofs_nid_t *nid, unsigned *d_type) + struct qstr *name, + erofs_nid_t *nid, unsigned int *d_type) { - int diff; + int ndirents; struct page *page; - u8 *data; + void *data; struct erofs_dirent *de; + struct erofs_qstr qn; if (unlikely(!dir->i_size)) return -ENOENT; - diff = 1; - page = find_target_block_classic(dir, name, &diff); + qn.name = name->name; + qn.end = name->name + name->len; + + ndirents = 0; + page = find_target_block_classic(dir, &qn, &ndirents); if (unlikely(IS_ERR(page))) return PTR_ERR(page); data = kmap_atomic(page); /* the target page has been mapped */ - de = likely(diff) ? - /* since the rest of the last page is 0 */ - find_target_dirent(name, data, EROFS_BLKSIZ) : - (struct erofs_dirent *)data; + if (ndirents) + de = find_target_dirent(&qn, data, EROFS_BLKSIZ, ndirents); + else + de = (struct erofs_dirent *)data; if (likely(!IS_ERR(de))) { *nid = le64_to_cpu(de->nid); diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c index 1279241449f4..f44662dd795c 100644 --- a/drivers/staging/erofs/unzip_vle.c +++ b/drivers/staging/erofs/unzip_vle.c @@ -57,15 +57,30 @@ enum z_erofs_vle_work_role { Z_EROFS_VLE_WORK_SECONDARY, Z_EROFS_VLE_WORK_PRIMARY, /* - * The current work has at least been linked with the following - * processed chained works, which means if the processing page - * is the tail partial page of the work, the current work can - * safely use the whole page, as illustrated below: - * +--------------+-------------------------------------------+ - * | tail page | head page (of the previous work) | - * +--------------+-------------------------------------------+ - * /\ which belongs to the current work - * [ (*) this page can be used for the current work itself. ] + * The current work was the tail of an exist chain, and the previous + * processed chained works are all decided to be hooked up to it. + * A new chain should be created for the remaining unprocessed works, + * therefore different from Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED, + * the next work cannot reuse the whole page in the following scenario: + * ________________________________________________________________ + * | tail (partial) page | head (partial) page | + * | (belongs to the next work) | (belongs to the current work) | + * |_______PRIMARY_FOLLOWED_______|________PRIMARY_HOOKED___________| + */ + Z_EROFS_VLE_WORK_PRIMARY_HOOKED, + /* + * The current work has been linked with the processed chained works, + * and could be also linked with the potential remaining works, which + * means if the processing page is the tail partial page of the work, + * the current work can safely use the whole page (since the next work + * is under control) for in-place decompression, as illustrated below: + * ________________________________________________________________ + * | tail (partial) page | head (partial) page | + * | (of the current work) | (of the previous work) | + * | PRIMARY_FOLLOWED or | | + * |_____PRIMARY_HOOKED____|____________PRIMARY_FOLLOWED____________| + * + * [ (*) the above page can be used for the current work itself. ] */ Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED, Z_EROFS_VLE_WORK_MAX @@ -234,10 +249,10 @@ static int z_erofs_vle_work_add_page( return ret ? 0 : -EAGAIN; } -static inline bool try_to_claim_workgroup( - struct z_erofs_vle_workgroup *grp, - z_erofs_vle_owned_workgrp_t *owned_head, - bool *hosted) +static enum z_erofs_vle_work_role +try_to_claim_workgroup(struct z_erofs_vle_workgroup *grp, + z_erofs_vle_owned_workgrp_t *owned_head, + bool *hosted) { DBG_BUGON(*hosted == true); @@ -251,6 +266,9 @@ static inline bool try_to_claim_workgroup( *owned_head = grp; *hosted = true; + /* lucky, I am the followee :) */ + return Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED; + } else if (grp->next == Z_EROFS_VLE_WORKGRP_TAIL) { /* * type 2, link to the end of a existing open chain, @@ -260,12 +278,11 @@ static inline bool try_to_claim_workgroup( if (Z_EROFS_VLE_WORKGRP_TAIL != cmpxchg(&grp->next, Z_EROFS_VLE_WORKGRP_TAIL, *owned_head)) goto retry; - *owned_head = Z_EROFS_VLE_WORKGRP_TAIL; - } else - return false; /* :( better luck next time */ + return Z_EROFS_VLE_WORK_PRIMARY_HOOKED; + } - return true; /* lucky, I am the followee :) */ + return Z_EROFS_VLE_WORK_PRIMARY; /* :( better luck next time */ } static struct z_erofs_vle_work * @@ -337,12 +354,8 @@ z_erofs_vle_work_lookup(struct super_block *sb, *hosted = false; if (!primary) *role = Z_EROFS_VLE_WORK_SECONDARY; - /* claim the workgroup if possible */ - else if (try_to_claim_workgroup(grp, owned_head, hosted)) - *role = Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED; - else - *role = Z_EROFS_VLE_WORK_PRIMARY; - + else /* claim the workgroup if possible */ + *role = try_to_claim_workgroup(grp, owned_head, hosted); return work; } @@ -419,6 +432,9 @@ static inline void __update_workgrp_llen(struct z_erofs_vle_workgroup *grp, } } +#define builder_is_hooked(builder) \ + ((builder)->role >= Z_EROFS_VLE_WORK_PRIMARY_HOOKED) + #define builder_is_followed(builder) \ ((builder)->role >= Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED) @@ -583,7 +599,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe, struct z_erofs_vle_work_builder *const builder = &fe->builder; const loff_t offset = page_offset(page); - bool tight = builder_is_followed(builder); + bool tight = builder_is_hooked(builder); struct z_erofs_vle_work *work = builder->work; #ifdef EROFS_FS_HAS_MANAGED_CACHE @@ -606,8 +622,12 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe, /* lucky, within the range of the current map_blocks */ if (offset + cur >= map->m_la && - offset + cur < map->m_la + map->m_llen) + offset + cur < map->m_la + map->m_llen) { + /* didn't get a valid unzip work previously (very rare) */ + if (!builder->work) + goto restart_now; goto hitted; + } /* go ahead the next map_blocks */ debugln("%s: [out-of-range] pos %llu", __func__, offset + cur); @@ -621,6 +641,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe, if (unlikely(err)) goto err_out; +restart_now: if (unlikely(!(map->m_flags & EROFS_MAP_MAPPED))) goto hitted; @@ -646,7 +667,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe, builder->role = Z_EROFS_VLE_WORK_PRIMARY; #endif - tight &= builder_is_followed(builder); + tight &= builder_is_hooked(builder); work = builder->work; hitted: cur = end - min_t(unsigned, offset + end - map->m_la, end); @@ -661,6 +682,9 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe, (tight ? Z_EROFS_PAGE_TYPE_EXCLUSIVE : Z_EROFS_VLE_PAGE_TYPE_TAIL_SHARED)); + if (cur) + tight &= builder_is_followed(builder); + retry: err = z_erofs_vle_work_add_page(builder, page, page_type); /* should allocate an additional staging page for pagevec */ @@ -901,11 +925,10 @@ static int z_erofs_vle_unzip(struct super_block *sb, if (llen > grp->llen) llen = grp->llen; - err = z_erofs_vle_unzip_fast_percpu(compressed_pages, - clusterpages, pages, llen, work->pageofs, - z_erofs_onlinepage_endio); + err = z_erofs_vle_unzip_fast_percpu(compressed_pages, clusterpages, + pages, llen, work->pageofs); if (err != -ENOTSUPP) - goto out_percpu; + goto out; if (sparsemem_pages >= nr_pages) goto skip_allocpage; @@ -926,21 +949,7 @@ static int z_erofs_vle_unzip(struct super_block *sb, erofs_vunmap(vout, nr_pages); out: - for (i = 0; i < nr_pages; ++i) { - page = pages[i]; - DBG_BUGON(page->mapping == NULL); - - /* recycle all individual staging pages */ - if (z_erofs_gather_if_stagingpage(page_pool, page)) - continue; - - if (unlikely(err < 0)) - SetPageError(page); - - z_erofs_onlinepage_endio(page); - } - -out_percpu: + /* must handle all compressed pages before endding pages */ for (i = 0; i < clusterpages; ++i) { page = compressed_pages[i]; @@ -954,6 +963,23 @@ static int z_erofs_vle_unzip(struct super_block *sb, WRITE_ONCE(compressed_pages[i], NULL); } + for (i = 0; i < nr_pages; ++i) { + page = pages[i]; + if (!page) + continue; + + DBG_BUGON(page->mapping == NULL); + + /* recycle all individual staging pages */ + if (z_erofs_gather_if_stagingpage(page_pool, page)) + continue; + + if (unlikely(err < 0)) + SetPageError(page); + + z_erofs_onlinepage_endio(page); + } + if (pages == z_pagemap_global) mutex_unlock(&z_pagemap_global_lock); else if (unlikely(pages != pages_onstack)) diff --git a/drivers/staging/erofs/unzip_vle.h b/drivers/staging/erofs/unzip_vle.h index 3316bc36965d..684ff06fc7bf 100644 --- a/drivers/staging/erofs/unzip_vle.h +++ b/drivers/staging/erofs/unzip_vle.h @@ -218,8 +218,7 @@ extern int z_erofs_vle_plain_copy(struct page **compressed_pages, extern int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages, unsigned clusterpages, struct page **pages, - unsigned outlen, unsigned short pageofs, - void (*endio)(struct page *)); + unsigned int outlen, unsigned short pageofs); extern int z_erofs_vle_unzip_vmap(struct page **compressed_pages, unsigned clusterpages, void *vaddr, unsigned llen, diff --git a/drivers/staging/erofs/unzip_vle_lz4.c b/drivers/staging/erofs/unzip_vle_lz4.c index 9cb35cd33365..055420e8af2c 100644 --- a/drivers/staging/erofs/unzip_vle_lz4.c +++ b/drivers/staging/erofs/unzip_vle_lz4.c @@ -105,8 +105,7 @@ int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages, unsigned clusterpages, struct page **pages, unsigned outlen, - unsigned short pageofs, - void (*endio)(struct page *)) + unsigned short pageofs) { void *vin, *vout; unsigned nr_pages, i, j; @@ -128,31 +127,30 @@ int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages, ret = z_erofs_unzip_lz4(vin, vout + pageofs, clusterpages * PAGE_SIZE, outlen); - if (ret >= 0) { - outlen = ret; - ret = 0; - } + if (ret < 0) + goto out; + ret = 0; for (i = 0; i < nr_pages; ++i) { j = min((unsigned)PAGE_SIZE - pageofs, outlen); if (pages[i] != NULL) { - if (ret < 0) - SetPageError(pages[i]); - else if (clusterpages == 1 && pages[i] == compressed_pages[0]) + if (clusterpages == 1 && + pages[i] == compressed_pages[0]) { memcpy(vin + pageofs, vout + pageofs, j); - else { + } else { void *dst = kmap_atomic(pages[i]); memcpy(dst + pageofs, vout + pageofs, j); kunmap_atomic(dst); } - endio(pages[i]); } vout += PAGE_SIZE; outlen -= j; pageofs = 0; } + +out: preempt_enable(); if (clusterpages == 1) diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c index dd2ac9dbc4b4..2d96820da62e 100644 --- a/drivers/staging/erofs/utils.c +++ b/drivers/staging/erofs/utils.c @@ -87,12 +87,21 @@ int erofs_register_workgroup(struct super_block *sb, grp = (void *)((unsigned long)grp | 1UL << RADIX_TREE_EXCEPTIONAL_SHIFT); - err = radix_tree_insert(&sbi->workstn_tree, - grp->index, grp); + /* + * Bump up reference count before making this workgroup + * visible to other users in order to avoid potential UAF + * without serialized by erofs_workstn_lock. + */ + __erofs_workgroup_get(grp); - if (!err) { - __erofs_workgroup_get(grp); - } + err = radix_tree_insert(&sbi->workstn_tree, + grp->index, grp); + if (unlikely(err)) + /* + * it's safe to decrease since the workgroup isn't visible + * and refcount >= 2 (cannot be freezed). + */ + __erofs_workgroup_put(grp); erofs_workstn_unlock(sbi); radix_tree_preload_end(); @@ -101,19 +110,99 @@ int erofs_register_workgroup(struct super_block *sb, extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp); +static void __erofs_workgroup_free(struct erofs_workgroup *grp) +{ + atomic_long_dec(&erofs_global_shrink_cnt); + erofs_workgroup_free_rcu(grp); +} + int erofs_workgroup_put(struct erofs_workgroup *grp) { int count = atomic_dec_return(&grp->refcount); if (count == 1) atomic_long_inc(&erofs_global_shrink_cnt); - else if (!count) { - atomic_long_dec(&erofs_global_shrink_cnt); - erofs_workgroup_free_rcu(grp); - } + else if (!count) + __erofs_workgroup_free(grp); return count; } +#ifdef EROFS_FS_HAS_MANAGED_CACHE +/* for cache-managed case, customized reclaim paths exist */ +static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp) +{ + erofs_workgroup_unfreeze(grp, 0); + __erofs_workgroup_free(grp); +} + +bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi, + struct erofs_workgroup *grp, + bool cleanup) +{ + void *entry; + + /* + * for managed cache enabled, the refcount of workgroups + * themselves could be < 0 (freezed). So there is no guarantee + * that all refcount > 0 if managed cache is enabled. + */ + if (!erofs_workgroup_try_to_freeze(grp, 1)) + return false; + + /* + * note that all cached pages should be unlinked + * before delete it from the radix tree. + * Otherwise some cached pages of an orphan old workgroup + * could be still linked after the new one is available. + */ + if (erofs_try_to_free_all_cached_pages(sbi, grp)) { + erofs_workgroup_unfreeze(grp, 1); + return false; + } + + /* + * it is impossible to fail after the workgroup is freezed, + * however in order to avoid some race conditions, add a + * DBG_BUGON to observe this in advance. + */ + entry = radix_tree_delete(&sbi->workstn_tree, grp->index); + DBG_BUGON((void *)((unsigned long)entry & + ~RADIX_TREE_EXCEPTIONAL_ENTRY) != grp); + + /* + * if managed cache is enable, the last refcount + * should indicate the related workstation. + */ + erofs_workgroup_unfreeze_final(grp); + return true; +} + +#else +/* for nocache case, no customized reclaim path at all */ +bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi, + struct erofs_workgroup *grp, + bool cleanup) +{ + int cnt = atomic_read(&grp->refcount); + void *entry; + + DBG_BUGON(cnt <= 0); + DBG_BUGON(cleanup && cnt != 1); + + if (cnt > 1) + return false; + + entry = radix_tree_delete(&sbi->workstn_tree, grp->index); + DBG_BUGON((void *)((unsigned long)entry & + ~RADIX_TREE_EXCEPTIONAL_ENTRY) != grp); + + /* (rarely) could be grabbed again when freeing */ + erofs_workgroup_put(grp); + return true; +} + +#endif + unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi, unsigned long nr_shrink, bool cleanup) @@ -130,44 +219,16 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi, batch, first_index, PAGEVEC_SIZE); for (i = 0; i < found; ++i) { - int cnt; struct erofs_workgroup *grp = (void *) ((unsigned long)batch[i] & ~RADIX_TREE_EXCEPTIONAL_ENTRY); first_index = grp->index + 1; - cnt = atomic_read(&grp->refcount); - BUG_ON(cnt <= 0); - - if (cleanup) - BUG_ON(cnt != 1); - -#ifndef EROFS_FS_HAS_MANAGED_CACHE - else if (cnt > 1) -#else - if (!erofs_workgroup_try_to_freeze(grp, 1)) -#endif + /* try to shrink each valid workgroup */ + if (!erofs_try_to_release_workgroup(sbi, grp, cleanup)) continue; - if (radix_tree_delete(&sbi->workstn_tree, - grp->index) != grp) { -#ifdef EROFS_FS_HAS_MANAGED_CACHE -skip: - erofs_workgroup_unfreeze(grp, 1); -#endif - continue; - } - -#ifdef EROFS_FS_HAS_MANAGED_CACHE - if (erofs_try_to_free_all_cached_pages(sbi, grp)) - goto skip; - - erofs_workgroup_unfreeze(grp, 1); -#endif - /* (rarely) grabbed again when freeing */ - erofs_workgroup_put(grp); - ++freed; if (unlikely(!--nr_shrink)) break; diff --git a/drivers/staging/erofs/xattr.c b/drivers/staging/erofs/xattr.c index 0e9cfeccdf99..2db99cff3c99 100644 --- a/drivers/staging/erofs/xattr.c +++ b/drivers/staging/erofs/xattr.c @@ -24,36 +24,77 @@ struct xattr_iter { static inline void xattr_iter_end(struct xattr_iter *it, bool atomic) { - /* only init_inode_xattrs use non-atomic once */ + /* the only user of kunmap() is 'init_inode_xattrs' */ if (unlikely(!atomic)) kunmap(it->page); else kunmap_atomic(it->kaddr); + unlock_page(it->page); put_page(it->page); } -static void init_inode_xattrs(struct inode *inode) +static inline void xattr_iter_end_final(struct xattr_iter *it) { + if (!it->page) + return; + + xattr_iter_end(it, true); +} + +static int init_inode_xattrs(struct inode *inode) +{ + struct erofs_vnode *const vi = EROFS_V(inode); struct xattr_iter it; unsigned i; struct erofs_xattr_ibody_header *ih; struct erofs_sb_info *sbi; - struct erofs_vnode *vi; bool atomic_map; + int ret = 0; - if (likely(inode_has_inited_xattr(inode))) - return; + /* the most case is that xattrs of this inode are initialized. */ + if (test_bit(EROFS_V_EA_INITED_BIT, &vi->flags)) + return 0; - vi = EROFS_V(inode); - BUG_ON(!vi->xattr_isize); + if (wait_on_bit_lock(&vi->flags, EROFS_V_BL_XATTR_BIT, TASK_KILLABLE)) + return -ERESTARTSYS; + + /* someone has initialized xattrs for us? */ + if (test_bit(EROFS_V_EA_INITED_BIT, &vi->flags)) + goto out_unlock; + + /* + * bypass all xattr operations if ->xattr_isize is not greater than + * sizeof(struct erofs_xattr_ibody_header), in detail: + * 1) it is not enough to contain erofs_xattr_ibody_header then + * ->xattr_isize should be 0 (it means no xattr); + * 2) it is just to contain erofs_xattr_ibody_header, which is on-disk + * undefined right now (maybe use later with some new sb feature). + */ + if (vi->xattr_isize == sizeof(struct erofs_xattr_ibody_header)) { + errln("xattr_isize %d of nid %llu is not supported yet", + vi->xattr_isize, vi->nid); + ret = -ENOTSUPP; + goto out_unlock; + } else if (vi->xattr_isize < sizeof(struct erofs_xattr_ibody_header)) { + if (unlikely(vi->xattr_isize)) { + DBG_BUGON(1); + ret = -EIO; + goto out_unlock; /* xattr ondisk layout error */ + } + ret = -ENOATTR; + goto out_unlock; + } sbi = EROFS_I_SB(inode); it.blkaddr = erofs_blknr(iloc(sbi, vi->nid) + vi->inode_isize); it.ofs = erofs_blkoff(iloc(sbi, vi->nid) + vi->inode_isize); it.page = erofs_get_inline_page(inode, it.blkaddr); - BUG_ON(IS_ERR(it.page)); + if (IS_ERR(it.page)) { + ret = PTR_ERR(it.page); + goto out_unlock; + } /* read in shared xattr array (non-atomic, see kmalloc below) */ it.kaddr = kmap(it.page); @@ -62,9 +103,13 @@ static void init_inode_xattrs(struct inode *inode) ih = (struct erofs_xattr_ibody_header *)(it.kaddr + it.ofs); vi->xattr_shared_count = ih->h_shared_count; - vi->xattr_shared_xattrs = (unsigned *)kmalloc_array( - vi->xattr_shared_count, sizeof(unsigned), - GFP_KERNEL | __GFP_NOFAIL); + vi->xattr_shared_xattrs = kmalloc_array(vi->xattr_shared_count, + sizeof(uint), GFP_KERNEL); + if (!vi->xattr_shared_xattrs) { + xattr_iter_end(&it, atomic_map); + ret = -ENOMEM; + goto out_unlock; + } /* let's skip ibody header */ it.ofs += sizeof(struct erofs_xattr_ibody_header); @@ -77,7 +122,12 @@ static void init_inode_xattrs(struct inode *inode) it.page = erofs_get_meta_page(inode->i_sb, ++it.blkaddr, S_ISDIR(inode->i_mode)); - BUG_ON(IS_ERR(it.page)); + if (IS_ERR(it.page)) { + kfree(vi->xattr_shared_xattrs); + vi->xattr_shared_xattrs = NULL; + ret = PTR_ERR(it.page); + goto out_unlock; + } it.kaddr = kmap_atomic(it.page); atomic_map = true; @@ -89,7 +139,11 @@ static void init_inode_xattrs(struct inode *inode) } xattr_iter_end(&it, atomic_map); - inode_set_inited_xattr(inode); + set_bit(EROFS_V_EA_INITED_BIT, &vi->flags); + +out_unlock: + clear_and_wake_up_bit(EROFS_V_BL_XATTR_BIT, &vi->flags); + return ret; } struct xattr_iter_handlers { @@ -99,18 +153,25 @@ struct xattr_iter_handlers { void (*value)(struct xattr_iter *, unsigned, char *, unsigned); }; -static void xattr_iter_fixup(struct xattr_iter *it) +static inline int xattr_iter_fixup(struct xattr_iter *it) { - if (unlikely(it->ofs >= EROFS_BLKSIZ)) { - xattr_iter_end(it, true); + if (it->ofs < EROFS_BLKSIZ) + return 0; - it->blkaddr += erofs_blknr(it->ofs); - it->page = erofs_get_meta_page(it->sb, it->blkaddr, false); - BUG_ON(IS_ERR(it->page)); + xattr_iter_end(it, true); - it->kaddr = kmap_atomic(it->page); - it->ofs = erofs_blkoff(it->ofs); + it->blkaddr += erofs_blknr(it->ofs); + it->page = erofs_get_meta_page(it->sb, it->blkaddr, false); + if (IS_ERR(it->page)) { + int err = PTR_ERR(it->page); + + it->page = NULL; + return err; } + + it->kaddr = kmap_atomic(it->page); + it->ofs = erofs_blkoff(it->ofs); + return 0; } static int inline_xattr_iter_begin(struct xattr_iter *it, @@ -132,21 +193,24 @@ static int inline_xattr_iter_begin(struct xattr_iter *it, it->ofs = erofs_blkoff(iloc(sbi, vi->nid) + inline_xattr_ofs); it->page = erofs_get_inline_page(inode, it->blkaddr); - BUG_ON(IS_ERR(it->page)); - it->kaddr = kmap_atomic(it->page); + if (IS_ERR(it->page)) + return PTR_ERR(it->page); + it->kaddr = kmap_atomic(it->page); return vi->xattr_isize - xattr_header_sz; } static int xattr_foreach(struct xattr_iter *it, - struct xattr_iter_handlers *op, unsigned *tlimit) + const struct xattr_iter_handlers *op, unsigned int *tlimit) { struct erofs_xattr_entry entry; unsigned value_sz, processed, slice; int err; /* 0. fixup blkaddr, ofs, ipage */ - xattr_iter_fixup(it); + err = xattr_iter_fixup(it); + if (err) + return err; /* * 1. read xattr entry to the memory, @@ -178,7 +242,9 @@ static int xattr_foreach(struct xattr_iter *it, if (it->ofs >= EROFS_BLKSIZ) { BUG_ON(it->ofs > EROFS_BLKSIZ); - xattr_iter_fixup(it); + err = xattr_iter_fixup(it); + if (err) + goto out; it->ofs = 0; } @@ -210,7 +276,10 @@ static int xattr_foreach(struct xattr_iter *it, while (processed < value_sz) { if (it->ofs >= EROFS_BLKSIZ) { BUG_ON(it->ofs > EROFS_BLKSIZ); - xattr_iter_fixup(it); + + err = xattr_iter_fixup(it); + if (err) + goto out; it->ofs = 0; } @@ -270,7 +339,7 @@ static void xattr_copyvalue(struct xattr_iter *_it, memcpy(it->buffer + processed, buf, len); } -static struct xattr_iter_handlers find_xattr_handlers = { +static const struct xattr_iter_handlers find_xattr_handlers = { .entry = xattr_entrymatch, .name = xattr_namematch, .alloc_buffer = xattr_checkbuffer, @@ -291,8 +360,11 @@ static int inline_getxattr(struct inode *inode, struct getxattr_iter *it) ret = xattr_foreach(&it->it, &find_xattr_handlers, &remaining); if (ret >= 0) break; + + if (ret != -ENOATTR) /* -ENOMEM, -EIO, etc. */ + break; } - xattr_iter_end(&it->it, true); + xattr_iter_end_final(&it->it); return ret < 0 ? ret : it->buffer_size; } @@ -315,8 +387,10 @@ static int shared_getxattr(struct inode *inode, struct getxattr_iter *it) xattr_iter_end(&it->it, true); it->it.page = erofs_get_meta_page(inode->i_sb, - blkaddr, false); - BUG_ON(IS_ERR(it->it.page)); + blkaddr, false); + if (IS_ERR(it->it.page)) + return PTR_ERR(it->it.page); + it->it.kaddr = kmap_atomic(it->it.page); it->it.blkaddr = blkaddr; } @@ -324,9 +398,12 @@ static int shared_getxattr(struct inode *inode, struct getxattr_iter *it) ret = xattr_foreach(&it->it, &find_xattr_handlers, NULL); if (ret >= 0) break; + + if (ret != -ENOATTR) /* -ENOMEM, -EIO, etc. */ + break; } if (vi->xattr_shared_count) - xattr_iter_end(&it->it, true); + xattr_iter_end_final(&it->it); return ret < 0 ? ret : it->buffer_size; } @@ -351,7 +428,9 @@ int erofs_getxattr(struct inode *inode, int index, if (unlikely(name == NULL)) return -EINVAL; - init_inode_xattrs(inode); + ret = init_inode_xattrs(inode); + if (ret) + return ret; it.index = index; @@ -374,7 +453,6 @@ static int erofs_xattr_generic_get(const struct xattr_handler *handler, struct dentry *unused, struct inode *inode, const char *name, void *buffer, size_t size) { - struct erofs_vnode *const vi = EROFS_V(inode); struct erofs_sb_info *const sbi = EROFS_I_SB(inode); switch (handler->flags) { @@ -392,9 +470,6 @@ static int erofs_xattr_generic_get(const struct xattr_handler *handler, return -EINVAL; } - if (!vi->xattr_isize) - return -ENOATTR; - return erofs_getxattr(inode, handler->flags, name, buffer, size); } @@ -494,7 +569,7 @@ static int xattr_skipvalue(struct xattr_iter *_it, return 1; } -static struct xattr_iter_handlers list_xattr_handlers = { +static const struct xattr_iter_handlers list_xattr_handlers = { .entry = xattr_entrylist, .name = xattr_namelist, .alloc_buffer = xattr_skipvalue, @@ -516,7 +591,7 @@ static int inline_listxattr(struct listxattr_iter *it) if (ret < 0) break; } - xattr_iter_end(&it->it, true); + xattr_iter_end_final(&it->it); return ret < 0 ? ret : it->buffer_ofs; } @@ -538,8 +613,10 @@ static int shared_listxattr(struct listxattr_iter *it) xattr_iter_end(&it->it, true); it->it.page = erofs_get_meta_page(inode->i_sb, - blkaddr, false); - BUG_ON(IS_ERR(it->it.page)); + blkaddr, false); + if (IS_ERR(it->it.page)) + return PTR_ERR(it->it.page); + it->it.kaddr = kmap_atomic(it->it.page); it->it.blkaddr = blkaddr; } @@ -549,7 +626,7 @@ static int shared_listxattr(struct listxattr_iter *it) break; } if (vi->xattr_shared_count) - xattr_iter_end(&it->it, true); + xattr_iter_end_final(&it->it); return ret < 0 ? ret : it->buffer_ofs; } @@ -560,7 +637,9 @@ ssize_t erofs_listxattr(struct dentry *dentry, int ret; struct listxattr_iter it; - init_inode_xattrs(d_inode(dentry)); + ret = init_inode_xattrs(d_inode(dentry)); + if (ret) + return ret; it.dentry = dentry; it.buffer = buffer; diff --git a/drivers/staging/wilc1000/linux_wlan.c b/drivers/staging/wilc1000/linux_wlan.c index 3b8d237decbf..649caae2b603 100644 --- a/drivers/staging/wilc1000/linux_wlan.c +++ b/drivers/staging/wilc1000/linux_wlan.c @@ -1090,8 +1090,8 @@ int wilc_netdev_init(struct wilc **wilc, struct device *dev, int io_type, vif->wilc = *wilc; vif->ndev = ndev; wl->vif[i] = vif; - wl->vif_num = i; - vif->idx = wl->vif_num; + wl->vif_num = i + 1; + vif->idx = i; ndev->netdev_ops = &wilc_netdev_ops; diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index 09bf6b4b741b..1493d0fdf5ad 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -187,6 +187,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) xhci->quirks |= XHCI_SSIC_PORT_UNUSED; if (pdev->vendor == PCI_VENDOR_ID_INTEL && (pdev->device == PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI || + pdev->device == PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_XHCI || pdev->device == PCI_DEVICE_ID_INTEL_APL_XHCI)) xhci->quirks |= XHCI_INTEL_USB_ROLE_SW; if (pdev->vendor == PCI_VENDOR_ID_INTEL && diff --git a/drivers/usb/phy/Kconfig b/drivers/usb/phy/Kconfig index 1a0cf5dcab77..f87c991b179c 100644 --- a/drivers/usb/phy/Kconfig +++ b/drivers/usb/phy/Kconfig @@ -21,7 +21,7 @@ config AB8500_USB config FSL_USB2_OTG bool "Freescale USB OTG Transceiver Driver" - depends on USB_EHCI_FSL && USB_FSL_USB2 && USB_OTG_FSM && PM + depends on USB_EHCI_FSL && USB_FSL_USB2 && USB_OTG_FSM=y && PM depends on USB_GADGET || !USB_GADGET # if USB_GADGET=m, this can't be 'y' select USB_PHY help diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index c0777a374a88..4c66edf533fe 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -61,6 +61,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x08e6, 0x5501) }, /* Gemalto Prox-PU/CU contactless smartcard reader */ { USB_DEVICE(0x08FD, 0x000A) }, /* Digianswer A/S , ZigBee/802.15.4 MAC Device */ { USB_DEVICE(0x0908, 0x01FF) }, /* Siemens RUGGEDCOM USB Serial Console */ + { USB_DEVICE(0x0B00, 0x3070) }, /* Ingenico 3070 */ { USB_DEVICE(0x0BED, 0x1100) }, /* MEI (TM) Cashflow-SC Bill/Voucher Acceptor */ { USB_DEVICE(0x0BED, 0x1101) }, /* MEI series 2000 Combo Acceptor */ { USB_DEVICE(0x0FCF, 0x1003) }, /* Dynastream ANT development board */ @@ -1353,8 +1354,13 @@ static int cp210x_gpio_get(struct gpio_chip *gc, unsigned int gpio) if (priv->partnum == CP210X_PARTNUM_CP2105) req_type = REQTYPE_INTERFACE_TO_HOST; + result = usb_autopm_get_interface(serial->interface); + if (result) + return result; + result = cp210x_read_vendor_block(serial, req_type, CP210X_READ_LATCH, &buf, sizeof(buf)); + usb_autopm_put_interface(serial->interface); if (result < 0) return result; @@ -1375,6 +1381,10 @@ static void cp210x_gpio_set(struct gpio_chip *gc, unsigned int gpio, int value) buf.mask = BIT(gpio); + result = usb_autopm_get_interface(serial->interface); + if (result) + goto out; + if (priv->partnum == CP210X_PARTNUM_CP2105) { result = cp210x_write_vendor_block(serial, REQTYPE_HOST_TO_INTERFACE, @@ -1392,6 +1402,8 @@ static void cp210x_gpio_set(struct gpio_chip *gc, unsigned int gpio, int value) NULL, 0, USB_CTRL_SET_TIMEOUT); } + usb_autopm_put_interface(serial->interface); +out: if (result < 0) { dev_err(&serial->interface->dev, "failed to set GPIO value: %d\n", result); diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c index b5cef322826f..1d8077e880a0 100644 --- a/drivers/usb/serial/ftdi_sio.c +++ b/drivers/usb/serial/ftdi_sio.c @@ -1015,6 +1015,8 @@ static const struct usb_device_id id_table_combined[] = { { USB_DEVICE(CYPRESS_VID, CYPRESS_WICED_BT_USB_PID) }, { USB_DEVICE(CYPRESS_VID, CYPRESS_WICED_WL_USB_PID) }, { USB_DEVICE(AIRBUS_DS_VID, AIRBUS_DS_P8GR) }, + /* EZPrototypes devices */ + { USB_DEVICE(EZPROTOTYPES_VID, HJELMSLUND_USB485_ISO_PID) }, { } /* Terminating entry */ }; diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h index 975d02666c5a..b863bedb55a1 100644 --- a/drivers/usb/serial/ftdi_sio_ids.h +++ b/drivers/usb/serial/ftdi_sio_ids.h @@ -1308,6 +1308,12 @@ #define IONICS_VID 0x1c0c #define IONICS_PLUGCOMPUTER_PID 0x0102 +/* + * EZPrototypes (PID reseller) + */ +#define EZPROTOTYPES_VID 0x1c40 +#define HJELMSLUND_USB485_ISO_PID 0x0477 + /* * Dresden Elektronik Sensor Terminal Board */ diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index fb544340888b..faf833e8f557 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1148,6 +1148,8 @@ static const struct usb_device_id option_ids[] = { .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM), .driver_info = NCTRL(0) | RSVD(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1102, 0xff), /* Telit ME910 (ECM) */ + .driver_info = NCTRL(0) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE910), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE910_USBCFG4), diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index fa93f6711d8d..e440f87ae1d6 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -642,7 +642,7 @@ static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u64 guest_cid) hash_del_rcu(&vsock->hash); vsock->guest_cid = guest_cid; - hash_add_rcu(vhost_vsock_hash, &vsock->hash, guest_cid); + hash_add_rcu(vhost_vsock_hash, &vsock->hash, vsock->guest_cid); spin_unlock_bh(&vhost_vsock_lock); return 0; diff --git a/fs/aio.c b/fs/aio.c index 44551d96eaa4..45d5ef8dd0a8 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1661,6 +1661,7 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, struct poll_iocb *req = container_of(wait, struct poll_iocb, wait); struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll); __poll_t mask = key_to_poll(key); + unsigned long flags; req->woken = true; @@ -1669,10 +1670,15 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, if (!(mask & req->events)) return 0; - /* try to complete the iocb inline if we can: */ - if (spin_trylock(&iocb->ki_ctx->ctx_lock)) { + /* + * Try to complete the iocb inline if we can. Use + * irqsave/irqrestore because not all filesystems (e.g. fuse) + * call this function with IRQs disabled and because IRQs + * have to be disabled before ctx_lock is obtained. + */ + if (spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) { list_del(&iocb->ki_list); - spin_unlock(&iocb->ki_ctx->ctx_lock); + spin_unlock_irqrestore(&iocb->ki_ctx->ctx_lock, flags); list_del_init(&req->wait.entry); aio_poll_complete(iocb, mask); diff --git a/fs/autofs/expire.c b/fs/autofs/expire.c index d441244b79df..28d9c2b1b3bb 100644 --- a/fs/autofs/expire.c +++ b/fs/autofs/expire.c @@ -596,7 +596,6 @@ int autofs_expire_run(struct super_block *sb, pkt.len = dentry->d_name.len; memcpy(pkt.name, dentry->d_name.name, pkt.len); pkt.name[pkt.len] = '\0'; - dput(dentry); if (copy_to_user(pkt_p, &pkt, sizeof(struct autofs_packet_expire))) ret = -EFAULT; @@ -609,6 +608,8 @@ int autofs_expire_run(struct super_block *sb, complete_all(&ino->expire_complete); spin_unlock(&sbi->fs_lock); + dput(dentry); + return ret; } diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c index 846c052569dd..3c14a8e45ffb 100644 --- a/fs/autofs/inode.c +++ b/fs/autofs/inode.c @@ -255,8 +255,10 @@ int autofs_fill_super(struct super_block *s, void *data, int silent) } root_inode = autofs_get_inode(s, S_IFDIR | 0755); root = d_make_root(root_inode); - if (!root) + if (!root) { + ret = -ENOMEM; goto fail_ino; + } pipe = NULL; root->d_fsdata = ino; diff --git a/fs/buffer.c b/fs/buffer.c index 6f1ae3ac9789..c083c4b3c1e7 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -200,6 +200,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) struct buffer_head *head; struct page *page; int all_mapped = 1; + static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1); index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED); @@ -227,15 +228,15 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) * file io on the block device and getblk. It gets dealt with * elsewhere, don't buffer_error if we had some unmapped buffers */ - if (all_mapped) { - printk("__find_get_block_slow() failed. " - "block=%llu, b_blocknr=%llu\n", - (unsigned long long)block, - (unsigned long long)bh->b_blocknr); - printk("b_state=0x%08lx, b_size=%zu\n", - bh->b_state, bh->b_size); - printk("device %pg blocksize: %d\n", bdev, - 1 << bd_inode->i_blkbits); + ratelimit_set_flags(&last_warned, RATELIMIT_MSG_ON_RELEASE); + if (all_mapped && __ratelimit(&last_warned)) { + printk("__find_get_block_slow() failed. block=%llu, " + "b_blocknr=%llu, b_state=0x%08lx, b_size=%zu, " + "device %pg blocksize: %d\n", + (unsigned long long)block, + (unsigned long long)bh->b_blocknr, + bh->b_state, bh->b_size, bdev, + 1 << bd_inode->i_blkbits); } out_unlock: spin_unlock(&bd_mapping->private_lock); diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 1e5a1171212f..a2d701775c49 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -2243,10 +2243,12 @@ SMB2_open_free(struct smb_rqst *rqst) { int i; - cifs_small_buf_release(rqst->rq_iov[0].iov_base); - for (i = 1; i < rqst->rq_nvec; i++) - if (rqst->rq_iov[i].iov_base != smb2_padding) - kfree(rqst->rq_iov[i].iov_base); + if (rqst && rqst->rq_iov) { + cifs_small_buf_release(rqst->rq_iov[0].iov_base); + for (i = 1; i < rqst->rq_nvec; i++) + if (rqst->rq_iov[i].iov_base != smb2_padding) + kfree(rqst->rq_iov[i].iov_base); + } } int @@ -2535,7 +2537,8 @@ SMB2_close_init(struct cifs_tcon *tcon, struct smb_rqst *rqst, void SMB2_close_free(struct smb_rqst *rqst) { - cifs_small_buf_release(rqst->rq_iov[0].iov_base); /* request */ + if (rqst && rqst->rq_iov) + cifs_small_buf_release(rqst->rq_iov[0].iov_base); /* request */ } int @@ -2685,7 +2688,8 @@ SMB2_query_info_init(struct cifs_tcon *tcon, struct smb_rqst *rqst, void SMB2_query_info_free(struct smb_rqst *rqst) { - cifs_small_buf_release(rqst->rq_iov[0].iov_base); /* request */ + if (rqst && rqst->rq_iov) + cifs_small_buf_release(rqst->rq_iov[0].iov_base); /* request */ } static int diff --git a/fs/cifs/smb2pdu.h b/fs/cifs/smb2pdu.h index 8fb7887f2b3d..437257d1116f 100644 --- a/fs/cifs/smb2pdu.h +++ b/fs/cifs/smb2pdu.h @@ -84,8 +84,8 @@ #define NUMBER_OF_SMB2_COMMANDS 0x0013 -/* 4 len + 52 transform hdr + 64 hdr + 56 create rsp */ -#define MAX_SMB2_HDR_SIZE 0x00b0 +/* 52 transform hdr + 64 hdr + 88 create rsp */ +#define MAX_SMB2_HDR_SIZE 204 #define SMB2_PROTO_NUMBER cpu_to_le32(0x424d53fe) #define SMB2_TRANSFORM_PROTO_NUM cpu_to_le32(0x424d53fd) diff --git a/fs/drop_caches.c b/fs/drop_caches.c index 82377017130f..d31b6c72b476 100644 --- a/fs/drop_caches.c +++ b/fs/drop_caches.c @@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused) spin_lock(&sb->s_inode_list_lock); list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { spin_lock(&inode->i_lock); + /* + * We must skip inodes in unusual state. We may also skip + * inodes without pages but we deliberately won't in case + * we need to reschedule to avoid softlockups. + */ if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || - (inode->i_mapping->nrpages == 0)) { + (inode->i_mapping->nrpages == 0 && !need_resched())) { spin_unlock(&inode->i_lock); continue; } @@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused) spin_unlock(&inode->i_lock); spin_unlock(&sb->s_inode_list_lock); + cond_resched(); invalidate_mapping_pages(inode->i_mapping, 0, -1); iput(toput_inode); toput_inode = inode; diff --git a/fs/exec.c b/fs/exec.c index c7e3417a10ae..77c03ceb3f3c 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -929,7 +929,7 @@ int kernel_read_file(struct file *file, void **buf, loff_t *size, bytes = kernel_read(file, *buf + pos, i_size - pos, &pos); if (bytes < 0) { ret = bytes; - goto out; + goto out_free; } if (bytes == 0) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 4614ee25f621..9d566e62684c 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -107,7 +107,7 @@ static int glock_wake_function(wait_queue_entry_t *wait, unsigned int mode, static wait_queue_head_t *glock_waitqueue(struct lm_lockname *name) { - u32 hash = jhash2((u32 *)name, sizeof(*name) / 4, 0); + u32 hash = jhash2((u32 *)name, ht_parms.key_len / 4, 0); return glock_wait_table + hash_32(hash, GLOCK_WAIT_TABLE_BITS); } diff --git a/fs/iomap.c b/fs/iomap.c index e57fb1e534c5..fac45206418a 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -117,6 +117,12 @@ iomap_page_create(struct inode *inode, struct page *page) atomic_set(&iop->read_count, 0); atomic_set(&iop->write_count, 0); bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE); + + /* + * migrate_page_move_mapping() assumes that pages with private data have + * their count elevated by 1. + */ + get_page(page); set_page_private(page, (unsigned long)iop); SetPagePrivate(page); return iop; @@ -133,6 +139,7 @@ iomap_page_release(struct page *page) WARN_ON_ONCE(atomic_read(&iop->write_count)); ClearPagePrivate(page); set_page_private(page, 0); + put_page(page); kfree(iop); } @@ -565,8 +572,10 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage, if (page_has_private(page)) { ClearPagePrivate(page); + get_page(newpage); set_page_private(newpage, page_private(page)); set_page_private(page, 0); + put_page(page); SetPagePrivate(newpage); } @@ -1778,6 +1787,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, loff_t pos = iocb->ki_pos, start = pos; loff_t end = iocb->ki_pos + count - 1, ret = 0; unsigned int flags = IOMAP_DIRECT; + bool wait_for_completion = is_sync_kiocb(iocb); struct blk_plug plug; struct iomap_dio *dio; @@ -1797,7 +1807,6 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio->end_io = end_io; dio->error = 0; dio->flags = 0; - dio->wait_for_completion = is_sync_kiocb(iocb); dio->submit.iter = iter; dio->submit.waiter = current; @@ -1852,7 +1861,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio_warn_stale_pagecache(iocb->ki_filp); ret = 0; - if (iov_iter_rw(iter) == WRITE && !dio->wait_for_completion && + if (iov_iter_rw(iter) == WRITE && !wait_for_completion && !inode->i_sb->s_dio_done_wq) { ret = sb_init_dio_done_wq(inode->i_sb); if (ret < 0) @@ -1868,7 +1877,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (ret <= 0) { /* magic error code to fall back to buffered I/O */ if (ret == -ENOTBLK) { - dio->wait_for_completion = true; + wait_for_completion = true; ret = 0; } break; @@ -1890,8 +1899,24 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (dio->flags & IOMAP_DIO_WRITE_FUA) dio->flags &= ~IOMAP_DIO_NEED_SYNC; + /* + * We are about to drop our additional submission reference, which + * might be the last reference to the dio. There are three three + * different ways we can progress here: + * + * (a) If this is the last reference we will always complete and free + * the dio ourselves. + * (b) If this is not the last reference, and we serve an asynchronous + * iocb, we must never touch the dio after the decrement, the + * I/O completion handler will complete and free it. + * (c) If this is not the last reference, but we serve a synchronous + * iocb, the I/O completion handler will wake us up on the drop + * of the final reference, and we will complete and free it here + * after we got woken by the I/O completion handler. + */ + dio->wait_for_completion = wait_for_completion; if (!atomic_dec_and_test(&dio->ref)) { - if (!dio->wait_for_completion) + if (!wait_for_completion) return -EIOCBQUEUED; for (;;) { @@ -1908,9 +1933,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, __set_current_state(TASK_RUNNING); } - ret = iomap_dio_complete(dio); - - return ret; + return iomap_dio_complete(dio); out_free_dio: kfree(dio); diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 5ef2c71348bd..6b666d187907 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1906,6 +1906,11 @@ static int nfs_parse_devname(const char *dev_name, size_t len; char *end; + if (unlikely(!dev_name || !*dev_name)) { + dfprintk(MOUNT, "NFS: device name not specified\n"); + return -EINVAL; + } + /* Is the host name protected with square brakcets? */ if (*dev_name == '[') { end = strchr(++dev_name, ']'); diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index 1cc797a08a5b..e6b5d6216302 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -829,7 +829,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags) dput(parent); dput(next); } - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index b2aadd3e1fec..2e4af5f46195 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -567,7 +567,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode, override_cred->fsgid = inode->i_gid; if (!attr->hardlink) { err = security_dentry_create_files_as(dentry, - attr->mode, &dentry->d_name, old_cred, + attr->mode, &dentry->d_name, + old_cred ? old_cred : current_cred(), override_cred); if (err) { put_cred(override_cred); @@ -583,7 +584,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode, err = ovl_create_over_whiteout(dentry, inode, attr); } out_revert_creds: - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } @@ -659,7 +660,7 @@ static int ovl_set_link_redirect(struct dentry *dentry) old_cred = ovl_override_creds(dentry->d_sb); err = ovl_set_redirect(dentry, false); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } @@ -857,7 +858,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir) err = ovl_remove_upper(dentry, is_dir, &list); else err = ovl_remove_and_whiteout(dentry, &list); - revert_creds(old_cred); + ovl_revert_creds(old_cred); if (!err) { if (is_dir) clear_nlink(dentry->d_inode); @@ -1225,7 +1226,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old, out_unlock: unlock_rename(new_upperdir, old_upperdir); out_revert_creds: - revert_creds(old_cred); + ovl_revert_creds(old_cred); ovl_nlink_end(new, locked); out_drop_write: ovl_drop_write(old); diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 986313da0c88..da7d7856fcda 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -33,7 +33,7 @@ static struct file *ovl_open_realfile(const struct file *file, old_cred = ovl_override_creds(inode->i_sb); realfile = open_with_fake_path(&file->f_path, file->f_flags | O_NOATIME, realinode, current_cred()); - revert_creds(old_cred); + ovl_revert_creds(old_cred); pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n", file, file, ovl_whatisit(inode, realinode), file->f_flags, @@ -208,7 +208,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter) old_cred = ovl_override_creds(file_inode(file)->i_sb); ret = vfs_iter_read(real.file, iter, &iocb->ki_pos, ovl_iocb_to_rwf(iocb)); - revert_creds(old_cred); + ovl_revert_creds(old_cred); ovl_file_accessed(file); @@ -244,7 +244,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter) ret = vfs_iter_write(real.file, iter, &iocb->ki_pos, ovl_iocb_to_rwf(iocb)); file_end_write(real.file); - revert_creds(old_cred); + ovl_revert_creds(old_cred); /* Update size */ ovl_copyattr(ovl_inode_real(inode), inode); @@ -271,7 +271,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync) if (file_inode(real.file) == ovl_inode_upper(file_inode(file))) { old_cred = ovl_override_creds(file_inode(file)->i_sb); ret = vfs_fsync_range(real.file, start, end, datasync); - revert_creds(old_cred); + ovl_revert_creds(old_cred); } fdput(real); @@ -295,7 +295,7 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma) old_cred = ovl_override_creds(file_inode(file)->i_sb); ret = call_mmap(vma->vm_file, vma); - revert_creds(old_cred); + ovl_revert_creds(old_cred); if (ret) { /* Drop reference count from new vm_file value */ @@ -323,7 +323,7 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len old_cred = ovl_override_creds(file_inode(file)->i_sb); ret = vfs_fallocate(real.file, mode, offset, len); - revert_creds(old_cred); + ovl_revert_creds(old_cred); /* Update size */ ovl_copyattr(ovl_inode_real(inode), inode); @@ -345,7 +345,7 @@ static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice) old_cred = ovl_override_creds(file_inode(file)->i_sb); ret = vfs_fadvise(real.file, offset, len, advice); - revert_creds(old_cred); + ovl_revert_creds(old_cred); fdput(real); @@ -365,7 +365,7 @@ static long ovl_real_ioctl(struct file *file, unsigned int cmd, old_cred = ovl_override_creds(file_inode(file)->i_sb); ret = vfs_ioctl(real.file, cmd, arg); - revert_creds(old_cred); + ovl_revert_creds(old_cred); fdput(real); @@ -470,7 +470,7 @@ static ssize_t ovl_copyfile(struct file *file_in, loff_t pos_in, real_out.file, pos_out, len); break; } - revert_creds(old_cred); + ovl_revert_creds(old_cred); /* Update size */ ovl_copyattr(ovl_inode_real(inode_out), inode_out); diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index 3b7ed5d2279c..b3c61267c8b5 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -64,7 +64,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr) inode_lock(upperdentry->d_inode); old_cred = ovl_override_creds(dentry->d_sb); err = notify_change(upperdentry, attr, NULL); - revert_creds(old_cred); + ovl_revert_creds(old_cred); if (!err) ovl_copyattr(upperdentry->d_inode, dentry->d_inode); inode_unlock(upperdentry->d_inode); @@ -260,7 +260,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat, stat->nlink = dentry->d_inode->i_nlink; out: - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } @@ -294,7 +294,7 @@ int ovl_permission(struct inode *inode, int mask) mask |= MAY_READ; } err = inode_permission(realinode, mask); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } @@ -311,7 +311,7 @@ static const char *ovl_get_link(struct dentry *dentry, old_cred = ovl_override_creds(dentry->d_sb); p = vfs_get_link(ovl_dentry_real(dentry), done); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return p; } @@ -354,7 +354,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name, WARN_ON(flags != XATTR_REPLACE); err = vfs_removexattr(realdentry, name); } - revert_creds(old_cred); + ovl_revert_creds(old_cred); /* copy c/mtime */ ovl_copyattr(d_inode(realdentry), inode); @@ -375,7 +375,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name, old_cred = ovl_override_creds(dentry->d_sb); res = vfs_getxattr(realdentry, name, value, size); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return res; } @@ -399,7 +399,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size) old_cred = ovl_override_creds(dentry->d_sb); res = vfs_listxattr(realdentry, list, size); - revert_creds(old_cred); + ovl_revert_creds(old_cred); if (res <= 0 || size == 0) return res; @@ -434,7 +434,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type) old_cred = ovl_override_creds(inode->i_sb); acl = get_acl(realinode, type); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return acl; } @@ -472,7 +472,7 @@ static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, filemap_write_and_wait(realinode->i_mapping); err = realinode->i_op->fiemap(realinode, fieinfo, start, len); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c index efd372312ef1..2fd199e9be97 100644 --- a/fs/overlayfs/namei.c +++ b/fs/overlayfs/namei.c @@ -1069,7 +1069,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, goto out_free_oe; } - revert_creds(old_cred); + ovl_revert_creds(old_cred); if (origin_path) { dput(origin_path->dentry); kfree(origin_path); @@ -1096,7 +1096,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, kfree(upperredirect); out: kfree(d.redirect); - revert_creds(old_cred); + ovl_revert_creds(old_cred); return ERR_PTR(err); } @@ -1150,7 +1150,7 @@ bool ovl_lower_positive(struct dentry *dentry) dput(this); } } - revert_creds(old_cred); + ovl_revert_creds(old_cred); return positive; } diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index a3c0d9584312..552a19abc155 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -208,6 +208,7 @@ int ovl_want_write(struct dentry *dentry); void ovl_drop_write(struct dentry *dentry); struct dentry *ovl_workdir(struct dentry *dentry); const struct cred *ovl_override_creds(struct super_block *sb); +void ovl_revert_creds(const struct cred *oldcred); struct super_block *ovl_same_sb(struct super_block *sb); int ovl_can_decode_fh(struct super_block *sb); struct dentry *ovl_indexdir(struct super_block *sb); diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h index ec237035333a..e38eea8104be 100644 --- a/fs/overlayfs/ovl_entry.h +++ b/fs/overlayfs/ovl_entry.h @@ -20,6 +20,7 @@ struct ovl_config { bool nfs_export; int xino; bool metacopy; + bool override_creds; }; struct ovl_sb { diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index cc8303a806b4..ec591b49e902 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd) } inode_unlock(dir->d_inode); } - revert_creds(old_cred); + ovl_revert_creds(old_cred); return err; } @@ -921,7 +921,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list) old_cred = ovl_override_creds(dentry->d_sb); err = ovl_dir_read_merged(dentry, list, &root); - revert_creds(old_cred); + ovl_revert_creds(old_cred); if (err) return err; diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 0fb0a59a5e5c..df7706218987 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -56,6 +56,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644); MODULE_PARM_DESC(ovl_xino_auto_def, "Auto enable xino feature"); +static bool __read_mostly ovl_override_creds_def = true; +module_param_named(override_creds, ovl_override_creds_def, bool, 0644); +MODULE_PARM_DESC(ovl_override_creds_def, + "Use mounter's credentials for accesses"); + static void ovl_entry_stack_free(struct ovl_entry *oe) { unsigned int i; @@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry) if (ofs->config.metacopy != ovl_metacopy_def) seq_printf(m, ",metacopy=%s", ofs->config.metacopy ? "on" : "off"); + if (ofs->config.override_creds != ovl_override_creds_def) + seq_show_option(m, "override_creds", + ofs->config.override_creds ? "on" : "off"); return 0; } @@ -401,6 +409,8 @@ enum { OPT_XINO_AUTO, OPT_METACOPY_ON, OPT_METACOPY_OFF, + OPT_OVERRIDE_CREDS_ON, + OPT_OVERRIDE_CREDS_OFF, OPT_ERR, }; @@ -419,6 +429,8 @@ static const match_table_t ovl_tokens = { {OPT_XINO_AUTO, "xino=auto"}, {OPT_METACOPY_ON, "metacopy=on"}, {OPT_METACOPY_OFF, "metacopy=off"}, + {OPT_OVERRIDE_CREDS_ON, "override_creds=on"}, + {OPT_OVERRIDE_CREDS_OFF, "override_creds=off"}, {OPT_ERR, NULL} }; @@ -477,6 +489,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config) config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL); if (!config->redirect_mode) return -ENOMEM; + config->override_creds = ovl_override_creds_def; while ((p = ovl_next_opt(&opt)) != NULL) { int token; @@ -557,6 +570,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config) config->metacopy = false; break; + case OPT_OVERRIDE_CREDS_ON: + config->override_creds = true; + break; + + case OPT_OVERRIDE_CREDS_OFF: + config->override_creds = false; + break; + default: pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p); return -EINVAL; @@ -1521,7 +1542,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) ovl_dentry_lower(root_dentry), NULL); sb->s_root = root_dentry; - return 0; out_free_oe: diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c index ace4fe4c39a9..470310e68558 100644 --- a/fs/overlayfs/util.c +++ b/fs/overlayfs/util.c @@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb) { struct ovl_fs *ofs = sb->s_fs_info; + if (!ofs->config.override_creds) + return NULL; return override_creds(ofs->creator_cred); } +void ovl_revert_creds(const struct cred *old_cred) +{ + if (old_cred) + revert_creds(old_cred); +} + struct super_block *ovl_same_sb(struct super_block *sb) { struct ovl_fs *ofs = sb->s_fs_info; @@ -783,7 +791,7 @@ int ovl_nlink_start(struct dentry *dentry, bool *locked) * value relative to the upper inode nlink in an upper inode xattr. */ err = ovl_set_nlink_upper(dentry); - revert_creds(old_cred); + ovl_revert_creds(old_cred); out: if (err) @@ -803,7 +811,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked) old_cred = ovl_override_creds(dentry->d_sb); ovl_cleanup_index(dentry); - revert_creds(old_cred); + ovl_revert_creds(old_cred); } mutex_unlock(&OVL_I(d_inode(dentry))->lock); diff --git a/fs/proc/Kconfig b/fs/proc/Kconfig index 4d96a7cc7ea8..cad2c60b8656 100644 --- a/fs/proc/Kconfig +++ b/fs/proc/Kconfig @@ -100,7 +100,6 @@ config PROC_CHILDREN config PROC_UID bool "Include /proc/uid/ files" - default y depends on PROC_FS && RT_MUTEXES help Provides aggregated per-uid information under /proc/uid. diff --git a/fs/proc/generic.c b/fs/proc/generic.c index 8ae109429a88..e39bac94dead 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -256,7 +256,7 @@ struct dentry *proc_lookup_de(struct inode *dir, struct dentry *dentry, inode = proc_get_inode(dir->i_sb, de); if (!inode) return ERR_PTR(-ENOMEM); - d_set_d_op(dentry, &proc_misc_dentry_ops); + d_set_d_op(dentry, de->proc_dops); return d_splice_alias(inode, dentry); } read_unlock(&proc_subdir_lock); @@ -429,6 +429,8 @@ static struct proc_dir_entry *__proc_create(struct proc_dir_entry **parent, INIT_LIST_HEAD(&ent->pde_openers); proc_set_user(ent, (*parent)->uid, (*parent)->gid); + ent->proc_dops = &proc_misc_dentry_ops; + out: return ent; } diff --git a/fs/proc/internal.h b/fs/proc/internal.h index c0c7abbd50fa..bacad3e6009b 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -44,6 +44,7 @@ struct proc_dir_entry { struct completion *pde_unload_completion; const struct inode_operations *proc_iops; const struct file_operations *proc_fops; + const struct dentry_operations *proc_dops; union { const struct seq_operations *seq_ops; int (*single_show)(struct seq_file *, void *); diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c index d5e0fcb3439e..a7b12435519e 100644 --- a/fs/proc/proc_net.c +++ b/fs/proc/proc_net.c @@ -38,6 +38,22 @@ static struct net *get_proc_net(const struct inode *inode) return maybe_get_net(PDE_NET(PDE(inode))); } +static int proc_net_d_revalidate(struct dentry *dentry, unsigned int flags) +{ + return 0; +} + +static const struct dentry_operations proc_net_dentry_ops = { + .d_revalidate = proc_net_d_revalidate, + .d_delete = always_delete_dentry, +}; + +static void pde_force_lookup(struct proc_dir_entry *pde) +{ + /* /proc/net/ entries can be changed under us by setns(CLONE_NEWNET) */ + pde->proc_dops = &proc_net_dentry_ops; +} + static int seq_open_net(struct inode *inode, struct file *file) { unsigned int state_size = PDE(inode)->state_size; @@ -90,6 +106,7 @@ struct proc_dir_entry *proc_create_net_data(const char *name, umode_t mode, p = proc_create_reg(name, mode, &parent, data); if (!p) return NULL; + pde_force_lookup(p); p->proc_fops = &proc_net_seq_fops; p->seq_ops = ops; p->state_size = state_size; @@ -133,6 +150,7 @@ struct proc_dir_entry *proc_create_net_data_write(const char *name, umode_t mode p = proc_create_reg(name, mode, &parent, data); if (!p) return NULL; + pde_force_lookup(p); p->proc_fops = &proc_net_seq_fops; p->seq_ops = ops; p->state_size = state_size; @@ -181,6 +199,7 @@ struct proc_dir_entry *proc_create_net_single(const char *name, umode_t mode, p = proc_create_reg(name, mode, &parent, data); if (!p) return NULL; + pde_force_lookup(p); p->proc_fops = &proc_net_single_fops; p->single_show = show; return proc_register(parent, p); @@ -223,6 +242,7 @@ struct proc_dir_entry *proc_create_net_single_write(const char *name, umode_t mo p = proc_create_reg(name, mode, &parent, data); if (!p) return NULL; + pde_force_lookup(p); p->proc_fops = &proc_net_single_fops; p->single_show = show; p->write = write; diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index bfe1639df02d..97fc498dc767 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -47,6 +47,24 @@ static inline bool drm_arch_can_wc_memory(void) return false; #elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON3) return false; +#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64) + /* + * The DRM driver stack is designed to work with cache coherent devices + * only, but permits an optimization to be enabled in some cases, where + * for some buffers, both the CPU and the GPU use uncached mappings, + * removing the need for DMA snooping and allocation in the CPU caches. + * + * The use of uncached GPU mappings relies on the correct implementation + * of the PCIe NoSnoop TLP attribute by the platform, otherwise the GPU + * will use cached mappings nonetheless. On x86 platforms, this does not + * seem to matter, as uncached CPU mappings will snoop the caches in any + * case. However, on ARM and arm64, enabling this optimization on a + * platform where NoSnoop is ignored results in loss of coherency, which + * breaks correct operation of the device. Since we have no way of + * detecting whether NoSnoop works or not, just disable this + * optimization entirely for ARM and arm64. + */ + return false; #else return true; #endif diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 68819730da2f..dae98633349e 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -254,20 +254,12 @@ __ATTR(_name, 0644, show_##_name, store_##_name) static struct freq_attr _name = \ __ATTR(_name, 0200, NULL, store_##_name) -struct global_attr { - struct attribute attr; - ssize_t (*show)(struct kobject *kobj, - struct attribute *attr, char *buf); - ssize_t (*store)(struct kobject *a, struct attribute *b, - const char *c, size_t count); -}; - #define define_one_global_ro(_name) \ -static struct global_attr _name = \ +static struct kobj_attribute _name = \ __ATTR(_name, 0444, show_##_name, NULL) #define define_one_global_rw(_name) \ -static struct global_attr _name = \ +static struct kobj_attribute _name = \ __ATTR(_name, 0644, show_##_name, store_##_name) diff --git a/include/linux/cpufreq_times.h b/include/linux/cpufreq_times.h index 757bf0cb6070..0eb6dc9d0fe2 100644 --- a/include/linux/cpufreq_times.h +++ b/include/linux/cpufreq_times.h @@ -27,7 +27,8 @@ int proc_time_in_state_show(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *p); void cpufreq_acct_update_power(struct task_struct *p, u64 cputime); void cpufreq_times_create_policy(struct cpufreq_policy *policy); -void cpufreq_times_record_transition(struct cpufreq_freqs *freq); +void cpufreq_times_record_transition(struct cpufreq_policy *policy, + unsigned int new_freq); void cpufreq_task_times_remove_uids(uid_t uid_start, uid_t uid_end); int single_uid_time_in_state_open(struct inode *inode, struct file *file); #else @@ -38,7 +39,7 @@ static inline void cpufreq_acct_update_power(struct task_struct *p, u64 cputime) {} static inline void cpufreq_times_create_policy(struct cpufreq_policy *policy) {} static inline void cpufreq_times_record_transition( - struct cpufreq_freqs *freq) {} + struct cpufreq_policy *policy, unsigned int new_freq) {} static inline void cpufreq_task_times_remove_uids(uid_t uid_start, uid_t uid_end) {} #endif /* CONFIG_CPU_FREQ_TIMES */ diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h index 8bdbb5f29494..3188c0bef3e7 100644 --- a/include/linux/irqchip/arm-gic-v3.h +++ b/include/linux/irqchip/arm-gic-v3.h @@ -319,7 +319,7 @@ #define GITS_TYPER_PLPIS (1UL << 0) #define GITS_TYPER_VLPIS (1UL << 1) #define GITS_TYPER_ITT_ENTRY_SIZE_SHIFT 4 -#define GITS_TYPER_ITT_ENTRY_SIZE(r) ((((r) >> GITS_TYPER_ITT_ENTRY_SIZE_SHIFT) & 0x1f) + 1) +#define GITS_TYPER_ITT_ENTRY_SIZE(r) ((((r) >> GITS_TYPER_ITT_ENTRY_SIZE_SHIFT) & 0xf) + 1) #define GITS_TYPER_IDBITS_SHIFT 8 #define GITS_TYPER_DEVBITS_SHIFT 13 #define GITS_TYPER_DEVBITS(r) ((((r) >> GITS_TYPER_DEVBITS_SHIFT) & 0x1f) + 1) diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h index 7ddfc65586b0..4335bd771ce5 100644 --- a/include/linux/stmmac.h +++ b/include/linux/stmmac.h @@ -184,6 +184,7 @@ struct plat_stmmacenet_data { struct clk *pclk; struct clk *clk_ptp_ref; unsigned int clk_ptp_rate; + unsigned int clk_ref_rate; struct reset_control *stmmac_rst; struct stmmac_axi *axi; int has_gmac4; diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index ec9d6bc65855..fabee6db0abb 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -276,7 +276,7 @@ int bt_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg); int bt_sock_wait_state(struct sock *sk, int state, unsigned long timeo); int bt_sock_wait_ready(struct sock *sk, unsigned long flags); -void bt_accept_enqueue(struct sock *parent, struct sock *sk); +void bt_accept_enqueue(struct sock *parent, struct sock *sk, bool bh); void bt_accept_unlink(struct sock *sk); struct sock *bt_accept_dequeue(struct sock *parent, struct socket *newsock); diff --git a/include/net/icmp.h b/include/net/icmp.h index 3ef2743a8eec..8665bf24e3b7 100644 --- a/include/net/icmp.h +++ b/include/net/icmp.h @@ -22,6 +22,7 @@ #include #include +#include struct icmp_err { int errno; @@ -39,7 +40,13 @@ struct net_proto_family; struct sk_buff; struct net; -void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info); +void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, + const struct ip_options *opt); +static inline void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) +{ + __icmp_send(skb_in, type, code, info, &IPCB(skb_in)->opt); +} + int icmp_rcv(struct sk_buff *skb); void icmp_err(struct sk_buff *skb, u32 info); int icmp_init(void); diff --git a/include/net/ip.h b/include/net/ip.h index ddaa2bb55655..0693b82e8ae8 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -641,6 +641,8 @@ static inline int ip_options_echo(struct net *net, struct ip_options *dopt, } void ip_options_fragment(struct sk_buff *skb); +int __ip_options_compile(struct net *net, struct ip_options *opt, + struct sk_buff *skb, __be32 *info); int ip_options_compile(struct net *net, struct ip_options *opt, struct sk_buff *skb); int ip_options_get(struct net *net, struct ip_options_rcu **optp, @@ -690,7 +692,7 @@ extern int sysctl_icmp_msgs_burst; int ip_misc_proc_init(void); #endif -int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, +int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, u8 family, struct netlink_ext_ack *extack); #endif /* _IP_H */ diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index a6d00093f35e..c44da48de7df 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -47,7 +47,10 @@ struct qdisc_size_table { struct qdisc_skb_head { struct sk_buff *head; struct sk_buff *tail; - __u32 qlen; + union { + u32 qlen; + atomic_t atomic_qlen; + }; spinlock_t lock; }; @@ -384,27 +387,19 @@ static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz) BUILD_BUG_ON(sizeof(qcb->data) < sz); } -static inline int qdisc_qlen_cpu(const struct Qdisc *q) -{ - return this_cpu_ptr(q->cpu_qstats)->qlen; -} - static inline int qdisc_qlen(const struct Qdisc *q) { return q->q.qlen; } -static inline int qdisc_qlen_sum(const struct Qdisc *q) +static inline u32 qdisc_qlen_sum(const struct Qdisc *q) { - __u32 qlen = q->qstats.qlen; - int i; + u32 qlen = q->qstats.qlen; - if (q->flags & TCQ_F_NOLOCK) { - for_each_possible_cpu(i) - qlen += per_cpu_ptr(q->cpu_qstats, i)->qlen; - } else { + if (q->flags & TCQ_F_NOLOCK) + qlen += atomic_read(&q->q.atomic_qlen); + else qlen += q->q.qlen; - } return qlen; } @@ -776,14 +771,14 @@ static inline void qdisc_qstats_cpu_backlog_inc(struct Qdisc *sch, this_cpu_add(sch->cpu_qstats->backlog, qdisc_pkt_len(skb)); } -static inline void qdisc_qstats_cpu_qlen_inc(struct Qdisc *sch) +static inline void qdisc_qstats_atomic_qlen_inc(struct Qdisc *sch) { - this_cpu_inc(sch->cpu_qstats->qlen); + atomic_inc(&sch->q.atomic_qlen); } -static inline void qdisc_qstats_cpu_qlen_dec(struct Qdisc *sch) +static inline void qdisc_qstats_atomic_qlen_dec(struct Qdisc *sch) { - this_cpu_dec(sch->cpu_qstats->qlen); + atomic_dec(&sch->q.atomic_qlen); } static inline void qdisc_qstats_cpu_requeues_inc(struct Qdisc *sch) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 03cc59ee9c95..cebadd6af4d9 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -677,7 +677,7 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l) } if (htab_is_prealloc(htab)) { - pcpu_freelist_push(&htab->freelist, &l->fnode); + __pcpu_freelist_push(&htab->freelist, &l->fnode); } else { atomic_dec(&htab->count); l->htab = htab; @@ -739,7 +739,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, } else { struct pcpu_freelist_node *l; - l = pcpu_freelist_pop(&htab->freelist); + l = __pcpu_freelist_pop(&htab->freelist); if (!l) return ERR_PTR(-E2BIG); l_new = container_of(l, struct htab_elem, fnode); diff --git a/kernel/bpf/percpu_freelist.c b/kernel/bpf/percpu_freelist.c index 673fa6fe2d73..0c1b4ba9e90e 100644 --- a/kernel/bpf/percpu_freelist.c +++ b/kernel/bpf/percpu_freelist.c @@ -28,8 +28,8 @@ void pcpu_freelist_destroy(struct pcpu_freelist *s) free_percpu(s->freelist); } -static inline void __pcpu_freelist_push(struct pcpu_freelist_head *head, - struct pcpu_freelist_node *node) +static inline void ___pcpu_freelist_push(struct pcpu_freelist_head *head, + struct pcpu_freelist_node *node) { raw_spin_lock(&head->lock); node->next = head->first; @@ -37,12 +37,22 @@ static inline void __pcpu_freelist_push(struct pcpu_freelist_head *head, raw_spin_unlock(&head->lock); } -void pcpu_freelist_push(struct pcpu_freelist *s, +void __pcpu_freelist_push(struct pcpu_freelist *s, struct pcpu_freelist_node *node) { struct pcpu_freelist_head *head = this_cpu_ptr(s->freelist); - __pcpu_freelist_push(head, node); + ___pcpu_freelist_push(head, node); +} + +void pcpu_freelist_push(struct pcpu_freelist *s, + struct pcpu_freelist_node *node) +{ + unsigned long flags; + + local_irq_save(flags); + __pcpu_freelist_push(s, node); + local_irq_restore(flags); } void pcpu_freelist_populate(struct pcpu_freelist *s, void *buf, u32 elem_size, @@ -63,7 +73,7 @@ void pcpu_freelist_populate(struct pcpu_freelist *s, void *buf, u32 elem_size, for_each_possible_cpu(cpu) { again: head = per_cpu_ptr(s->freelist, cpu); - __pcpu_freelist_push(head, buf); + ___pcpu_freelist_push(head, buf); i++; buf += elem_size; if (i == nr_elems) @@ -74,14 +84,12 @@ void pcpu_freelist_populate(struct pcpu_freelist *s, void *buf, u32 elem_size, local_irq_restore(flags); } -struct pcpu_freelist_node *pcpu_freelist_pop(struct pcpu_freelist *s) +struct pcpu_freelist_node *__pcpu_freelist_pop(struct pcpu_freelist *s) { struct pcpu_freelist_head *head; struct pcpu_freelist_node *node; - unsigned long flags; int orig_cpu, cpu; - local_irq_save(flags); orig_cpu = cpu = raw_smp_processor_id(); while (1) { head = per_cpu_ptr(s->freelist, cpu); @@ -89,16 +97,25 @@ struct pcpu_freelist_node *pcpu_freelist_pop(struct pcpu_freelist *s) node = head->first; if (node) { head->first = node->next; - raw_spin_unlock_irqrestore(&head->lock, flags); + raw_spin_unlock(&head->lock); return node; } raw_spin_unlock(&head->lock); cpu = cpumask_next(cpu, cpu_possible_mask); if (cpu >= nr_cpu_ids) cpu = 0; - if (cpu == orig_cpu) { - local_irq_restore(flags); + if (cpu == orig_cpu) return NULL; - } } } + +struct pcpu_freelist_node *pcpu_freelist_pop(struct pcpu_freelist *s) +{ + struct pcpu_freelist_node *ret; + unsigned long flags; + + local_irq_save(flags); + ret = __pcpu_freelist_pop(s); + local_irq_restore(flags); + return ret; +} diff --git a/kernel/bpf/percpu_freelist.h b/kernel/bpf/percpu_freelist.h index 3049aae8ea1e..c3960118e617 100644 --- a/kernel/bpf/percpu_freelist.h +++ b/kernel/bpf/percpu_freelist.h @@ -22,8 +22,12 @@ struct pcpu_freelist_node { struct pcpu_freelist_node *next; }; +/* pcpu_freelist_* do spin_lock_irqsave. */ void pcpu_freelist_push(struct pcpu_freelist *, struct pcpu_freelist_node *); struct pcpu_freelist_node *pcpu_freelist_pop(struct pcpu_freelist *); +/* __pcpu_freelist_* do spin_lock only. caller must disable irqs. */ +void __pcpu_freelist_push(struct pcpu_freelist *, struct pcpu_freelist_node *); +struct pcpu_freelist_node *__pcpu_freelist_pop(struct pcpu_freelist *); void pcpu_freelist_populate(struct pcpu_freelist *s, void *buf, u32 elem_size, u32 nr_elems); int pcpu_freelist_init(struct pcpu_freelist *); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 382c09dddf93..cc40b8be1171 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -701,8 +701,13 @@ static int map_lookup_elem(union bpf_attr *attr) if (bpf_map_is_dev_bound(map)) { err = bpf_map_offload_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + goto done; + } + + preempt_disable(); + this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { err = bpf_percpu_hash_copy(map, key, value); } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { err = bpf_percpu_array_copy(map, key, value); @@ -722,7 +727,10 @@ static int map_lookup_elem(union bpf_attr *attr) rcu_read_unlock(); err = ptr ? 0 : -ENOENT; } + this_cpu_dec(bpf_prog_active); + preempt_enable(); +done: if (err) goto free_value; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 4d81be2d0739..bcb42aaf1b3a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6035,7 +6035,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) u32 off_reg; aux = &env->insn_aux_data[i + delta]; - if (!aux->alu_state) + if (!aux->alu_state || + aux->alu_state == BPF_ALU_NON_POINTER) continue; isneg = aux->alu_state & BPF_ALU_NEG_VALUE; diff --git a/kernel/events/core.c b/kernel/events/core.c index 0709f8584a66..c89f8ea5c8d1 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -469,18 +469,18 @@ int perf_proc_update_handler(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos) { - int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); - - if (ret || !write) - return ret; - + int ret; + int perf_cpu = sysctl_perf_cpu_time_max_percent; /* * If throttling is disabled don't allow the write: */ - if (sysctl_perf_cpu_time_max_percent == 100 || - sysctl_perf_cpu_time_max_percent == 0) + if (write && (perf_cpu == 100 || perf_cpu == 0)) return -EINVAL; + ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); + if (ret || !write) + return ret; + max_samples_per_tick = DIV_ROUND_UP(sysctl_perf_event_sample_rate, HZ); perf_sample_period_ns = NSEC_PER_SEC / sysctl_perf_event_sample_rate; update_perf_cpu_limits(); diff --git a/kernel/relay.c b/kernel/relay.c index 04f248644e06..9e0f52375487 100644 --- a/kernel/relay.c +++ b/kernel/relay.c @@ -428,6 +428,8 @@ static struct dentry *relay_create_buf_file(struct rchan *chan, dentry = chan->cb->create_buf_file(tmpname, chan->parent, S_IRUSR, buf, &chan->is_global); + if (IS_ERR(dentry)) + dentry = NULL; kfree(tmpname); @@ -461,7 +463,7 @@ static struct rchan_buf *relay_open_buf(struct rchan *chan, unsigned int cpu) dentry = chan->cb->create_buf_file(NULL, NULL, S_IRUSR, buf, &chan->is_global); - if (WARN_ON(dentry)) + if (IS_ERR_OR_NULL(dentry)) goto free_buf; } diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 9864a35c8bb5..6c28d519447d 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1158,22 +1158,12 @@ static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog * int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog) { - int err; - - mutex_lock(&bpf_event_mutex); - err = __bpf_probe_register(btp, prog); - mutex_unlock(&bpf_event_mutex); - return err; + return __bpf_probe_register(btp, prog); } int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog) { - int err; - - mutex_lock(&bpf_event_mutex); - err = tracepoint_probe_unregister(btp->tp, (void *)btp->bpf_func, prog); - mutex_unlock(&bpf_event_mutex); - return err; + return tracepoint_probe_unregister(btp->tp, (void *)btp->bpf_func, prog); } int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id, diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c index 5574e862de8d..5a1c64a26e81 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -1301,7 +1301,7 @@ static int parse_pred(const char *str, void *data, /* go past the last quote */ i++; - } else if (isdigit(str[i])) { + } else if (isdigit(str[i]) || str[i] == '-') { /* Make sure the field is not a string */ if (is_string_field(field)) { @@ -1314,6 +1314,9 @@ static int parse_pred(const char *str, void *data, goto err_free; } + if (str[i] == '-') + i++; + /* We allow 0xDEADBEEF */ while (isalnum(str[i])) i++; diff --git a/lib/test_kmod.c b/lib/test_kmod.c index d82d022111e0..9cf77628fc91 100644 --- a/lib/test_kmod.c +++ b/lib/test_kmod.c @@ -632,7 +632,7 @@ static void __kmod_config_free(struct test_config *config) config->test_driver = NULL; kfree_const(config->test_fs); - config->test_driver = NULL; + config->test_fs = NULL; } static void kmod_config_free(struct kmod_test_device *test_dev) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 37c5c519e90d..2cf470a76866 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1287,11 +1287,13 @@ static inline int pageblock_free(struct page *page) return PageBuddy(page) && page_order(page) >= pageblock_order; } -/* Return the start of the next active pageblock after a given page */ -static struct page *next_active_pageblock(struct page *page) +/* Return the pfn of the start of the next active pageblock after a given pfn */ +static unsigned long next_active_pageblock(unsigned long pfn) { + struct page *page = pfn_to_page(pfn); + /* Ensure the starting page is pageblock-aligned */ - BUG_ON(page_to_pfn(page) & (pageblock_nr_pages - 1)); + BUG_ON(pfn & (pageblock_nr_pages - 1)); /* If the entire pageblock is free, move to the end of free page */ if (pageblock_free(page)) { @@ -1299,16 +1301,16 @@ static struct page *next_active_pageblock(struct page *page) /* be careful. we don't have locks, page_order can be changed.*/ order = page_order(page); if ((order < MAX_ORDER) && (order >= pageblock_order)) - return page + (1 << order); + return pfn + (1 << order); } - return page + pageblock_nr_pages; + return pfn + pageblock_nr_pages; } -static bool is_pageblock_removable_nolock(struct page *page) +static bool is_pageblock_removable_nolock(unsigned long pfn) { + struct page *page = pfn_to_page(pfn); struct zone *zone; - unsigned long pfn; /* * We have to be careful here because we are iterating over memory @@ -1331,12 +1333,14 @@ static bool is_pageblock_removable_nolock(struct page *page) /* Checks if this range of memory is likely to be hot-removable. */ bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages) { - struct page *page = pfn_to_page(start_pfn); - struct page *end_page = page + nr_pages; + unsigned long end_pfn, pfn; + + end_pfn = min(start_pfn + nr_pages, + zone_end_pfn(page_zone(pfn_to_page(start_pfn)))); /* Check the starting page of each pageblock within the range */ - for (; page < end_page; page = next_active_pageblock(page)) { - if (!is_pageblock_removable_nolock(page)) + for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) { + if (!is_pageblock_removable_nolock(pfn)) return false; cond_resched(); } @@ -1372,6 +1376,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn, i++; if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn) continue; + /* Check if we got outside of the zone */ + if (zone && !zone_spans_pfn(zone, pfn + i)) + return 0; page = pfn_to_page(pfn + i); if (zone && page_zone(page) != zone) return 0; diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c index e8090f099eb8..ef0dec20c7d8 100644 --- a/net/batman-adv/bat_v_elp.c +++ b/net/batman-adv/bat_v_elp.c @@ -104,6 +104,9 @@ static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh) ret = cfg80211_get_station(real_netdev, neigh->addr, &sinfo); + /* free the TID stats immediately */ + cfg80211_sinfo_release_content(&sinfo); + dev_put(real_netdev); if (ret == -ENOENT) { /* Node is not associated anymore! It would be diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c index 6fa61b875b69..a4d6d7774105 100644 --- a/net/bluetooth/af_bluetooth.c +++ b/net/bluetooth/af_bluetooth.c @@ -183,15 +183,25 @@ void bt_sock_unlink(struct bt_sock_list *l, struct sock *sk) } EXPORT_SYMBOL(bt_sock_unlink); -void bt_accept_enqueue(struct sock *parent, struct sock *sk) +void bt_accept_enqueue(struct sock *parent, struct sock *sk, bool bh) { BT_DBG("parent %p, sk %p", parent, sk); sock_hold(sk); - lock_sock_nested(sk, SINGLE_DEPTH_NESTING); + + if (bh) + bh_lock_sock_nested(sk); + else + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); + list_add_tail(&bt_sk(sk)->accept_q, &bt_sk(parent)->accept_q); bt_sk(sk)->parent = parent; - release_sock(sk); + + if (bh) + bh_unlock_sock(sk); + else + release_sock(sk); + parent->sk_ack_backlog++; } EXPORT_SYMBOL(bt_accept_enqueue); diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index 686bdc6b35b0..a3a2cd55e23a 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -1252,7 +1252,7 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan) l2cap_sock_init(sk, parent); - bt_accept_enqueue(parent, sk); + bt_accept_enqueue(parent, sk, false); release_sock(parent); diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c index d606e9212291..c044ff2f73e6 100644 --- a/net/bluetooth/rfcomm/sock.c +++ b/net/bluetooth/rfcomm/sock.c @@ -988,7 +988,7 @@ int rfcomm_connect_ind(struct rfcomm_session *s, u8 channel, struct rfcomm_dlc * rfcomm_pi(sk)->channel = channel; sk->sk_state = BT_CONFIG; - bt_accept_enqueue(parent, sk); + bt_accept_enqueue(parent, sk, true); /* Accept connection and return socket DLC */ *d = rfcomm_pi(sk)->dlc; diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 8f0f9279eac9..a4ca55df7390 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -193,7 +193,7 @@ static void __sco_chan_add(struct sco_conn *conn, struct sock *sk, conn->sk = sk; if (parent) - bt_accept_enqueue(parent, sk); + bt_accept_enqueue(parent, sk, true); } static int sco_chan_add(struct sco_conn *conn, struct sock *sk, diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 5e55cef0cec3..6693e209efe8 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -2293,9 +2293,12 @@ static int compat_do_replace(struct net *net, void __user *user, xt_compat_lock(NFPROTO_BRIDGE); - ret = xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries); - if (ret < 0) - goto out_unlock; + if (tmp.nentries) { + ret = xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries); + if (ret < 0) + goto out_unlock; + } + ret = compat_copy_entries(entries_tmp, tmp.entries_size, &state); if (ret < 0) goto out_unlock; diff --git a/net/core/filter.c b/net/core/filter.c index fb0080e84bd4..bed9061102f4 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3909,10 +3909,12 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock, /* Only some socketops are supported */ switch (optname) { case SO_RCVBUF: + val = min_t(u32, val, sysctl_rmem_max); sk->sk_userlocks |= SOCK_RCVBUF_LOCK; sk->sk_rcvbuf = max_t(int, val * 2, SOCK_MIN_RCVBUF); break; case SO_SNDBUF: + val = min_t(u32, val, sysctl_wmem_max); sk->sk_userlocks |= SOCK_SNDBUF_LOCK; sk->sk_sndbuf = max_t(int, val * 2, SOCK_MIN_SNDBUF); break; diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c index 188d693cb251..e2fd8baec65f 100644 --- a/net/core/gen_stats.c +++ b/net/core/gen_stats.c @@ -256,7 +256,6 @@ __gnet_stats_copy_queue_cpu(struct gnet_stats_queue *qstats, for_each_possible_cpu(i) { const struct gnet_stats_queue *qcpu = per_cpu_ptr(q, i); - qstats->qlen = 0; qstats->backlog += qcpu->backlog; qstats->drops += qcpu->drops; qstats->requeues += qcpu->requeues; @@ -272,7 +271,6 @@ void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats, if (cpu) { __gnet_stats_copy_queue_cpu(qstats, cpu); } else { - qstats->qlen = q->qlen; qstats->backlog = q->backlog; qstats->drops = q->drops; qstats->requeues = q->requeues; diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c index acf45ddbe924..e095fb871d91 100644 --- a/net/core/gro_cells.c +++ b/net/core/gro_cells.c @@ -13,22 +13,36 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb) { struct net_device *dev = skb->dev; struct gro_cell *cell; + int res; - if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) - return netif_rx(skb); + rcu_read_lock(); + if (unlikely(!(dev->flags & IFF_UP))) + goto drop; + + if (!gcells->cells || skb_cloned(skb) || netif_elide_gro(dev)) { + res = netif_rx(skb); + goto unlock; + } cell = this_cpu_ptr(gcells->cells); if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) { +drop: atomic_long_inc(&dev->rx_dropped); kfree_skb(skb); - return NET_RX_DROP; + res = NET_RX_DROP; + goto unlock; } __skb_queue_tail(&cell->napi_skbs, skb); if (skb_queue_len(&cell->napi_skbs) == 1) napi_schedule(&cell->napi); - return NET_RX_SUCCESS; + + res = NET_RX_SUCCESS; + +unlock: + rcu_read_unlock(); + return res; } EXPORT_SYMBOL(gro_cells_receive); diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index bd67c4d0fcfd..2aabb7eb0854 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1547,6 +1547,9 @@ static int register_queue_kobjects(struct net_device *dev) error: netdev_queue_update_kobjects(dev, txq, 0); net_rx_queue_update_kobjects(dev, rxq, 0); +#ifdef CONFIG_SYSFS + kset_unregister(dev->queues_kset); +#endif return error; } diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index b8cd43c9ed5b..a97bf326b231 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -94,9 +94,8 @@ static void hsr_check_announce(struct net_device *hsr_dev, && (old_operstate != IF_OPER_UP)) { /* Went up */ hsr->announce_count = 0; - hsr->announce_timer.expires = jiffies + - msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL); - add_timer(&hsr->announce_timer); + mod_timer(&hsr->announce_timer, + jiffies + msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL)); } if ((hsr_dev->operstate != IF_OPER_UP) && (old_operstate == IF_OPER_UP)) @@ -332,6 +331,7 @@ static void hsr_announce(struct timer_list *t) { struct hsr_priv *hsr; struct hsr_port *master; + unsigned long interval; hsr = from_timer(hsr, t, announce_timer); @@ -343,18 +343,16 @@ static void hsr_announce(struct timer_list *t) hsr->protVersion); hsr->announce_count++; - hsr->announce_timer.expires = jiffies + - msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL); + interval = msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL); } else { send_hsr_supervision_frame(master, HSR_TLV_LIFE_CHECK, hsr->protVersion); - hsr->announce_timer.expires = jiffies + - msecs_to_jiffies(HSR_LIFE_CHECK_INTERVAL); + interval = msecs_to_jiffies(HSR_LIFE_CHECK_INTERVAL); } if (is_admin_up(master->dev)) - add_timer(&hsr->announce_timer); + mod_timer(&hsr->announce_timer, jiffies + interval); rcu_read_unlock(); } @@ -486,7 +484,7 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2], res = hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER); if (res) - return res; + goto err_add_port; res = register_netdevice(hsr_dev); if (res) @@ -506,6 +504,8 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2], fail: hsr_for_each_port(hsr, port) hsr_del_port(port); +err_add_port: + hsr_del_node(&hsr->self_node_db); return res; } diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index 286ceb41ac0c..9af16cb68f76 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -124,6 +124,18 @@ int hsr_create_self_node(struct list_head *self_node_db, return 0; } +void hsr_del_node(struct list_head *self_node_db) +{ + struct hsr_node *node; + + rcu_read_lock(); + node = list_first_or_null_rcu(self_node_db, struct hsr_node, mac_list); + rcu_read_unlock(); + if (node) { + list_del_rcu(&node->mac_list); + kfree(node); + } +} /* Allocate an hsr_node and add it to node_db. 'addr' is the node's AddressA; * seq_out is used to initialize filtering of outgoing duplicate frames diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h index 370b45998121..531fd3dfcac1 100644 --- a/net/hsr/hsr_framereg.h +++ b/net/hsr/hsr_framereg.h @@ -16,6 +16,7 @@ struct hsr_node; +void hsr_del_node(struct list_head *self_node_db); struct hsr_node *hsr_add_node(struct list_head *node_db, unsigned char addr[], u16 seq_out); struct hsr_node *hsr_get_node(struct hsr_port *port, struct sk_buff *skb, diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c index 777fa3b7fb13..f0165c5f376b 100644 --- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -667,7 +667,8 @@ static int cipso_v4_map_lvl_valid(const struct cipso_v4_doi *doi_def, u8 level) case CIPSO_V4_MAP_PASS: return 0; case CIPSO_V4_MAP_TRANS: - if (doi_def->map.std->lvl.cipso[level] < CIPSO_V4_INV_LVL) + if ((level < doi_def->map.std->lvl.cipso_size) && + (doi_def->map.std->lvl.cipso[level] < CIPSO_V4_INV_LVL)) return 0; break; } @@ -1735,13 +1736,26 @@ int cipso_v4_validate(const struct sk_buff *skb, unsigned char **option) */ void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway) { + unsigned char optbuf[sizeof(struct ip_options) + 40]; + struct ip_options *opt = (struct ip_options *)optbuf; + if (ip_hdr(skb)->protocol == IPPROTO_ICMP || error != -EACCES) return; + /* + * We might be called above the IP layer, + * so we can not use icmp_send and IPCB here. + */ + + memset(opt, 0, sizeof(struct ip_options)); + opt->optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr); + if (__ip_options_compile(dev_net(skb->dev), opt, skb, NULL)) + return; + if (gateway) - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_NET_ANO, 0); + __icmp_send(skb, ICMP_DEST_UNREACH, ICMP_NET_ANO, 0, opt); else - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_ANO, 0); + __icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_ANO, 0, opt); } /** diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 958e185a8e8d..dae743b649c1 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -700,6 +700,10 @@ static int rtm_to_fib_config(struct net *net, struct sk_buff *skb, case RTA_GATEWAY: cfg->fc_gw = nla_get_be32(attr); break; + case RTA_VIA: + NL_SET_ERR_MSG(extack, "IPv4 does not support RTA_VIA attribute"); + err = -EINVAL; + goto errout; case RTA_PRIORITY: cfg->fc_priority = nla_get_u32(attr); break; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 695979b7ef6d..ad75c468ecfb 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -570,7 +570,8 @@ static struct rtable *icmp_route_lookup(struct net *net, * MUST reply to only the first fragment. */ -void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) +void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, + const struct ip_options *opt) { struct iphdr *iph; int room; @@ -691,7 +692,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) iph->tos; mark = IP4_REPLY_MARK(net, skb_in->mark); - if (ip_options_echo(net, &icmp_param.replyopts.opt.opt, skb_in)) + if (__ip_options_echo(net, &icmp_param.replyopts.opt.opt, skb_in, opt)) goto out_unlock; @@ -742,7 +743,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) local_bh_enable(); out:; } -EXPORT_SYMBOL(icmp_send); +EXPORT_SYMBOL(__icmp_send); static void icmp_socket_deliver(struct sk_buff *skb, u32 info) diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c index 797c4ffecf9b..0680f872895f 100644 --- a/net/ipv4/ip_input.c +++ b/net/ipv4/ip_input.c @@ -307,11 +307,10 @@ static inline bool ip_rcv_options(struct sk_buff *skb) } static int ip_rcv_finish_core(struct net *net, struct sock *sk, - struct sk_buff *skb) + struct sk_buff *skb, struct net_device *dev) { const struct iphdr *iph = ip_hdr(skb); int (*edemux)(struct sk_buff *skb); - struct net_device *dev = skb->dev; struct rtable *rt; int err; @@ -400,6 +399,7 @@ static int ip_rcv_finish_core(struct net *net, struct sock *sk, static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb) { + struct net_device *dev = skb->dev; int ret; /* if ingress device is enslaved to an L3 master device pass the @@ -409,7 +409,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb) if (!skb) return NET_RX_SUCCESS; - ret = ip_rcv_finish_core(net, sk, skb); + ret = ip_rcv_finish_core(net, sk, skb, dev); if (ret != NET_RX_DROP) ret = dst_input(skb); return ret; @@ -549,6 +549,7 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk, INIT_LIST_HEAD(&sublist); list_for_each_entry_safe(skb, next, head, list) { + struct net_device *dev = skb->dev; struct dst_entry *dst; skb_list_del_init(skb); @@ -558,7 +559,7 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk, skb = l3mdev_ip_rcv(skb); if (!skb) continue; - if (ip_rcv_finish_core(net, sk, skb) == NET_RX_DROP) + if (ip_rcv_finish_core(net, sk, skb, dev) == NET_RX_DROP) continue; dst = skb_dst(skb); diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index ed194d46c00e..32a35043c9f5 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -251,8 +251,9 @@ static void spec_dst_fill(__be32 *spec_dst, struct sk_buff *skb) * If opt == NULL, then skb->data should point to IP header. */ -int ip_options_compile(struct net *net, - struct ip_options *opt, struct sk_buff *skb) +int __ip_options_compile(struct net *net, + struct ip_options *opt, struct sk_buff *skb, + __be32 *info) { __be32 spec_dst = htonl(INADDR_ANY); unsigned char *pp_ptr = NULL; @@ -468,11 +469,22 @@ int ip_options_compile(struct net *net, return 0; error: - if (skb) { - icmp_send(skb, ICMP_PARAMETERPROB, 0, htonl((pp_ptr-iph)<<24)); - } + if (info) + *info = htonl((pp_ptr-iph)<<24); return -EINVAL; } + +int ip_options_compile(struct net *net, + struct ip_options *opt, struct sk_buff *skb) +{ + int ret; + __be32 info; + + ret = __ip_options_compile(net, opt, skb, &info); + if (ret != 0 && skb) + icmp_send(skb, ICMP_PARAMETERPROB, 0, info); + return ret; +} EXPORT_SYMBOL(ip_options_compile); /* diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 7f56944b020f..40a7cd56e008 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -74,6 +74,33 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi, return 0; } +static int vti_input_ipip(struct sk_buff *skb, int nexthdr, __be32 spi, + int encap_type) +{ + struct ip_tunnel *tunnel; + const struct iphdr *iph = ip_hdr(skb); + struct net *net = dev_net(skb->dev); + struct ip_tunnel_net *itn = net_generic(net, vti_net_id); + + tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, + iph->saddr, iph->daddr, 0); + if (tunnel) { + if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto drop; + + XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = tunnel; + + skb->dev = tunnel->dev; + + return xfrm_input(skb, nexthdr, spi, encap_type); + } + + return -EINVAL; +drop: + kfree_skb(skb); + return 0; +} + static int vti_rcv(struct sk_buff *skb) { XFRM_SPI_SKB_CB(skb)->family = AF_INET; @@ -82,6 +109,14 @@ static int vti_rcv(struct sk_buff *skb) return vti_input(skb, ip_hdr(skb)->protocol, 0, 0); } +static int vti_rcv_ipip(struct sk_buff *skb) +{ + XFRM_SPI_SKB_CB(skb)->family = AF_INET; + XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct iphdr, daddr); + + return vti_input_ipip(skb, ip_hdr(skb)->protocol, ip_hdr(skb)->saddr, 0); +} + static int vti_rcv_cb(struct sk_buff *skb, int err) { unsigned short family; @@ -435,6 +470,12 @@ static struct xfrm4_protocol vti_ipcomp4_protocol __read_mostly = { .priority = 100, }; +static struct xfrm_tunnel ipip_handler __read_mostly = { + .handler = vti_rcv_ipip, + .err_handler = vti4_err, + .priority = 0, +}; + static int __net_init vti_init_net(struct net *net) { int err; @@ -603,6 +644,13 @@ static int __init vti_init(void) if (err < 0) goto xfrm_proto_comp_failed; + msg = "ipip tunnel"; + err = xfrm4_tunnel_register(&ipip_handler, AF_INET); + if (err < 0) { + pr_info("%s: cant't register tunnel\n",__func__); + goto xfrm_tunnel_failed; + } + msg = "netlink interface"; err = rtnl_link_register(&vti_link_ops); if (err < 0) @@ -612,6 +660,8 @@ static int __init vti_init(void) rtnl_link_failed: xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); +xfrm_tunnel_failed: + xfrm4_tunnel_deregister(&ipip_handler, AF_INET); xfrm_proto_comp_failed: xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH); xfrm_proto_ah_failed: diff --git a/net/ipv4/netlink.c b/net/ipv4/netlink.c index f86bb4f06609..d8e3a1fb8e82 100644 --- a/net/ipv4/netlink.c +++ b/net/ipv4/netlink.c @@ -3,9 +3,10 @@ #include #include #include +#include #include -int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, +int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, u8 family, struct netlink_ext_ack *extack) { *ip_proto = nla_get_u8(attr); @@ -13,11 +14,19 @@ int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, switch (*ip_proto) { case IPPROTO_TCP: case IPPROTO_UDP: - case IPPROTO_ICMP: return 0; - default: - NL_SET_ERR_MSG(extack, "Unsupported ip proto"); - return -EOPNOTSUPP; + case IPPROTO_ICMP: + if (family != AF_INET) + break; + return 0; +#if IS_ENABLED(CONFIG_IPV6) + case IPPROTO_ICMPV6: + if (family != AF_INET6) + break; + return 0; +#endif } + NL_SET_ERR_MSG(extack, "Unsupported ip proto"); + return -EOPNOTSUPP; } EXPORT_SYMBOL_GPL(rtm_getroute_parse_ip_proto); diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 436b46c0e687..7a556e459375 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1308,6 +1308,10 @@ static void ip_del_fnhe(struct fib_nh *nh, __be32 daddr) if (fnhe->fnhe_daddr == daddr) { rcu_assign_pointer(*fnhe_p, rcu_dereference_protected( fnhe->fnhe_next, lockdep_is_held(&fnhe_lock))); + /* set fnhe_daddr to 0 to ensure it won't bind with + * new dsts in rt_bind_exception(). + */ + fnhe->fnhe_daddr = 0; fnhe_flush_routes(fnhe); kfree_rcu(fnhe, rcu); break; @@ -2155,12 +2159,13 @@ int ip_route_input_rcu(struct sk_buff *skb, __be32 daddr, __be32 saddr, int our = 0; int err = -EINVAL; - if (in_dev) - our = ip_check_mc_rcu(in_dev, daddr, saddr, - ip_hdr(skb)->protocol); + if (!in_dev) + return err; + our = ip_check_mc_rcu(in_dev, daddr, saddr, + ip_hdr(skb)->protocol); /* check l3 master if no match yet */ - if ((!in_dev || !our) && netif_is_l3_slave(dev)) { + if (!our && netif_is_l3_slave(dev)) { struct in_device *l3_in_dev; l3_in_dev = __in_dev_get_rcu(skb->dev); @@ -2814,7 +2819,7 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, if (tb[RTA_IP_PROTO]) { err = rtm_getroute_parse_ip_proto(tb[RTA_IP_PROTO], - &ip_proto, extack); + &ip_proto, AF_INET, extack); if (err) return err; } diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index c3387dfd725b..f66b2e6d97a7 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb, refcount_set(&req->rsk_refcnt, 1); tcp_sk(child)->tsoffset = tsoff; sock_rps_save_rxhash(child, skb); - inet_csk_reqsk_queue_add(sk, req, child); + if (!inet_csk_reqsk_queue_add(sk, req, child)) { + bh_unlock_sock(child); + sock_put(child); + child = NULL; + reqsk_put(req); + } } else { reqsk_free(req); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index d561464db7ce..ca38aca78d84 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1901,6 +1901,11 @@ static int tcp_inq_hint(struct sock *sk) inq = tp->rcv_nxt - tp->copied_seq; release_sock(sk); } + /* After receiving a FIN, tell the user-space to continue reading + * by returning a non-zero inq. + */ + if (inq == 0 && sock_flag(sk, SOCK_DONE)) + inq = 1; return inq; } diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 0e9fbdf24aed..16f2c84ad0d8 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6493,7 +6493,13 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, af_ops->send_synack(fastopen_sk, dst, &fl, req, &foc, TCP_SYNACK_FASTOPEN); /* Add the child socket directly into the accept queue */ - inet_csk_reqsk_queue_add(sk, req, fastopen_sk); + if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) { + reqsk_fastopen_remove(fastopen_sk, req, false); + bh_unlock_sock(fastopen_sk); + sock_put(fastopen_sk); + reqsk_put(req); + goto drop; + } sk->sk_data_ready(sk); bh_unlock_sock(fastopen_sk); sock_put(fastopen_sk); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 5f880b01243d..ce66c233fd12 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1646,15 +1646,8 @@ EXPORT_SYMBOL(tcp_add_backlog); int tcp_filter(struct sock *sk, struct sk_buff *skb) { struct tcphdr *th = (struct tcphdr *)skb->data; - unsigned int eaten = skb->len; - int err; - err = sk_filter_trim_cap(sk, skb, th->doff * 4); - if (!err) { - eaten -= skb->len; - TCP_SKB_CB(skb)->end_seq -= eaten; - } - return err; + return sk_filter_trim_cap(sk, skb, th->doff * 4); } EXPORT_SYMBOL(tcp_filter); diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c index 10aafea3af0f..35e7092eceb3 100644 --- a/net/ipv6/ip6mr.c +++ b/net/ipv6/ip6mr.c @@ -1954,10 +1954,10 @@ int ip6mr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg) static inline int ip6mr_forward2_finish(struct net *net, struct sock *sk, struct sk_buff *skb) { - __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), - IPSTATS_MIB_OUTFORWDATAGRAMS); - __IP6_ADD_STATS(net, ip6_dst_idev(skb_dst(skb)), - IPSTATS_MIB_OUTOCTETS, skb->len); + IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), + IPSTATS_MIB_OUTFORWDATAGRAMS); + IP6_ADD_STATS(net, ip6_dst_idev(skb_dst(skb)), + IPSTATS_MIB_OUTOCTETS, skb->len); return dst_output(net, sk, skb); } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 7b832c3e0297..509a49f3aa33 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1282,18 +1282,29 @@ static DEFINE_SPINLOCK(rt6_exception_lock); static void rt6_remove_exception(struct rt6_exception_bucket *bucket, struct rt6_exception *rt6_ex) { + struct fib6_info *from; struct net *net; if (!bucket || !rt6_ex) return; net = dev_net(rt6_ex->rt6i->dst.dev); + net->ipv6.rt6_stats->fib_rt_cache--; + + /* purge completely the exception to allow releasing the held resources: + * some [sk] cache may keep the dst around for unlimited time + */ + from = rcu_dereference_protected(rt6_ex->rt6i->from, + lockdep_is_held(&rt6_exception_lock)); + rcu_assign_pointer(rt6_ex->rt6i->from, NULL); + fib6_info_release(from); + dst_dev_put(&rt6_ex->rt6i->dst); + hlist_del_rcu(&rt6_ex->hlist); dst_release(&rt6_ex->rt6i->dst); kfree_rcu(rt6_ex, rcu); WARN_ON_ONCE(!bucket->depth); bucket->depth--; - net->ipv6.rt6_stats->fib_rt_cache--; } /* Remove oldest rt6_ex in bucket and free the memory @@ -1612,15 +1623,15 @@ static int rt6_remove_exception_rt(struct rt6_info *rt) static void rt6_update_exception_stamp_rt(struct rt6_info *rt) { struct rt6_exception_bucket *bucket; - struct fib6_info *from = rt->from; struct in6_addr *src_key = NULL; struct rt6_exception *rt6_ex; - - if (!from || - !(rt->rt6i_flags & RTF_CACHE)) - return; + struct fib6_info *from; rcu_read_lock(); + from = rcu_dereference(rt->from); + if (!from || !(rt->rt6i_flags & RTF_CACHE)) + goto unlock; + bucket = rcu_dereference(from->rt6i_exception_bucket); #ifdef CONFIG_IPV6_SUBTREES @@ -1639,6 +1650,7 @@ static void rt6_update_exception_stamp_rt(struct rt6_info *rt) if (rt6_ex) rt6_ex->stamp = jiffies; +unlock: rcu_read_unlock(); } @@ -2796,20 +2808,24 @@ static int ip6_route_check_nh_onlink(struct net *net, u32 tbid = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; const struct in6_addr *gw_addr = &cfg->fc_gateway; u32 flags = RTF_LOCAL | RTF_ANYCAST | RTF_REJECT; + struct fib6_info *from; struct rt6_info *grt; int err; err = 0; grt = ip6_nh_lookup_table(net, cfg, gw_addr, tbid, 0); if (grt) { + rcu_read_lock(); + from = rcu_dereference(grt->from); if (!grt->dst.error && /* ignore match if it is the default route */ - grt->from && !ipv6_addr_any(&grt->from->fib6_dst.addr) && + from && !ipv6_addr_any(&from->fib6_dst.addr) && (grt->rt6i_flags & flags || dev != grt->dst.dev)) { NL_SET_ERR_MSG(extack, "Nexthop has invalid gateway or device mismatch"); err = -EINVAL; } + rcu_read_unlock(); ip6_rt_put(grt); } @@ -4189,6 +4205,10 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, cfg->fc_gateway = nla_get_in6_addr(tb[RTA_GATEWAY]); cfg->fc_flags |= RTF_GATEWAY; } + if (tb[RTA_VIA]) { + NL_SET_ERR_MSG(extack, "IPv6 does not support RTA_VIA attribute"); + goto errout; + } if (tb[RTA_DST]) { int plen = (rtm->rtm_dst_len + 7) >> 3; @@ -4682,7 +4702,7 @@ static int rt6_fill_node(struct net *net, struct sk_buff *skb, table = rt->fib6_table->tb6_id; else table = RT6_TABLE_UNSPEC; - rtm->rtm_table = table; + rtm->rtm_table = table < 256 ? table : RT_TABLE_COMPAT; if (nla_put_u32(skb, RTA_TABLE, table)) goto nla_put_failure; @@ -4883,7 +4903,8 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, if (tb[RTA_IP_PROTO]) { err = rtm_getroute_parse_ip_proto(tb[RTA_IP_PROTO], - &fl6.flowi6_proto, extack); + &fl6.flowi6_proto, AF_INET6, + extack); if (err) goto errout; } diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index da6d5a3f5399..de9aa5cb295c 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -778,8 +778,9 @@ static bool check_6rd(struct ip_tunnel *tunnel, const struct in6_addr *v6dst, pbw0 = tunnel->ip6rd.prefixlen >> 5; pbi0 = tunnel->ip6rd.prefixlen & 0x1f; - d = (ntohl(v6dst->s6_addr32[pbw0]) << pbi0) >> - tunnel->ip6rd.relay_prefixlen; + d = tunnel->ip6rd.relay_prefixlen < 32 ? + (ntohl(v6dst->s6_addr32[pbw0]) << pbi0) >> + tunnel->ip6rd.relay_prefixlen : 0; pbi1 = pbi0 - tunnel->ip6rd.relay_prefixlen; if (pbi1 > 0) @@ -1873,6 +1874,7 @@ static int __net_init sit_init_net(struct net *net) err_reg_dev: ipip6_dev_free(sitn->fb_tunnel_dev); + free_netdev(sitn->fb_tunnel_dev); err_alloc_dev: return err; } diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index 0ae6899edac0..37a69df17cab 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -674,9 +674,6 @@ static int l2tp_ip6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, if (flags & MSG_OOB) goto out; - if (addr_len) - *addr_len = sizeof(*lsa); - if (flags & MSG_ERRQUEUE) return ipv6_recv_error(sk, msg, len, addr_len); @@ -706,6 +703,7 @@ static int l2tp_ip6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, lsa->l2tp_conn_id = 0; if (ipv6_addr_type(&lsa->l2tp_addr) & IPV6_ADDR_LINKLOCAL) lsa->l2tp_scope_id = inet6_iif(skb); + *addr_len = sizeof(*lsa); } if (np->rxopt.all) diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c index 8fbe6cdbe255..d5a4db5b3fe7 100644 --- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -1822,6 +1822,9 @@ static int rtm_to_route_config(struct sk_buff *skb, goto errout; break; } + case RTA_GATEWAY: + NL_SET_ERR_MSG(extack, "MPLS does not support RTA_GATEWAY attribute"); + goto errout; case RTA_VIA: { if (nla_get_via(nla, &cfg->rc_via_alen, diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 518364f4abcc..55a77314340a 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2220,6 +2220,18 @@ static int ip_vs_set_timeout(struct netns_ipvs *ipvs, struct ip_vs_timeout_user u->tcp_fin_timeout, u->udp_timeout); +#ifdef CONFIG_IP_VS_PROTO_TCP + if (u->tcp_timeout < 0 || u->tcp_timeout > (INT_MAX / HZ) || + u->tcp_fin_timeout < 0 || u->tcp_fin_timeout > (INT_MAX / HZ)) { + return -EINVAL; + } +#endif + +#ifdef CONFIG_IP_VS_PROTO_UDP + if (u->udp_timeout < 0 || u->udp_timeout > (INT_MAX / HZ)) + return -EINVAL; +#endif + #ifdef CONFIG_IP_VS_PROTO_TCP if (u->tcp_timeout) { pd = ip_vs_proto_data_get(ipvs, IPPROTO_TCP); diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 277d02a8cac8..895171a2e1f1 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1007,6 +1007,22 @@ nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple, } if (nf_ct_key_equal(h, tuple, zone, net)) { + /* Tuple is taken already, so caller will need to find + * a new source port to use. + * + * Only exception: + * If the *original tuples* are identical, then both + * conntracks refer to the same flow. + * This is a rare situation, it can occur e.g. when + * more than one UDP packet is sent from same socket + * in different threads. + * + * Let nf_ct_resolve_clash() deal with this later. + */ + if (nf_ct_tuple_equal(&ignored_conntrack->tuplehash[IP_CT_DIR_ORIGINAL].tuple, + &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)) + continue; + NF_CT_STAT_INC_ATOMIC(net, found); rcu_read_unlock(); return 1; diff --git a/net/netlabel/netlabel_kapi.c b/net/netlabel/netlabel_kapi.c index ea7c67050792..ee3e5b6471a6 100644 --- a/net/netlabel/netlabel_kapi.c +++ b/net/netlabel/netlabel_kapi.c @@ -903,7 +903,8 @@ int netlbl_bitmap_walk(const unsigned char *bitmap, u32 bitmap_len, (state == 0 && (byte & bitmask) == 0)) return bit_spot; - bit_spot++; + if (++bit_spot >= bitmap_len) + return -1; bitmask >>= 1; if (bitmask == 0) { byte = bitmap[++byte_offset]; diff --git a/net/nfc/llcp_commands.c b/net/nfc/llcp_commands.c index 6a196e438b6c..d1fc019e932e 100644 --- a/net/nfc/llcp_commands.c +++ b/net/nfc/llcp_commands.c @@ -419,6 +419,10 @@ int nfc_llcp_send_connect(struct nfc_llcp_sock *sock) sock->service_name, sock->service_name_len, &service_name_tlv_length); + if (!service_name_tlv) { + err = -ENOMEM; + goto error_tlv; + } size += service_name_tlv_length; } @@ -429,9 +433,17 @@ int nfc_llcp_send_connect(struct nfc_llcp_sock *sock) miux_tlv = nfc_llcp_build_tlv(LLCP_TLV_MIUX, (u8 *)&miux, 0, &miux_tlv_length); + if (!miux_tlv) { + err = -ENOMEM; + goto error_tlv; + } size += miux_tlv_length; rw_tlv = nfc_llcp_build_tlv(LLCP_TLV_RW, &rw, 0, &rw_tlv_length); + if (!rw_tlv) { + err = -ENOMEM; + goto error_tlv; + } size += rw_tlv_length; pr_debug("SKB size %d SN length %zu\n", size, sock->service_name_len); @@ -484,9 +496,17 @@ int nfc_llcp_send_cc(struct nfc_llcp_sock *sock) miux_tlv = nfc_llcp_build_tlv(LLCP_TLV_MIUX, (u8 *)&miux, 0, &miux_tlv_length); + if (!miux_tlv) { + err = -ENOMEM; + goto error_tlv; + } size += miux_tlv_length; rw_tlv = nfc_llcp_build_tlv(LLCP_TLV_RW, &rw, 0, &rw_tlv_length); + if (!rw_tlv) { + err = -ENOMEM; + goto error_tlv; + } size += rw_tlv_length; skb = llcp_allocate_pdu(sock, LLCP_PDU_CC, size); diff --git a/net/nfc/llcp_core.c b/net/nfc/llcp_core.c index ef4026a23e80..4fa015208aab 100644 --- a/net/nfc/llcp_core.c +++ b/net/nfc/llcp_core.c @@ -532,10 +532,10 @@ static u8 nfc_llcp_reserve_sdp_ssap(struct nfc_llcp_local *local) static int nfc_llcp_build_gb(struct nfc_llcp_local *local) { - u8 *gb_cur, *version_tlv, version, version_length; - u8 *lto_tlv, lto_length; - u8 *wks_tlv, wks_length; - u8 *miux_tlv, miux_length; + u8 *gb_cur, version, version_length; + u8 lto_length, wks_length, miux_length; + u8 *version_tlv = NULL, *lto_tlv = NULL, + *wks_tlv = NULL, *miux_tlv = NULL; __be16 wks = cpu_to_be16(local->local_wks); u8 gb_len = 0; int ret = 0; @@ -543,17 +543,33 @@ static int nfc_llcp_build_gb(struct nfc_llcp_local *local) version = LLCP_VERSION_11; version_tlv = nfc_llcp_build_tlv(LLCP_TLV_VERSION, &version, 1, &version_length); + if (!version_tlv) { + ret = -ENOMEM; + goto out; + } gb_len += version_length; lto_tlv = nfc_llcp_build_tlv(LLCP_TLV_LTO, &local->lto, 1, <o_length); + if (!lto_tlv) { + ret = -ENOMEM; + goto out; + } gb_len += lto_length; pr_debug("Local wks 0x%lx\n", local->local_wks); wks_tlv = nfc_llcp_build_tlv(LLCP_TLV_WKS, (u8 *)&wks, 2, &wks_length); + if (!wks_tlv) { + ret = -ENOMEM; + goto out; + } gb_len += wks_length; miux_tlv = nfc_llcp_build_tlv(LLCP_TLV_MIUX, (u8 *)&local->miux, 0, &miux_length); + if (!miux_tlv) { + ret = -ENOMEM; + goto out; + } gb_len += miux_length; gb_len += ARRAY_SIZE(llcp_magic); diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c index 521189f4b666..6e419b15a9f8 100644 --- a/net/rxrpc/conn_client.c +++ b/net/rxrpc/conn_client.c @@ -353,7 +353,7 @@ static int rxrpc_get_client_conn(struct rxrpc_sock *rx, * normally have to take channel_lock but we do this before anyone else * can see the connection. */ - list_add_tail(&call->chan_wait_link, &candidate->waiting_calls); + list_add(&call->chan_wait_link, &candidate->waiting_calls); if (cp->exclusive) { call->conn = candidate; @@ -432,7 +432,7 @@ static int rxrpc_get_client_conn(struct rxrpc_sock *rx, call->conn = conn; call->security_ix = conn->security_ix; call->service_id = conn->service_id; - list_add(&call->chan_wait_link, &conn->waiting_calls); + list_add_tail(&call->chan_wait_link, &conn->waiting_calls); spin_unlock(&conn->channel_lock); _leave(" = 0 [extant %d]", conn->debug_id); return 0; diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 8525de811616..334f3a057671 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -199,8 +199,7 @@ static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla, err2: kfree(tname); err1: - if (ret == ACT_P_CREATED) - tcf_idr_release(*a, bind); + tcf_idr_release(*a, bind); return err; } diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c index 73e44ce2a883..86d90fc5e97e 100644 --- a/net/sched/act_skbedit.c +++ b/net/sched/act_skbedit.c @@ -191,8 +191,7 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla, params_new = kzalloc(sizeof(*params_new), GFP_KERNEL); if (unlikely(!params_new)) { - if (ret == ACT_P_CREATED) - tcf_idr_release(*a, bind); + tcf_idr_release(*a, bind); return -ENOMEM; } diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c index 0f6601fdf889..72d9c432e8b4 100644 --- a/net/sched/act_tunnel_key.c +++ b/net/sched/act_tunnel_key.c @@ -377,7 +377,8 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla, return ret; release_tun_meta: - dst_release(&metadata->dst); + if (metadata) + dst_release(&metadata->dst); err_out: if (exists) diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 84893bc67531..09b359784629 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -1213,6 +1213,24 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, if (err < 0) goto errout; + if (tb[TCA_FLOWER_FLAGS]) { + fnew->flags = nla_get_u32(tb[TCA_FLOWER_FLAGS]); + + if (!tc_flags_valid(fnew->flags)) { + err = -EINVAL; + goto errout; + } + } + + err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE], ovr, + tp->chain->tmplt_priv, extack); + if (err) + goto errout; + + err = fl_check_assign_mask(head, fnew, fold, mask); + if (err) + goto errout; + if (!handle) { handle = 1; err = idr_alloc_u32(&head->handle_idr, fnew, &handle, @@ -1223,37 +1241,19 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, handle, GFP_KERNEL); } if (err) - goto errout; + goto errout_mask; fnew->handle = handle; - if (tb[TCA_FLOWER_FLAGS]) { - fnew->flags = nla_get_u32(tb[TCA_FLOWER_FLAGS]); - - if (!tc_flags_valid(fnew->flags)) { - err = -EINVAL; - goto errout_idr; - } - } - - err = fl_set_parms(net, tp, fnew, mask, base, tb, tca[TCA_RATE], ovr, - tp->chain->tmplt_priv, extack); - if (err) - goto errout_idr; - - err = fl_check_assign_mask(head, fnew, fold, mask); - if (err) - goto errout_idr; - if (!tc_skip_sw(fnew->flags)) { if (!fold && fl_lookup(fnew->mask, &fnew->mkey)) { err = -EEXIST; - goto errout_mask; + goto errout_idr; } err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node, fnew->mask->filter_ht_params); if (err) - goto errout_mask; + goto errout_idr; } if (!tc_skip_hw(fnew->flags)) { @@ -1290,12 +1290,13 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, kfree(mask); return 0; -errout_mask: - fl_mask_put(head, fnew->mask, false); - errout_idr: if (!fold) idr_remove(&head->handle_idr, fnew->handle); + +errout_mask: + fl_mask_put(head, fnew->mask, false); + errout: tcf_exts_destroy(&fnew->exts); kfree(fnew); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 69078c82963e..77b289da7763 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -68,7 +68,7 @@ static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q) skb = __skb_dequeue(&q->skb_bad_txq); if (qdisc_is_percpu_stats(q)) { qdisc_qstats_cpu_backlog_dec(q, skb); - qdisc_qstats_cpu_qlen_dec(q); + qdisc_qstats_atomic_qlen_dec(q); } else { qdisc_qstats_backlog_dec(q, skb); q->q.qlen--; @@ -108,7 +108,7 @@ static inline void qdisc_enqueue_skb_bad_txq(struct Qdisc *q, if (qdisc_is_percpu_stats(q)) { qdisc_qstats_cpu_backlog_inc(q, skb); - qdisc_qstats_cpu_qlen_inc(q); + qdisc_qstats_atomic_qlen_inc(q); } else { qdisc_qstats_backlog_inc(q, skb); q->q.qlen++; @@ -147,7 +147,7 @@ static inline int dev_requeue_skb_locked(struct sk_buff *skb, struct Qdisc *q) qdisc_qstats_cpu_requeues_inc(q); qdisc_qstats_cpu_backlog_inc(q, skb); - qdisc_qstats_cpu_qlen_inc(q); + qdisc_qstats_atomic_qlen_inc(q); skb = next; } @@ -252,7 +252,7 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate, skb = __skb_dequeue(&q->gso_skb); if (qdisc_is_percpu_stats(q)) { qdisc_qstats_cpu_backlog_dec(q, skb); - qdisc_qstats_cpu_qlen_dec(q); + qdisc_qstats_atomic_qlen_dec(q); } else { qdisc_qstats_backlog_dec(q, skb); q->q.qlen--; @@ -633,7 +633,7 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc, if (unlikely(err)) return qdisc_drop_cpu(skb, qdisc, to_free); - qdisc_qstats_cpu_qlen_inc(qdisc); + qdisc_qstats_atomic_qlen_inc(qdisc); /* Note: skb can not be used after skb_array_produce(), * so we better not use qdisc_qstats_cpu_backlog_inc() */ @@ -658,7 +658,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) if (likely(skb)) { qdisc_qstats_cpu_backlog_dec(qdisc, skb); qdisc_bstats_cpu_update(qdisc, skb); - qdisc_qstats_cpu_qlen_dec(qdisc); + qdisc_qstats_atomic_qlen_dec(qdisc); } return skb; @@ -702,7 +702,6 @@ static void pfifo_fast_reset(struct Qdisc *qdisc) struct gnet_stats_queue *q = per_cpu_ptr(qdisc->cpu_qstats, i); q->backlog = 0; - q->qlen = 0; } } diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 74c0f656f28c..4dfe10b9f96c 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -440,6 +440,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch, int nb = 0; int count = 1; int rc = NET_XMIT_SUCCESS; + int rc_drop = NET_XMIT_DROP; /* Do not fool qdisc_drop_all() */ skb->prev = NULL; @@ -479,6 +480,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch, q->duplicate = 0; rootq->enqueue(skb2, rootq, to_free); q->duplicate = dupsave; + rc_drop = NET_XMIT_SUCCESS; } /* @@ -491,7 +493,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch, if (skb_is_gso(skb)) { segs = netem_segment(skb, sch, to_free); if (!segs) - return NET_XMIT_DROP; + return rc_drop; } else { segs = skb; } @@ -514,8 +516,10 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch, 1<<(prandom_u32() % 8); } - if (unlikely(sch->q.qlen >= sch->limit)) - return qdisc_drop_all(skb, sch, to_free); + if (unlikely(sch->q.qlen >= sch->limit)) { + qdisc_drop_all(skb, sch, to_free); + return rc_drop; + } qdisc_qstats_backlog_inc(sch, skb); diff --git a/net/sctp/socket.c b/net/sctp/socket.c index e5e70cff5bb3..1b16250c5718 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1884,6 +1884,7 @@ static int sctp_sendmsg_check_sflags(struct sctp_association *asoc, pr_debug("%s: aborting association:%p\n", __func__, asoc); sctp_primitive_ABORT(net, asoc, chunk); + iov_iter_revert(&msg->msg_iter, msg_len); return 0; } diff --git a/net/sctp/stream.c b/net/sctp/stream.c index 2936ed17bf9e..3b47457862cc 100644 --- a/net/sctp/stream.c +++ b/net/sctp/stream.c @@ -230,8 +230,6 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt, for (i = 0; i < stream->outcnt; i++) SCTP_SO(stream, i)->state = SCTP_STREAM_OPEN; - sched->init(stream); - in: sctp_stream_interleave_init(stream); if (!incnt) diff --git a/net/smc/smc.h b/net/smc/smc.h index 5721416d0605..adbdf195eb08 100644 --- a/net/smc/smc.h +++ b/net/smc/smc.h @@ -113,9 +113,9 @@ struct smc_host_cdc_msg { /* Connection Data Control message */ } __aligned(8); enum smc_urg_state { - SMC_URG_VALID, /* data present */ - SMC_URG_NOTYET, /* data pending */ - SMC_URG_READ /* data was already read */ + SMC_URG_VALID = 1, /* data present */ + SMC_URG_NOTYET = 2, /* data pending */ + SMC_URG_READ = 3, /* data was already read */ }; struct smc_connection { diff --git a/net/socket.c b/net/socket.c index 7d2703f781fe..7a0ddf86620f 100644 --- a/net/socket.c +++ b/net/socket.c @@ -587,6 +587,7 @@ static void __sock_release(struct socket *sock, struct inode *inode) if (inode) inode_lock(inode); sock->ops->release(sock); + sock->sk = NULL; if (inode) inode_unlock(inode); sock->ops = NULL; diff --git a/net/tipc/socket.c b/net/tipc/socket.c index e1bdaf056c8f..88c307ef1318 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -377,11 +377,13 @@ static int tipc_sk_sock_err(struct socket *sock, long *timeout) #define tipc_wait_for_cond(sock_, timeo_, condition_) \ ({ \ + DEFINE_WAIT_FUNC(wait_, woken_wake_function); \ struct sock *sk_; \ int rc_; \ \ while ((rc_ = !(condition_))) { \ - DEFINE_WAIT_FUNC(wait_, woken_wake_function); \ + /* coupled with smp_wmb() in tipc_sk_proto_rcv() */ \ + smp_rmb(); \ sk_ = (sock_)->sk; \ rc_ = tipc_sk_sock_err((sock_), timeo_); \ if (rc_) \ @@ -1318,7 +1320,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen) if (unlikely(!dest)) { dest = &tsk->peer; - if (!syn || dest->family != AF_TIPC) + if (!syn && dest->family != AF_TIPC) return -EDESTADDRREQ; } @@ -1961,6 +1963,8 @@ static void tipc_sk_proto_rcv(struct sock *sk, return; case SOCK_WAKEUP: tipc_dest_del(&tsk->cong_links, msg_orignode(hdr), 0); + /* coupled with smp_rmb() in tipc_wait_for_cond() */ + smp_wmb(); tsk->cong_link_cnt--; wakeup = true; break; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index c754f3a90a2e..f601933ad728 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -888,7 +888,7 @@ static int unix_autobind(struct socket *sock) addr->hash ^= sk->sk_type; __unix_remove_socket(sk); - u->addr = addr; + smp_store_release(&u->addr, addr); __unix_insert_socket(&unix_socket_table[addr->hash], sk); spin_unlock(&unix_table_lock); err = 0; @@ -1058,7 +1058,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) err = 0; __unix_remove_socket(sk); - u->addr = addr; + smp_store_release(&u->addr, addr); __unix_insert_socket(list, sk); out_unlock: @@ -1329,15 +1329,29 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, RCU_INIT_POINTER(newsk->sk_wq, &newu->peer_wq); otheru = unix_sk(other); - /* copy address information from listening to new sock*/ - if (otheru->addr) { - refcount_inc(&otheru->addr->refcnt); - newu->addr = otheru->addr; - } + /* copy address information from listening to new sock + * + * The contents of *(otheru->addr) and otheru->path + * are seen fully set up here, since we have found + * otheru in hash under unix_table_lock. Insertion + * into the hash chain we'd found it in had been done + * in an earlier critical area protected by unix_table_lock, + * the same one where we'd set *(otheru->addr) contents, + * as well as otheru->path and otheru->addr itself. + * + * Using smp_store_release() here to set newu->addr + * is enough to make those stores, as well as stores + * to newu->path visible to anyone who gets newu->addr + * by smp_load_acquire(). IOW, the same warranties + * as for unix_sock instances bound in unix_bind() or + * in unix_autobind(). + */ if (otheru->path.dentry) { path_get(&otheru->path); newu->path = otheru->path; } + refcount_inc(&otheru->addr->refcnt); + smp_store_release(&newu->addr, otheru->addr); /* Set credentials */ copy_peercred(sk, other); @@ -1451,7 +1465,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer) { struct sock *sk = sock->sk; - struct unix_sock *u; + struct unix_address *addr; DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, uaddr); int err = 0; @@ -1466,19 +1480,15 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer) sock_hold(sk); } - u = unix_sk(sk); - unix_state_lock(sk); - if (!u->addr) { + addr = smp_load_acquire(&unix_sk(sk)->addr); + if (!addr) { sunaddr->sun_family = AF_UNIX; sunaddr->sun_path[0] = 0; err = sizeof(short); } else { - struct unix_address *addr = u->addr; - err = addr->len; memcpy(sunaddr, addr->name, addr->len); } - unix_state_unlock(sk); sock_put(sk); out: return err; @@ -2071,11 +2081,11 @@ static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg, static void unix_copy_addr(struct msghdr *msg, struct sock *sk) { - struct unix_sock *u = unix_sk(sk); + struct unix_address *addr = smp_load_acquire(&unix_sk(sk)->addr); - if (u->addr) { - msg->msg_namelen = u->addr->len; - memcpy(msg->msg_name, u->addr->name, u->addr->len); + if (addr) { + msg->msg_namelen = addr->len; + memcpy(msg->msg_name, addr->name, addr->len); } } @@ -2579,15 +2589,14 @@ static int unix_open_file(struct sock *sk) if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) return -EPERM; - unix_state_lock(sk); - path = unix_sk(sk)->path; - if (!path.dentry) { - unix_state_unlock(sk); + if (!smp_load_acquire(&unix_sk(sk)->addr)) + return -ENOENT; + + path = unix_sk(sk)->path; + if (!path.dentry) return -ENOENT; - } path_get(&path); - unix_state_unlock(sk); fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) @@ -2828,7 +2837,7 @@ static int unix_seq_show(struct seq_file *seq, void *v) (s->sk_state == TCP_ESTABLISHED ? SS_CONNECTING : SS_DISCONNECTING), sock_i_ino(s)); - if (u->addr) { + if (u->addr) { // under unix_table_lock here int i, len; seq_putc(seq, ' '); diff --git a/net/unix/diag.c b/net/unix/diag.c index 384c84e83462..3183d9b8ab33 100644 --- a/net/unix/diag.c +++ b/net/unix/diag.c @@ -10,7 +10,8 @@ static int sk_diag_dump_name(struct sock *sk, struct sk_buff *nlskb) { - struct unix_address *addr = unix_sk(sk)->addr; + /* might or might not have unix_table_lock */ + struct unix_address *addr = smp_load_acquire(&unix_sk(sk)->addr); if (!addr) return 0; diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 5d3cce9e8744..15eb5d3d4750 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -75,6 +75,9 @@ static u32 virtio_transport_get_local_cid(void) { struct virtio_vsock *vsock = virtio_vsock_get(); + if (!vsock) + return VMADDR_CID_ANY; + return vsock->guest_cid; } @@ -584,10 +587,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) virtio_vsock_update_guest_cid(vsock); - ret = vsock_core_init(&virtio_transport.transport); - if (ret < 0) - goto out_vqs; - vsock->rx_buf_nr = 0; vsock->rx_buf_max_nr = 0; atomic_set(&vsock->queued_replies, 0); @@ -618,8 +617,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) mutex_unlock(&the_virtio_vsock_mutex); return 0; -out_vqs: - vsock->vdev->config->del_vqs(vsock->vdev); out: kfree(vsock); mutex_unlock(&the_virtio_vsock_mutex); @@ -637,6 +634,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev) flush_work(&vsock->event_work); flush_work(&vsock->send_pkt_work); + /* Reset all connected sockets when the device disappear */ + vsock_for_each_connected_socket(virtio_vsock_reset_sock); + vdev->config->reset(vdev); mutex_lock(&vsock->rx_lock); @@ -669,7 +669,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev) mutex_lock(&the_virtio_vsock_mutex); the_virtio_vsock = NULL; - vsock_core_exit(); mutex_unlock(&the_virtio_vsock_mutex); vdev->config->del_vqs(vdev); @@ -702,14 +701,28 @@ static int __init virtio_vsock_init(void) virtio_vsock_workqueue = alloc_workqueue("virtio_vsock", 0, 0); if (!virtio_vsock_workqueue) return -ENOMEM; + ret = register_virtio_driver(&virtio_vsock_driver); if (ret) - destroy_workqueue(virtio_vsock_workqueue); + goto out_wq; + + ret = vsock_core_init(&virtio_transport.transport); + if (ret) + goto out_vdr; + + return 0; + +out_vdr: + unregister_virtio_driver(&virtio_vsock_driver); +out_wq: + destroy_workqueue(virtio_vsock_workqueue); return ret; + } static void __exit virtio_vsock_exit(void) { + vsock_core_exit(); unregister_virtio_driver(&virtio_vsock_driver); destroy_workqueue(virtio_vsock_workqueue); } diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c index fef473c736fa..f7f53f9ae7ef 100644 --- a/net/x25/af_x25.c +++ b/net/x25/af_x25.c @@ -679,8 +679,7 @@ static int x25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) struct sockaddr_x25 *addr = (struct sockaddr_x25 *)uaddr; int len, i, rc = 0; - if (!sock_flag(sk, SOCK_ZAPPED) || - addr_len != sizeof(struct sockaddr_x25) || + if (addr_len != sizeof(struct sockaddr_x25) || addr->sx25_family != AF_X25) { rc = -EINVAL; goto out; @@ -695,9 +694,13 @@ static int x25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) } lock_sock(sk); - x25_sk(sk)->source_addr = addr->sx25_addr; - x25_insert_socket(sk); - sock_reset_flag(sk, SOCK_ZAPPED); + if (sock_flag(sk, SOCK_ZAPPED)) { + x25_sk(sk)->source_addr = addr->sx25_addr; + x25_insert_socket(sk); + sock_reset_flag(sk, SOCK_ZAPPED); + } else { + rc = -EINVAL; + } release_sock(sk); SOCK_DEBUG(sk, "x25_bind: socket is bound\n"); out: @@ -813,8 +816,13 @@ static int x25_connect(struct socket *sock, struct sockaddr *uaddr, sock->state = SS_CONNECTED; rc = 0; out_put_neigh: - if (rc) + if (rc) { + read_lock_bh(&x25_list_lock); x25_neigh_put(x25->neighbour); + x25->neighbour = NULL; + read_unlock_bh(&x25_list_lock); + x25->state = X25_STATE_0; + } out_put_route: x25_route_put(rt); out: diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c index 08c88de0ffda..11975ec8d566 100644 --- a/security/apparmor/domain.c +++ b/security/apparmor/domain.c @@ -1444,7 +1444,10 @@ int aa_change_profile(const char *fqname, int flags) new = aa_label_merge(label, target, GFP_KERNEL); if (IS_ERR_OR_NULL(new)) { info = "failed to build target label"; - error = PTR_ERR(new); + if (!new) + error = -ENOMEM; + else + error = PTR_ERR(new); new = NULL; perms.allow = 0; goto audit; diff --git a/security/lsm_audit.c b/security/lsm_audit.c index f84001019356..33028c098ef3 100644 --- a/security/lsm_audit.c +++ b/security/lsm_audit.c @@ -321,6 +321,7 @@ static void dump_common_audit_data(struct audit_buffer *ab, if (a->u.net->sk) { struct sock *sk = a->u.net->sk; struct unix_sock *u; + struct unix_address *addr; int len = 0; char *p = NULL; @@ -351,14 +352,15 @@ static void dump_common_audit_data(struct audit_buffer *ab, #endif case AF_UNIX: u = unix_sk(sk); + addr = smp_load_acquire(&u->addr); + if (!addr) + break; if (u->path.dentry) { audit_log_d_path(ab, " path=", &u->path); break; } - if (!u->addr) - break; - len = u->addr->len-sizeof(short); - p = &u->addr->name->sun_path[0]; + len = addr->len-sizeof(short); + p = &addr->name->sun_path[0]; audit_log_format(ab, " path="); if (*p) audit_log_untrustedstring(ab, p); diff --git a/sound/firewire/bebob/bebob.c b/sound/firewire/bebob/bebob.c index de4af8a41ff0..5636e89ce5c7 100644 --- a/sound/firewire/bebob/bebob.c +++ b/sound/firewire/bebob/bebob.c @@ -474,7 +474,19 @@ static const struct ieee1394_device_id bebob_id_table[] = { /* Focusrite, SaffirePro 26 I/O */ SND_BEBOB_DEV_ENTRY(VEN_FOCUSRITE, 0x00000003, &saffirepro_26_spec), /* Focusrite, SaffirePro 10 I/O */ - SND_BEBOB_DEV_ENTRY(VEN_FOCUSRITE, 0x00000006, &saffirepro_10_spec), + { + // The combination of vendor_id and model_id is the same as the + // same as the one of Liquid Saffire 56. + .match_flags = IEEE1394_MATCH_VENDOR_ID | + IEEE1394_MATCH_MODEL_ID | + IEEE1394_MATCH_SPECIFIER_ID | + IEEE1394_MATCH_VERSION, + .vendor_id = VEN_FOCUSRITE, + .model_id = 0x000006, + .specifier_id = 0x00a02d, + .version = 0x010001, + .driver_data = (kernel_ulong_t)&saffirepro_10_spec, + }, /* Focusrite, Saffire(no label and LE) */ SND_BEBOB_DEV_ENTRY(VEN_FOCUSRITE, MODEL_FOCUSRITE_SAFFIRE_BOTH, &saffire_spec), diff --git a/sound/firewire/motu/amdtp-motu.c b/sound/firewire/motu/amdtp-motu.c index f0555a24d90e..6c9b743ea74b 100644 --- a/sound/firewire/motu/amdtp-motu.c +++ b/sound/firewire/motu/amdtp-motu.c @@ -136,7 +136,9 @@ static void read_pcm_s32(struct amdtp_stream *s, byte = (u8 *)buffer + p->pcm_byte_offset; for (c = 0; c < channels; ++c) { - *dst = (byte[0] << 24) | (byte[1] << 16) | byte[2]; + *dst = (byte[0] << 24) | + (byte[1] << 16) | + (byte[2] << 8); byte += 3; dst++; } diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c index 617ff1aa818f..27eb0270a711 100644 --- a/sound/hda/hdac_i915.c +++ b/sound/hda/hdac_i915.c @@ -144,9 +144,9 @@ int snd_hdac_i915_init(struct hdac_bus *bus) return -ENODEV; if (!acomp->ops) { request_module("i915"); - /* 10s timeout */ + /* 60s timeout */ wait_for_completion_timeout(&bind_complete, - msecs_to_jiffies(10 * 1000)); + msecs_to_jiffies(60 * 1000)); } if (!acomp->ops) { dev_info(bus->dev, "couldn't bind with audio component\n"); diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c index fead0acb29f7..3cbd2119e148 100644 --- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -936,6 +936,9 @@ static const struct snd_pci_quirk cxt5066_fixups[] = { SND_PCI_QUIRK(0x103c, 0x8299, "HP 800 G3 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x829a, "HP 800 G3 DM", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8455, "HP Z2 G4", CXT_FIXUP_HP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x103c, 0x8456, "HP Z2 G4 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x103c, 0x8457, "HP Z2 G4 mini", CXT_FIXUP_HP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x103c, 0x8458, "HP Z2 G4 mini premium", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x138d, "Asus", CXT_FIXUP_HEADPHONE_MIC_PIN), SND_PCI_QUIRK(0x152d, 0x0833, "OLPC XO-1.5", CXT_FIXUP_OLPC_XO), SND_PCI_QUIRK(0x17aa, 0x20f2, "Lenovo T400", CXT_PINCFG_LENOVO_TP410), diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index bf1ffcaab23f..877293149e3a 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -118,6 +118,7 @@ struct alc_spec { unsigned int has_alc5505_dsp:1; unsigned int no_depop_delay:1; unsigned int done_hp_init:1; + unsigned int no_shutup_pins:1; /* for PLL fix */ hda_nid_t pll_nid; @@ -476,6 +477,14 @@ static void alc_auto_setup_eapd(struct hda_codec *codec, bool on) set_eapd(codec, *p, on); } +static void alc_shutup_pins(struct hda_codec *codec) +{ + struct alc_spec *spec = codec->spec; + + if (!spec->no_shutup_pins) + snd_hda_shutup_pins(codec); +} + /* generic shutup callback; * just turning off EAPD and a little pause for avoiding pop-noise */ @@ -486,7 +495,7 @@ static void alc_eapd_shutup(struct hda_codec *codec) alc_auto_setup_eapd(codec, false); if (!spec->no_depop_delay) msleep(200); - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); } /* generic EAPD initialization */ @@ -814,7 +823,7 @@ static inline void alc_shutup(struct hda_codec *codec) if (spec && spec->shutup) spec->shutup(codec); else - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); } static void alc_reboot_notify(struct hda_codec *codec) @@ -2950,7 +2959,7 @@ static void alc269_shutup(struct hda_codec *codec) (alc_get_coef0(codec) & 0x00ff) == 0x018) { msleep(150); } - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); } static struct coef_fw alc282_coefs[] = { @@ -3053,14 +3062,15 @@ static void alc282_shutup(struct hda_codec *codec) if (hp_pin_sense) msleep(85); - snd_hda_codec_write(codec, hp_pin, 0, - AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); + if (!spec->no_shutup_pins) + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); if (hp_pin_sense) msleep(100); alc_auto_setup_eapd(codec, false); - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); alc_write_coef_idx(codec, 0x78, coef78); } @@ -3166,15 +3176,16 @@ static void alc283_shutup(struct hda_codec *codec) if (hp_pin_sense) msleep(100); - snd_hda_codec_write(codec, hp_pin, 0, - AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); + if (!spec->no_shutup_pins) + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); alc_update_coef_idx(codec, 0x46, 0, 3 << 12); if (hp_pin_sense) msleep(100); alc_auto_setup_eapd(codec, false); - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); alc_write_coef_idx(codec, 0x43, 0x9614); } @@ -3240,14 +3251,15 @@ static void alc256_shutup(struct hda_codec *codec) /* NOTE: call this before clearing the pin, otherwise codec stalls */ alc_update_coef_idx(codec, 0x46, 0, 3 << 12); - snd_hda_codec_write(codec, hp_pin, 0, - AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); + if (!spec->no_shutup_pins) + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); if (hp_pin_sense) msleep(100); alc_auto_setup_eapd(codec, false); - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); } static void alc225_init(struct hda_codec *codec) @@ -3334,7 +3346,7 @@ static void alc225_shutup(struct hda_codec *codec) msleep(100); alc_auto_setup_eapd(codec, false); - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); } static void alc_default_init(struct hda_codec *codec) @@ -3388,14 +3400,15 @@ static void alc_default_shutup(struct hda_codec *codec) if (hp_pin_sense) msleep(85); - snd_hda_codec_write(codec, hp_pin, 0, - AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); + if (!spec->no_shutup_pins) + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); if (hp_pin_sense) msleep(100); alc_auto_setup_eapd(codec, false); - snd_hda_shutup_pins(codec); + alc_shutup_pins(codec); } static void alc294_hp_init(struct hda_codec *codec) @@ -3412,8 +3425,9 @@ static void alc294_hp_init(struct hda_codec *codec) msleep(100); - snd_hda_codec_write(codec, hp_pin, 0, - AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); + if (!spec->no_shutup_pins) + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); alc_update_coef_idx(codec, 0x6f, 0x000f, 0);/* Set HP depop to manual mode */ alc_update_coefex_idx(codec, 0x58, 0x00, 0x8000, 0x8000); /* HP depop procedure start */ @@ -5007,16 +5021,12 @@ static void alc_fixup_auto_mute_via_amp(struct hda_codec *codec, } } -static void alc_no_shutup(struct hda_codec *codec) -{ -} - static void alc_fixup_no_shutup(struct hda_codec *codec, const struct hda_fixup *fix, int action) { if (action == HDA_FIXUP_ACT_PRE_PROBE) { struct alc_spec *spec = codec->spec; - spec->shutup = alc_no_shutup; + spec->no_shutup_pins = 1; } } @@ -5602,6 +5612,7 @@ enum { ALC294_FIXUP_ASUS_SPK, ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE, ALC285_FIXUP_LENOVO_PC_BEEP_IN_NOISE, + ALC255_FIXUP_ACER_HEADSET_MIC, }; static const struct hda_fixup alc269_fixups[] = { @@ -6546,6 +6557,16 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC285_FIXUP_LENOVO_HEADPHONE_NOISE }, + [ALC255_FIXUP_ACER_HEADSET_MIC] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x19, 0x03a11130 }, + { 0x1a, 0x90a60140 }, /* use as internal mic */ + { } + }, + .chained = true, + .chain_id = ALC255_FIXUP_HEADSET_MODE_NO_HP_MIC + }, }; static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -6565,6 +6586,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1025, 0x128f, "Acer Veriton Z6860G", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1025, 0x1290, "Acer Veriton Z4860G", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1025, 0x1291, "Acer Veriton Z4660G", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1025, 0x1330, "Acer TravelMate X514-51T", ALC255_FIXUP_ACER_HEADSET_MIC), SND_PCI_QUIRK(0x1028, 0x0470, "Dell M101z", ALC269_FIXUP_DELL_M101Z), SND_PCI_QUIRK(0x1028, 0x054b, "Dell XPS one 2710", ALC275_FIXUP_DELL_XPS), SND_PCI_QUIRK(0x1028, 0x05bd, "Dell Latitude E6440", ALC292_FIXUP_DELL_E7X), @@ -6596,6 +6618,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1028, 0x0704, "Dell XPS 13 9350", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE), SND_PCI_QUIRK(0x1028, 0x0706, "Dell Inspiron 7559", ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER), SND_PCI_QUIRK(0x1028, 0x0725, "Dell Inspiron 3162", ALC255_FIXUP_DELL_SPK_NOISE), + SND_PCI_QUIRK(0x1028, 0x0738, "Dell Precision 5820", ALC269_FIXUP_NO_SHUTUP), SND_PCI_QUIRK(0x1028, 0x075b, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE), SND_PCI_QUIRK(0x1028, 0x075c, "Dell XPS 27 7760", ALC298_FIXUP_SPK_VOLUME), SND_PCI_QUIRK(0x1028, 0x075d, "Dell AIO", ALC298_FIXUP_SPK_VOLUME), @@ -6670,11 +6693,13 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x2336, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1), SND_PCI_QUIRK(0x103c, 0x2337, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1), SND_PCI_QUIRK(0x103c, 0x221c, "HP EliteBook 755 G2", ALC280_FIXUP_HP_HEADSET_MIC), + SND_PCI_QUIRK(0x103c, 0x802e, "HP Z240 SFF", ALC221_FIXUP_HP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x103c, 0x802f, "HP Z240", ALC221_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x820d, "HP Pavilion 15", ALC269_FIXUP_HP_MUTE_LED_MIC3), SND_PCI_QUIRK(0x103c, 0x8256, "HP", ALC221_FIXUP_HP_FRONT_MIC), SND_PCI_QUIRK(0x103c, 0x827e, "HP x360", ALC295_FIXUP_HP_X360), - SND_PCI_QUIRK(0x103c, 0x82bf, "HP", ALC221_FIXUP_HP_MIC_NO_PRESENCE), - SND_PCI_QUIRK(0x103c, 0x82c0, "HP", ALC221_FIXUP_HP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x103c, 0x82bf, "HP G3 mini", ALC221_FIXUP_HP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x103c, 0x82c0, "HP G3 mini premium", ALC221_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x83b9, "HP Spectre x360", ALC269_FIXUP_HP_MUTE_LED_MIC3), SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300), @@ -6690,7 +6715,6 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x12e0, "ASUS X541SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK), - SND_PCI_QUIRK(0x1043, 0x14a1, "ASUS UX533FD", ALC294_FIXUP_ASUS_SPK), SND_PCI_QUIRK(0x1043, 0x1517, "Asus Zenbook UX31A", ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A), SND_PCI_QUIRK(0x1043, 0x16e3, "ASUS UX50", ALC269_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x1043, 0x1a13, "Asus G73Jw", ALC269_FIXUP_ASUS_G73JW), @@ -7303,6 +7327,10 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x14, 0x90170110}, {0x1b, 0x90a70130}, {0x21, 0x04211020}), + SND_HDA_PIN_QUIRK(0x10ec0294, 0x1043, "ASUS", ALC294_FIXUP_ASUS_SPK, + {0x12, 0x90a60130}, + {0x17, 0x90170110}, + {0x21, 0x03211020}), SND_HDA_PIN_QUIRK(0x10ec0294, 0x1043, "ASUS", ALC294_FIXUP_ASUS_SPK, {0x12, 0x90a60130}, {0x17, 0x90170110}, diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index b455930a3eaf..ec73d83d0d31 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -370,6 +370,20 @@ static char **parse_bytes(char **argv, const char *name, unsigned char *val, return argv + i; } +/* on per cpu maps we must copy the provided value on all value instances */ +static void fill_per_cpu_value(struct bpf_map_info *info, void *value) +{ + unsigned int i, n, step; + + if (!map_is_per_cpu(info->type)) + return; + + n = get_possible_cpus(); + step = round_up(info->value_size, 8); + for (i = 1; i < n; i++) + memcpy(value + i * step, value, info->value_size); +} + static int parse_elem(char **argv, struct bpf_map_info *info, void *key, void *value, __u32 key_size, __u32 value_size, __u32 *flags, __u32 **value_fd) @@ -449,6 +463,8 @@ static int parse_elem(char **argv, struct bpf_map_info *info, argv = parse_bytes(argv, "value", value, value_size); if (!argv) return -1; + + fill_per_cpu_value(info, value); } return parse_elem(argv, info, key, NULL, key_size, value_size, diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 0de024a6cc2b..bbba0d61570f 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -109,13 +109,14 @@ static void print_boot_time(__u64 nsecs, char *buf, unsigned int size) static int prog_fd_by_tag(unsigned char *tag) { - struct bpf_prog_info info = {}; - __u32 len = sizeof(info); unsigned int id = 0; int err; int fd; while (true) { + struct bpf_prog_info info = {}; + __u32 len = sizeof(info); + err = bpf_prog_get_next_id(id, &id); if (err) { p_err("%s", strerror(errno)); diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 6c1e7ceedcf3..53c11fc0855e 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -1589,13 +1589,8 @@ static void perf_sample__fprint_metric(struct perf_script *script, .force_header = false, }; struct perf_evsel *ev2; - static bool init; u64 val; - if (!init) { - perf_stat__init_shadow_stats(); - init = true; - } if (!evsel->stats) perf_evlist__alloc_stats(script->session->evlist, false); if (evsel_script(evsel->leader)->gnum++ == 0) @@ -1658,7 +1653,7 @@ static void process_event(struct perf_script *script, return; } - if (PRINT_FIELD(TRACE)) { + if (PRINT_FIELD(TRACE) && sample->raw_data) { event_format__fprintf(evsel->tp_format, sample->cpu, sample->raw_data, sample->raw_size, fp); } @@ -2214,6 +2209,8 @@ static int __cmd_script(struct perf_script *script) signal(SIGINT, sig_handler); + perf_stat__init_shadow_stats(); + /* override event processing functions */ if (script->show_task_events) { script->tool.comm = process_comm_event; diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 22ab8e67c760..3f43aedb384d 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -2263,19 +2263,30 @@ static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp); static bool perf_evlist__add_vfs_getname(struct perf_evlist *evlist) { - struct perf_evsel *evsel = perf_evsel__newtp("probe", "vfs_getname"); + bool found = false; + struct perf_evsel *evsel, *tmp; + struct parse_events_error err = { .idx = 0, }; + int ret = parse_events(evlist, "probe:vfs_getname*", &err); - if (IS_ERR(evsel)) + if (ret) return false; - if (perf_evsel__field(evsel, "pathname") == NULL) { + evlist__for_each_entry_safe(evlist, evsel, tmp) { + if (!strstarts(perf_evsel__name(evsel), "probe:vfs_getname")) + continue; + + if (perf_evsel__field(evsel, "pathname")) { + evsel->handler = trace__vfs_getname; + found = true; + continue; + } + + list_del_init(&evsel->node); + evsel->evlist = NULL; perf_evsel__delete(evsel); - return false; } - evsel->handler = trace__vfs_getname; - perf_evlist__add(evlist, evsel); - return true; + return found; } static struct perf_evsel *perf_evsel__new_pgfault(u64 config) diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c index 1ccbd3342069..383674f448fc 100644 --- a/tools/perf/util/cpumap.c +++ b/tools/perf/util/cpumap.c @@ -134,7 +134,12 @@ struct cpu_map *cpu_map__new(const char *cpu_list) if (!cpu_list) return cpu_map__read_all_cpu_map(); - if (!isdigit(*cpu_list)) + /* + * must handle the case of empty cpumap to cover + * TOPOLOGY header for NUMA nodes with no CPU + * ( e.g., because of CPU hotplug) + */ + if (!isdigit(*cpu_list) && *cpu_list != '\0') goto out; while (isdigit(*cpu_list)) { @@ -181,8 +186,10 @@ struct cpu_map *cpu_map__new(const char *cpu_list) if (nr_cpus > 0) cpus = cpu_map__trim_new(nr_cpus, tmp_cpus); - else + else if (*cpu_list != '\0') cpus = cpu_map__default_new(); + else + cpus = cpu_map__dummy_new(); invalid: free(tmp_cpus); out: diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 6e70cc00c161..a701a8a48f00 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -87,6 +87,11 @@ static inline uint8_t elf_sym__type(const GElf_Sym *sym) return GELF_ST_TYPE(sym->st_info); } +static inline uint8_t elf_sym__visibility(const GElf_Sym *sym) +{ + return GELF_ST_VISIBILITY(sym->st_other); +} + #ifndef STT_GNU_IFUNC #define STT_GNU_IFUNC 10 #endif @@ -111,7 +116,9 @@ static inline int elf_sym__is_label(const GElf_Sym *sym) return elf_sym__type(sym) == STT_NOTYPE && sym->st_name != 0 && sym->st_shndx != SHN_UNDEF && - sym->st_shndx != SHN_ABS; + sym->st_shndx != SHN_ABS && + elf_sym__visibility(sym) != STV_HIDDEN && + elf_sym__visibility(sym) != STV_INTERNAL; } static bool elf_sym__filter(GElf_Sym *sym) diff --git a/tools/testing/selftests/bpf/bpf_util.h b/tools/testing/selftests/bpf/bpf_util.h index 315a44fa32af..84fd6f1bf33e 100644 --- a/tools/testing/selftests/bpf/bpf_util.h +++ b/tools/testing/selftests/bpf/bpf_util.h @@ -13,7 +13,7 @@ static inline unsigned int bpf_num_possible_cpus(void) unsigned int start, end, possible_cpus = 0; char buff[128]; FILE *fp; - int n; + int len, n, i, j = 0; fp = fopen(fcpu, "r"); if (!fp) { @@ -21,17 +21,27 @@ static inline unsigned int bpf_num_possible_cpus(void) exit(1); } - while (fgets(buff, sizeof(buff), fp)) { - n = sscanf(buff, "%u-%u", &start, &end); - if (n == 0) { - printf("Failed to retrieve # possible CPUs!\n"); - exit(1); - } else if (n == 1) { - end = start; - } - possible_cpus = start == 0 ? end + 1 : 0; - break; + if (!fgets(buff, sizeof(buff), fp)) { + printf("Failed to read %s!\n", fcpu); + exit(1); } + + len = strlen(buff); + for (i = 0; i <= len; i++) { + if (buff[i] == ',' || buff[i] == '\0') { + buff[i] = '\0'; + n = sscanf(&buff[j], "%u-%u", &start, &end); + if (n <= 0) { + printf("Failed to retrieve # possible CPUs!\n"); + exit(1); + } else if (n == 1) { + end = start; + } + possible_cpus += end - start + 1; + j = i + 1; + } + } + fclose(fp); return possible_cpus; diff --git a/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh b/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh index bab13dd025a6..0d26b5e3f966 100755 --- a/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh +++ b/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh @@ -37,6 +37,10 @@ prerequisite() exit $ksft_skip fi + present_cpus=`cat $SYSFS/devices/system/cpu/present` + present_max=${present_cpus##*-} + echo "present_cpus = $present_cpus present_max = $present_max" + echo -e "\t Cpus in online state: $online_cpus" offline_cpus=`cat $SYSFS/devices/system/cpu/offline` @@ -151,6 +155,8 @@ online_cpus=0 online_max=0 offline_cpus=0 offline_max=0 +present_cpus=0 +present_max=0 while getopts e:ahp: opt; do case $opt in @@ -190,9 +196,10 @@ if [ $allcpus -eq 0 ]; then online_cpu_expect_success $online_max if [[ $offline_cpus -gt 0 ]]; then - echo -e "\t offline to online to offline: cpu $offline_max" - online_cpu_expect_success $offline_max - offline_cpu_expect_success $offline_max + echo -e "\t offline to online to offline: cpu $present_max" + online_cpu_expect_success $present_max + offline_cpu_expect_success $present_max + online_cpu $present_max fi exit 0 else diff --git a/tools/testing/selftests/firmware/fw_lib.sh b/tools/testing/selftests/firmware/fw_lib.sh index 6c5f1b2ffb74..1cbb12e284a6 100755 --- a/tools/testing/selftests/firmware/fw_lib.sh +++ b/tools/testing/selftests/firmware/fw_lib.sh @@ -91,7 +91,7 @@ verify_reqs() if [ "$TEST_REQS_FW_SYSFS_FALLBACK" = "yes" ]; then if [ ! "$HAS_FW_LOADER_USER_HELPER" = "yes" ]; then echo "usermode helper disabled so ignoring test" - exit $ksft_skip + exit 0 fi fi } diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 919aa2ac00af..9a3764a1084e 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -18,6 +18,6 @@ TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls KSFT_KHDR_INSTALL := 1 include ../lib.mk -$(OUTPUT)/reuseport_bpf_numa: LDFLAGS += -lnuma +$(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma $(OUTPUT)/tcp_mmap: LDFLAGS += -lpthread $(OUTPUT)/tcp_inq: LDFLAGS += -lpthread diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile index 47ed6cef93fb..c9ff2b47bd1c 100644 --- a/tools/testing/selftests/netfilter/Makefile +++ b/tools/testing/selftests/netfilter/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 # Makefile for netfilter selftests -TEST_PROGS := nft_trans_stress.sh +TEST_PROGS := nft_trans_stress.sh nft_nat.sh include ../lib.mk diff --git a/tools/testing/selftests/netfilter/config b/tools/testing/selftests/netfilter/config index 1017313e41a8..59caa8f71cd8 100644 --- a/tools/testing/selftests/netfilter/config +++ b/tools/testing/selftests/netfilter/config @@ -1,2 +1,2 @@ CONFIG_NET_NS=y -NF_TABLES_INET=y +CONFIG_NF_TABLES_INET=y diff --git a/tools/testing/selftests/netfilter/nft_nat.sh b/tools/testing/selftests/netfilter/nft_nat.sh new file mode 100755 index 000000000000..8ec76681605c --- /dev/null +++ b/tools/testing/selftests/netfilter/nft_nat.sh @@ -0,0 +1,762 @@ +#!/bin/bash +# +# This test is for basic NAT functionality: snat, dnat, redirect, masquerade. +# + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 +ret=0 + +nft --version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without nft tool" + exit $ksft_skip +fi + +ip -Version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without ip tool" + exit $ksft_skip +fi + +ip netns add ns0 +ip netns add ns1 +ip netns add ns2 + +ip link add veth0 netns ns0 type veth peer name eth0 netns ns1 +ip link add veth1 netns ns0 type veth peer name eth0 netns ns2 + +ip -net ns0 link set lo up +ip -net ns0 link set veth0 up +ip -net ns0 addr add 10.0.1.1/24 dev veth0 +ip -net ns0 addr add dead:1::1/64 dev veth0 + +ip -net ns0 link set veth1 up +ip -net ns0 addr add 10.0.2.1/24 dev veth1 +ip -net ns0 addr add dead:2::1/64 dev veth1 + +for i in 1 2; do + ip -net ns$i link set lo up + ip -net ns$i link set eth0 up + ip -net ns$i addr add 10.0.$i.99/24 dev eth0 + ip -net ns$i route add default via 10.0.$i.1 + ip -net ns$i addr add dead:$i::99/64 dev eth0 + ip -net ns$i route add default via dead:$i::1 +done + +bad_counter() +{ + local ns=$1 + local counter=$2 + local expect=$3 + + echo "ERROR: $counter counter in $ns has unexpected value (expected $expect)" 1>&2 + ip netns exec $ns nft list counter inet filter $counter 1>&2 +} + +check_counters() +{ + ns=$1 + local lret=0 + + cnt=$(ip netns exec $ns nft list counter inet filter ns0in | grep -q "packets 1 bytes 84") + if [ $? -ne 0 ]; then + bad_counter $ns ns0in "packets 1 bytes 84" + lret=1 + fi + cnt=$(ip netns exec $ns nft list counter inet filter ns0out | grep -q "packets 1 bytes 84") + if [ $? -ne 0 ]; then + bad_counter $ns ns0out "packets 1 bytes 84" + lret=1 + fi + + expect="packets 1 bytes 104" + cnt=$(ip netns exec $ns nft list counter inet filter ns0in6 | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter $ns ns0in6 "$expect" + lret=1 + fi + cnt=$(ip netns exec $ns nft list counter inet filter ns0out6 | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter $ns ns0out6 "$expect" + lret=1 + fi + + return $lret +} + +check_ns0_counters() +{ + local ns=$1 + local lret=0 + + cnt=$(ip netns exec ns0 nft list counter inet filter ns0in | grep -q "packets 0 bytes 0") + if [ $? -ne 0 ]; then + bad_counter ns0 ns0in "packets 0 bytes 0" + lret=1 + fi + + cnt=$(ip netns exec ns0 nft list counter inet filter ns0in6 | grep -q "packets 0 bytes 0") + if [ $? -ne 0 ]; then + bad_counter ns0 ns0in6 "packets 0 bytes 0" + lret=1 + fi + + cnt=$(ip netns exec ns0 nft list counter inet filter ns0out | grep -q "packets 0 bytes 0") + if [ $? -ne 0 ]; then + bad_counter ns0 ns0out "packets 0 bytes 0" + lret=1 + fi + cnt=$(ip netns exec ns0 nft list counter inet filter ns0out6 | grep -q "packets 0 bytes 0") + if [ $? -ne 0 ]; then + bad_counter ns0 ns0out6 "packets 0 bytes 0" + lret=1 + fi + + for dir in "in" "out" ; do + expect="packets 1 bytes 84" + cnt=$(ip netns exec ns0 nft list counter inet filter ${ns}${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 $ns$dir "$expect" + lret=1 + fi + + expect="packets 1 bytes 104" + cnt=$(ip netns exec ns0 nft list counter inet filter ${ns}${dir}6 | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 $ns$dir6 "$expect" + lret=1 + fi + done + + return $lret +} + +reset_counters() +{ + for i in 0 1 2;do + ip netns exec ns$i nft reset counters inet > /dev/null + done +} + +test_local_dnat6() +{ + local lret=0 +ip netns exec ns0 nft -f - < /dev/null + if [ $? -ne 0 ]; then + lret=1 + echo "ERROR: ping6 failed" + return $lret + fi + + expect="packets 0 bytes 0" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 ns1$dir "$expect" + lret=1 + fi + done + + expect="packets 1 bytes 104" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 ns2$dir "$expect" + lret=1 + fi + done + + # expect 0 count in ns1 + expect="packets 0 bytes 0" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + done + + # expect 1 packet in ns2 + expect="packets 1 bytes 104" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns2 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns0$dir "$expect" + lret=1 + fi + done + + test $lret -eq 0 && echo "PASS: ipv6 ping to ns1 was NATted to ns2" + ip netns exec ns0 nft flush chain ip6 nat output + + return $lret +} + +test_local_dnat() +{ + local lret=0 +ip netns exec ns0 nft -f - < /dev/null + if [ $? -ne 0 ]; then + lret=1 + echo "ERROR: ping failed" + return $lret + fi + + expect="packets 0 bytes 0" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 ns1$dir "$expect" + lret=1 + fi + done + + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 ns2$dir "$expect" + lret=1 + fi + done + + # expect 0 count in ns1 + expect="packets 0 bytes 0" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + done + + # expect 1 packet in ns2 + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns2 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns0$dir "$expect" + lret=1 + fi + done + + test $lret -eq 0 && echo "PASS: ping to ns1 was NATted to ns2" + + ip netns exec ns0 nft flush chain ip nat output + + reset_counters + ip netns exec ns0 ping -q -c 1 10.0.1.99 > /dev/null + if [ $? -ne 0 ]; then + lret=1 + echo "ERROR: ping failed" + return $lret + fi + + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns1$dir "$expect" + lret=1 + fi + done + expect="packets 0 bytes 0" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 ns2$dir "$expect" + lret=1 + fi + done + + # expect 1 count in ns1 + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns0 ns0$dir "$expect" + lret=1 + fi + done + + # expect 0 packet in ns2 + expect="packets 0 bytes 0" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns2 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns2$dir "$expect" + lret=1 + fi + done + + test $lret -eq 0 && echo "PASS: ping to ns1 OK after nat output chain flush" + + return $lret +} + + +test_masquerade6() +{ + local lret=0 + + ip netns exec ns0 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null + + ip netns exec ns2 ping -q -c 1 dead:1::99 > /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannot ping ns1 from ns2 via ipv6" + return 1 + lret=1 + fi + + expect="packets 1 bytes 104" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns2$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns2 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + reset_counters + +# add masquerading rule +ip netns exec ns0 nft -f - < /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannot ping ns1 from ns2 with active ipv6 masquerading" + lret=1 + fi + + # ns1 should have seen packets from ns0, due to masquerade + expect="packets 1 bytes 104" + for dir in "in6" "out6" ; do + + cnt=$(ip netns exec ns1 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns2 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + # ns1 should not have seen packets from ns2, due to masquerade + expect="packets 0 bytes 0" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + ip netns exec ns0 nft flush chain ip6 nat postrouting + if [ $? -ne 0 ]; then + echo "ERROR: Could not flush ip6 nat postrouting" 1>&2 + lret=1 + fi + + test $lret -eq 0 && echo "PASS: IPv6 masquerade for ns2" + + return $lret +} + +test_masquerade() +{ + local lret=0 + + ip netns exec ns0 sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null + ip netns exec ns0 sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null + + ip netns exec ns2 ping -q -c 1 10.0.1.99 > /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: canot ping ns1 from ns2" + lret=1 + fi + + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns2$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns2 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + reset_counters + +# add masquerading rule +ip netns exec ns0 nft -f - < /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannot ping ns1 from ns2 with active ip masquerading" + lret=1 + fi + + # ns1 should have seen packets from ns0, due to masquerade + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns0${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns2 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + # ns1 should not have seen packets from ns2, due to masquerade + expect="packets 0 bytes 0" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + ip netns exec ns0 nft flush chain ip nat postrouting + if [ $? -ne 0 ]; then + echo "ERROR: Could not flush nat postrouting" 1>&2 + lret=1 + fi + + test $lret -eq 0 && echo "PASS: IP masquerade for ns2" + + return $lret +} + +test_redirect6() +{ + local lret=0 + + ip netns exec ns0 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null + + ip netns exec ns2 ping -q -c 1 dead:1::99 > /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannnot ping ns1 from ns2 via ipv6" + lret=1 + fi + + expect="packets 1 bytes 104" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns2$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns2 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + reset_counters + +# add redirect rule +ip netns exec ns0 nft -f - < /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannot ping ns1 from ns2 with active ip6 redirect" + lret=1 + fi + + # ns1 should have seen no packets from ns2, due to redirection + expect="packets 0 bytes 0" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + done + + # ns0 should have seen packets from ns2, due to masquerade + expect="packets 1 bytes 104" + for dir in "in6" "out6" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + done + + ip netns exec ns0 nft delete table ip6 nat + if [ $? -ne 0 ]; then + echo "ERROR: Could not delete ip6 nat table" 1>&2 + lret=1 + fi + + test $lret -eq 0 && echo "PASS: IPv6 redirection for ns2" + + return $lret +} + +test_redirect() +{ + local lret=0 + + ip netns exec ns0 sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null + ip netns exec ns0 sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null + + ip netns exec ns2 ping -q -c 1 10.0.1.99 > /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannot ping ns1 from ns2" + lret=1 + fi + + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns2$dir "$expect" + lret=1 + fi + + cnt=$(ip netns exec ns2 nft list counter inet filter ns1${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns2 ns1$dir "$expect" + lret=1 + fi + done + + reset_counters + +# add redirect rule +ip netns exec ns0 nft -f - < /dev/null # ping ns2->ns1 + if [ $? -ne 0 ] ; then + echo "ERROR: cannot ping ns1 from ns2 with active ip redirect" + lret=1 + fi + + # ns1 should have seen no packets from ns2, due to redirection + expect="packets 0 bytes 0" + for dir in "in" "out" ; do + + cnt=$(ip netns exec ns1 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + done + + # ns0 should have seen packets from ns2, due to masquerade + expect="packets 1 bytes 84" + for dir in "in" "out" ; do + cnt=$(ip netns exec ns0 nft list counter inet filter ns2${dir} | grep -q "$expect") + if [ $? -ne 0 ]; then + bad_counter ns1 ns0$dir "$expect" + lret=1 + fi + done + + ip netns exec ns0 nft delete table ip nat + if [ $? -ne 0 ]; then + echo "ERROR: Could not delete nat table" 1>&2 + lret=1 + fi + + test $lret -eq 0 && echo "PASS: IP redirection for ns2" + + return $lret +} + + +# ip netns exec ns0 ping -c 1 -q 10.0.$i.99 +for i in 0 1 2; do +ip netns exec ns$i nft -f - < /dev/null + if [ $? -ne 0 ];then + echo "ERROR: Could not reach other namespace(s)" 1>&2 + ret=1 + fi + + ip netns exec ns0 ping -c 1 -q dead:$i::99 > /dev/null + if [ $? -ne 0 ];then + echo "ERROR: Could not reach other namespace(s) via ipv6" 1>&2 + ret=1 + fi + check_counters ns$i + if [ $? -ne 0 ]; then + ret=1 + fi + + check_ns0_counters ns$i + if [ $? -ne 0 ]; then + ret=1 + fi + reset_counters +done + +if [ $ret -eq 0 ];then + echo "PASS: netns routing/connectivity: ns0 can reach ns1 and ns2" +fi + +reset_counters +test_local_dnat +test_local_dnat6 + +reset_counters +test_masquerade +test_masquerade6 + +reset_counters +test_redirect +test_redirect6 + +for i in 0 1 2; do ip netns del ns$i;done + +exit $ret diff --git a/tools/testing/selftests/proc/.gitignore b/tools/testing/selftests/proc/.gitignore index 82121a81681f..29bac5ef9a93 100644 --- a/tools/testing/selftests/proc/.gitignore +++ b/tools/testing/selftests/proc/.gitignore @@ -10,4 +10,5 @@ /proc-uptime-002 /read /self +/setns-dcache /thread-self diff --git a/tools/testing/selftests/proc/Makefile b/tools/testing/selftests/proc/Makefile index 1c12c34cf85d..434d033ee067 100644 --- a/tools/testing/selftests/proc/Makefile +++ b/tools/testing/selftests/proc/Makefile @@ -14,6 +14,7 @@ TEST_GEN_PROGS += proc-uptime-001 TEST_GEN_PROGS += proc-uptime-002 TEST_GEN_PROGS += read TEST_GEN_PROGS += self +TEST_GEN_PROGS += setns-dcache TEST_GEN_PROGS += thread-self include ../lib.mk diff --git a/tools/testing/selftests/proc/setns-dcache.c b/tools/testing/selftests/proc/setns-dcache.c new file mode 100644 index 000000000000..60ab197a73fc --- /dev/null +++ b/tools/testing/selftests/proc/setns-dcache.c @@ -0,0 +1,129 @@ +/* + * Copyright © 2019 Alexey Dobriyan + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ +/* + * Test that setns(CLONE_NEWNET) points to new /proc/net content even + * if old one is in dcache. + * + * FIXME /proc/net/unix is under CONFIG_UNIX which can be disabled. + */ +#undef NDEBUG +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static pid_t pid = -1; + +static void f(void) +{ + if (pid > 0) { + kill(pid, SIGTERM); + } +} + +int main(void) +{ + int fd[2]; + char _ = 0; + int nsfd; + + atexit(f); + + /* Check for priviledges and syscall availability straight away. */ + if (unshare(CLONE_NEWNET) == -1) { + if (errno == ENOSYS || errno == EPERM) { + return 4; + } + return 1; + } + /* Distinguisher between two otherwise empty net namespaces. */ + if (socket(AF_UNIX, SOCK_STREAM, 0) == -1) { + return 1; + } + + if (pipe(fd) == -1) { + return 1; + } + + pid = fork(); + if (pid == -1) { + return 1; + } + + if (pid == 0) { + if (unshare(CLONE_NEWNET) == -1) { + return 1; + } + + if (write(fd[1], &_, 1) != 1) { + return 1; + } + + pause(); + + return 0; + } + + if (read(fd[0], &_, 1) != 1) { + return 1; + } + + { + char buf[64]; + snprintf(buf, sizeof(buf), "/proc/%u/ns/net", pid); + nsfd = open(buf, O_RDONLY); + if (nsfd == -1) { + return 1; + } + } + + /* Reliably pin dentry into dcache. */ + (void)open("/proc/net/unix", O_RDONLY); + + if (setns(nsfd, CLONE_NEWNET) == -1) { + return 1; + } + + kill(pid, SIGTERM); + pid = 0; + + { + char buf[4096]; + ssize_t rv; + int fd; + + fd = open("/proc/net/unix", O_RDONLY); + if (fd == -1) { + return 1; + } + +#define S "Num RefCount Protocol Flags Type St Inode Path\n" + rv = read(fd, buf, sizeof(buf)); + + assert(rv == strlen(S)); + assert(memcmp(buf, S, strlen(S)) == 0); + } + + return 0; +} diff --git a/tools/testing/selftests/timers/Makefile b/tools/testing/selftests/timers/Makefile index c02683cfb6c9..7656c7ce79d9 100644 --- a/tools/testing/selftests/timers/Makefile +++ b/tools/testing/selftests/timers/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 CFLAGS += -O3 -Wl,-no-as-needed -Wall -LDFLAGS += -lrt -lpthread -lm +LDLIBS += -lrt -lpthread -lm # these are all "safe" tests that don't modify # system time or require escalated privileges