Commit Graph

239 Commits

Author SHA1 Message Date
Ravi Joshi
b7dc3b5420 WLAN subsystem: Sysctl support for key TCP/IP parameters
It has been observed that default values for some of key tcp/ip
parameters are affecting the tput/performance of the system. Hence
extending configuration capabilities to TCP/Ip stack through
sysctl interface.

Change-Id: I4287e9103769535f43e0934bac08435a524ee6a4
CRs-Fixed: 507581
Signed-off-by: Ravi Joshi <ravij@codeaurora.org>
Signed-off-by: Ganesh Babu Kumaravel <kganesh@codeaurora.org>
Signed-off-by: Mohit Khanna <mkhannaqca@codeaurora.org>
Signed-off-by: Manjunathappa Prakash <prakashpm@codeaurora.org>
Signed-off-by: Sandeep Singh <sandsing@codeaurora.org>
2020-03-04 18:35:36 +05:30
Subash Abhinov Kasiviswanathan
f094f0c406 net: ipv4: Remove tcp_default_init_rwnd
This reverts commit 1e9a9ffbbd ("RFC: ANDROID: net: ipv4: tcp:
Namespace-ify sysctl_tcp_default_init_rwnd") and
commit 89e8c70306 ("ANDROID: net: ipv4: tcp: add a sysctl to config
the tcp_default_init_rwnd")

This feature has been deprecated.

CRs-Fixed: 2500636
Change-Id: Ie831694ab897e2b267b8208833f5a2b28c1c2abe
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
2019-08-02 00:42:21 -06:00
Ivaylo Georgiev
1d0720259a Merge android-4.19.52 (d9bd265) into msm-4.19
* refs/heads/tmp-d9bd265:
  Linux 4.19.52
  tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
  tcp: add tcp_min_snd_mss sysctl
  tcp: tcp_fragment() should apply sane memory limits
  tcp: limit payload size of sacked skbs

Change-Id: I01c304645f63cde6b87926bb01c7b2112db2042d
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
2019-07-18 07:44:02 -07:00
Eric Dumazet
7f9f8a37e5 tcp: add tcp_min_snd_mss sysctl
commit 5f3e2bf008c2221478101ee72f5cb4654b9fc363 upstream.

Some TCP peers announce a very small MSS option in their SYN and/or
SYN/ACK messages.

This forces the stack to send packets with a very high network/cpu
overhead.

Linux has enforced a minimal value of 48. Since this value includes
the size of TCP options, and that the options can consume up to 40
bytes, this means that each segment can include only 8 bytes of payload.

In some cases, it can be useful to increase the minimal value
to a saner value.

We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility
reasons.

Note that TCP_MAXSEG socket option enforces a minimal value
of (TCP_MIN_MSS). David Miller increased this minimal value
in commit c39508d6f1 ("tcp: Make TCP_MAXSEG minimum more correct.")
from 64 to 88.

We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS.

CVE-2019-11479 -- tcp mss hardcoded to 48

Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-17 19:51:56 +02:00
Ivaylo Georgiev
437cc3700f Merge android-4.19.38 (5e7b4fb) into msm-4.19
* refs/heads/tmp-5e7b4fb:
  Linux 4.19.38
  powerpc/fsl: Add FSL_PPC_BOOK3E as supported arch for nospectre_v2 boot arg
  net/tls: don't leak IV and record seq when offload fails
  net/tls: avoid potential deadlock in tls_set_device_offload_rx()
  net/mlx5e: Fix use-after-free after xdp_return_frame
  net/mlx5e: Fix the max MTU check in case of XDP
  mlxsw: spectrum: Put MC TCs into DWRR mode
  mlxsw: pci: Reincrease PCI reset timeout
  net: hns: Fix WARNING when hns modules installed
  team: fix possible recursive locking when add slaves
  stmmac: pci: Adjust IOT2000 matching
  net/tls: fix refcount adjustment in fallback
  net: stmmac: move stmmac_check_ether_addr() to driver probe
  net/rose: fix unbound loop in rose_loopback_timer()
  net: rds: exchange of 8K and 1M pool
  net/mlx5e: ethtool, Remove unsupported SFP EEPROM high pages query
  mlxsw: spectrum: Fix autoneg status in ethtool
  ipv4: set the tcp_min_rtt_wlen range from 0 to one day
  ipv4: add sanity checks in ipv4_link_failure()
  x86/fpu: Don't export __kernel_fpu_{begin,end}()
  mm: Fix warning in insert_pfn()
  x86/retpolines: Disable switch jump tables when retpolines are enabled
  x86, retpolines: Raise limit for generating indirect calls from switch-case
  Fix aio_poll() races
  aio: store event at final iocb_put()
  aio: keep io_event in aio_kiocb
  aio: fold lookup_kiocb() into its sole caller
  pin iocb through aio.
  aio: simplify - and fix - fget/fput for io_submit()
  aio: initialize kiocb private in case any filesystems expect it.
  aio: abstract out io_event filler helper
  aio: split out iocb copy from io_submit_one()
  aio: use iocb_put() instead of open coding it
  aio: don't zero entire aio_kiocb aio_get_req()
  aio: separate out ring reservation from req allocation
  aio: use assigned completion handler
  aio: clear IOCB_HIPRI
  rxrpc: fix race condition in rxrpc_input_packet()
  net/rds: Check address length before reading address family
  net: netrom: Fix error cleanup path of nr_proto_init
  tipc: check link name with right length in tipc_nl_compat_link_set
  tipc: check bearer name with right length in tipc_nl_compat_bearer_enable
  fm10k: Fix a potential NULL pointer dereference
  netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON
  NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.
  sched/deadline: Correctly handle active 0-lag timers
  binder: fix handling of misaligned binder object
  workqueue: Try to catch flush_work() without INIT_WORK().
  fs/proc/proc_sysctl.c: Fix a NULL pointer dereference
  intel_th: gth: Fix an off-by-one in output unassigning
  slip: make slhc_free() silently accept an error pointer
  USB: Consolidate LPM checks to avoid enabling LPM twice
  USB: Add new USB LPM helpers
  drm/vc4: Fix compilation error reported by kbuild test bot
  Revert "drm/i915/fbdev: Actually configure untiled displays"
  drm/vc4: Fix memory leak during gpu reset.
  powerpc/mm/radix: Make Radix require HUGETLB_PAGE
  ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache
  dmaengine: sh: rcar-dmac: Fix glitch in dmaengine_tx_status
  dmaengine: sh: rcar-dmac: With cyclic DMA residue 0 is valid
  vfio/type1: Limit DMA mappings per container
  Input: synaptics-rmi4 - write config register values to the right offset
  perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters
  sunrpc: don't mark uninitialised items as VALID.
  nfsd: Don't release the callback slot unless it was actually held
  ceph: fix ci->i_head_snapc leak
  ceph: ensure d_name stability in ceph_dentry_hash()
  ceph: only use d_name directly when parent is locked
  sched/numa: Fix a possible divide-by-zero
  RDMA/mlx5: Do not allow the user to write to the clock page
  IB/rdmavt: Fix frwr memory registration
  trace: Fix preempt_enable_no_resched() abuse
  MIPS: scall64-o32: Fix indirect syscall number load
  lib/Kconfig.debug: fix build error without CONFIG_BLOCK
  zram: pass down the bvec we need to read into in the work struct
  gpio: eic: sprd: Fix incorrect irq type setting for the sync EIC
  tracing: Fix buffer_ref pipe ops
  tracing: Fix a memory leak by early error exit in trace_pid_write()
  cifs: do not attempt cifs operation on smb2+ rename error
  cifs: fix memory leak in SMB2_read
  net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
  ALSA: hda/ca0132 - Fix build error without CONFIG_PCI
  powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64
  ipvs: fix warning on unused variable
  vsock/virtio: fix kernel panic from virtio_transport_reset_no_sock
  drm/rockchip: fix for mailbox read validation.
  loop: do not print warn message if partition scan is successful
  tipc: handle the err returned from cmd header function
  ext4: fix some error pointer dereferences
  net: mvpp2: fix validate for PPv2.1
  net/ibmvnic: Fix RTNL deadlock during device reset
  netfilter: nf_tables: bogus EBUSY in helper removal from transaction
  netfilter: nf_tables: bogus EBUSY when deleting set after flush
  netfilter: nf_tables: fix set double-free in abort path
  netfilter: nft_compat: use .release_ops and remove list of extension
  netfilter: nft_compat: don't use refcount_inc on newly allocated entry
  netfilter: nf_tables: unbind set in rule from commit path
  netfilter: nf_tables: warn when expr implements only one of activate/deactivate
  netfilter: nft_compat: destroy function must not have side effects
  netfilter: nf_tables: split set destruction in deactivate and destroy phase
  netfilter: nft_compat: make lists per netns
  netfilter: nft_compat: use refcnt_t type for nft_xt reference count

Change-Id: I5ac7e5185c3b9f2264d850549df4978946ffcd50
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
2019-05-31 06:00:28 -07:00
ZhangXiaoxu
250e51f856 ipv4: set the tcp_min_rtt_wlen range from 0 to one day
[ Upstream commit 19fad20d15a6494f47f85d869f00b11343ee5c78 ]

There is a UBSAN report as below:
UBSAN: Undefined behaviour in net/ipv4/tcp_input.c:2877:56
signed integer overflow:
2147483647 * 1000 cannot be represented in type 'int'
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.1.0-rc4-00058-g582549e #1
Call Trace:
 <IRQ>
 dump_stack+0x8c/0xba
 ubsan_epilogue+0x11/0x60
 handle_overflow+0x12d/0x170
 ? ttwu_do_wakeup+0x21/0x320
 __ubsan_handle_mul_overflow+0x12/0x20
 tcp_ack_update_rtt+0x76c/0x780
 tcp_clean_rtx_queue+0x499/0x14d0
 tcp_ack+0x69e/0x1240
 ? __wake_up_sync_key+0x2c/0x50
 ? update_group_capacity+0x50/0x680
 tcp_rcv_established+0x4e2/0xe10
 tcp_v4_do_rcv+0x22b/0x420
 tcp_v4_rcv+0xfe8/0x1190
 ip_protocol_deliver_rcu+0x36/0x180
 ip_local_deliver+0x15b/0x1a0
 ip_rcv+0xac/0xd0
 __netif_receive_skb_one_core+0x7f/0xb0
 __netif_receive_skb+0x33/0xc0
 netif_receive_skb_internal+0x84/0x1c0
 napi_gro_receive+0x2a0/0x300
 receive_buf+0x3d4/0x2350
 ? detach_buf_split+0x159/0x390
 virtnet_poll+0x198/0x840
 ? reweight_entity+0x243/0x4b0
 net_rx_action+0x25c/0x770
 __do_softirq+0x19b/0x66d
 irq_exit+0x1eb/0x230
 do_IRQ+0x7a/0x150
 common_interrupt+0xf/0xf
 </IRQ>

It can be reproduced by:
  echo 2147483647 > /proc/sys/net/ipv4/tcp_min_rtt_wlen

Fixes: f672258391 ("tcp: track min RTT using windowed min-filter")
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-02 09:58:59 +02:00
Subash Abhinov Kasiviswanathan
0cbb5c42e1 net: Fail explicit bind to local reserved ports
Reserved ports may have some special use cases which are not suitable
for use by general userspace applications. Currently, ports specified
in ip_local_reserved_ports will not be returned only in case of
automatic port assignment.

Add a boolean sysctl flag 'reserved_port_bind'. Default value is 1
which preserves the existing behavior. Setting the value to 0 will
prevent userspace applications from binding to these ports even when
they are explicitly requested.

CRs-Fixed: 2333588
Change-Id: Ib1071ca5bd437cd3c4f71b56147e4858f3b9ebec
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
2018-10-29 16:31:38 -06:00
Rishabh Bhatnagar
b3953ede2a Merge remote-tracking branch 'origin/tmp-35a7f35' into msm-kona
* origin/tmp-35a7f35:
  Linux 4.19-rc8
  KVM: vmx: hyper-v: don't pass EPT configuration info to vmx_hv_remote_flush_tlb()
  ubifs: Fix WARN_ON logic in exit path
  fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters()
  mm/thp: fix call to mmu_notifier in set_pmd_migration_entry() v2
  mm/mmap.c: don't clobber partially overlapping VMA with MAP_FIXED_NOREPLACE
  ocfs2: fix a GCC warning
  afs: Fix afs_server struct leak
  MAINTAINERS: use the correct location for dt-bindings includes for mux
  mux: adgs1408: use the correct MODULE_LICENSE
  gfs2: Fix iomap buffered write support for journaled files (2)
  arm64: perf: Reject stand-alone CHAIN events for PMUv3
  arm64: Fix /proc/iomem for reserved but not memory regions
  afs: Fix cell proc list
  lib/bch: fix possible stack overrun
  net: dsa: bcm_sf2: Call setup during switch resume
  net: dsa: bcm_sf2: Fix unbind ordering
  vmlinux.lds.h: Fix linker warnings about orphan .LPBX sections
  vmlinux.lds.h: Fix incomplete .text.exit discards
  i2c: Fix kerneldoc for renamed i2c dma put function
  blk-wbt: wake up all when we scale up, not down
  net: phy: sfp: remove sfp_mutex's definition
  r8169: set RX_MULTI_EN bit in RxConfig for 8168F-family chips
  net: socionext: clear rx irq correctly
  net/mlx4_core: Fix warnings during boot on driverinit param set failures
  tipc: eliminate possible recursive locking detected by LOCKDEP
  selftests: udpgso_bench.sh explicitly requires bash
  selftests: rtnetlink.sh explicitly requires bash.
  qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface
  tipc: queue socket protocol error messages into socket receive buffer
  tipc: set link tolerance correctly in broadcast link
  net: ipv4: don't let PMTU updates increase route MTU
  net: ipv4: update fnhe_pmtu when first hop's MTU changes
  net/ipv6: stop leaking percpu memory in fib6 info
  rds: RDS (tcp) hangs on sendto() to unresponding address
  dm linear: fix linear_end_io conditional definition
  IB/mlx5: Unmap DMA addr from HCA before IOMMU
  net: make skb_partial_csum_set() more robust against overflows
  devlink: Add helper function for safely copy string param
  devlink: Fix param cmode driverinit for string type
  devlink: Fix param set handling for string type
  samples: disable CONFIG_SAMPLES for UML
  dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled
  pinctrl: mcp23s08: fix irq and irqchip setup order
  gpio: Assign gpio_irq_chip::parents to non-stack pointer
  libertas: call into generic suspend code before turning off power
  of: unittest: Disable interrupt node tests for old world MAC systems
  mfd: cros-ec: copy the whole event in get_next_event_xfer
  mm: Preserve _PAGE_DEVMAP across mprotect() calls
  dm: fix report zone remapping to account for partition offset
  dm cache: destroy migration_cache if cache target registration failed
  net: ena: fix auto casting to boolean
  net: ena: fix NULL dereference due to untimely napi initialization
  net: ena: fix rare bug when failed restart/resume is followed by driver removal
  net: ena: fix warning in rmmod caused by double iounmap
  KVM: x86: support CONFIG_KVM_AMD=y with CONFIG_CRYPTO_DEV_CCP_DD=m
  gfs2: Fix iomap buffered write support for journaled files
  ARM: KVM: Correctly order SGI register entries in the cp15 array
  mmc: block: avoid multiblock reads for the last sector in SPI mode
  x86/mm: Avoid VLA in pgd_alloc()
  mm, sched/numa: Remove remaining traces of NUMA rate-limiting
  x86/intel_rdt: Fix out-of-bounds memory access in CBM tests
  rxrpc: Fix the packet reception routine
  rxrpc: Fix the rxrpc_tx_packet trace line
  rxrpc: Fix connection-level abort handling
  rxrpc: Only take the rwind and mtu values from latest ACK
  filesystem-dax: Fix dax_layout_busy_page() livelock
  rxrpc: Carry call state out of locked section in rxrpc_rotate_tx_window()
  rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()
  rxrpc: Don't need to take the RCU read lock in the packet receiver
  rxrpc: Use the UDP encap_rcv hook
  sparc64: fix fall-through annotation
  sparc32: fix fall-through annotation
  sparc: vdso: clean-up vdso Makefile
  oradax: remove redundant null check before kfree
  sparc64: viohs: Remove VLA usage
  sbus: Use of_get_child_by_name helper
  sparc: Convert to using %pOFn instead of device_node.name
  mach64: detect the dot clock divider correctly on sparc
  net/smc: retain old name for diag_mode field
  net/smc: use __aligned_u64 for 64-bit smc_diag fields
  net: sched: cls_u32: fix hnode refcounting
  udp: Unbreak modules that rely on external __skb_recv_udp() availability
  percpu: stop leaking bitmap metadata blocks
  Linux 4.19-rc7
  xfs: fix data corruption w/ unaligned reflink ranges
  xfs: fix data corruption w/ unaligned dedupe ranges
  treewide: Replace more open-coded allocation size multiplications
  mm: madvise(MADV_DODUMP): allow hugetlbfs pages
  ocfs2: fix locking for res->tracking and dlm->tracking_list
  mm/vmscan.c: fix int overflow in callers of do_shrink_slab()
  mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly
  mm/vmstat.c: fix outdated vmstat_text
  proc: restrict kernel stack dumps to root
  mm/hugetlb: add mmap() encodings for 32MB and 512MB page sizes
  mm/migrate.c: split only transparent huge pages when allocation fails
  ipc/shm.c: use ERR_CAST() for shm_lock() error return
  mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl
  mm, thp: fix mlocking THP page with migration enabled
  ocfs2: fix crash in ocfs2_duplicate_clusters_by_page()
  hugetlb: take PMD sharing into account when flushing tlb/caches
  mm: migration: fix migration of huge PMD shared pages
  net: mvpp2: Extract the correct ethtype from the skb for tx csum offload
  ipv6: take rcu lock in rawv6_send_hdrinc()
  ARC: clone syscall to setp r25 as thread pointer
  net: sched: Add policy validation for tc attributes
  rtnetlink: fix rtnl_fdb_dump() for ndmsg header
  yam: fix a missing-check bug
  net: bpfilter: Fix type cast and pointer warnings
  net: cxgb3_main: fix a missing-check bug
  Input: uinput - add a schedule point in uinput_inject_events()
  Input: evdev - add a schedule point in evdev_write()
  bpf: 32-bit RSH verification must truncate input before the ALU op
  MIPS: memset: Fix CPU_DADDI_WORKAROUNDS `small_fixup' regression
  perf record: Use unmapped IP for inline callchain cursors
  vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
  perf python: Use -Wno-redundant-decls to build with PYTHON=python3
  rxrpc: Fix the data_ready handler
  rxrpc: Fix some missed refs to init_net
  powerpc/numa: Skip onlining a offline node in kdump path
  powerpc: Don't print kernel instructions in show_user_instructions()
  i2c: designware: Call i2c_dw_clk_rate() only when calculating timings
  xfs: update ctime and remove suid before cloning files
  xfs: zero posteof blocks when cloning above eof
  xfs: refactor clonerange preparation into a separate helper
  iommu/amd: Clear memory encryption mask from physical address
  net: phy: phylink: fix SFP interface autodetection
  be2net: don't flip hw_features when VXLANs are added/deleted
  drm/nouveau/drm/nouveau: Grab runtime PM ref in nv50_mstc_detect()
  net/packet: fix packet drop as of virtio gso
  net: dsa: b53: Keep CPU port as tagged in all VLANs
  openvswitch: load NAT helper
  bnxt_en: get the reduced max_irqs by the ones used by RDMA
  bnxt_en: free hwrm resources, if driver probe fails.
  bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
  bnxt_en: Fix VNIC reservations on the PF.
  Input: mousedev - add a schedule point in mousedev_write()
  team: Forbid enslaving team device to itself
  net/usb: cancel pending work when unbinding smsc75xx
  cgroup: Fix dom_cgrp propagation when enabling threaded mode
  dm cache: fix resize crash if user doesn't reload cache table
  dm cache metadata: ignore hints array being too small during resize
  PM / core: Clear the direct_complete flag on errors
  mlxsw: spectrum: Delete RIF when VLAN device is removed
  mlxsw: pci: Derive event type from event queue number
  drm/amdkfd: Fix incorrect use of process->mm
  drm/amd/display: Signal hw_done() after waiting for flip_done()
  kvm: nVMX: fix entry with pending interrupt if APICv is enabled
  ovl: fix format of setxattr debug
  ovl: fix access beyond unterminated strings
  KVM: VMX: hide flexpriority from guest when disabled at the module level
  KVM: VMX: check for existence of secondary exec controls before accessing
  x86/vdso: Fix vDSO syscall fallback asm constraint regression
  ALSA: hda/realtek - Cannot adjust speaker's volume on Dell XPS 27 7760
  KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page fault
  ixgbe: check return value of napi_complete_done()
  sctp: fix fall-through annotation
  drm/i915: Handle incomplete Z_FINISH for compressed error states
  media: v4l: event: Prevent freeing event subscriptions while accessed
  locking/ww_mutex: Fix runtime warning in the WW mutex selftest
  x86/cpu/amd: Remove unnecessary parentheses
  x86/vdso: Only enable vDSO retpolines when enabled and supported
  r8169: always autoneg on resume
  ipv4: fix use-after-free in ip_cmsg_recv_dstaddr()
  net: qualcomm: rmnet: Fix incorrect allocation flag in receive path
  net: qualcomm: rmnet: Fix incorrect allocation flag in transmit
  net: qualcomm: rmnet: Skip processing loopback packets
  net: systemport: Fix wake-up interrupt race during resume
  smb3: fix lease break problem introduced by compounding
  cifs: only wake the thread for the very last PDU in a compound
  cifs: add a warning if we try to to dequeue a deleted mid
  smb2: fix missing files in root share directory listing
  rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096
  bonding: fix warning message
  inet: make sure to grab rcu_read_lock before using ireq->ireq_opt
  Revert "serial: sh-sci: Allow for compressed SCIF address"
  Revert "serial: sh-sci: Remove SCIx_RZ_SCIFA_REGTYPE"
  Revert "serial: 8250_dw: Fix runtime PM handling"
  RISCV: Fix end PFN for low memory
  x86/tsc: Fix UV TSC initialization
  x86/platform/uv: Provide is_early_uv_system()
  nfp: avoid soft lockups under control message storm
  declance: Fix continuation with the adapter identification message
  net: fec: fix rare tx timeout
  thunderbolt: Initialize after IOMMUs
  thunderbolt: Do not handle ICM events after domain is stopped
  powerpc/lib: fix book3s/32 boot failure due to code patching
  bpf: don't accept cgroup local storage with zero value size
  drm/cma-helper: Fix crash in fbdev error path
  sched/numa: Migrate pages to local nodes quicker early in the lifetime of a task
  mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration
  MAINTAINERS: Remove dead path from LOCKING PRIMITIVES entry
  drm: fix use-after-free read in drm_mode_create_lease_ioctl()
  s390/cio: Fix how vfio-ccw checks pinned pages
  sched/numa: Avoid task migration for small NUMA improvement
  mm/migrate: Use spin_trylock() while resetting rate limit
  sched/numa: Limit the conditions where scan period is reset
  sched/numa: Reset scan rate whenever task moves across nodes
  sched/numa: Pass destination CPU as a parameter to migrate_task_rq
  sched/numa: Stop multiple tasks from moving to the CPU at the same time
  perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events
  perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX
  perf/ring_buffer: Prevent concurent ring buffer access
  perf/x86/intel/uncore: Use boot_cpu_data.phys_proc_id instead of hardcorded physical package ID 0
  perf/core: Fix perf_pmu_unregister() locking
  selftests/x86: Add clock_gettime() tests to test_vdso
  r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO
  x86/vdso: Fix asm constraints on vDSO syscall fallbacks
  tun: napi flags belong to tfile
  tun: initialize napi_mutex unconditionally
  tun: remove unused parameters
  bond: take rcu lock in netpoll_send_skb_on_dev
  rtnetlink: Fail dump if target netnsid is invalid
  Revert "openvswitch: Fix template leak in error cases."
  tipc: ignore STATE_MSG on wrong link session
  net: sched: act_ipt: check for underflow in __tcf_ipt_init()
  usb: xhci-mtk: resume USB3 roothub first
  xhci: Add missing CAS workaround for Intel Sunrise Point xHCI
  usb: cdc_acm: Do not leak URB buffers
  Input: i8042 - enable keyboard wakeups by default when s2idle is used
  Input: xpad - add support for Xbox1 PDP Camo series gamepad
  soc: fsl: qman_portals: defer probe after qman's probe
  lib/xz: Put CRC32_POLY_LE in xz_private.h
  tcp/dccp: fix lockdep issue when SYN is backlogged
  PCI: mvebu: Fix PCI I/O mapping creation sequence
  net/mlx5e: Set vlan masks for all offloaded TC rules
  net/mlx5: E-Switch, Fix out of bound access when setting vport rate
  net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules
  drm/i915: Avoid compiler warning for maybe unused gu_misc_iir
  drm/i915: Do not redefine the has_csr parameter.
  MAINTAINERS: MIPS/LOONGSON2 ARCHITECTURE - Use the normal wildcard style
  KVM: x86: fix L1TF's MMIO GFN calculation
  tools/kvm_stat: cut down decimal places in update interval dialog
  KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS
  KVM: x86: Do not use kvm_x86_ops->mpx_supported() directly
  KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
  arm64: KVM: Sanitize PSTATE.M when being set from userspace
  arm64: KVM: Tighten guest core register access from userspace
  cfg80211: fix use-after-free in reg_process_hint()
  mac80211: fix setting IEEE80211_KEY_FLAG_RX_MGMT for AP mode keys
  cfg80211: fix wext-compat memory leak
  drm/exynos: Use selected dma_dev default iommu domain instead of a fake one
  i2c: i2c-scmi: fix for i2c_smbus_write_block_data
  xfs: fix error handling in xfs_bmap_extents_to_btree
  pstore/ram: Fix failure-path memory leak in ramoops_init
  firmware: Always initialize the fw_priv list object
  docs: fpga: document fpga manager flags
  fpga: bridge: fix obvious function documentation error
  tools: hv: fcopy: set 'error' in case an unknown operation was requested
  fpga: do not access region struct after fpga_region_unregister
  Drivers: hv: vmbus: Use get/put_cpu() in vmbus_connect()
  netlink: fix typo in nla_parse_nested() comment
  r8169: Disable clk during suspend / resume
  qlcnic: fix Tx descriptor corruption on 82xx devices
  tipc: fix failover problem
  smsc95xx: Check for Wake-on-LAN modes
  smsc75xx: Check for Wake-on-LAN modes
  r8152: Check for supported Wake-on-LAN Modes
  sr9800: Check for supported Wake-on-LAN modes
  lan78xx: Check for supported Wake-on-LAN modes
  ax88179_178a: Check for supported Wake-on-LAN modes
  asix: Check for supported Wake-on-LAN modes
  iomap: set page dirty after partial delalloc on mkwrite
  xfs: remove invalid log recovery first/last cycle check
  xfs: validate inode di_forkoff
  xfs: skip delalloc COW blocks in xfs_reflink_end_cow
  xfs: don't treat unknown di_flags2 as corruption in scrub
  xfs: remove duplicated include from alloc.c
  xfs: don't bring in extents in xfs_bmap_punch_delalloc_range
  xfs: fix transaction leak in xfs_reflink_allocate_cow()
  xfs: avoid lockdep false positives in xfs_trans_alloc
  xfs: refactor xfs_buf_log_item reference count handling
  xfs: clean up xfs_trans_brelse()
  xfs: don't unlock invalidated buf on aborted tx commit
  xfs: remove last of unnecessary xfs_defer_cancel() callers
  xfs: don't crash the vfs on a garbage inline symlink
  MAINTAINERS: Remove obsolete drivers/pci pattern from ACPI section
  MIPS: Fix CONFIG_CMDLINE handling
  MIPS: VDSO: Always map near top of user memory
  ibmvnic: remove ndo_poll_controller
  sfc-falcon: remove ndo_poll_controller
  sfc: remove ndo_poll_controller
  net: ena: remove ndo_poll_controller
  qlogic: netxen: remove ndo_poll_controller
  qlcnic: remove ndo_poll_controller
  virtio_net: remove ndo_poll_controller
  net: hns: remove ndo_poll_controller
  ehea: remove ndo_poll_controller
  hinic: remove ndo_poll_controller
  netpoll: do not test NAPI_STATE_SCHED in poll_one_napi()
  qed: Fix shmem structure inconsistency between driver and the mfw.
  Update maintainers for bnx2/bnx2x/qlge/qlcnic drivers.
  MAINTAINERS: change bridge maintainers
  s390: qeth: Fix potential array overrun in cmd/rc lookup
  s390: qeth_core_mpc: Use ARRAY_SIZE instead of reimplementing its function
  mmc: slot-gpio: Fix debounce time to use miliseconds again
  bpf: harden flags check in cgroup_storage_update_elem()
  netfilter: xt_socket: check sk before checking for netns.
  netfilter: avoid erronous array bounds warning
  netfilter: nft_set_rbtree: add missing rb_erase() in GC routine
  rxrpc: Fix error distribution
  rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket
  rxrpc: Make service call handling more robust
  rxrpc: Improve up-front incoming packet checking
  rxrpc: Emit BUSY packets when supposed to rather than ABORTs
  rxrpc: Fix RTT gathering
  rxrpc: Fix checks as to whether we should set up a new call
  scsi: qedi: Initialize the stats mutex lock
  crypto: qat - Fix KASAN stack-out-of-bounds bug in adf_probe()
  crypto: mxs-dcp - Fix wait logic on chan threads
  crypto: chelsio - Fix memory corruption in DMA Mapped buffers.
  PCI: Reprogram bridge prefetch registers on resume
  soc: fsl: qbman: add APIs to retrieve the probing status
  perf report: Don't try to map ip to invalid map
  rseq/selftests: fix parametrized test with -fpie
  iwlwifi: 1000: set the TFD queue size
  ieee802154: mcr20a: Replace magic number with constants
  s390/cio: Refactor alloc of ccw_io_region
  s390/cio: Convert ccw_io_region to pointer
  rxrpc: Remove dup code from rxrpc_find_connection_rcu()
  ieee802154: ca8210: remove redundant condition check before debugfs_remove
  nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds
  net-tcp: /proc/sys/net/ipv4/tcp_probe_interval is a u32 not int
  bnxt_en: Fix TX timeout during netpoll.
  vxlan: fill ttl inherit info
  net: phy: sfp: Fix unregistering of HWMON SFP device
  qed: Avoid implicit enum conversion in qed_iwarp_parse_rx_pkt
  qed: Avoid constant logical operation warning in qed_vf_pf_acquire
  bonding: avoid possible dead-lock
  bonding: pass link-local packets to bonding master also.
  qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor
  qed: Fix mask parameter in qed_vf_prep_tunn_req_tlv
  qed: Avoid implicit enum conversion in qed_set_tunn_cls_info
  wimax/i2400m: fix spelling mistake "not unitialized" -> "uninitialized"
  qed: fix spelling mistake "toogle" -> "toggle"
  net: phy: fix WoL handling when suspending the PHY
  net: core: add member wol_enabled to struct net_device
  Revert "net: phy: fix WoL handling when suspending the PHY"
  net: phy: fix WoL handling when suspending the PHY
  net/ipv6: Remove extra call to ip6_convert_metrics for multipath case
  mmc: core: Fix debounce time to use microseconds
  video/fbdev/stifb: Fix spelling mistake in fall-through annotation
  uvesafb: Fix URLs in the documentation
  efifb: BGRT: Add nobgrt option
  fbdev/omapfb: fix omapfb_memory_read infoleak
  pxa168fb: prepare the clock
  Bluetooth: SMP: fix crash in unpairing
  mac80211_hwsim: do not omit multicast announce of first added radio
  mac80211_hwsim: fix race in radio destruction from netlink notifier
  mac80211_hwsim: fix locking when iterating radios during ns exit
  nl80211: Fix possible Spectre-v1 for NL80211_TXRATE_HT
  cfg80211: fix reg_query_regdb_wmm kernel-doc
  mac80211: allocate TXQs for active monitor interfaces
  tipc: lock wakeup & inputq at tipc_link_reset()
  tipc: reset bearer if device carrier not ok
  ARM: dts: stm32: update SPI6 dmas property on stm32mp157c
  soc: fsl: qe: Fix copy/paste bug in ucc_get_tdm_sync_shift()
  soc: fsl: qbman: qman: avoid allocating from non existing gen_pool
  ovl: make symbol 'ovl_aops' static
  tipc: fix flow control accounting for implicit connect
  net: hns: fix for unmapping problem when SMMU is on
  xen-netback: handle page straddling in xenvif_set_hash_mapping()
  xen-netback: validate queue numbers in xenvif_set_hash_mapping()
  xen-netback: fix input validation in xenvif_set_hash_mapping()
  net: macb: Clean 64b dma addresses if they are not detected
  perf script python: Fix export-to-sqlite.py sample columns
  perf script python: Fix export-to-postgresql.py occasional failure
  i2c: i2c-isch: fix spelling mistake "unitialized" -> "uninitialized"
  i2c: i2c-qcom-geni: Properly handle DMA safe buffers
  ARM: dts: BCM63xx: Fix incorrect interrupt specifiers
  arm64: hugetlb: Avoid unnecessary clearing in huge_ptep_set_access_flags
  arm64: hugetlb: Fix handling of young ptes
  KVM: x86: never trap MSR_KERNEL_GS_BASE
  USB: serial: simple: add Motorola Tetra MTP6550 id
  HID: intel-ish-hid: Enable Ice Lake mobile
  HID: i2c-hid: Remove RESEND_REPORT_DESCR quirk and its handling
  vfs: swap names of {do,vfs}_clone_file_range()
  ovl: fix freeze protection bypass in ovl_clone_file_range()
  ovl: fix freeze protection bypass in ovl_write_iter()
  ovl: fix memory leak on unlink of indexed file
  MAINTAINERS: update the Annapurna Labs maintainer email
  ieee802154: remove unecessary condition check before debugfs_remove_recursive
  ieee802154: Use kmemdup instead of duplicating it in ca8210_test_int_driver_write
  crypto: caam/jr - fix ablkcipher_edesc pointer arithmetic
  netfilter: conntrack: get rid of double sizeof
  netfilter: nft_osf: use enum nft_data_types for nft_validate_register_store
  netfilter: bridge: Don't sabotage nf_hook calls from an l3mdev
  drm/i2c: tda9950: set MAX_RETRIES for errors only
  drm/i2c: tda9950: fix timeout counter check
  b43: fix DMA error related regression with proprietary firmware
  s390/hibernate: fix error handling when suspend cpu != resume cpu
  ALSA: hda: Fix the audio-component completion timeout
  xfrm: validate template mode
  ARM: dts: sun8i: drop A64 HDMI PHY fallback compatible from R40 DT
  kbuild: allow to use GCC toolchain not in Clang search path
  ftrace: Build with CPPFLAGS to get -Qunused-arguments
  ARM: 8799/1: mm: fix pci_ioremap_io() offset check
  ARM: 8787/1: wire up io_pgetevents syscall
  gpiolib: Free the last requested descriptor
  ARC: build: Don't set CROSS_COMPILE in arch's Makefile
  sysfs: Do not return POSIX ACL xattrs via listxattr
  dm raid: remove bogus const from decipher_sync_action() return type
  dm mpath: fix attached_handler_name leak and dangling hw_handler_name pointer
  mmc: sdhi: sys_dmac: check for all Gen3 types when whitelisting
  dm thin metadata: fix __udivdi3 undefined on 32-bit
  mt76x0: fix remove_interface
  ARC: fix spelling mistake "entires" -> "entries"
  USB: serial: option: add two-endpoints device-id flag
  USB: serial: option: improve Quectel EP06 detection
  HID: i2c-hid: disable runtime PM operations on hantick touchpad
  ARC: build: Get rid of toolchain check
  ARM: dts: at91: sama5d2_ptc_ek: fix nand pinctrl
  ARM: dts: imx53-qsb: disable 1.2GHz OPP
  xfrm: Fix NULL pointer dereference when skb_dst_force clears the dst_entry.
  ARCv2: build: use mcpu=hs38 iso generic mcpu=archs
  mac80211: fix TX status reporting for ieee80211s
  mac80211: TDLS: fix skb queue/priority assignment
  cfg80211: Address some corner cases in scan result channel updating
  mac80211: fix pending queue hang due to TX_DROP
  cfg80211: reg: Init wiphy_idx in regulatory_hint_core()
  mac80211: Don't wake up from PS for offchannel TX
  mac80211: Always report TX status
  xfrm: reset crypto_done when iterating over multiple input xfrms
  xfrm: reset transport header back to network header after all input transforms ahave been applied
  xfrm6: call kfree_skb when skb is toobig
  xfrm: Validate address prefix lengths in the xfrm selector.

[rishabhb@codeaurora.org: resolved some minor conflicts]
Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org>
Change-Id: Ic3fb7f2c090b32694426ab160416f6a59cca8126
2018-10-15 11:47:50 -07:00
Maciej Żenczykowski
d4ce58082f net-tcp: /proc/sys/net/ipv4/tcp_probe_interval is a u32 not int
(fix documentation and sysctl access to treat it as such)

Tested:
  # zcat /proc/config.gz | egrep ^CONFIG_HZ
  CONFIG_HZ_1000=y
  CONFIG_HZ=1000
  # echo $[(1<<32)/1000 + 1] | tee /proc/sys/net/ipv4/tcp_probe_interval
  4294968
  tee: /proc/sys/net/ipv4/tcp_probe_interval: Invalid argument
  # echo $[(1<<32)/1000] | tee /proc/sys/net/ipv4/tcp_probe_interval
  4294967
  # echo 0 | tee /proc/sys/net/ipv4/tcp_probe_interval
  # echo -1 | tee /proc/sys/net/ipv4/tcp_probe_interval
  -1
  tee: /proc/sys/net/ipv4/tcp_probe_interval: Invalid argument

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:33:21 -07:00
Amit Pundir
1e9a9ffbbd RFC: ANDROID: net: ipv4: tcp: Namespace-ify sysctl_tcp_default_init_rwnd
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2018-08-28 17:15:17 +05:30
JP Abgrall
89e8c70306 ANDROID: net: ipv4: tcp: add a sysctl to config the tcp_default_init_rwnd
The default initial rwnd is hardcoded to 10.

Now we allow it to be controlled via
  /proc/sys/net/ipv4/tcp_default_init_rwnd
which limits the values from 3 to 100

This is somewhat needed because ipv6 routes are
autoconfigured by the kernel.

See "An Argument for Increasing TCP's Initial Congestion Window"
in https://developers.google.com/speed/articles/tcp_initcwnd_paper.pdf

Change-Id: I386b2a9d62de0ebe05c1ebe1b4bd91b314af5c54
Signed-off-by: JP Abgrall <jpa@google.com>

Conflicts:
	net/ipv4/sysctl_net_ipv4.c
	net/ipv4/tcp_input.c

[AmitP: Folded following android-4.9 commit changes into this patch
        3823c8b26e6e ("ANDROID: tcp: fix tcp_default_init_rwnd() for 4.1")]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2018-08-28 17:10:42 +05:30
Petr Machata
d18c5d1995 net: ipv4: Notify about changes to ip_forward_update_priority
Drivers may make offloading decision based on whether
ip_forward_update_priority is enabled or not. Therefore distribute
netevent notifications to give them a chance to react to a change.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:52:30 -07:00
Petr Machata
432e05d328 net: ipv4: Control SKB reprioritization after forwarding
After IPv4 packets are forwarded, the priority of the corresponding SKB
is updated according to the TOS field of IPv4 header. This overrides any
prioritization done earlier by e.g. an skbedit action or ingress-qos-map
defined at a vlan device.

Such overriding may not always be desirable. Even if the packet ends up
being routed, which implies this is an L3 network node, an administrator
may wish to preserve whatever prioritization was done earlier on in the
pipeline.

Therefore introduce a sysctl that controls this behavior. Keep the
default value at 1 to maintain backward-compatible behavior.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:52:30 -07:00
Tyler Hicks
70ba5b6db9 ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns
The low and high values of the net.ipv4.ping_group_range sysctl were
being silently forced to the default disabled state when a write to the
sysctl contained GIDs that didn't map to the associated user namespace.
Confusingly, the sysctl's write operation would return success and then
a subsequent read of the sysctl would indicate that the low and high
values are the overflowgid.

This patch changes the behavior by clearly returning an error when the
sysctl write operation receives a GID range that doesn't map to the
associated user namespace. In such a situation, the previous value of
the sysctl is preserved and that range will be returned in a subsequent
read of the sysctl.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-06 11:51:18 +09:00
Yuchung Cheng
c860e997e9 tcp: fix Fast Open key endianness
Fast Open key could be stored in different endian based on the CPU.
Previously hosts in different endianness in a server farm using
the same key config (sysctl value) would produce different cookies.
This patch fixes it by always storing it as little endian to keep
same API for LE hosts.

Reported-by: Daniele Iamartino <danielei@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 18:40:46 +09:00
Maciej Żenczykowski
79e9fed460 net-tcp: extend tcp_tw_reuse sysctl to enable loopback only optimization
This changes the /proc/sys/net/ipv4/tcp_tw_reuse from a boolean
to an integer.

It now takes the values 0, 1 and 2, where 0 and 1 behave as before,
while 2 enables timewait socket reuse only for sockets that we can
prove are loopback connections:
  ie. bound to 'lo' interface or where one of source or destination
  IPs is 127.0.0.0/8, ::ffff:127.0.0.0/104 or ::1.

This enables quicker reuse of ephemeral ports for loopback connections
- where tcp_tw_reuse is 100% safe from a protocol perspective
(this assumes no artificially induced packet loss on 'lo').

This also makes estblishing many loopback connections *much* faster
(allocating ports out of the first half of the ephemeral port range
is significantly faster, then allocating from the second half)

Without this change in a 32K ephemeral port space my sample program
(it just establishes and closes [::1]:ephemeral -> [::1]:server_port
connections in a tight loop) fails after 32765 connections in 24 seconds.
With it enabled 50000 connections only take 4.7 seconds.

This is particularly problematic for IPv6 where we only have one local
address and cannot play tricks with varying source IP from 127.0.0.0/8
pool.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Wei Wang <weiwan@google.com>
Change-Id: I0377961749979d0301b7b62871a32a4b34b654e1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:13:35 -04:00
Eric Dumazet
9c21d2fc41 tcp: add tcp_comp_sack_nr sysctl
This per netns sysctl allows for TCP SACK compression fine-tuning.

This limits number of SACK that can be compressed.
Using 0 disables SACK compression.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00
Eric Dumazet
6d82aa2420 tcp: add tcp_comp_sack_delay_ns sysctl
This per netns sysctl allows for TCP SACK compression fine-tuning.

Its default value is 1,000,000, or 1 ms to meet TSO autosizing period.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00
Kirill Tkhai
2f635ceeb2 net: Drop pernet_operations::async
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 13:18:09 -04:00
Tonghao Zhang
1e80295158 udp: Move the udp sysctl to namespace.
This patch moves the udp_rmem_min, udp_wmem_min
to namespace and init the udp_l3mdev_accept explicitly.

The udp_rmem_min/udp_wmem_min affect udp rx/tx queue,
with this patch namespaces can set them differently.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-16 12:03:30 -04:00
David Ahern
3192dac64c net: Rename NETEVENT_MULTIPATH_HASH_UPDATE
Rename NETEVENT_MULTIPATH_HASH_UPDATE to
NETEVENT_IPV4_MPATH_HASH_UPDATE to denote it relates to a change
in the IPv4 hash policy.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04 13:04:22 -05:00
Kirill Tkhai
22769a2a6e net: Convert ipv4_sysctl_ops
These pernet_operations create and destroy sysctl,
which are not touched by anybody else.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 10:36:09 -05:00
Stephen Hemminger
6670e15244 tcp: Namespace-ify sysctl_tcp_default_congestion_control
Make default TCP default congestion control to a per namespace
value. This changes default congestion control to a pointer to congestion ops
(rather than implicit as first element of available lsit).

The congestion control setting of new namespaces is inherited
from the current setting of the root namespace.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:09:52 +09:00
Eric Dumazet
356d1833b6 tcp: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem
Note that when a new netns is created, it inherits its
sysctl_tcp_rmem and sysctl_tcp_wmem from initial netns.

This change is needed so that we can refine TCP rcvbuf autotuning,
to take RTT into consideration.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-10 14:34:58 +09:00
David S. Miller
2a171788ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Files removed in 'net-next' had their license header updated
in 'net'.  We take the remove from 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-04 09:26:51 +09:00
Ido Schimmel
3ae6ec0829 ipv4: Send a netevent whenever multipath hash policy is changed
Devices performing IPv4 forwarding need to update their multipath hash
policy whenever it is changed.

Inform these devices by generating a netevent.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03 15:40:41 +09:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Eric Dumazet
c26e91f8b9 tcp: Namespace-ify sysctl_tcp_pacing_ca_ratio
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:39 +09:00
Eric Dumazet
23a7102a2d tcp: Namespace-ify sysctl_tcp_pacing_ss_ratio
Also remove an obsolete comment about TCP pacing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:39 +09:00
Eric Dumazet
4170ba6b58 tcp: Namespace-ify sysctl_tcp_invalid_ratelimit
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:39 +09:00
Eric Dumazet
790f00e19f tcp: Namespace-ify sysctl_tcp_autocorking
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:39 +09:00
Eric Dumazet
bd23970429 tcp: Namespace-ify sysctl_tcp_min_rtt_wlen
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:39 +09:00
Eric Dumazet
26e9596e5b tcp: Namespace-ify sysctl_tcp_min_tso_segs
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
b530b68148 tcp: Namespace-ify sysctl_tcp_challenge_ack_limit
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
9184d8bb44 tcp: Namespace-ify sysctl_tcp_limit_output_bytes
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
ceef9ab6be tcp: Namespace-ify sysctl_tcp_workaround_signed_windows
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
d06a990458 tcp: Namespace-ify sysctl_tcp_tso_win_divisor
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
4540c0cf98 tcp: Namespace-ify sysctl_tcp_moderate_rcvbuf
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
ec36e416f0 tcp: Namespace-ify sysctl_tcp_nometrics_save
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:24:38 +09:00
Eric Dumazet
af9b69a7a6 tcp: Namespace-ify sysctl_tcp_frto
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:43 +09:00
Eric Dumazet
94f0893e0c tcp: Namespace-ify sysctl_tcp_adv_win_scale
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:43 +09:00
Eric Dumazet
0c12654ac6 tcp: Namespace-ify sysctl_tcp_app_win
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:43 +09:00
Eric Dumazet
6496f6bde0 tcp: Namespace-ify sysctl_tcp_dsack
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:43 +09:00
Eric Dumazet
c6e2180359 tcp: Namespace-ify sysctl_tcp_max_reordering
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:43 +09:00
Eric Dumazet
0bc65a28ae tcp: Namespace-ify sysctl_tcp_fack
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:42 +09:00
Eric Dumazet
65c9410cf5 tcp: Namespace-ify sysctl_tcp_abort_on_overflow
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:42 +09:00
Eric Dumazet
625357aa17 tcp: Namespace-ify sysctl_tcp_rfc1337
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:42 +09:00
Eric Dumazet
3f4c7c6f6a tcp: Namespace-ify sysctl_tcp_stdurg
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:42 +09:00
Eric Dumazet
e0a1e5b519 tcp: Namespace-ify sysctl_tcp_retrans_collapse
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:42 +09:00
Eric Dumazet
b510f0d23a tcp: Namespace-ify sysctl_tcp_slow_start_after_idle
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 16:35:42 +09:00