strizhechenko / netutils-linux Goto Github PK

View Code? Open in Web Editor NEW

743.0 47.0 74.0 567 KB

A suite of utilities simplilfying linux networking stack performance troubleshooting and tuning.

Home Page: https://pypi.python.org/pypi/netutils-linux

License: MIT License

Makefile 2.32% Python 84.57% Shell 12.29% Dockerfile 0.82%

shell linux python network interrupts utils cpufreq cpu monitoring networking-stack

netutils-linux's Introduction

netutils-linux

Project is frozen

It's in licensing and intellectual property gray zone - previous employeer and me talked about it, but didn't fixed anything "on paper". I will not upgrade project, receive any patches, etc. Feel free to fork for own usage, drop py2 support, etc.

It's a useful utils to simplify Linux network troubleshooting and performance tuning, developed in order to help Carbon Reductor techsupport and automate the whole linux performance tuning process out of box (ok, except the best RSS layout detection with multiple network devices). These utils may be useful for datacenters and internet service providers with heavy network workload (you probably wouldn't see an effect at your desktop computer). It's now in production usage with 2000+ deployment and save us a lot of time with hardware and software settings debugging. Inspired by packagecloud's blog post.

Installation

You'll need pip.

pip install netutils-linux

Usage

Check this guide about usage.

Utils

Monitoring

All these top-like utils don't require root priveledges or sudo usage. So you can install and use them as non-priveledged user if you care about security.

pip install --user netutils-linux

Brief explanation about highlighting colors for CPU and device groups: green and red are for NUMA-nodes, blue and yellow for CPU sockets. Screenshots are taken from different hosts with different hardware.

network-top

Most useful util in this repo that includes almost all linux network stack performance metrics and allow to monitor interrupts, soft interrupts, network processing statistic for devices and CPUs. Based on following files:

/proc/interrupts (vectors with small amount of irqs/second are hidden by default)
/proc/net/softnet_stat - packet distribution and errors/squeeze rate between CPUs.
/proc/softirqs (only NET_RX and NET_TX values).
/sys/class/net/<NET_DEVICE>/statistic/<METRIC> files (you can specify units, mbits are default)

There are also separate utils if you want to look at only specific metrics: irqtop, softirq-top, softnet-stat-top, link-rate.

snmptop

Basic /proc/net/smmp file watcher.

Tuning

rss-ladder

Automatically set smp_affinity_list for IRQ of NIC rx/tx queues that usually work on CPU0 out of the box).

Based on lscpu's output.

It also supports double/quad ladder in case of multiprocessor systems (but you better explicitly specify queue count == core per socket as NIC's driver's param). Example output:

# rss-ladder eth1 0
- distributing interrupts of eth1 (-TxRx-) on socket 0
  - eth1: irq 67 eth1-TxRx-0 -> 0
  - eth1: irq 68 eth1-TxRx-1 -> 1
  - eth1: irq 69 eth1-TxRx-2 -> 2
  - eth1: irq 70 eth1-TxRx-3 -> 3
  - eth1: irq 71 eth1-TxRx-4 -> 8
  - eth1: irq 72 eth1-TxRx-5 -> 9
  - eth1: irq 73 eth1-TxRx-6 -> 10
  - eth1: irq 74 eth1-TxRx-7 -> 11

autorps

Enables RPS on all available CPUs of NUMA node local for the NIC for all NIC's rx queues. It may be good for small servers with cheap network cards. You also can explicitely pass --cpus or --cpu-mask. Example output:

# autorps eth0
Using mask 'fc0' for eth0-rx-0.

maximize-cpu-freq

Sets every CPU scaling governor mode to performance and set max scaling value for min scaling value. So you will be able to use all power of your processor (useful for latency sensible systems).

rx-buffers-increase

rx-buffers-increase utils, that finds and sets compromise-value between avoiding dropped/missing pkts and keeping a latency low.

Example output:

# ethtool -g eth1

Ring parameters for eth1:
Pre-set maximums:
RX:           4096
...
Current hardware settings:
RX:           256

# rx-buffers-increase eth1

run: ethtool -G eth1 rx 2048

# rx-buffers-increase eth1

eth1's rx ring buffer already has fine size.

# ethtool -g eth1

Ring parameters for eth1:
Pre-set maximums:
RX:           4096
...
Current hardware settings:
RX:           2048

Hardware and its configuration rating. server-info

Much alike lshw but designed for network processing role of server.

Information about server

➜  vscale-vm git:(folding) ✗ server-info --show
cpu:
  info:
    Architecture: x86_64
    BogoMIPS: 4399
    Byte Order: Little Endian
    CPU MHz: 2199
    CPU family: 6
    CPU op-mode(s): 32-bit, 64-bit
    CPU(s): 1
    Core(s) per socket: 1
    Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
      clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
      rep_good nopl eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic
      movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm
      abm 3dnowprefetch tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust
      bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat
    Hypervisor vendor: KVM
    L1d cache: 32K
    L1i cache: 32K
    L2 cache: 256K
    L3 cache: 25600K
    Model: 79
    Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
    NUMA node(s): 1
    NUMA node0 CPU(s): 0
    On-line CPU(s) list: 0
    Socket(s): 1
    Stepping: 1
    Thread(s) per core: 1
    Vendor ID: GenuineIntel
    Virtualization: VT-x
    Virtualization type: full
  layout:
    '0': '0'
disk:
  vda:
    model: null
    size: 21474836480
    type: HDD
memory:
  devices:
    '0x1100':
      size: '512'
      speed: 0
      type: RAM
  size:
    MemFree: 78272
    MemTotal: 500196
    SwapFree: 0
    SwapTotal: 0
net:
  eth0:
    buffers:
      cur: 256
      max: 256
    conf:
      ip: ''
      vlan: false
    driver:
      driver: virtio_net
      version: 1.0.0
    queues:
      own: []
      rx: []
      rxtx: []
      shared: []
      tx: []
      unknown: []

Overall server rating

➜  vscale-vm git:(folding) ✗ server-info --rate --server
server: 1.7666666666666664

Subsystems rating

➜  vscale-vm git:(folding) ✗ server-info --rate --subsystem
cpu: 4.5
disk: 1.0
memory: 1.0
net: 1.3333333333333333
system: 1.0

Devices rating

➜  vscale-vm git:(folding) ✗ server-info --rate --device
cpu:
  BogoMIPS: 2
  CPU MHz: 2
  CPU(s): 1
  Core(s) per socket: 1
  L3 cache: 9
  Socket(s): 1
  Thread(s) per core: 10
  Vendor ID: 10
disk:
  vda: 1.0
memory:
  devices:
    '0x1100': 1.0
  size: 1.0
net:
  eth0: 1.3333333333333333
system:
  Hypervisor vendor: 1
  Virtualization type: 1

Device's detailed rating

➜  vscale-vm git:(folding) ✗ server-info --rate
cpu:
  BogoMIPS: 2
  CPU MHz: 2
  CPU(s): 1
  Core(s) per socket: 1
  L3 cache: 9
  Socket(s): 1
  Thread(s) per core: 10
  Vendor ID: 10
disk:
  vda:
    size: 1
    type: 1
memory:
  devices:
    '0x1100':
      size: 1
      speed: 1
      type: 1
  size:
    MemTotal: 1
    SwapTotal: 1
net:
  eth0:
    buffers:
      cur: 1
      max: 1
    driver: 2
    queues: 1
system:
  Hypervisor vendor: 1
  Virtualization type: 1

FAQ

Q: I see that workload is distributed fine, but there is a lot of workload. How to go deeper, how to understand what my system doing right now?

A: Try

perf top

How to contribute?

Close issues

Any help is welcome. Just comment an issue with "I want to help, how can I solve this issue?" to start.

netutils-linux's People

Contributors

Stargazers

Watchers

Forkers

nick-carbonsoft oleshiy jagular-spb dserov zypan aligator77 olegbukatchuk atom134 mylearning2017 vladdou bossjones iahmad-khan alexeyab itestings rodrigolins chocrates kinhvan017 yilinxiong plucena24 carbonsoft tdrjnr termlen0 vipulnayyar abhi2898 eremenko alexeydeyneko benroeder cloudxtreme scampgit micha-sky skkachaev skystart tafhim safarihostel biosnod nnazym ravi-mrk ikigai-gh henkaru brosysx64 zhyh329 sv-yarakaev vshulkin minimal101 stake-capital gaiyangjun agolovkin zhuizhuhaomeng wu1f72514 5l1v3r1 tadryanom arijit9998 bartklin gencoo pirenga lnkgyv sophrinix logicall siliba jazysamurai rayrapetyan goldbeef bot-unit znuff goga1992 alekskomp dendydo z-boris plagunov nigredon1991

netutils-linux's Issues

Highlight dropped / missed / other errors with red color if metric > 0

Fix rss-ladder tests to use new and common data. Also merge partly autotune-network branch.

Structurize tests by packets

Implement colors for NUMA-nodes

Backport multiple socket improvements in rss-ladder

Get know about kind of DDR memory and rate it too in server-info rate

type, ddr1 - 1, ddr2 - 4, ddr3 - 7, ddr4 - 10
speed, 400 1, 3000 10
fix tests
fix code issues

irqtop should evaluate what to skip in diff evaluation, not in repr

--bits / --kbits / --mbits for link-rate

No examples and screenshots

Don't set own queue's cpu if NIC has other rx/rxtx/tx queues

netrxtop bugged on KVM with 2 CPU

Add NET_TX to network-top, softirq-top, rename it and

QinQ support for server-info

Core mapping support for single-queue NICs

automatically detect queue-prefix for nic.

It may differs from common prefix for driver and there may be multiple prefixes (separate rx/tx).

grep -o eth1-.* /proc/interrupts | tr -d '[0-9]' | sed -e 's/eth//g' | sort -u

meta-rss-ladder.

4 cpu, 2 nic, total 4 queue.

eth0-rx-0
eth0-tx-0
eth0
eth1-rx-0
eth1-tx-0
eth1

problem is that rss-ladder works with one device at time.
maybe some higher layer utility should decide how to distribute interrupts of multiple nics.

Firmware version gathering and assessment

cpu_count in autorps

https://github.com/strizhechenko/netutils-linux/blob/master/utils/autorps#L13

А чем плох например nproc?

cpu_count="$(nproc)"

Refactor colors usage, move it to function wrappers and use it in other utils

Remove pylint disables from autotune/assess code

network-top don't work after new optparse refactoring

#39 (comment)

rx-buffers tune

if have explicit ethtool options in config - skip
elseif ethtool error on G - skip
else max(cur, min(2048, max(rx)/2))

Decide to use or not to use non-mirror devices while taking a decision about RSS tuning

case:
eth0 is used for redirects.
eth1, eth2 in mirror.

 27:   48654883   0   PCI-MSI-edge      eth0
 28: 3101331408   0   PCI-MSI-edge      eth1
 30:   95332395   0   PCI-MSI-edge      eth2

Use options group in top-like utils

or maybe https://docs.python.org/3/library/argparse.html#parents

Link rate should be able to show all net/dev devices or filter by regex arg

Header should be printed only once in network-top.

Queues for i40e

unknown:

i40e-eth0-TxRx-0
i40e-eth0-TxRx-1
i40e-eth0-TxRx-2
i40e-eth0-TxRx-3

No message that you can not perform

No file /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Check for many cpu

we need to collect interrupt rate while gathering info. rate's can be static file and be used in tests.

pylint check for netutils-linux-monitoring

Тесты для rss-ladder

Reading mirror info with mixed l2 mirror makes NIC.conf.vlan = true

If there is only one numa node - use sockets to colorize CPUs

There's a funny way to differ them: just using of other colors (yellow and blue instead of green and red for numa).

Rework option parsing

Before #36:

No more environment variables.
https://github.com/strizhechenko/optparse-modular-example based parsing.

shortopt	longopt	owners	default	help
-i	--interval	basetop	1	interval between screen renew in seconds
-n	--iterations-limit	basetop	60 or -1	count of screen's renews, -1 - infinite loop
--	--no-delta-mode	basetop	False	shows metrics' values instead of growth
--	--delta-small-hide	basetop	True	hides lines with only small changes or without changes at all
-l	--delta-small-hide-limit	basetop	True	hides lines with only changes less than this limit
-c	--(no-)color	basetop	True	highlight NUMA nodes or sockets
--assert	--assert-mode	softnet-stat-top, link-rate	False	stops running after errors detected
--	--interrupt-file	irqtop	/proc/interrupt	option for testing on MacOS purpose
--	--softirqs-file	softirq-net-rx-top	/proc/softirqs	option for testing on MacOS purpose
--	--softnet-stat-file	softnet-stat-top	/proc/net/softnet_stat	option for testing on MacOS purpose
-- dev	--devices	link-rate	[]	comma-separated list of devices to monitor
--dev-regex	--devices-regex	link-rate	^.*$	regex-mask for devices to monitor
-s	--simple	link-rate	False	hide different kinds of error, showing only general counters
--rx	--rx-only	link-rate (and maybe softirq-net-top)	False	hide tx-counters

Implement pretty layout for network-top

irqtop - irqtop.diff_total

cpu0 cpu1 cpu2 cpu3
10   0    0    0     eth1-rx-0
0    20   0    0     eth1-rx-1
0    0    30   0     eth1-rx-2

softnet-stat-top + softirq-net-top + irqtop.diff_total

     total dropped squeezed irq net-rx net-tx rps collisions
cpu0 0     0       0        0   0      0      0   0
cpu1 0     0       0        0   0      0      0   0
cpu2 0     0       0        0   0      0      0   0

link-rate

                        RX          RX          RX          RX          RX          RX          RX          RX          RX          RX          TX          TX          TX
                   packets       bytes      errors     dropped      missed        fifo      length     overrun         crc       frame     packets       bytes      errors
eth0                    91       85672           0           0           0           0           0           0           0           0          91       11380           0
eth1                     0           0           0           0           0           0           0           0           0           0           0           0           0

What to do with well working polling and how to determine it

case:

 27:   48654883   48544883   PCI-MSI-edge      eth0
 28: 3101331408 3101444157   PCI-MSI-edge      eth1
 30:   95332395   95334018   PCI-MSI-edge      eth2

and second case:

 28: 3101331408  0   3101444157   0   PCI-MSI-edge      eth1

maybe it will be better to skip rss part.

Drivers should be assessed in scale of 1 to 10, not only good / bad

#29

Irqtop layout improvements: compact view while running on systems with a lot of cpu.

It's not so good how network-top / irqtop looks when running on 32+ cpus systems. Even 4k monitor doesn't help here. Scaling terminal to 0.2x size is weird too. Irqtop in these cases should look like:

# /proc/interrupts: /1000 mode

0  1  2  3  4  5  6  7

2  0  0  0  0  0  0  0 eth2-0
0  3  0  0  0  0  0  0 eth2-1
0  0  4  0  0  0  0  0 eth2-2
0  0  0  1  0  0  0  0 eth2-3
0  0  0  0  3  0  0  0 eth2-4
0  0  0  0  0  2  0  0 eth2-5
0  0  0  0  0  0  3  0 eth2-6
0  0  0  0  0  0  0  5 eth2-7

we can get rid of IRQ number, type of interrupt and strip CPU name to its number.

One-queue NIC's support

--command support

It may be a good idea to give opportunity to show some custom data while using network-top via passing some commands.

for example:

--command 'uptime'
--command 'date'
--command 'myscript.sh'

Unify -top utils' code

irqtop
softirq-net-rx-top
softnet-stat-top
link-rate

They should have one parent class like a:

class Top(object):
    cur
    prev
    def __init__(filename)
    def parse()
    def diff()
    def tick()
    def run()
    def __repr__()

Bug in network-top with python 2.6

Traceback (most recent call last):
  File "/usr/bin/network-top", line 7, in <module>
    NetworkTop().run()
  File "/usr/lib/python2.6/site-packages/netutils_linux/network_top.py", line 23, in __init__
    self.numa = Numa(devices=self.options.devices, fake=self.options.random)
  File "/usr/lib/python2.6/site-packages/netutils_linux/numa.py", line 20, in __init__
    self.nodes = self.FAKE if fake else self.node_cpu_dict()
  File "/usr/lib/python2.6/site-packages/netutils_linux/numa.py", line 38, in node_cpu_dict
    return dict((int(node.strip('node')), self.cpulist_read(node)) for node in self.node_list())
  File "/usr/lib/python2.6/site-packages/netutils_linux/numa.py", line 38, in <genexpr>
    return dict((int(node.strip('node')), self.cpulist_read(node)) for node in self.node_list())
  File "/usr/lib/python2.6/site-packages/netutils_linux/numa.py", line 30, in cpulist_read
    return list(self.cpulist_parse(fd.read()))
  File "/usr/lib/python2.6/site-packages/netutils_linux/numa.py", line 33, in cpulist_parse
    for _min, _max in [map(int, r.split('-')) for r in cpulist.split(',')]:
ValueError: need more than 1 value to unpack

autotune-network tests long term bugs

1xE5200.82571EB_and_AR8121.L2.manual

net:

eth0
eth1
eth2

later: rss-ladder:

eth0: cpu0
eth1: cpu1
eth2: cpu0

1xE31240.2x82576.L2.manual

Add eth2/eth3 later.

eth0, eth1 - ok
eth2, eth3 - cpu0
eth2-rx: cpu1
eth2-tx: cpu2
eth3-rx: cpu3
eth3-tx: cpu4

1xi3.single_i350.l2_mixed.manual

Investigate on client's machine if CPU1/CPU3 may be used at all.

if so:

eth0 - SKIP.
eth6 - standard ladder.

2xE5-2640.i350_and_82599ES.l2_mixed.masterconf

eth0: cpu0..7
eth1: cpu: 8..15
eth2: cpu: 0..7 + 16..23
eth3: cpu: 8..15 + 24..31
eth4: cpu: 0..7 + 16..23
eth5: cpu: 8..15 + 24..31

but maybe I'm wrong, we must see per NIC load.

1xi7-4770K.1x82541PI.L2.manual

eth0: cpu1
eth1: cpu2

utils/ - everything same, just import X: if __name__ = '__main__': X().run() and bash utils
netutils-linux-monitoring/ - top-like utils
netutils-linux-tuning/ - rx-buffers-increase, autorps, autotune-reductor, maximize-cpu-freq, rss-ladder
netutils-linux-hardware/ - server-info, server-info-show, server-info-collect, server-info-rate
netutils-linux-misc/ - pcap-tcp-stream, subnet2iplist