Coder Social home page Coder Social logo

sonic-net / sonic-swss Goto Github PK

View Code? Open in Web Editor NEW
166.0 102.0 492.0 12.41 MB

SONiC Switch State Service (SwSS)

Home Page: https://azure.github.io/SONiC

License: Other

Makefile 0.40% Shell 0.15% C++ 71.55% M4 0.05% C 0.24% Lua 1.23% Python 26.36% Dockerfile 0.02%
sonic

sonic-swss's Introduction

static analysis:

Total alerts Language grade: Python Language grade: C/C++

sonic-swss builds:

master build 202205 build 202111 build 202106 build 202012 build 201911 build

SONiC - SWitch State Service - SWSS

Description

The SWitch State Service (SWSS) is a collection of software that provides a database interface for communication with and state representation of network applications and network switch hardware.

Getting Started

Install

Before installing, add key and package sources:

sudo apt-key adv --keyserver apt-mo.trafficmanager.net --recv-keys 417A0893
echo 'deb http://apt-mo.trafficmanager.net/repos/sonic/ trusty main' | sudo tee -a /etc/apt/sources.list.d/sonic.list
sudo apt-get update

Install dependencies:

sudo apt-get install redis-server -t trusty
sudo apt-get install libhiredis0.13 -t trusty
sudo apt-get install libzmq5 libzmq3-dev

Install building dependencies:

sudo apt-get install libtool
sudo apt-get install autoconf automake
sudo apt-get install dh-exec

There are a few different ways you can install SONiC-SWSS.

Install from Debian Repo

For your convenience, you can install prepared packages on Debian Jessie:

sudo apt-get install swss

Install from Source

Checkout the source: git clone https://github.com/sonic-net/sonic-swss.git and install it yourself.

Get SAI header files into /usr/include/sai. Put the SAI header files that you use to compile libsairedis into /usr/include/sai

Install prerequisite packages:

sudo apt-get install libswsscommon libswsscommon-dev libsairedis libsairedis-dev

You can compile and install from source using:

./autogen.sh
./configure
make && sudo make install

You can also build a debian package using:

./autogen.sh
fakeroot debian/rules binary

Need Help?

For general questions, setup help, or troubleshooting:

For bug reports or feature requests, please open an Issue.

Contribution guide

See the contributors guide for information about how to contribute.

GitHub Workflow

We're following basic GitHub Flow. If you have no idea what we're talking about, check out GitHub's official guide. Note that merge is only performed by the repository maintainer.

Guide for performing commits:

  • Isolate each commit to one component/bugfix/issue/feature
  • Use a standard commit message format:
[component/folder touched]: Description intent of your changes

[List of changes]

Signed-off-by: Your Name [email protected]

For example:

swss-common: Stabilize the ConsumerTable

* Fixing autoreconf
* Fixing unit-tests by adding checkers and initialize the DB before start
* Adding the ability to select from multiple channels
* Health-Monitor - The idea of the patch is that if something went wrong with the notification channel,
  we will have the option to know about it (Query the LLEN table length).

  Signed-off-by: [email protected]
  • Each developer should fork this repository and add the team as a Contributor
  • Push your changes to your private fork and do "pull-request" to this repository
  • Use a pull request to do code review
  • Use issues to keep track of what is going on

sonic-swss's People

Contributors

akhileshsamineni avatar andriymoroz-mlnx avatar bingwang-ms avatar daall avatar dgsudharsan avatar eladraz avatar jipanyang avatar junchao-mellanox avatar kcudnik avatar lguohan avatar liushilongbuaa avatar marian-pritsak avatar nazariig avatar ndancejic avatar oleksandrivantsiv avatar pavel-shirshov avatar prsunny avatar pterosaur avatar qiluo-msft avatar shi-su avatar sihuihan88 avatar stcheng avatar stepanblyschak avatar stephenxs avatar theasianpianist avatar vivekrnv avatar wendani avatar yakiv-huryk avatar yxieca avatar zhenggen-xu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sonic-swss's Issues

GPF in orchagent. ACL counters related code

Reading symbols from /usr/bin/orchagent...Reading symbols from /usr/lib/debug/.build-id/ef/36e12d55ea2b4540a35abab394a233817622dc.debug...bdone.
done.
t[New LWP 108]
[New LWP 97]
[New LWP 107]
[New LWP 105]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `orchagent -m 90:b1:1c:f4:a8:51'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000046a139 in pair (this=0x7f1b1fffee50)
at /usr/include/c++/4.9/bits/stl_pair.h:127
127 /usr/include/c++/4.9/bits/stl_pair.h: No such file or directory.
(gdb) bt
#0 0x000000000046a139 in pair (this=0x7f1b1fffee50)
at /usr/include/c++/4.9/bits/stl_pair.h:127
#1 AclOrch::collectCountersThread (pAclOrch=0x2387690) at aclorch.cpp:1315
#2 0x00007f1b25a7f970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f1b26825064 in start_thread (arg=0x7f1b1ffff700) at pthread_create.c:309
#4 0x00007f1b251ef62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

ECMP routes are not handled correctly

Steps to reproduce:

  1. Bring up two interfaces connected to hosts.
  2. Ping host1 to create entry in neighbor table:
root@arc-switch1028:/home/admin# ifconfig Ethernet0
Ethernet0 Link encap:Ethernet  HWaddr 00:02:03:04:05:00
          inet addr:20.0.1.100  Bcast:20.0.1.255  Mask:255.255.255.0
          inet6 addr: fe80::202:3ff:fe04:500/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9216  Metric:1
          RX packets:130 errors:0 dropped:0 overruns:0 frame:0
          TX packets:138 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:10692 (10.4 KiB)  TX bytes:13052 (12.7 KiB)

root@arc-switch1028:/home/admin# ping 20.0.1.1
PING 20.0.1.1 (20.0.1.1) 56(84) bytes of data.
64 bytes from 20.0.1.1: icmp_seq=1 ttl=64 time=0.181 ms
^C
--- 20.0.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.181/0.181/0.181/0.000 ms
  1. Ping host2 to create entry in neighbor table
root@arc-switch1028:/home/admin# ifconfig Ethernet4
Ethernet4 Link encap:Ethernet  HWaddr 00:02:03:04:05:00
          inet addr:20.0.2.100  Bcast:20.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::202:3ff:fe04:500/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9216  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:648 (648.0 B)

root@arc-switch1028:/home/admin# ping 20.0.2.1
PING 20.0.2.1 (20.0.2.1) 56(84) bytes of data.
64 bytes from 20.0.2.1: icmp_seq=1 ttl=64 time=0.370 ms
64 bytes from 20.0.2.1: icmp_seq=2 ttl=64 time=0.238 ms
64 bytes from 20.0.2.1: icmp_seq=3 ttl=64 time=0.256 ms
^C
--- 20.0.2.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.238/0.288/0.370/0.058 ms
  1. Verify neighbor table:
root@arc-switch1028:/home/admin# ip neigh
fe80::7efe:90ff:fe5e:6b12 dev eth0 lladdr 7c:fe:90:5e:6b:12 STALE
10.224.23.254 dev eth0 lladdr 40:b4:f0:cb:e9:81 REACHABLE
20.0.2.1 dev Ethernet4 lladdr 24:8a:07:1e:de:e3 REACHABLE
20.0.1.1 dev Ethernet0 lladdr 24:8a:07:1e:de:cf REACHABLE
  1. Create route with two next hops with different metrics:
root@arc-switch1028:/home/admin# ip route add 2.2.2.0/24  via 20.0.1.1 metric  1
root@arc-switch1028:/home/admin# ip route add 2.2.2.0/24  via 20.0.2.1 metric  10
root@arc-switch1028:/home/admin# ip route
default via 10.224.23.254 dev eth0  proto zebra
2.2.2.0/24 via 20.0.1.1 dev Ethernet0  metric 1
2.2.2.0/24 via 20.0.2.1 dev Ethernet4  metric 10
10.224.23.0/24 dev eth0  proto kernel  scope link  src 10.224.23.82
20.0.1.0/24 dev Ethernet0  proto kernel  scope link  src 20.0.1.100
20.0.2.0/24 dev Ethernet4  proto kernel  scope link  src 20.0.2.100
240.127.1.0/24 dev docker0  proto kernel  scope link  src 240.127.1.1
  1. Verify that route entry with two next hops is created in APP DB:
root@arc-switch1028:/home/admin# docker exec -it database redis-cli
127.0.0.1:6379> HGETALL "ROUTE_TABLE:2.2.2.0/24"
1) "nexthop"
2) "20.0.2.1"
3) "ifname"
4) "Ethernet4"
127.0.0.1:6379>

Actual result: Entry with only one next hop is created is APP DB table.

lots of doTask issues that flood the /var/log/syslog

Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :> doTask: enter
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :< doTask: exit
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :> doTask: enter
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :< doTask: exit
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :> doTask: enter
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :< doTask: exit
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :> doTask: enter
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :< doTask: exit
Aug 12 06:38:08 str-s6000-on-1 DEBUG orchagent: :> doTask: enter

"systemctl restart swss" causes PORT_TABLE ConfigDone set prematurely

Since all Ethernet ports have been created with previous swss instance, upon portsyncd netlink registration, kernel will send RTM_NEWLINK notification for all ports to portsyncd. Portsyncd set PORT_TABLE ConfigDone after seeing all the physical ports.

Orchagent deletes and recreates those Ethernet port later, and that will cause issue on configurations which have dependency on the existence of Ethernet ports.

Aug 24 23:27:57.0 sonic INFO supervisord: portsyncd Read port configuration file...
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet44 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:768 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet19 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:769 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet10 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:770 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet31 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:771 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet38 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:772 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet45 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:773 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet23 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:774 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet34 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:775 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet6 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:776 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet18 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:777 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet0 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:778 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet39 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:779 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet35 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:780 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet1 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:781 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet5 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:782 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet17 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:783 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet12 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:784 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet25 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:785 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet22 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:786 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet41 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:787 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet32 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:788 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet2 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:789 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet4 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:790 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet24 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:791 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet14 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:792 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet26 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:793 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet16 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:794 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet40 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:795 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet60 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:796 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet36 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:797 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet42 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:798 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet29 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:799 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet64 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:800 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet15 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:801 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet47 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:802 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet11 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:803 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:PortChannel46 admin:1 oper:1 addr:00:05:64:30:73:c0 ifindex:693 master:0 type:team
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet68 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:750 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet27 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:751 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet46 admin:1 oper:1 addr:00:05:64:30:73:c0 ifindex:752 master:693
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet20 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:753 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet43 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:754 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet8 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:755 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet3 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:756 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet9 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:757 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet33 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:758 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet30 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:759 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet37 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:760 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet7 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:761 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet56 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:762 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet13 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:763 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet48 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:764 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet21 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:765 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet28 admin:0 oper:0 addr:00:05:64:30:73:c0 ifindex:766 master:0
Aug 24 23:27:57.0 sonic INFO portsyncd: :- onMsg: nlmsg type:16 key:Ethernet52 admin:1 oper:1 addr:00:05:64:30:73:c0 ifindex:767 master:0

Separate LAG/LAG_MEMBER and VLAN/VLAN_MEMBER tables

We should seperate LAG_TABLE veruse LAG_MEMBER_TABLE, they are different and have different schema.

acsadmin@str-s6000-acs-11:~$ redis-cli keys LAG_TABLE:PortChannel*

  1. "LAG_TABLE:PortChannel8:Ethernet8"
  2. "LAG_TABLE:PortChannel0:Ethernet4"
  3. "LAG_TABLE:PortChannel32"
  4. "LAG_TABLE:PortChannel32:Ethernet32"
  5. "LAG_TABLE:PortChannel24:Ethernet28"
  6. "LAG_TABLE:PortChannel16"
  7. "LAG_TABLE:PortChannel24:Ethernet24"
  8. "LAG_TABLE:PortChannel40:Ethernet44"
  9. "LAG_TABLE:PortChannel48:Ethernet48"
  10. "LAG_TABLE:PortChannel56"
  11. "LAG_TABLE:PortChannel0:Ethernet0"
  12. "LAG_TABLE:PortChannel48"
  13. "LAG_TABLE:PortChannel56:Ethernet56"
  14. "LAG_TABLE:PortChannel8:Ethernet12"
  15. "LAG_TABLE:PortChannel16:Ethernet20"
  16. "LAG_TABLE:PortChannel32:Ethernet36"
  17. "LAG_TABLE:PortChannel40"
  18. "LAG_TABLE:PortChannel40:Ethernet40"
  19. "LAG_TABLE:PortChannel0"
  20. "LAG_TABLE:PortChannel48:Ethernet52"
  21. "LAG_TABLE:PortChannel8"
  22. "LAG_TABLE:PortChannel16:Ethernet16"
  23. "LAG_TABLE:PortChannel24"
  24. "LAG_TABLE:PortChannel56:Ethernet60"

Kernel routes get lost after SwSS restart

127.0.0.1:6379> keys ROUTE_TABLE:240*
1) "ROUTE_TABLE:240.127.1.0/24"
127.0.0.1:6379> hgetall ROUTE_TABLE:240.127.1.0/24
1) "nexthop"
2) ""
3) "ifname"
4) "docker0"

The first time Quagga starts, it will insert all the kernel routes into the database.
However, after SwSS restarts, the database gets flushed and all the previous information is gone.
As all the front panel interfaces are recreated, all the routes pointing to front panel ports are re-inserted into the database. However, the routes that are not pointing to front panel ports are not changed, and this part of the information gets lost.

Currently it won't be a risk because if routes point to eth0 or lo or docker0 will be ignored unless the routes were pointing to some front panel ports formerly.

In order to address this, Quagga restart is required for now.

Add VLAN support

In configuration file port_config.ini, VLAN information will be appended. While reading the file, portsyncd will set up the VLANs. VLAN logic will be addressed in portsOrch class.

Failed to set default drop route to a new next hop

To have deterministic behavior, when the orchagent start, it creates IPv4 and IPv6 default route and set it to DROP.

Later we learn the default routes and the orchagent reset these routes to some new next hop IDs and set the packet action to FORWARD.

However, Mellanox SAI has an issue to set the route from DROP to FORWARD with next hop ID with using set_attribute API.

ACL range parsing get wrong data

Apr 21 07:03:40 str-s6000-acs-11 NOTICE swssconfig: :- main: Loading config from JSON file:acltb_test_rules.json...
Apr 21 07:03:40 str-s6000-acs-11 NOTICE orchagent: :- doAclRuleTask: OP: SET, TABLE_ID: ACL_Testbed_Test_Table, RULE_ID: Rule0A1_DST_port_range_test
Apr 21 07:03:40 str-s6000-acs-11 NOTICE orchagent: :- validateAddMatch: 4656 --> 4671
Apr 21 07:03:40 str-s6000-acs-11 NOTICE orchagent: :- doAclRuleTask: Added match attribute 'L4_DST_PORT_RANGE'
Apr 21 07:03:40 str-s6000-acs-11 NOTICE orchagent: :- create: Creating range object 4609..4671

I have "l4_dst_port_range" : "4656-4671", in ACL config JSON file.
After enable some debug logs, I notice that 4656 becomes 4609, also in SAI function calls
2017-04-21.05:08:39.132453|c|SAI_OBJECT_TYPE_ACL_RANGE:oid:0xa0000000005fc|SAI_ACL_RANGE_ATTR_TYPE=SAI_ACL_RANGE_TYPE_L4_DST_PORT_RANGE|SAI_ACL_RANGE_ATTR_LIMIT=4609,4671

I add one output line in validateAddMatch function to show the read value after https://github.com/Azure/sonic-swss/blob/master/orchagent/aclorch.cpp#L215.

SWSS_LOG_NOTICE("%d --> %d", value.u32range.min, value.u32range.max); It shows that it is 4656 at that time. But before creating, the number becomes 4609.

use SAI_HOSTIF_USER_DEFINED_TRAP_TYPE_NEIGH for SAI 1.0

And in SAI 1.0, you can configure trap group for user defined packet, so I would think we can add it to the CoPP definitions, and extend Sonic code to be able to configure user defined trap in addition to regular trap

netdev link status does not match the physical port status

The netdev knet link status are all up, but the actually physical link status for them are not up.

acsadmin@CCPSCH01030BBLF:~$ cat /proc/bcm/knet/link 
Software link status:
  Ethernet0      up
  Ethernet4      up
  Ethernet8      up
  Ethernet12     up
  Ethernet16     up
  Ethernet20     up
  Ethernet24     up
  Ethernet28     up
  Ethernet32     up
  Ethernet36     up
  Ethernet40     up
  Ethernet44     up
  Ethernet48     up
  Ethernet52     up
  Ethernet56     up
  Ethernet60     up
  Ethernet64     up
  Ethernet68     up
  Ethernet72     up
  Ethernet76     up
  Ethernet80     up
  Ethernet84     up
  Ethernet88     up
  Ethernet92     up
  Ethernet96     up
  Ethernet100    up
  Ethernet104    up
  Ethernet108    up
  Ethernet112    up
  Ethernet116    up
  Ethernet120    up
  Ethernet124    up
acsadmin@CCPSCH01030BBLF:~$  

#1 AclOrch::collectCountersThread (pAclOrch=0x1614da0) at aclorch.cpp:1279

not sure if it is related to #188

Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/orchagent...Reading symbols from /usr/lib/debug/.build-id/f2/1ca493a357d807d2e38833574d6bdc00518151.debug...done.
done.
[New LWP 137]
[New LWP 132]
[New LWP 133]
[New LWP 124]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `orchagent -m ec:f4:bb:fe:80:90'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000046885e in pair (this=0x7f769e386e50) at /usr/include/c++/4.9/bits/stl_pair.h:127
127 /usr/include/c++/4.9/bits/stl_pair.h: No such file or directory.
(gdb) bt
#0 0x000000000046885e in pair (this=0x7f769e386e50) at /usr/include/c++/4.9/bits/stl_pair.h:127
#1 AclOrch::collectCountersThread (pAclOrch=0x1614da0) at aclorch.cpp:1279
#2 0x00007f769fd02970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f76a0aa8064 in start_thread (arg=0x7f769e387700) at pthread_create.c:309
#4 0x00007f769f47262d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) q

Repeated error messages for 'still referenced neighbor' by orchagent

Orchagent is repeatedly, with approximately 1 second interval, printing following message in /var/log/syslog

Example:
Dec 1 22:16:29 str-msn2700-04 ERR orchagent: :- removeNeighbor: Neighbor is still referenced ip:10.0.0.31
Dec 1 22:16:30 str-msn2700-04 ERR orchagent: :- removeNeighbor: Neighbor is still referenced ip:10.0.0.31
Dec 1 22:16:30 str-msn2700-04 ERR orchagent: :- removeNeighbor: Neighbor is still referenced ip:10.0.0.31
Dec 1 22:16:31 str-msn2700-04 ERR orchagent: :- removeNeighbor: Neighbor is still referenced ip:10.0.0.31

The error needs to be investigated.
Another issue is that the syslog is constantly growing in size by the repeated messages.

unsolved neighbor cause a flood of error mesages.

Aug  4 17:58:13 str-msn2700-05 DEBUG orchagent: :< addRoute: exit
Aug  4 17:58:13 str-msn2700-05 DEBUG orchagent: :< doTask: exit
Aug  4 17:58:13 str-msn2700-05 DEBUG orchagent: :> doTask: enter
Aug  4 17:58:13 str-msn2700-05 DEBUG orchagent: :> addRoute: enter
Aug  4 17:58:13 str-msn2700-05 NOTICE orchagent: :- addRoute: Failed to get next hop entry ip:10.0.0.35

add lag member failed on mellanox platform

Apr 14 19:40:46 str-msn2700-04 NOTICE orchagent: :- addLagMember: Add member Ethernet120 to LAG PortChannel03 lid:20000000005a1 pid:100000000000f
Apr 14 19:40:46 str-msn2700-04 ERR syncd: :- handle_generic: failed to create -1
Apr 14 19:40:46 str-msn2700-04 ERR syncd: :- processEvent: failed to execute api: create, key: SAI_OBJECT_TYPE_LAG_MEMBER:oid:0x1a0000000005b8, status: SAI_STATUS_FAILURE
Apr 14 19:40:46 str-msn2700-04 ERR syncd: :- processEvent: field: SAI_LAG_MEMBER_ATTR_LAG_ID, value: oid:0x20000000005a1
Apr 14 19:40:46 str-msn2700-04 ERR syncd: :- processEvent: field: SAI_LAG_MEMBER_ATTR_PORT_ID, value: oid:0x100000000000f
Apr 14 19:40:46 str-msn2700-04 NOTICE syncd: :- exit_and_notify: sending switch_shutdown_request notification to OA
Apr 14 19:40:46 str-msn2700-04 NOTICE orchagent: :- handle_switch_shutdown_request: switch shutdown request
Apr 14 19:40:46 str-msn2700-04 ERR orchagent: :- on_switch_shutdown_request: Syncd stopped
Apr 14 19:40:46 str-msn2700-04 INFO swss.sh[2648]: terminate called without an active exception

default route isn't removed from ASIC DB

root@str-s-acs-1:/# ip route show
default via 10.3.146.1 dev eth0  proto zebra 

127.0.0.1:6379> keys ROUTE_TABLE*
1) "ROUTE_TABLE:0.0.0.0/0"
2) "ROUTE_TABLE:10.0.0.8/31"
3) "ROUTE_TABLE:fc00::18/126"
4) "ROUTE_TABLE:fc00::8/126"
5) "ROUTE_TABLE:10.0.0.12/31"
6) "ROUTE_TABLE:10.0.0.4/31"
7) "ROUTE_TABLE:fc00::10/126"
8) "ROUTE_TABLE:fc00::/126"
9) "ROUTE_TABLE:10.0.0.0/31"

127.0.0.1:6379> hgetall ROUTE_TABLE:0.0.0.0/0
1) "nexthop"
2) "10.3.146.1"
3) "ifname"
4) "eth0"

127.0.0.1:6379[1]> hgetall "ASIC_STATE:SAI_OBJECT_TYPE_ROUTE_ENTRY:{\"dest\":\"0.0.0.0/0\",\"vr\":\"oid:0x3000000000002\"}"
1) "SAI_ROUTE_ATTR_PACKET_ACTION"
2) "SAI_PACKET_ACTION_FORWARD"
3) "SAI_ROUTE_ATTR_NEXT_HOP_ID"
4) "oid:0x5000000000ade"

IP2ME and Subnet (local) routes are not created for VLAN interface.

Steps to reproduce:

  • Prepare minigraph file with VLAN interface and at least one port attached to that interface
  • load minigraph
  • reboot device
  • Try to ping VLAN interface IP address

Observed behavior:
ARP request packets are trapped to the CPU. VLAN interface sends ARP replay.
ICMP packets with destination IP address equal to VLAN interface IP address are not trapped.
No notice messages about subnet and IP2ME routes creation for VLAN interface in logs.

[v1.0.3] Bridge port remove error

Hit this issue and noticed the comments in code
"
/* Flush FDB entries pointing to this bridge port */
// TODO: Remove all FDB entries associated with this bridge port before
// removing the bridge port itself
"
Is the enhancement being worked on? thanks!

Aug 22 00:25:23.0 sonic ERR portsyncd: :- readMe: netlink reports an error=-4 on reading a netlink socket
Aug 22 00:25:23.0 sonic ERR teamsyncd: :- readMe: netlink reports an error=-4 on reading a netlink socket
Aug 22 00:25:23.0 sonic NOTICE orchagent: :- removeVlanMember: Remove member PortChannel46 from VLAN Vlan2000 lid:7d0 vmid:270000000009cb
Aug 22 00:25:23.955989 sonic INFO kernel: [271770.105867] device Ethernet46 left promiscuous mode
Aug 22 00:25:23.956003 sonic INFO kernel: [271770.105875] Bridge: port 1(PortChannel46) entered disabled state
Aug 22 00:25:23.956005 sonic INFO kernel: [271770.106632] device Bridge entered promiscuous mode
Aug 22 00:25:23.0 sonic INFO syncd: brcm_sai_remove_vlan_member:552 SAI Enter brcm_sai_remove_vlan_member
Aug 22 00:25:23.0 sonic INFO orchagent: :- setPortPvid: Set pvid 1 to port pid:1000000000031
Aug 22 00:25:23.0 sonic INFO syncd: brcm_sai_remove_vlan_member:622 SAI Exit brcm_sai_remove_vlan_member
Aug 22 00:25:23.0 sonic NOTICE orchagent: :- setHostIntfsStripTag: Set SAI_HOSTIF_VLAN_TAG_STRIP to host interface Ethernet46
Aug 22 00:25:23.0 sonic ERR orchagent: :- meta_generic_validation_remove: object 0x3a0000000009b0 reference count is 2, can't remove
Aug 22 00:25:23.0 sonic ERR orchagent: :- removeBridgePort: Failed to remove bridge port PortChannel46 from default 1Q bridge, rv:-5

create, key: SAI_OBJECT_TYPE_LAG_MEMBER:oid:0x1a0000000005b7, status: SAI_STATUS_FAILURE

Apr 27 17:07:10.766478 str-msn2700-04 WARNING kernel: [ 2065.173842] sx_netdev_handle_pude_event: Called for logical port - 10F00 status DOWN
Apr 27 17:07:10.770758 str-msn2700-04 INFO kernel: [ 2065.174235] Vlan1000: port 5(Ethernet92) entered disabled state
Apr 27 17:07:10.0 str-msn2700-04 NOTICE orchagent: :- doPortTask: Set port Ethernet92 admin status to up
Apr 27 17:07:10.0 str-msn2700-04 NOTICE orchagent: :- doPortTask: Set port Ethernet92 admin status to up
Apr 27 17:07:10.0 str-msn2700-04 NOTICE orchagent: :- doPortTask: Set port Ethernet92 admin status to up
Apr 27 17:07:10.0 str-msn2700-04 NOTICE orchagent: :- on_port_state_change: Get port state change notification id:1000000000008 status:2
Apr 27 17:07:10.0 str-msn2700-04 NOTICE orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet92
Apr 27 17:07:32.026510 str-msn2700-04 INFO ansible-<stdin>: Invoked with ip_path=/sbin/ip
Apr 27 17:07:53.0 str-msn2700-04 ERR syncd: :- handle_generic: failed to create -1
Apr 27 17:07:53.0 str-msn2700-04 NOTICE orchagent: :- handle_switch_shutdown_request: switch shutdown request
Apr 27 17:07:53.0 str-msn2700-04 ERR orchagent: :- on_switch_shutdown_request: Syncd stopped
Apr 27 17:07:53.0 str-msn2700-04 ERR syncd: :- processEvent: failed to execute api: create, key: SAI_OBJECT_TYPE_LAG_MEMBER:oid:0x1a0000000005b7, status: SAI_STATUS_FAILURE
Apr 27 17:07:53.753585 str-msn2700-04 INFO swss.sh[2631]: terminate called without an active exception

High CPU usage of orchagent due to unresolved next hops

When an interface goes up, BGP sessions will be set up and there will be thousands of messages sent from fpmsyncd. These messages are consumed by routeorch class and all the routes are pending to be added before the corresponding neighbors and next hops are created. During this time, all the routes that are pointing to unresolved next hops will be retried forever before the next hops finally get resolved. This approach is computing resources consuming and should be replaced by the logic that once a next hop is added, the corresponding routes will be added. User may experience long time waiting before routes are syncd and see high CPU usage of orchagent daemon. Currently, there are observer/subject module in the orchagent daemon. It could be leveraged for the above scenario.

Mirror feature issues

  1. When creating the mirror session I notice that the VLAN ID is 1. Where is this number coming from? Why the port VLAN ID is 1 and used here? Do we need to have a default VLAN ID here?
  2. TOS is zero even when DSCP is set.

[v1.0.3] How to get port out of default VLAN after removeRouterIntfs()?

At the contruction of PortsOrch (PortsOrch::PortsOrch()),
removeDefaultVlanMembers();
removeDefaultBridgePorts();
are called to get all ports out of default VLAN and default 1Q bridge.

After calling "bool IntfsOrch::removeRouterIntfs(Port &port)", the specific port will go back to default VLAN, how to get that port out of default VLAN? Calling removeDefaultVlanMembers() again might have side effect on other ports which may be in default vlan for some reason? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.