Coder Social home page Coder Social logo

glb-director's Introduction

GitHub Load Balancer Director

The GitHub Load Balancer (GLB) Director is a set of components that provide a scalable set of stateless Layer 4 load balancer servers capable of line rate packet processing in bare metal datacenter environments, and is used in production to serve all traffic from GitHub's datacenters.

GLB Logo

Design

GLB Director is designed to be used in datacenter environments where multiple servers can announce the same IP address via BGP and have network routers shard traffic amongst those servers using ECMP routing. While ECMP shards connections per-flow using consistent hashing, addition or removal of nodes will generally cause some disruption to traffic as state isn't stored for each flow. A split L4/L7 design is typically used to allow the L4 servers to redistribute these flows back to a consistent server in a flow-aware manner. GLB Director implements the L4 (director) tier of a split L4/L7 load balancer design.

L4/L7 load balancer design

Traditional solutions such as LVS have stored flow state on each director node and then shared flow state between nodes. GLB Director instead receives these flows and uses a derivative of rendezvous hashing to hash flows to a pair of servers with a pre-determined order, and leverages the state already stored on those servers to allow flows to complete after a server begins draining.

GLB "second chance" packet flow

GLB Director only processes packets on ingress, and encapsulates them inside an extended Generic UDP Encapsulation packet. Egress packets from proxy layer servers are sent directly to clients using Direct Server Return.

Getting started

GLB Director has a number of components that work together with other infrastructure components to create a complete load balancer. We've created an example Vagrant setup/guide which will create a local instance of GLB with all required components. The docs directory also contains additional documentation on the design and constraints. For details about the packages provided and how to install them, see the packages and quick start guide.

Contributing

Please check out our contributing guidelines.

License

Components in this repository are licensed under BSD 3-Clause except where required to be GPL v2 depending on their dependencies and usage, see the license documentation for detailed information.

Authors

GLB Director has been an ongoing project designed, authored, reviewed and supported by various members of GitHub's Production Engineering organisation, including:

glb-director's People

Contributors

awlx avatar bobrik avatar chipitsine avatar dependabot[bot] avatar dverbeir avatar hjmcnew avatar jmdelafe avatar lmb avatar massar avatar pavantc avatar poupas avatar ravisinghsfbay avatar sentinel avatar shelson avatar statik avatar synical avatar theojulienne avatar yeled avatar yzguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

glb-director's Issues

make error

[root@glb-director-xdp]# make
make -C xdp-root-shim/
make[1]: Entering directory /root/glb-director/src/glb-director-xdp/xdp-root-shim' cc xdp-root-shim.c -o xdp-root-shim -lbpf -lsystemd xdp-root-shim.c:36:21: fatal error: bpf/bpf.h: No such file or directory #include <bpf/bpf.h> ^ compilation terminated. make[1]: *** [xdp-root-shim] Error 1 make[1]: Leaving directory /root/glb-director/src/glb-director-xdp/xdp-root-shim'
make: *** [all] Error 2

Package for Debian buster ?

Hello!

I was looking into GLB-director to reproduce the setup mentioned on Github Blog, but couldn't install it as the package repository for Debian "buster" doesn't exist (maybe that's what happened for #51 ?).

Will there be any package for Debian - buster?

Before I can think about contributing, I have few questions:

  • What is the reason behind enforcing dpdk=17.11.1-6 and not allowing newer versions?
  • Any ideas regarding what may entails the upgrade from "stretch" to "buster"?

Kind regards,
Eclion

glb-redirect: conntrack lookups might not be needed

This is a follow up to #44 (comment)

I dove into the 4.19 sources, and found that new connections are created in tcp_conn_request:

  1. Call to inet_reqsk_alloc which creates a new struct request_sock with sk_state equal to TCP_NEW_SYN_RECV
  2. Call to inet_csk_reqsk_queue_hash_add which ends up adding the struct request_sock into tcp_hashinfo

This would mean my conjecture in the original comment is almost correct. inet_lookup_established does return these sockets where SYNACK is outstanding, but in state TCP_NEW_SYN_RECV.

I think I've found the place where the request_sock is turned into a full socket: tcp_v4_rcv.

Maybe the whole conntrack part of the code is not required anymore?

iptables

I'm having a bit of a problem with setting up the second chance flows.
I keep running into this problem.

root@0fb610b8835e:/# sudo iptables -t raw -A INPUT -p udp -m udp --dport 19523 -j CT --notrack

iptables: No chain/target/match by that name.

Could anyone help me out?

Packets sent to primary when both primary and secondary destinations are unhealthy

While doing some testing, @ravisinghsfbay and @robloxrob and I ran into a case where glb-director forwards ingress packets to a known unhealthy destination. We're interested in submitting a patch to fix this case, but first I'm hoping to get a quick confirmation that the root-cause analysis below is correct.

The test setup was:

  • Set up glb-director in front of a pool of four backend servers.
  • Send some requests through glb-director to verify that the load balancing is working.
  • Then turn off two of the backend servers so they stop passing health checks.
    Expected result:
  • glb-director forwards subsequent TCP connections to one of the healthy servers.
    Observed result:
  • glb-director forwards subsequent TCP connections to one of the unhealthy servers. The metadata in the GUE header indicates that the other unhealthy server is that server's secondary.

I verified that glb-healthcheck detected the failed healthchecks and generated a forwarding_table.checked.bin that properly identified the affected servers as state=1 and health=0.

From a quick read through the code, I think the problem is in the construction of the forwarding table in glb-director/cli/main.c. There's some code checks for the case where the primary is unhealthy and the secondary is healthy. But if both the primary and the secondary are unhealthy, that entry in the forwarding table ends up with two unhealthy destinations in it. In the data plane, if an incoming packet hashes to that entry in the forwarding table, glb-director then sends it to the (unhealthy) primary, with a GUE wrapper that lists the (unhealthy) secondary.

Assuming that analysis is correct, the first idea that springs to mind is to increase the number of destination IPs stored in each routing table slot from 2 to a configurable N. That wouldn't prevent the problem altogether, but it would reduce the probability of the problem as N increased.

Metrics of glb-director-xdp

We've been using the DPDK glb-director but had issues with driver support, so we're trialling using the XDP implementation instead and first impressions are excellent.

However, the DPDK version had dpdk-procinfo and statsd exporting from the glb-director process for understanding packets and bytes received and transmitted.

It seems that we can get some semblence of XDP network throughput statistics via ethtool -S $interface but it is again driver-specific and doesn't give XDP-only traffic metrics. The xdp project acknowledges that metrics for XDP are inconsistent at this time, and suggests that each XDP program does it's own accounting, one example of this is given here.

Is there already a source of traffic data for glb-director-xdp or would custom solution per the above example be necessary? Is metrics support planned or would it be considered?

Failing build as of 5.4.0-rc2

Hello, this commit in 5.4.0-rc2 renamed nf_reset to nf_reset_ct, which broke the build:

Failure looks roughly like this:

[03:15:15]	make -C /huh/build/linux-5.4.0-rc2 ARCH=x86_64  O=/huh/build/amd64 M=/huh/build/glb-director/src/glb-redirect modules
[03:15:15]	make[1]: Entering directory '/huh/build/linux-5.4.0-rc2'
[03:15:15]	make[2]: Entering directory '/huh/build/amd64'
[03:15:16]	grep: dkms.conf: No such file or directory
[03:15:16]	  CC [M]  /huh/build/glb-director/src/glb-redirect/ipt_GLBREDIRECT.o
[03:15:17]	/huh/build/glb-director/src/glb-redirect/ipt_GLBREDIRECT.c: In function 'glbredirect_send_forwarded_skb':
[03:15:17]	/huh/build/glb-director/src/glb-redirect/ipt_GLBREDIRECT.c:94:2: error: implicit declaration of function 'nf_reset'; did you mean 'nf_ct_set'? [-Werror=implicit-function-declaration]
[03:15:17]	   94 |  nf_reset(skb);
[03:15:17]	      |  ^~~~~~~~
[03:15:17]	      |  nf_ct_set
[03:15:17]	cc1: some warnings being treated as errors
[03:15:17]	make[2]: Leaving directory '/huh/build/amd64'
[03:15:17]	make[3]: *** [/huh/build/linux-5.4.0-rc2/scripts/Makefile.build:266: /huh/build/glb-director/src/glb-redirect/ipt_GLBREDIRECT.o] Error 1
[03:15:17]	make[1]: Leaving directory '/huh/build/linux-5.4.0-rc2'
[03:15:17]	make[2]: *** [/huh/build/linux-5.4.0-rc2/Makefile:1650: /huh/build/glb-director/src/glb-redirect] Error 2
[03:15:17]	make[1]: *** [Makefile:179: sub-make] Error 2
[03:15:17]	make: *** [Makefile:195: build-glb-redirect-amd64] Error 2
[03:15:19]	Failure: 2

cibuild-create-packages fails to prepare the Docker build environment with a broken packages error

script/cibuild-create-packages prepares the Docker build environment:

docker build -t glb-director-build-stretch -f script/Dockerfile.stretch script

The Docker build fails on the llvm install step:

RUN wget https://apt.llvm.org/llvm.sh && chmod +x llvm.sh && sudo ./llvm.sh 9

The output from the llvm install step is:

+ apt-get install -y clang-9 lldb-9 lld-9 clangd-9
Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 clang-9 : Depends: libclang-common-9-dev (= 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106) but it is not going to be installed
 clangd-9 : Depends: libclang-common-9-dev (= 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The command '/bin/sh -c wget https://apt.llvm.org/llvm.sh && chmod +x llvm.sh && ./llvm.sh 9' returned a non-zero code: 100

The problem is likely a more general issue installing llvm in a debian:stretch container since this greatly simplified Dockerfile reproduces the exact same error:

FROM debian:stretch
RUN apt-get update && apt-get -y install apt-transport-https build-essential software-properties-common wget
RUN wget https://apt.llvm.org/llvm.sh && chmod +x llvm.sh && ./llvm.sh 9

It seems the implied llvm-9-tools dependency may be the core of the issue:

$ apt-get -o Debug::pkgProblemResolver=yes install clang-9 lldb-9 lld-9 clangd-9
Reading package lists... Done
Building dependency tree
Reading state information... Done
Starting pkgProblemResolver with broken count: 1
Starting 2 pkgProblemResolver with broken count: 1
Investigating (0) llvm-9-tools:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uN Ib >
Broken llvm-9-tools:amd64 Breaks on libclang-common-9-dev:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uN > (< 1:9.0.1~+rc2)
  Considering libclang-common-9-dev:amd64 0 as a solution to llvm-9-tools:amd64 2
  Added libclang-common-9-dev:amd64 to the remove list
  Fixing llvm-9-tools:amd64 via keep of libclang-common-9-dev:amd64
Investigating (1) clangd-9:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un puN Ib >
Broken clangd-9:amd64 Depends on libclang-common-9-dev:amd64 < none | 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uH > (= 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106)
  Considering libclang-common-9-dev:amd64 0 as a solution to clangd-9:amd64 9999
  Re-Instated libclang-common-9-dev:amd64
Investigating (1) llvm-9-tools:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uN Ib >
Broken llvm-9-tools:amd64 Breaks on libclang-common-9-dev:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uN > (< 1:9.0.1~+rc2)
  Considering libclang-common-9-dev:amd64 0 as a solution to llvm-9-tools:amd64 2
  Added libclang-common-9-dev:amd64 to the remove list
  Fixing llvm-9-tools:amd64 via keep of libclang-common-9-dev:amd64
Investigating (2) clangd-9:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un puN Ib >
Broken clangd-9:amd64 Depends on libclang-common-9-dev:amd64 < none | 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uH > (= 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106)
  Considering libclang-common-9-dev:amd64 0 as a solution to clangd-9:amd64 9999
Investigating (2) clang-9:amd64 < none -> 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un puN Ib >
Broken clang-9:amd64 Depends on libclang-common-9-dev:amd64 < none | 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106 @un uH > (= 1:9~+20200403104614+c1a0a213378-1~exp1~20200403085153.106)
  Considering libclang-common-9-dev:amd64 0 as a solution to clang-9:amd64 9999
Done

However, I don't know LLVM well, nor glb-director's usage of LLVM, to know what is the appropriate way forward to resolve the build failure.

Question about filling state

Hi there,

First off, many thanks for this project and making it public. It's quite interesting and valuable to outsiders.

I was wondering when/how it is decided a proxy node is no longer in the filling state and is transitioned to the active state. With a proxy that is draining it is safe to remove once it is no longer handling any connections, which makes sense. But for a proxy that is filling it may be passing some traffic to its secondary for an indeterminate period of time. Is there an easy way to tell that a node is no longer handling any traffic for the secondary or do you just make an assumption it is not after not having seen such traffic for some period of time?

building dep with dpdk versions

I get this after I build the .deb files (on ubuntu 18.04 machines)

root@t7810-mate:~/github/glb-director/src/glb-director# gdebi *.deb
Reading package lists... Done
Building dependency tree
Reading state information... Done
Reading state information... Done
This package is uninstallable
Dependency is not satisfiable: dpdk (= 17.11.1-6)

leisner@t7810-mate:~/github/glb-director/src/glb-director$ pkg-config --modversion libdpdk
17.11.5

I assume since 17.11.5 is later than 17.11.1, it would work? it doesn't.
On a hunch, I changed 17.11.1-6 to 17.11.1 -- no difference.

Dependency is not satisfiable: dpdk (= 17.11.1)

Tag releases please

Could you please tag the commit used to build each package published to PackageCloud?

It's tricky to uncover which version of the source corresponds to, e.g., v1.0.5, v1.0.7, etc.

I wouldn't expect everything to be tagged retrospectively but at least for future releases, and ideally the most recently published package.

Vagrant setup issue

Got errors running vagrant up command, the cause is that linux-herader-4.9.0-9-amd64 package is deprecated. Here's the error logs on my local environment:

...
==> router: Checking for guest additions in VM...
    router: No guest additions were detected on the base box for this VM! Guest
    router: additions are required for forwarded ports, shared folders, host only
    router: networking, and more. If SSH fails on this machine, please install
    router: the guest additions and repackage the box to continue.
    router:
    router: This is not an error message; everything may continue to work properly,
    router: in which case you may ignore this message.
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

deb [check-valid-until=no] http://snapshot.debian.org/archive/debian/20190801T025637Z/ stretch main
apt-get install -y linux-headers-`uname -r` dkms

Stdout from the command:

Reading package lists...
Building dependency tree...
Reading state information...


Stderr from the command:

E: Unable to locate package linux-headers-4.9.0-9-amd64
E: Couldn't find any package by glob 'linux-headers-4.9.0-9-amd64'
E: Couldn't find any package by regex 'linux-headers-4.9.0-9-amd64'

Workaround:
The error is actually caused by vagrant-vbguest plugin since it's also installing that linux header package

What I did is to uninstall that plugin locally

vagrant plugin uninstall vagrant-vbguest

Then add the source list in the box (following this workaround).

racecondition() in posix platform

Team,

File:
glb-director/src/glb-director/glb_director_config.c

	if (access(config_file, F_OK) == -1) {
		glb_log_error_and_exit(
		    "Configuration filename must be provided: "
		    "--config-file <json-config-file>");
	}

	if (access(forwarding_table, F_OK) == -1) {
		glb_log_error_and_exit(
		    "Forwarding table filename must be provided: "
		    "--forwarding-table <binary-file>");
}

I believe this indicates a security flaw, If an attacker can change anything along the path between the call access() and the files actually used, attacker may exploit the race condition or a time-of-check, time-of-use race condition, request team to please have a look and validate.

Reference: https://linux.die.net/man/2/access

Can Ethernet support DPDK

IF My Host NIC not support DPDK ,can I use?
my NIC is not intel
Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe

proposal: support path MTU discovery in glb-redirect

The current glb-redirect sinks all ICMP packets at the first hop, even if there are other hops specified. This breaks path MTU messages destined for the second hop.

I'd like to contribute getting the 4-tuple from the ICMP Packet Too Big message, and pushing that through is_valid_locally.

ttyname failed: Inappropriate ioctl for device

I have this error (script/cibuild):

The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifdown eth1 2> /dev/null

Stdout from the command:


Stderr from the command:

mesg: ttyname failed: Inappropriate ioctl for device

Ubuntu 16.04
How do I fix it?
Log:

%%%END FOLD%%%
%%%FOLD {Bringing up and syncing vagrant test environment}%%%
Bringing machine 'router' up with 'virtualbox' provider...
Bringing machine 'user' up with 'virtualbox' provider...
Bringing machine 'director-test' up with 'virtualbox' provider...
Bringing machine 'director1' up with 'virtualbox' provider...
Bringing machine 'director2' up with 'virtualbox' provider...
Bringing machine 'proxy1' up with 'virtualbox' provider...
Bringing machine 'proxy2' up with 'virtualbox' provider...
==> router: Checking if box 'debian/stretch64' is up to date...
==> router: VirtualBox VM is already running.
==> user: Importing base box 'debian/stretch64'...
==> user: Matching MAC address for NAT networking...
==> user: Checking if box 'debian/stretch64' is up to date...
==> user: Setting the name of the VM: glb-director_user_1534927675359_28
==> user: Fixed port collision for 22 => 2222. Now on port 2200.
==> user: Clearing any previously set network interfaces...
==> user: Preparing network interfaces based on configuration...
    user: Adapter 1: nat
    user: Adapter 2: intnet
==> user: Forwarding ports...
    user: 22 (guest) => 2200 (host) (adapter 1)
==> user: Running 'pre-boot' VM customizations...
==> user: Booting VM...
==> user: Waiting for machine to boot. This may take a few minutes...
    user: SSH address: 127.0.0.1:2200
    user: SSH username: vagrant
    user: SSH auth method: private key
    user:
    user: Vagrant insecure key detected. Vagrant will automatically replace
    user: this with a newly generated keypair for better security.
    user:
    user: Inserting generated public key within guest...
    user: Removing insecure key from the guest if it's present...
    user: Key inserted! Disconnecting and reconnecting using new SSH key...
==> user: Machine booted and ready!
==> user: Checking for guest additions in VM...
    user: No guest additions were detected on the base box for this VM! Guest
    user: additions are required for forwarded ports, shared folders, host only
    user: networking, and more. If SSH fails on this machine, please install
    user: the guest additions and repackage the box to continue.
    user:
    user: This is not an error message; everything may continue to work properly,
    user: in which case you may ignore this message.
==> user: Setting hostname...
==> user: Configuring and enabling network interfaces...

The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifdown eth1 2> /dev/null

Stdout from the command:



Stderr from the command:

mesg: ttyname failed: Inappropriate ioctl for device

vagrant status:

Current machine states:

router                    running (virtualbox)
user                       not created (virtualbox)
director-test          not created (virtualbox)
director1               not created (virtualbox)
director2               not created (virtualbox)
proxy1                   not created (virtualbox)
proxy2                   not created (virtualbox)

cli programs need -ljansson at end instead of at the beginning

Not sure which system it was tested on -- libraries generally have to be in order
I'm using ubuntu 18.04 with gcc 7.4

  •   ${MAKE} -C src/glb-director/cli clean
    

diff --git a/src/glb-director/cli/Makefile b/src/glb-director/cli/Makefile
index d7433a7..ecfeeef 100644
--- a/src/glb-director/cli/Makefile
+++ b/src/glb-director/cli/Makefile
@@ -73,44 +73,44 @@ LDFLAGS += -ljansson

glb-director-cli: main.c
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              -I`pwd`/.. \
              main.c \
              ../siphash24.c \
    
  •           -o glb-director-cli
    
  •           -o glb-director-cli ${LDFLAGS}
    

glb-config-check:
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              $(CHECK_SRCS) \
              -o glb-config-check \
              -I`pwd`/.. \
              -I/usr/include/dpdk \
              -I/usr/include/x86_64-linux-gnu \
    
  •           -ldpdk -lpcap \
    
  •           -ldpdk -lpcap  ${LDFLAGS} \
              -m64 -mssse3
    

glb-director-pcap:
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              $(PCAP_SRCS) \
              -o glb-director-pcap \
              -I`pwd`/.. \
    

:

  •   make -C src/glb-healthcheck clean
    
  •   make -C src/glb-director clean
    
  •   make -C src/glb-director/cli clean
    
  •   ${MAKE} -C src/glb-redirect clean
    
  •   ${MAKE} -C src/glb-healthcheck clean
    
  •   ${MAKE} -C src/glb-director clean
    
  •   ${MAKE} -C src/glb-director/cli clean
    

diff --git a/src/glb-director/cli/Makefile b/src/glb-director/cli/Makefile
index d7433a7..ecfeeef 100644
--- a/src/glb-director/cli/Makefile
+++ b/src/glb-director/cli/Makefile
@@ -73,44 +73,44 @@ LDFLAGS += -ljansson

glb-director-cli: main.c
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              -I`pwd`/.. \
              main.c \
              ../siphash24.c \
    
  •           -o glb-director-cli
    
  •           -o glb-director-cli ${LDFLAGS}
    

glb-config-check:
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              $(CHECK_SRCS) \
              -o glb-config-check \
              -I`pwd`/.. \
              -I/usr/include/dpdk \
              -I/usr/include/x86_64-linux-gnu \
    
  •           -ldpdk -lpcap \
    
  •           -ldpdk -lpcap  ${LDFLAGS} \
              -m64 -mssse3
    

glb-director-pcap:
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              $(PCAP_SRCS) \
              -o glb-director-pcap \
              -I`pwd`/.. \
    
  •           -ldpdk -lpcap  ${LDFLAGS} \
              -m64 -mssse3
    

glb-director-pcap:
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              $(PCAP_SRCS) \
              -o glb-director-pcap \
              -I`pwd`/.. \
              -I/usr/include/dpdk \
              -I/usr/include/x86_64-linux-gnu \
    
  •           -lpcap \
    
  •           -DPCAP_MODE \
    
  •           -lpcap ${LDFLAGS}
    
  •           -DPCAP_MODE  ${CFLAGS} \
              -m64 -mssse3
    

glb-director-stub-server:
gcc \

  •           $(CFLAGS) $(LDFLAGS) \
    
  •           $(CFLAGS)  \
              $(STUB_SRCS) \
              -o glb-director-stub-server \
              -I`pwd`/.. \
              -I/usr/include/dpdk \
              -I/usr/include/x86_64-linux-gnu \
    
  •           -DPCAP_MODE \
    
  •           -DPCAP_MODE ${LDFLAGS} \
              -m64 -mssse3
    

clean:

vagrant setup not work as expected...

i've configured as https://github.com/github/glb-director/blob/master/docs/setup/example-setup-vagrant.md.

i've shutdowned director2 that use xdp and use dpdk.

when i use curl from user machine... i go in timeout...

when i use curl from router machine they work but... the ip source that are seen front proxy are 192.168.50.1, in place of 192.168.50.2 why ?

and when try from user machine they use the same ip (192.168.50.1) as ip source in place of 192.168.40.x source ip, and cannot reply to user machine beceause that machine not have the 192.168.50 network...

what are missing ? why glb-director not correctly encapsulate the source ip ?

this ip are not configured any where (no default route any where! but exist on 192.168.50 network because of virtual network as host ip interface)....

vagrant@proxy1:~$ tshark -ni any port not 22 and not arp and port not 547 and not stp
Capturing on 'any'
    1 0.000000000 192.168.50.1 ? 10.10.10.10  TCP 116 37924 ? 80 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1 TSval=1513029862 TSecr=0 WS=64
    2 0.000000000 192.168.50.1 ? 10.10.10.10  TCP 76 [TCP Out-Of-Order] 37924 ? 80 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1 TSval=1513029862 TSecr=0 WS=64
    3 0.000051246  10.10.10.10 ? 192.168.50.1 TCP 76 80 ? 37924 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM=1 TSval=1575348520 TSecr=1513029862 WS=64
    4 0.000255767 192.168.50.1 ? 10.10.10.10  TCP 108 37924 ? 80 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=1513029862 TSecr=1575348520
    5 0.000255767 192.168.50.1 ? 10.10.10.10  TCP 68 [TCP Dup ACK 4#1] 37924 ? 80 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=1513029862 TSecr=1575348520
    6 0.000373384 192.168.50.1 ? 10.10.10.10  HTTP 183 GET / HTTP/1.1
    7 0.000373384 192.168.50.1 ? 10.10.10.10  TCP 143 [TCP Retransmission] 37924 ? 80 [PSH, ACK] Seq=1 Ack=1 Win=64256 Len=75 TSval=1513029862 TSecr=1575348520
    8 0.000391393  10.10.10.10 ? 192.168.50.1 TCP 68 80 ? 37924 [ACK] Seq=1 Ack=76 Win=65088 Len=0 TSval=1575348520 TSecr=1513029862
    9 0.000574828  10.10.10.10 ? 192.168.50.1 HTTP 336 HTTP/1.1 200 OK  (text/html)
   10 0.000756007 192.168.50.1 ? 10.10.10.10  TCP 108 37924 ? 80 [ACK] Seq=76 Ack=269 Win=64128 Len=0 TSval=1513029863 TSecr=1575348521
   11 0.000756007 192.168.50.1 ? 10.10.10.10  TCP 68 [TCP Dup ACK 10#1] 37924 ? 80 [ACK] Seq=76 Ack=269 Win=64128 Len=0 TSval=1513029863 TSecr=1575348521
   12 0.001064636 192.168.50.1 ? 10.10.10.10  TCP 108 37924 ? 80 [FIN, ACK] Seq=76 Ack=269 Win=64128 Len=0 TSval=1513029863 TSecr=1575348521
   13 0.001064636 192.168.50.1 ? 10.10.10.10  TCP 68 [TCP Out-Of-Order] 37924 ? 80 [FIN, ACK] Seq=76 Ack=269 Win=64128 Len=0 TSval=1513029863 TSecr=1575348521
   14 0.001100476  10.10.10.10 ? 192.168.50.1 TCP 68 80 ? 37924 [FIN, ACK] Seq=269 Ack=77 Win=65088 Len=0 TSval=1575348521 TSecr=1513029863
   15 0.001311342 192.168.50.1 ? 10.10.10.10  TCP 108 37924 ? 80 [ACK] Seq=77 Ack=270 Win=64128 Len=0 TSval=1513029863 TSecr=1575348521
   16 0.001311342 192.168.50.1 ? 10.10.10.10  TCP 68 [TCP Dup ACK 15#1] 37924 ? 80 [ACK] Seq=77 Ack=270 Win=64128 Len=0 TSval=1513029863 TSecr=1575348521

Denial of Service (DoS)

Denial of Service (DoS)
Vulnerable module: scapy
Introduced through: [email protected]
Detailed paths
Introduced through: github/glb-director@github/glb-director#5e1edd0a0fe057320fc30f6ad850c9878c607882 › [email protected]
Remediation: Upgrade to [email protected].
Overview
scapy is a Python-based interactive packet manipulation program and library.

Affected versions of this package are vulnerable to Denial of Service (DoS) due to a lack of input validation when reading the length field in the RADIUS packet’s Attribute Value Pairs (AVP). When Scapy parses a UDP Radius packet that has an AVP with a length byte equal to zero, the getfield function doesn’t shorten the remain value in the while loop. This causes the loop to continue forever, causing Scapy to crash.

Destination Mac address mapping according to backend IP address

Hi,
I understand the bypassing arp process and using MAC of the gw as destination MAC. But this may cause CPU issue on the network device as all packets are designated to the device itself is CPU processed.
As all configuration process of backed are automatized via CI/CD, MAC address of each backend can also be included in the configuration file.
Do you see any obstacle? Or any idea about the problem?

We will like to implement and contribute to the project if it sounds good.

Unable to install native extensions for FPM

I was taking a look at this today and started to get the vagrant environment up. After running script/cibuild the Docker build failed due to the following error

Building native extensions.  This could take a while...
ERROR:  Error installing fpm:
	ERROR: Failed to build gem native extension.

    current directory: /var/lib/gems/2.3.0/gems/childprocess-1.0.0/ext
/usr/bin/ruby2.3 mkrf_conf.rb

current directory: /var/lib/gems/2.3.0/gems/childprocess-1.0.0/ext
/usr/bin/ruby2.3 -rubygems /usr/share/rubygems-integration/all/gems/rake-10.5.0/bin/rake RUBYARCHDIR=/var/lib/gems/2.3.0/extensions/x86_64-linux/2.3.0/childprocess-1.0.0 RUBYLIBDIR=/var/lib/gems/2.3.0/extensions/x86_64-linux/2.3.0/childprocess-1.0.0
/usr/bin/ruby2.3: No such file or directory -- /usr/share/rubygems-integration/all/gems/rake-10.5.0/bin/rake (LoadError)

rake failed, exit code 1

Gem files will remain installed in /var/lib/gems/2.3.0/gems/childprocess-1.0.0 for inspection.
Results logged to /var/lib/gems/2.3.0/extensions/x86_64-linux/2.3.0/childprocess-1.0.0/gem_make.out

It looks like rake is a dependency of FPM and it should be installed so that the native extensions can be build.

duplicate strlcpy definitions on ubuntu 18.04.2

When building on ubuntu (after I found all the dependencies, it would be nice to have a configure stage) I get (I'm using 4.20.11-042011-generic)
gcc -Wp,-MD,./.main.o.d.tmp -m64 -pthread -fPIC -march=corei7 -DRTE_MACHINE_CPUFLAG_SSE -DRTE_MACHINE_CPUFLAG_SSE2 -DRTE_MACHINE_CPUFLAG_SSE3 -DRTE_MACHINE_CPUFLAG_SSSE3 -DRTE_MACHINE_CPUFLAG_SSE4_1 -DRTE_MACHINE_CPUFLAG_SSE4_2 -I/home/leisner/github/glb-director/src/glb-director/build/include -I/usr/share/dpdk/x86_64-default-linuxapp-gcc/include -include /usr/share/dpdk/x86_64-default-linuxapp-gcc/include/rte_config.h -O3 -g -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-align -Wnested-externs -Wcast-qual -Wformat-nonliteral -Wformat-security -Wundef -Wwrite-strings -Wimplicit-fallthrough=2 -Wno-format-truncation -Werror -pie -fPIE -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -fstack-protector-strong -DSTATSD -o main.o -c /home/leisner/github/glb-director/src/glb-director/main.c
In file included from /home/leisner/github/glb-director/src/glb-director/main.c:79:0:
/usr/share/dpdk/x86_64-default-linuxapp-gcc/include/rte_string_fns.h:103:33: error: redefinition of ‘rte_strlcpy’
#define strlcpy(dst, src, size) rte_strlcpy(dst, src, size)
^
/home/leisner/github/glb-director/src/glb-director/strlcpy.h:25:1: note: in expansion of macro ‘strlcpy’
strlcpy(char * __restrict dst, const char * __restrict src, size_t siz)
^~~~~~~
/usr/share/dpdk/x86_64-default-linuxapp-gcc/include/rte_string_fns.h:85:1: note: previous definition of ‘rte_strlcpy’ was here
rte_strlcpy(char *dst, const char *src, size_t size)
^~~~~~~~~~~
/usr/share/dpdk/mk/internal/rte.compile-pre.mk:138: recipe for target 'main.o' failed
make[2]: *** [main.o] Error 1
/usr/share/dpdk/mk/rte.extapp.mk:42: recipe for target 'all' failed
make[1]: *** [all] Error 2
make[1]: Leaving directory '/home/leisner/github/glb-director/src/glb-director'
Makefile:2: recipe for target 'mkdeb' failed
make: *** [mkdeb] Error 2

I modified rte_string_fns.h (for now) to get past this problem.

vagrant setup fail witrh libvirt provider...

i'm on ubuntu 20.04 with vagrant 2.2.6 with libvirt.

i try to use vagrant to setup a test explained in https://github.com/github/glb-director/blob/master/docs/setup/example-setup-vagrant.md.

my first problem has been with /vagrant that are configured with nfs, and in server they don't found the vm ip to configur the nfs exports... i've added the next line to overide nfs default configuration (added after src/glb-wireshark-dissector sync folder):

config.vm.synced_folder ".", "/vagrant", type: 'rsync'

then when try to install dpdk-rte-kni-dkms dpdk-igb-uio-dkms they cannot because they are not authenticated package....
i've added "--allow-unauthenticated" on apt install.

sudo apt install -y --allow-unauthenticated dpdk-rte-kni-dkms dpdk-igb-uio-dkms

but after the rte_kni and igb_uio are not in kernel module directory...

then i found that memory and cpu are sized only on virtualbox provider... because i've seen many fork fail...

i've added the rigth size also for libvirt provider:

v.vm.provider "libvirt" do |vb|
  vb.cpus = 3
  vb.memory = "2048"
end

after the kernel module load correctly.

after the interface name are not ethx in last debian image... now are ensxxx...

i've changed eth1 with ens6 in Vagrantfile and ./script/helpers/configure-vagrant-director.sh.

now i'm blocked on director2 xdp that say this error message:

Mar 26 00:45:51 director2 xdp-root-shim[11671]: libbpf: load bpf program failed: Argument list too long
Mar 26 00:45:51 director2 xdp-root-shim[11671]: libbpf: failed to load program 'xdp-root'
Mar 26 00:45:51 director2 xdp-root-shim[11671]: libbpf: failed to load object '/usr/share/xdp-root-shim/tailcall.o'
Mar 26 00:45:51 director2 xdp-root-shim[11671]: Could not load '/usr/share/xdp-root-shim/tailcall.o'
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Mar 26 00:45:51 director2 systemd[1]: Failed to start XDP Root Shim provides a root array to bind and replace XDP programs on a given interface.
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Unit entered failed state.
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Failed with result 'exit-code'.
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Service hold-off time over, scheduling restart.
Mar 26 00:45:51 director2 systemd[1]: Stopped XDP Root Shim provides a root array to bind and replace XDP programs on a given interface.
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Start request repeated too quickly.
Mar 26 00:45:51 director2 systemd[1]: Failed to start XDP Root Shim provides a root array to bind and replace XDP programs on a given interface.
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Unit entered failed state.
Mar 26 00:45:51 director2 systemd[1]: [email protected]: Failed with result 'exit-code'.

glb-director failing host + ecmp/ibgp

Trying to learn as much before trying it out.

How is glb-director announces if one of its hosts are down -- any automatization with bird or other bgp software? What about ecmp, how is it done as well?

Thanks!

conntrack lookup removal in ipt_GLBREDIRECT breaks with network namespaces

The change to ipt_GLBREDIRECT implemented in PR #67 and discussed in issue #50 breaks deployments where the listening socket is in a different network namespace to where the -j GLBREDIRECT iptables rule is installed.

The observed behaviour is that GUE-encapsulated TCP SYN packets are accepted but all subsequent GUE packets for the same TCP session are then forwarded to the next-hop specified in the GUE private data, instead of being accepted locally.

Taking current master (commit 5387908) and reverting just the PR #67 merge commit 5e1edd0, i.e. git revert -m1 5e1edd0 corrects the behaviour. The behaviour is also mitigated by configuring the GLB with only a single backend since there is no next-hop to forward to but this is not very useful in practice.

The assumption is that the inet_lookup_established call is only considering ESTABLISHED sockets in the host network namespace and the now deleted conntrack lookup code does not exist to discover the conntrack entries related to having directed the connection to another network namespace.

One example where this occurs is on a Kubernetes node with the ip fou tunnel and GLBREDIRECT iptables rule configured on the host network namespace, while an nginx-ingress controller Pod listens on TCP sockets 80 and 443 inside the Pod's network namespace and traffic is routed from the host to the Pod via DNAT iptables rules added by the Kubernetes CNI. I expect the same behaviour can be reproduced without Kubernetes, such as with a Docker container's network namespace, or even just with ip netns add, ip netns exec and appropriate NAT rules.

The problem was experienced on Ubuntu 18.04.5 with kernel 5.4.0-42-generic.

I have not confirmed but I suspect that configuring the fou tunnel and the GLBREDIRECT iptables rule inside the Pod network namespace would also resolve the fault but this is less maintainable in a Kubernetes ingress controller context.

Possible options to fix ipt_GLBREDIRECT:

  • Just revert PR #67
  • Revert PR #67 and make it either a conditional compilation option, or enabled at module load with a module parameter, or as an additional iptables argument for -j GLBREDIRECT.
  • Introduce a module/iptable parameter to specify the network namespace to use for inet_lookup_established calls (not sure if feasible, or even friendly to use).
  • Other??

question about proxy failure

Glad to read your blog post about "Github Load Balance",That post said GLB will not disrupt any connection in any condition,in my opinion,the addition and remove of proxy is ok,because the existence connection data flow will be handled by the old primary(the new secondary),but what if the primary fail,I don't see how the primary sync with secondary about connection state information,will the connection be terminated on secondary?

ignoring implicit fallthrough on 4.20.11/ubuntu 18.04.2

builds generate a warning which shuts down....

There's a -Wimplicit-fallthrough flag in kernel builds.

This gets around the problem

diff --git a/src/glb-director/shared_opt.c b/src/glb-director/shared_opt.c
index 46e9307..0c93355 100644
--- a/src/glb-director/shared_opt.c
+++ b/src/glb-director/shared_opt.c
@@ -48,6 +48,8 @@ void get_options(char *config_file, char *forwarding_table, int argc,
{"debug", no_argument, NULL, 'v'},
{NULL, 0, NULL, 0}};

+#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
+
while ((opt = getopt_long(argc, argv, ":c:t:v", long_options, NULL)) !=
-1)
switch (opt) {
diff --git a/src/glb-director/siphash24.c b/src/glb-director/siphash24.c
index ebe785c..e3cda35 100644
--- a/src/glb-director/siphash24.c
+++ b/src/glb-director/siphash24.c
@@ -110,7 +110,7 @@ int siphash(uint8_t *out, const uint8_t *in, uint64_t inlen, const uint8_t *k)

            v0 ^= m;                                                                                                                                                
    }                                                                                                                                                               

+#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
switch (left) {
case 7:
b |= ((uint64_t)in[6]) << 48;

Denial of Service (DoS) # 2

Denial of Service (DoS)
Vulnerable module: scapy
Introduced through: [email protected]
Detailed paths
Introduced through: github/glb-director@github/glb-director#5e1edd0a0fe057320fc30f6ad850c9878c607882 › [email protected]
Remediation: Upgrade to [email protected].
Overview
scapy is a Python-based interactive packet manipulation program and library.

Affected versions of this package are vulnerable to Denial of Service (DoS) due to a lack of input validation when reading the length field in the RADIUS packet’s Attribute Value Pairs (AVP). When Scapy parses a UDP Radius packet that has an AVP with a length byte equal to zero, the getfield function doesn’t shorten the remain value in the while loop.

6to4 traffic on sit tunnel is dropped as spoofed

We're seeing the following in dmesg:

[Thu Apr 11 13:22:44 2019] sit: Src spoofed 108.X.X.X/2002::Z -> 108.Y.Y.Y/2606::Q

As far as we can tell, this is because the encapsulated source IP uses the 6to4 prefix, presumably this is a legit client. We traced this to https://elixir.bootlin.com/linux/v4.19.6/source/net/ipv6/sit.c#L622, which means that the packet is dropped after this message is logged.

The configuration for our tunnel:

11: sit1@ethX: <NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/sit 108.Y.Y.Y brd 0.0.0.0

It seems like it might be possible to work around this using ip tunnel 6rd dev sit1 6rd-prefix fc00::/32 (see https://elixir.bootlin.com/linux/v4.19.6/source/net/ipv6/sit.c#L771), but that seems hacky.

cc @arthurfabre who did the debugging on this.

Have you encountered this issue as well? Is there a better way to fix this / are we doing something wrong?

Using the glb Wireshark dissector doesn't work out of the box

Hi,

I did the naive thing and copied the contents of the glb dissector directory to ~/.config/wireshark/plugins/glb-..., but Wireshark then complains that it can't find glb_fou.lua. It seems like the dissector expects that file to be in ~?

I've fixed that by just moving gue_fou.lua to init.lua, but that seems like a hack.

glb-director fails to build with kernel linux-5.9.1

Only linux-5.9.1 tested, but probably any linux-5.9.x fails. Kernel linux-5.8.1 works fine.

Ninja build

Fails with;

.../workspace/dpdk-stable-19.11.5/kernel/linux/kni/kni_dev.h: In function ‘iova_to_phys’:
.../workspace/dpdk-stable-19.11.5/kernel/linux/kni/kni_dev.h:104:30: error: passing argument 1 of ‘get_user_pages_remote’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  ret = get_user_pages_remote(tsk, tsk->mm, iova, 1,
                              ^~~

Make build

Fails with;

.../workspace/dpdk-stable-19.11.5/kernel/linux/kni/kni_dev.h: In function ‘iova_to_phys’:
.../workspace/dpdk-stable-19.11.5/kernel/linux/kni/kni_dev.h:104:30: error: passing argument 1 of ‘get_user_pages_remote’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  ret = get_user_pages_remote(tsk, tsk->mm, iova, 1,
                              ^~~
In file included from /home/uablrek/tmp/linux/linux-5.9.1/include/linux/bvec.h:13:0,
                 from /home/uablrek/tmp/linux/linux-5.9.1/include/linux/skbuff.h:17,
                 from /home/uablrek/tmp/linux/linux-5.9.1/include/linux/if_ether.h:19,
                 from /home/uablrek/tmp/linux/linux-5.9.1/include/uapi/linux/ethtool.h:19,
                 from /home/uablrek/tmp/linux/linux-5.9.1/include/linux/ethtool.h:18,
                 from /home/uablrek/tmp/linux/linux-5.9.1/include/linux/netdevice.h:37,
                 from /home/uablrek/tmp/xcluster/workspace/dpdk-stable-19.11.5/x86_64-native-linuxapp-gcc/build/kernel/linux/kni/kni_net.c:14:
/home/uablrek/tmp/linux/linux-5.9.1/include/linux/mm.h:1714:6: note: expected ‘struct mm_struct *’ but argument is of type ‘struct task_struct *’
 long get_user_pages_remote(struct mm_struct *mm,
      ^~~~~~~~~~~~~~~~~~~~~

(more errors follows)

glb-director-pcap needs -lrte-acl to link

Not sure how it links without it.

Without -rte-acl I get:
gcc
-Wall -O3 -g -Werror -pie -fPIE -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-all -DCLI_MODE
pcap_mode.c ../glb_fwd_config.c ../glb_director_config.c ../glb_encap.c ../cmdline_parse.c ../cmdline_parse_etheraddr.c ../glb_encap_pcap.c ../siphash24.c ../shared_opt.c
-o glb-director-pcap
-Ipwd/..
-I/usr/include/dpdk
-I/usr/include/x86_64-linux-gnu
-lpcap -z relro -z now -ljansson
/tmp/ccSt1u9w.o: In function glb_fwd_config_ctx_decref': /home/leisner/github/glb-director/src/glb-director/cli/../glb_fwd_config.c:157: undefined reference to rte_acl_free'
/home/leisner/github/glb-director/src/glb-director/cli/../glb_fwd_config.c:159: undefined reference to rte_acl_free' /tmp/ccSt1u9w.o: In function create_glb_fwd_config':
/home/leisner/github/glb-director/src/glb-director/cli/../glb_fwd_config.c:113: undefined reference to `create_bind_classifier'
collect2: error: ld returned 1 exit status

How to install in Kubernetes?

I wonder if it is possible to run the glb-director in Kubernetes? I haven't found any documentation about this. Any documentation/tips would be highly appreciated.

Question: XDP Director status

Hi, I'm seeing commits in 2020 for the XDP director, which I think is super exciting! We're interested in using GLB Director in a large e-commerce environment, but I haven't seen any official announcements regarding the XDP director's availability. Before getting to deep into the XDP director, I was hoping to get idea of its status. Is it being run in production? are there any pending features? known bugs? etc. Thanks so much!

(Sorry, if this isn't the right forum for a question like this, but I didn't see any contact/chat info and I saw some precedence for questions in other GitHub issues)

How to install in CentOS?

Is there any document or steps available to install and configure GLB-director in CentOS/RHEL?
I could see its only available for UBUNTU.

Also, Is GLB supports UDP custom applications?

glb-director-xdp on bonded nic

I have successfully deployed glb-director-xdp v1.0.6 on a Debian 10 server (a t1.small/x86 instance from Packet).

Packet servers have bonded nics:

$ ip -o link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000\    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s20f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000\    link/ether 0c:c4:7a:81:09:9c brd ff:ff:ff:ff:ff:ff\    prog/xdp id 368 tag 631855c97cb7abd1
3: enp0s20f1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN mode DEFAULT group default qlen 1000\    link/ether 0c:c4:7a:81:09:9d brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc noqueue state UP mode DEFAULT group default qlen 1000\    link/ether 0c:c4:7a:81:09:9c brd ff:ff:ff:ff:ff:ff\    prog/xdp id 351 tag 631855c97cb7abd1

I initially tried using the xdp-root-shim@bond0 service with /etc/default/glb-director-xdp containing:

GLB_DIRECTOR_XDP_ROOT_PATHS="--xdp-root-path=/sys/fs/bpf/xdp_root_array@bond0"

with no success.

Then I tried configuring xdp on the underlying nic instead of bond0 via:

GLB_DIRECTOR_XDP_ROOT_PATHS="--xdp-root-path=/sys/fs/bpf/xdp_root_array@enp0s20f0"

And this works.

However, I'd like to have glb-director-xdp handling traffic from both underlying nics. Is there a way to configure this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.