Unicast VXLAN: overlay network for multiple servers with dozens of containers

Introduction

Virtualization is a great technology, as it lets you run multiple virtual systems on a single server.

It's easy to create a "LAN" for containers on a single server - just attach them to the same bridge, use the same subnet (i.e. 10.10.10.0/24) - done. Containers can communicate with each other using their private IP address.

However, with more then one server *not* in the same LAN (i.e. two LXD servers in different datacentres, or even in the same datacentre, but with hosting which doesn't allow LANs), the things get tricky. There are numerous examples of such hostings - i.e. Amazon AWS or Hetzner.

Virtual Extensible LAN (VXLAN) can help us solve this issue. No more hassle with port redirections, adjusting IPs after containers were migrated etc.

While this HOWTO is primarily with LXD in mind, it can be used with any networking technologies. Apart from LXD - LXC, KVM, Xen, docker should be fine to use it. Make sure to read MTU notes.

Network diagram

We will build a Virtual Extensible LAN as on example diagrams below. Each server is using unicast VXLAN connected to every other server. In the examples, the containers are using 10.10.10.0/24 subnet, but of course you can use any other subnet.

Example 1:

Example 2:

Prerequisites

You will need a fairly modern kernel and userspace. Ubuntu 14.04 LTS is too old; Ubuntu 16.04 LTS is fine. CentOS 7 should also be fine.

Each container needs to be attached to the bridge created by the script below. We're using vxbr0 as the bridge for VXLAN devices.

Performance

VXLAN offers near wire speed. I.e. if iperf between your containers with public IPs will show 90 MB/s traffic, VXLAN traffic should be showing around 85 MB/s.

Drawbacks

VXLAN does not encrypt the traffic
VXLAN does not compress the traffic

While it's typically fine to run the traffic unencrypted between the servers in the same VPC / security group in AWS or in general, a single datacentre, you may want to think about extra traffic encryption with server in different geographical locations.

MTU issue

VXLAN interface will lower your MTU to 1450. It means, any container attached to such a bridge using VXLAN needs its networking to have MTU of 1450 as well. Otherwise, any traffic with larger packets will hang.

If you're using LXD, you can run "lxc config edit your-container" and set it as below:

preventing bridge looping

If you use more than two servers, you will have more than one VXLAN device attached to the bridge. This normally creates packet loops. To prevent it, we use ebtables to block the traffic between different VXLAN devices on the same server.

The script sets this up automatically.

Please note that the script assumes it's the only thing on the server which manipulates ebtables.

vxlansetup.sh script

The script creates VXLAN devices between every server.

usage - available actions are start, stop, restart and status

copy the script to every server running the containers (do not copy or use the script in the containers)
modify LOCALIP, REMOTEIPS, LOCAL_DEV, BRIDGE_DEV variables on every server
DRYRUN - set it to 1 to see which commands would be run
VXLAN_DEV, PORT - should be OK for most systems, but adjust if needed
you have to run the script on every server (remember to adjust the IPs accordingly)

Example output

Here is an example output in "DRYRUN" mode:

Unicast VXLAN: overlay network for multiple servers with dozens of containers

Introduction

Network diagram

Prerequisites

Performance

Drawbacks

MTU issue

preventing bridge looping

vxlansetup.sh script

Example output

Fix Maven Import Issues: Step-By-Step Guide to Troubleshoot Unable to Import Maven Project – See Logs for Details Error

Troubleshooting Guide: Fixing the I/O Operation Aborted due to Thread Exit or Application Request Error

Resolving the 'Undefined Operator *' Error for Function_Handle Input Arguments: A Comprehensive Guide

Solving the Command 'bin sh' Failed with Exit Code 1 Issue: Comprehensive Guide

Troubleshooting Guide: Fixing the 'Current Working Directory is Not a Cordova-Based Project' Error

Solving 'Symbol(s) Not Found for Architecture x86_64' Error

Solving Resource Interpreted as Stylesheet but Transferred with MIME Type Text/Plain

Solving 'Failed to Push Some Refs to Heroku' Error

Solving 'Container Name Already in Use' Error: A Comprehensive Guide to Solving Docker Container Conflicts

Solving the Issue of Unexpected $gopath/go.mod File Existence