Back to list of postings

Letting Traefik run on Worker Nodes

NOTE: I've written a new post that updates this post with a better implementation. However, the information here is still valid on why you want to run Traefik on worker nodes. So, read this post first and jump over to the new one.

Traefik (traefik.io) is a fantastic tool and one I've used on many projects. It just works really well and is easy to configure. In Docker mode, it listens to events and automatically reconfigures itself to allow traffic to be sent to new services and/or containers. Deploying a microserviced application is a breeze.

However, in order for it to listen, you often see Docker Compose files looking like this…

version: "3.5"

services:
  traefik:
    image: traefik:latest
    command: --docker --docker.watch
    ports:
      - 80:80
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

While this works just fine when running locally, it's a terrible idea when running it in a Swarm cluster. Why? In order to hear Swarm events, Traefik has to have access to a manager node (which means a placement constraint to ensure this). This means all of your cluster traffic will run through a manager node!

Using socat

Per the man page, "socat is a command line based utility that establishes two bidirectional byte streams and transfers data between them." Using this utility, we can "upgrade" the Docker socket (a Unix socket) to a TCP socket. Then, services can connect to the Docker socket using plain TCP from remote locations.

If we run socat in a container that has the Docker socket mounted, we can make the Docker socket available to any other containers on the same network. If you're using Docker EE, you can further secure the network by limiting who can access it by putting it into its own collection.

Why use socat rather than just enabling remote connections on the engine socket? Great question! By doing this, we can leverage Swarm's DNS-based service discovery (don't have to lookup where the managers are located) and we can use network isolation to limit who can access it.

The Stack File

The following stack file will add the socat service and update Traefik to use the new service for its Docker endpoint.

version: "3.6"

services:
  socat:
    image: alpine/socat
    command: tcp-listen:2375,fork,reuseaddr unix-connect:/var/run/docker.sock
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - mgmt
    deploy:
      placement:
        constraints:
          - node.role == manager

  traefik:
    image: traefik:latest
    command: --docker --docker.endpoint=tcp://socat:2375 --docker.watch --docker.swarmMode
    ports:
      - 80:80
    networks:
      - mgmt
      - app-entry
    deploy:
      placement:
        constraints:
          - node.role == worker

networks:
  mgmt:
    external: true
  app-entry:
    external: true

A couple of things to note…

  • The Traefik service is configured with a docker.endpoint of socat:2375. Remember that with Docker's DNS-based service discovery, this will resolve to the socat service.
  • There are two networks, which you'll notice are defined externally. The reason I do this is so 1) they have exact names (rather than having a project prefix added to them) and 2) making it easier to have other services connect to them (since this is a reverse proxy after all). The app-entry network is used to communicate from Traefik to any other service (example below).

Deploying a Service

Now that we have the proxy stack, let's deploy a simple app. We'll use the ridiculous mikesir87/cats image.

version: "3.6"

services:
  cats:
    image: mikesir87/cats
    networks:
      - app-entry
    deploy:
      labels:
        traefik.docker.network: app-entry
        traefik.backend: cats
        traefik.frontend.rule: "Path: /"
        traefik.port: 5000
      placement:
        constraints:
          - node.role == worker

networks:
  app-entry:
    external: true

Running it!

To try it out, we'll sping up a quick Swarm cluster using Play with Docker.

  • Get a quick five-node cluster (three managers and two nodes) by using the templates found by clicking on the wrench icon.
  • On a manager node, run the following commands:
git clone https://github.com/mikesir87/traefik-socat-demo.git
cd traefik-socat-demo
docker network create --attachable --driver overlay --opt encrypted=true app-entry
docker network create --attachable --driver overlay --opt encrypted=true mgmt
docker stack deploy -c proxy-stack.yml proxy
docker stack deploy -c app-stack.yml cats

Wait for everything and then open badge for port 80. You should see some cats now!

Cat Gifs!

For kicks, you can also run this command to get a quick swarm visualizer:

docker service create --constraint 'node.role == manager' --mount type=bind,source=/var/run/docker.sock,destination=/var/run/docker.sock --publish 3000:3000 mikesir87/swarm-viz

(Yes… this runs on a manager node because I don't have it configurable to connect to a TCP socket yet. Doh!)

Wait for that to launch, and open the badge for port 3000 and you should see the Swarm with only the visualizer and socat on the manager node, with everything else on worker nodes, including Traefik!

Swarm Visualizer in action

Further explorations

While this works, there are a few obvious next steps to explore. Have any others to add? Feel free to comment and ask below.

  • We could run the socat container in global replication so the agent runs on all manager nodes, hopefully spreading the work out more than it is right now
  • We could still secure the Docker socket by setting up cert auth.
  • We could run multiple replicas of Traefik to spread the load across the cluster, or even consider running that as a global service too.

Thanks!