A briga entre Docker e firewalld

Categoria: Linux Publicado: Segunda, 16 Agosto 2021 Escrito por Helio Loureiro Imprimir

Tirei férias longas demais.   Na verdade foram apenas 3 semanas, mas eu meio que deixei de postar aqui durante o período de férias.  E a procrastinação voltou forte.   Então aqui vamos nós com uma tentativa de voltar a escrever semanalmente.

Hoje, fazendo um troubleshooting the um serviço que não funcionava como devia (um scan de aplicação com o OWASP ZAP), descobri que meus containers em docker não estavam acessando a rede.   O que mudou?   Minha máquina de trabalho é um Ubuntu 18.04.  O repositório bionic-update trouxe uma versão nova do docker que reiniciou o daemon, mas... a parte de rede não funcionando.  E só percebi isso hoje.

root@dell-latitude-7480 /u/local# apt show docker.io
Package: docker.io
Version: 20.10.7-0ubuntu1~18.04.1
Built-Using: glibc (= 2.27-3ubuntu1.2), golang-1.13 (= 1.13.8-1ubuntu1~18.04.3)
Priority: optional
Section: universe/admin
Origin: Ubuntu
Maintainer: Ubuntu Developers <Este endereço de email está sendo protegido de spambots. Você precisa do JavaScript ativado para vê-lo.>
Original-Maintainer: Paul Tagliamonte <Este endereço de email está sendo protegido de spambots. Você precisa do JavaScript ativado para vê-lo.>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 193 MB
Depends: adduser, containerd (>= 1.2.6-0ubuntu1~), iptables, debconf (>= 0.5) | debconf-2.0, libc6 (>= 2.8), libdevmapper1.02.1 (>= 2:1.02.97), libsystemd0 (>= 209~)
Recommends: ca-certificates, git, pigz, ubuntu-fan, xz-utils, apparmor
Suggests: aufs-tools, btrfs-progs, cgroupfs-mount | cgroup-lite, debootstrap, docker-doc, rinse, zfs-fuse | zfsutils
Breaks: docker (<< 1.5~)
Replaces: docker (<< 1.5~)
Homepage: https://www.docker.com/community-edition
Download-Size: 36.9 MB
APT-Manual-Installed: yes
APT-Sources: mirror://mirrors.ubuntu.com/mirrors.txt bionic-updates/universe amd64 Packages
Description: Linux container runtime
 Docker complements kernel namespacing with a high-level API which operates at
 the process level. It runs unix processes with strong guarantees of isolation
 and repeatability across servers.
 .
 Docker is a great building block for automating distributed systems:
 large-scale web deployments, database clusters, continuous deployment systems,
 private PaaS, service-oriented architectures, etc.
 .
 This package contains the daemon and client. Using docker.io on non-amd64 hosts
 is not supported at this time. Please be careful when using it on anything
 besides amd64.
 .
 Also, note that kernel version 3.8 or above is required for proper operation of
 the daemon process, and that any lower versions may have subtle and/or glaring
 issues.

N: There is 1 additional record. Please use the '-a' switch to see it

Primeira coisa que tentei foi reiniciar o docker mesmo.

root@dell-latitude-7480 /u/local# systemctl restart --no-block docker; journalctl -u docker -f
[...]
Aug 12 10:29:25 dell-latitude-7480 dockerd[446605]: time="2021-08-12T10:29:25.203367946+02:00" level=info msg="Firewalld: docker zone already exists, returning"
Aug 12 10:29:25 dell-latitude-7480 dockerd[446605]: time="2021-08-12T10:29:25.549158535+02:00" level=warning msg="could not create bridge network for id 88bd200
b5bb27d3fd10d9e8bf86b1947b2190cf7be36cd7243eec55ac8089dc6 bridge name docker0 while booting up from persistent state: Failed to program NAT chain:
ZONE_CONFLICT: 'docker0' already bound to a zone" Aug 12 10:29:25 dell-latitude-7480 dockerd[446605]: time="2021-08-12T10:29:25.596805407+02:00" level=info msg="stopping event stream following graceful shutdown"
error="" module=libcontainerd namespace=moby Aug 12 10:29:25 dell-latitude-7480 dockerd[446605]: time="2021-08-12T10:29:25.596994440+02:00" level=info msg="stopping event stream following graceful shutdown"
error="context canceled" module=libcontainerd namespace=plugins.moby Aug 12 10:29:25 dell-latitude-7480 dockerd[446605]: failed to start daemon: Error initializing network controller: Error creating default "bridge" network:
Failed to program NAT chain: ZONE_CONFLICT: 'docker0' already bound to a zone Aug 12 10:29:25 dell-latitude-7480 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE Aug 12 10:29:25 dell-latitude-7480 systemd[1]: docker.service: Failed with result 'exit-code'. Aug 12 10:29:25 dell-latitude-7480 systemd[1]: Failed to start Docker Application Container Engine. Aug 12 10:29:27 dell-latitude-7480 systemd[1]: docker.service: Service hold-off time over, scheduling restart. Aug 12 10:29:27 dell-latitude-7480 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3. Aug 12 10:29:27 dell-latitude-7480 systemd[1]: Stopped Docker Application Container Engine. Aug 12 10:29:27 dell-latitude-7480 systemd[1]: docker.service: Start request repeated too quickly. Aug 12 10:29:27 dell-latitude-7480 systemd[1]: docker.service: Failed with result 'exit-code'. Aug 12 10:29:27 dell-latitude-7480 systemd[1]: Failed to start Docker Application Container Engine.

As linhas estão editadas pra facilitar a visualização uma vez que o systemd usa linhas bem maiores que 120 colunas.  Mas o resultado foi... falha.

Parando o firewalld e somente reiniciando docker levava a uma condição em que o daemon iniciava, mas ao iniciar o container, novamente ficava sem acesso à rede.

root@dell-latitude-7480 /u/local# docker run -it --rm --init ubuntu:20.04 bash
root@f45dcbb1ecaa:/# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
^C
--- 1.1.1.1 ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5153ms

root@f45dcbb1ecaa:/# exit

Olhando somente as regras do firewall eu pude ver que realmente o docker estava carregando a regra correta sem o firewalld:

root@dell-latitude-7480 /u/local# systemctl stop firewalld.service 
root@dell-latitude-7480 /u/local# iptables -L -n -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
root@dell-latitude-7480 /u/local# systemctl restart docker
root@dell-latitude-7480 /u/local# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-08-12 12:01:12 CEST; 4s ago
     Docs: https://docs.docker.com
 Main PID: 484649 (dockerd)
    Tasks: 27
   CGroup: /system.slice/docker.service
           └─484649 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.061383466+02:00" level=warning msg="Your kernel does not support swap 
 memory limit"
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.061414030+02:00" level=warning msg="Your kernel does not support CPU
 realtime scheduler"
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.061421558+02:00" level=warning msg="Your kernel does not support cgroup
 blkio weight"
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.061427194+02:00" level=warning msg="Your kernel does not support cgroup
 blkio weight_device"
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.061796106+02:00" level=info msg="Loading containers: start."
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.531851162+02:00" level=info msg="Loading containers: done."
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.549979768+02:00" level=info msg="Docker daemon"
 commit="20.10.7-0ubuntu1~18.04.1" graphdriver(s)=overlay2 version=20.10.7
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.550057275+02:00" level=info msg="Daemon has completed initialization"
Aug 12 12:01:12 dell-latitude-7480 dockerd[484649]: time="2021-08-12T12:01:12.558188106+02:00" level=info msg="API listen on /var/run/docker.sock"
Aug 12 12:01:12 dell-latitude-7480 systemd[1]: Started Docker Application Container Engine.
root@dell-latitude-7480 /u/local# iptables -L -n -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.16.0.0/24       0.0.0.0/0           

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Claramente existia uma regra de MASQUERADE vinda da rede do docker (172.16.0.0/24).  E o firewalld estava sumindo com essa regra ao ser ativado (pra ficar menos poluído com várias regras peguei só a saída da cadeia do POSTROUTING.

root@dell-latitude-7480 /u/local# systemctl start firewalld.service 
root@dell-latitude-7480 /u/local# iptables -L POSTROUTING -n -t nat 
Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
POSTROUTING_direct  all  --  0.0.0.0/0            0.0.0.0/0           
POSTROUTING_ZONES_SOURCE  all  --  0.0.0.0/0            0.0.0.0/0           
POSTROUTING_ZONES  all  --  0.0.0.0/0            0.0.0.0/0

A minha primeira ideia: inserir à força uma regra de MASQUERADE direto na cadeia de POSTROUTING.

root@dell-latitude-7480 /u/local# iptables -I  POSTROUTING 1 -s 172.16.0.0/24 -j MASQUERADE -t nat
root@dell-latitude-7480 /u/local# iptables -L POSTROUTING --line-numbers -t nat
Chain POSTROUTING (policy ACCEPT)
num  target     prot opt source               destination         
1    MASQUERADE  all  --  172.16.0.0/24       anywhere            
2    POSTROUTING_direct  all  --  anywhere             anywhere            
3    POSTROUTING_ZONES_SOURCE  all  --  anywhere             anywhere            
4    POSTROUTING_ZONES  all  --  anywhere             anywhere            

E, claro, não deu certo.

Depois de procurar na Internet sobre docker e firewalld, encontrei o próprio site do Docker explicando como fazer isso em https://docs.docker.com/network/iptables/ com o seguinte comando:

# Please substitute the appropriate zone and docker interface
$ firewall-cmd --zone=trusted --remove-interface=docker0 --permanent
$ firewall-cmd --reload

Beleza.  Agora não teria como dar errado. E...

root@dell-latitude-7480 /u/local# firewall-cmd --get-zone-of-interface=docker0
public
root@dell-latitude-7480 /u/local# firewall-cmd --zone=public --remove-interface=docker0 --permanent
The interface is under control of NetworkManager and already bound to the default zone
The interface is under control of NetworkManager, setting zone to default.
success
root@dell-latitude-7480 /u/local# systemctl start docker
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.

Caramba... algo de errado não estava certo.  Bom... se tivesse funcionado de primeira, eu provavelmente não teria escrito esse artigo.

Então vamos rever em qual zona está a interface docker0, remover essa interface dessa zona e adicionar na zona do docker.

root@dell-latitude-7480 /u/local# firewall-cmd --get-zone-of-interface=docker0
public
root@dell-latitude-7480 /u/local# firewall-cmd --zone=public --remove-interface=docker0 --permanent
The interface is under control of NetworkManager and already bound to the default zone
The interface is under control of NetworkManager, setting zone to default.
success
root@dell-latitude-7480 /u/local# firewall-cmd --get-zone-of-interface=docker0
public
root@dell-latitude-7480 /u/local# firewall-cmd --reload
success
root@dell-latitude-7480 /u/local# firewall-cmd --get-zone-of-interface=docker0
public

Mas que catzo... esse foi problema que encontrei.  Por mais que eu removesse ou tentasse remover a interface docker0 da zone public, sempre voltava.

Foram algumas horas nesse vai e vem, procurando na Internet o que fazer, lendo documentação do firewalld, até que finalmente acertei.

root@dell-latitude-7480 /u/local# firewall-cmd --zone=docker --add-interface=docker0 --permanent
The interface is under control of NetworkManager, setting zone to 'docker'.
success
root@dell-latitude-7480 /u/local# firewall-cmd --get-zone-of-interface=docker0
docker
root@dell-latitude-7480 /u/local# firewall-cmd --reload
success
root@dell-latitude-7480 /u/local# systemctl start docker

Então não precisava do comando pra remover.  Apenas adicionar diretamente na zona desejada.

root@dell-latitude-7480 /u/local# docker run -it --rm --init ubuntu:20.04 bash
root@e5d78d7f081b:/# ping -c 5 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=58 time=1.95 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=58 time=2.02 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=58 time=1.68 ms
64 bytes from 1.1.1.1: icmp_seq=4 ttl=58 time=1.62 ms
64 bytes from 1.1.1.1: icmp_seq=5 ttl=58 time=1.76 ms

--- 1.1.1.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4003ms
rtt min/avg/max/mdev = 1.621/1.808/2.026/0.162 ms
root@e5d78d7f081b:/# exit

via GIPHY

Acessos: 235