Learning Docker, traefik and backing things up on a Proxmox host

January 7, 2024

Beginning to build a homelab has been an experience. Picking the underlying system on which to build took way too long. But now that I’ve settled on something (proxmox with a simple VM host for Docker, no portainer, k8s, no NAS. As few layers as possible for now) progress is being made. 

I started by setting up a Debian LXC and after figuring that all out, discovered that Docker on LXC is not supported on proxmox since it doesn’t play nice with overlayfs. The interesting thing is that mileage has varied reading through the various setups and stories out there. The majority can either go without problems altogether (not running ZFS I believe) or for a long time (2 years). Others setting up zfs and trying this same setup end up with problems. Apparently setting up fuse-overlayfs can help/mitigate/protect from these issues, but I really don’t understand enough about these things as it is so I’m sticking to a heavier VM build for now. LXD in theory should be an improvement but when you look under the hood of LXD it appears to be another layer on LXC. If that includes working out the problems between overlayfs and proxmox that would be nice. 

I thought I’d start simple, with one service, Immich photo server. I could gush about Immich here but you just have to look around on the web to see what people are sying about this software. Setting this up in docker was amazingly simple using docker-compose. I looked up a recipe and ran through the recipe without understanding exactly what I was doing but it just came up and worked right away on my local network. The host VM has 16GB of RAM (fixed… that is min and max match) and 2TB of disk. This is available to the containers. The containers for now are dumping all their data to the VM’s disk as volumes. In configuring Immich I moved the postgresql data from a named volume (that doesn’t live with the rest of the container data) to the volume for this container which puts all the contents for the Immich container in one place (so if I want to run file system backups instead of a VM backup, things are neatly in one place). 

Docker start up and creating the first service

To prepare the VM to be the docker host I had to do a couple of things. 

Basics: bring the system up to date for everything already installed

apt update
apt upgrade

then I installed docker and docker-compose

apt install docker
apt install docker-compose

Then, I did something that isn’t what you’d call secure but it solves a problem of working around and pfaffing about for Docker. I enabled root in Debian. This is NOT recommended, there are ways to run Docker “rootless”. But that’s another learning curve and another layer of complexity. I’ll get there but not today. 

passwd root

follow the prompts and set a password. This both sets a password and unlocks root for login. I repeat this is NOT a best practice. This is a “get it working” practice. I then logged in as root and ran all the docker-compose commands (mostly ‘docker-compose up’ from the directory where I’m storing the docker-compose.yml file for the service I’m working on, each service gets it’s own directory or I’d lose my mind trying to keep it all straight or backed up for that matter, ‘docker-compose down’ takes a service down without removing volumes which is nice so things persist. ’docker-compose down -v’ scorches the earth… sorta, that is to say it should remove all volumes and the data they hold). 

NOTE: ‘docker-compose down uses’ the docker-compose.yml file to take things down (since it doesn’t look to see what’s running, it looks to see what it did the last time.) SO if you change the .yml file BEFORE taking the service down, weirdness can ensue. Either take a copy of your .yml file before “fixing” it or take the service down first. I recommend making copies so you can see what you did.

The docker-compose.yml that worked for me to get Immich up and running is this

version: "3.8"

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

#name: immich

services:
immich-server:
container_name: immich_server
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
command: [ "start.sh", "immich" ]
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
env_file:
- .env
ports:
- 2283:3001
depends_on:
- redis
- database
restart: always

immich-microservices:
container_name: immich_microservices
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
# extends:
# file: hwaccel.yml
# service: hwaccel
command: [ "start.sh", "microservices" ]
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
env_file:
- .env
depends_on:
- redis
- database
restart: always

immich-machine-learning:
container_name: immich_machine_learning
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
volumes:
- model-cache:/cache
env_file:
- .env
restart: always

redis:
container_name: immich_redis
image: redis:6.2-alpine@sha256:b6124ab2e45cc332e16398022a411d7e37181f21ff7874835e0180f56a09e82a
restart: always

database:
container_name: immich_postgres
image: tensorchord/pgvecto-rs:pg14-v0.1.11@sha256:0335a1a22f8c5dd1b697f14f079934f5152eaaa216c09b61e293be285491f8ee
env_file:
- .env
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
volumes:
- pgdata:/var/lib/postgresql/data
restart: always

volumes:
pgdata:
model-cache:

Traefik setup

Then I wanted to share with the world. For that I figured I didn’t want to expose the interface to the container directly to the wild internet. I figured I should get and use a proxy of some kind. Preferably one that provides only one point of entry from the world to anything I might want to expose. I did a pile of research and figured out that nginx is the most popular, most flexible, traefik is scalable and slightly simpler to configure and Caddy I didn’t play with at all but found it was another popular choice. The nice thing about all of these is that as a bonus, if you can figure out how to configure it, they will obtain SSL certificates for your web services automatically. This is dependent on a cooperative DNS host (like Cloudflare) and works like a charm if you set it up correctly. 

I chose traefik mostly because a wise and much more experienced infrastructure guy suggested I try it and after comparing syntax between nginx and traefik it seemed simpler and more understandable to me (YMMV). I setup a traefik container using docker-compose again and again it came right up. This time the recipe I used wasn’t really a recipe so much as it was a hodgepodge of whatever I could find online that made any sense to me. I fiddled a lot until I got it to correctly pass traffic through my barriers and allowed internet traffic to the Traefik interface. 

NOTE: the traefik examples come with a sample (whoami) service that uses traefik. This is much wiser to use as using the traefik service itself as a test (by exposing it to the wild internet) isn’t a great idea. For the traefik container to understand what dockers are running (it will automatically pick up what’s running in docker and configure itself accordingly, which is nice) it has to have direct access to the Docker Socket, under volumes in the docker-compose.yml this line appears

– /var/run/docker.sock:/var/run/docker.sock 

This is a bad thing as a bad actor may use this to gain access to the VM host and thereby the rest of your homelab. So until you understand how to create a docker socket proxy (there are some packages out there or you can put nginx, which doesn’t have this vulnerability as I understand it, out in front) I wouldn’t expose traefik’s interface intended for internal monitoring only, to the outside world. 

Getting the traefik container up and running was a good start but nothing ran through it. To make that happen you have to do a few things.

  • create a docker network (external to the containers) – enabling any container on that network to see any other container on that network
  • ensure there’s an entry for logs under volumes so you can see from the VM host what’s happening in Traefik easily
  • for SSL configure certificateresolvers and the environment variables enabling login to your DNS provider (in my case using Cloudflare and dnschallenge)
  • configure a DNS entry at your DNS provider to support the service (in my first case photos.mypersonaldomain.ca), this is most easily done with a CNAME record entry pointing to an entry your router is updating with your home IP address (an A record for a sub domain you likely won’t be using for anything). Leaving the A record proxied by Cloudflare on a free account should work for you (YMMV), if you’re getting timeouts you can remove the proxy but there may be other configuration issues which can cause timeouts as well. 
  • Forward the port (443 unless you’re doing something exotic) to the system (the VM hosting docker containers and therefore your proxy, not proxmox) on your router. This will point to both the internal IP address for the VM and the port on that system. See port forwarding configuration documentation for your router.  
  • finally you have to add entries to the containers that need to be externally accessed (in the case of Immich that is the proxy network needs to be added to every container and immich-server needs traefik labels that will enable and route traffic correctly as well as call for an SSL certificate. 

Traefik is configured like this for me using docker-compose.

version: '3.8'


services:

traefik:
image: 'traefik:v2.10.7'
restart: always
container_name: traefik
deploy:
resources:
limits:
cpus: '0.10'
memory: 256M
ports:
- "443:443"
command:
- "--api.insecure=true"
- "--api.dashboard=true"
- "--api.debug=true"
- "--log.level=DEBUG"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.file.filename=/dynamic.yaml"
- "--providers.docker.network=proxy"
- "--entryPoints.sslproxy.address=:443"
- "--certificatesResolvers.myresolver.acme.email=me@mypersonaldomain.ca"
- "--certificatesResolvers.myresolver.acme.storage=/data/acme.json"
- "--certificatesresolvers.myresolver.acme.caServer=https://acme-v02.api.letsencrypt.org/directory"
- "--certificatesresolvers.myresolver.acme.dnschallenge=true"
- "--certificatesresolvers.myresolver.acme.dnschallenge.provider=cloudflare"
# - "--certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=proxy"

environment:
- TIMEZONE=America/Vancouver
- CONTAINER_NAME=traefik
- CONTAINER_ENABLE_LOGSHIPPING=FALSE
- CONTAINER_ENABLE_MONITORING=FALSE

- LETSENCRYPT_EMAIL=me@mypersonaldomain.ca
- LETSENCRYPT_CHALLENGE=DNS
- LETSENCRYPT_DNS_PROVIDER=cloudflare

- CF_API_EMAIL=me@mypersonaldomain.ca
- CF_API_KEY=complicated ID from Cloudflare

- LETSENCRYPT_DNS_DOMAIN1_MAIN=mypersonaldomain.ca

- LOG_LEVEL=INFO

- LOG_TYPE=FILE
- ACCESS_LOG_TYPE=FILE
- SERVER_TRANSPORT_INSECURE_SKIP_VERIFY=TRUE
volumes:
- ./certs:/data/certs
- ./config:/data/config
- ./logs:/data/logs
- ./dynamic.yaml:/dynamic.yaml
- /var/run/docker.sock:/var/run/docker.sock

networks:
- socket-proxy
- proxy

networks:
proxy:
external: true

And the working docker.compose.yml for immich now looks something like this

version: "3.8"


#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

#name: immich

services:
immich-server:
container_name: immich_server
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
command: [ "start.sh", "immich" ]
labels:
- traefik.enable=true
- traefik.http.routers.immich-server.rule=Host(`photos.mypersonaldomain.ca`)
- traefik.docker.network=proxy
- traefik.http.services.immich-server.loadbalancer.server.port=3001
- traefik.http.routers.immich-server.tls.certresolver=myresolver

volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
networks:
- immichproxy
env_file:
- .env
ports:
- 2283:3001
depends_on:
- redis
- database
restart: always

immich-microservices:
container_name: immich_microservices
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
# extends:
# file: hwaccel.yml
# service: hwaccel
command: [ "start.sh", "microservices" ]
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
networks:
- immichproxy
env_file:
- .env
depends_on:
- redis
- database
restart: always

immich-machine-learning:
container_name: immich_machine_learning
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
volumes:
- model-cache:/cache
networks:
- immichproxy
env_file:
- .env
restart: always

redis:
container_name: immich_redis
image: redis:6.2-alpine@sha256:b6124ab2e45cc332e16398022a411d7e37181f21ff7874835e0180f56a09e82a
networks:
- immichproxy
restart: always

database:
container_name: immich_postgres
image: tensorchord/pgvecto-rs:pg14-v0.1.11@sha256:0335a1a22f8c5dd1b697f14f079934f5152eaaa216c09b61e293be285491f8ee
env_file:
- .env
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
volumes:
- ./db:/var/lib/postgresql/data
networks:
- immichproxy
restart: always

volumes:
# pgdata:
model-cache:

networks:
immichproxy:
name: proxy
external: true

experiments I was conducting with the proxy network lead to the alias to a named external network here… note that each container creates it’s own network internally, you can alias that to the existing external network as I’ve done. Note also the addition of the network to all containers, this is necessary since all of these containers make up Immich and need to communicate with each other. Since traefik is inside the same container network as Immich it needs to be pointed to the internal port 3001 not 2283 which is externally facing for the Immich container on this VM. With the certresolver label added, this also gets Traefik to retrieve an SSL certificate for this service on the rule=Host URL. This works because we set it up to do certificate resolution in the traefik container docker-compose. 

And then it just worked… It was really very gratifying to get here. But it’s not safe. Realistically I’m not a security expert and the possibility of getting in trouble is still real. The other thing is, this is holding all of my family photos from 2004 on a single spinning disk. I really DO NOT want to lose any. So backing this up needs to happen.

Backups

Setting up backups required a bit of homework as well. In fact there’s more homework to do, but for now I’m keeping it VERY simple.

Basically I’ve setup an external harddrive attached to proxmox directly. I’ve set it up as Directory, content VZDump, which means it’s meant to hold proxmox dumps of whole virtual machines. In the long run this won’t be granular enough for me, but for now it’s just fine and it works (yay). Getting here was interesting as I learned a few things that I’ll be using in the future for doing backups I suspect, or at least recovery. 

rsync is a powerful and simple tool. It is quite swift which is nice and you can use it to push (command run on the source system) or pull (command run on the target system) as you need to. This came up as I needed to move the fledgling system configured above from an LXC docker host to a VM docker host (since problems can arise running Docker on LXC). rsync is how I made this happen. First I tried to just rsync the directories (this should have included all the volumes from the installs above I thought) from the LXC to the VM without realizing I was missing something (the postgres data for immich which was in a named volume not in the volumes where everything else is). To get rsync working you need a couple of things. You CAN use logins in rsync but that’s impossible for automation (like backups) and public-private key pairs when correctly setup reduce friction by a lot. So I generated some key pairs (one on the source LXC and one on the Immich postgresql container) and copied the public keys to the recipient machine (the new VM) and used rsync to move the files. Note, best practice is to generate one key pair and share the public key wherever…

to generate key pairs, login as the user who will be doing the rsync, in this case it was root

ssh-keygen

this starts the process of generating a 2048 bit rsa private public key pair. Follow the prompts but when it asks for a passphrase do NOT enter anything just hit the enter key leaving this empty (this allows automation for the key pair). Next you have to copy the newly created id_rsa.pub (the file is here ~/.ssh/id_rsa.pub where ~/ will get the system to the home directory for the user you’re logged in as, this is the public key) to the destination system. Particularly in the authorized_keys file in the home directory of the user making the rsync call from the other end. 

~/.ssh/authorized_keys

Now that the keys have been exchanged I logged into the receiving system as a non-root user

sudo rsync -avh root@<ip of source system>:/var/local/containers /var/local/containers

This copied all of the container data I needed (docker-compose.yml files and all of the persistent data I wanted to keep) with one exception. When I fired up the new containers in the new system the database for immich was missing. On review of the original .yml file I can see now that it was in a named container. These aren’t kept with the rest of the persistent data for the container in volumes on a path I know. I have learned that named volumes end up at /var/lib/docker/volumes/<project_name>_<volume_name> in this case /var/lib/docker/volumes/immich_pgdata. There are ways around this (bind mounts or docker plugins) but I just needed to get a copy of my data. To do this I had to get into the docker container itself and run rsync again from there. In this case I used

docker exec -it immich_postgres bash

Then I had to do the usual things for a completely bare bones linux system

apt update
apt install ssh
apt install rsync

THEN I could copy the id_rsa.pub to the root .ssh/authorized_keys file and then I could run rsync (this time I’m running as root so no sudo)

rsync -avh /var/lib/docker/volumes/immich_pgdata root@<ip of target VM system>:/var/local/containers/immich/db

Note the change in destination of the database files. Having learned my lesson about the location of named volumes I’ve chosen to put EVERYTHING in the container in one volume so it is easy to backup and easy to find. 

So… in case you’re wondering that’s a pretty quick and relatively simple way to manually migrate between VM/LXC systems. 

These sorts of commands could be used to do file level backups of container volumes or anything else you like really. You would set these commands up as crontabs on your backup system (which in my case is the host proxmox server). Which in the end is all you need if you’re looking at reinstalling after a disaster. I confess, my current plan is even simpler than this. I’m currently using the built-in proxmox backups (configured in the Datacenter under “Backup”). These backups appear listed on my “Backup” storage item under the node running everything. A test restore to a completely new VM with the backup VM stopped, made it a drop in replacement including the MAC address, so all networking and everything just simply worked in the new VM. At some point I’ll get to using pg_dump as well as file level backups so snapshots and more granular time points can be recovered. But for now daily should be enough. 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.