12. Services with containers
12.1. Motivation and plan
The way in which we install web services has become much too varied, so it is now very cumbersome to keep track of what mechanism they all use.
Who controls port 80 on a computer? How does that program decide which backend to pass the stuff to? How do php, flask, django, node.js, specific services all fit together?
In 2020-08-14 I had a conversation at the Portland breakfast club with some of the hackers there who said that nowadays maybe the simplest way to do it is to handle the web entry point with nginx (a new web server, very modern), and to have it then farm requests out to a variety of docker containers.
Meanwhile on the docker container side, they tell me that there is a feature called “Docker Compose” which allows a simple and lightweight way of managing several containers, each of which offers a different service. Docker Compose is apparently simpler than using kubernetes to manage a bunch of containers and sort which ones get traffic to various http ports.
This chapter describes the experiments I did in trying to set it up.
The plan is to install nginx, and then to have at least two services in containers. Let’s say:
One for jenkins, provided by the jenkins team.
One for a program of my own which does some fun calculation of stuff on the earth’s surfaces and then offers it up as a RESTful web service.
12.2. Prerequisites
A clean setup without existing web servers, like apache, so that we can experiment.
12.3. Concepts
In the earliest days of the web you just had static html pages and cgi scripts. A single web server served up all of that.
But for a long time there have been separate services offered, with special programs that the web server needs to load up. For example, some web sites use the php programming language and environment, so somebody has to start up php and pass requests to it. Similarly for the django python-based environment.
On top of that, you might have a single web server which serves web
pages for different domains. This means that if you go to
http://host1.domain1/dude.html
it should go to a different place
from http://host2.domain2/dude.html
even though the same web
server is being asked to serve up the same page dude.html.
Even with the same host, you might have that
http://host.domain/jenkins/
is sent to one program to process, and
http://host.domain/cloud/
is sent to another.
I think of this as a sort of “multiplexing” that the web server has to handle: a URL request has to go to the appropriate technological infrastructure (php, python+django, python+flask, java+tomcat, so many others…), and to the correct resolved address.
In the grand old apache web server this is done with a mechanism called “virtual hosts”. In nginx, which we explore here, the mechanism is called “server blocks”.
12.4. Installing the infrastructure
I will start this out on an ubuntu 20.04 machine which has personal desktop software running on it.
12.4.1. nginx first time installation
Remember that nginx is pronounced “engine-x”.
Start with this tutorial: https://www.digitalocean.com/community/tutorials/how-to-install-nginx-on-ubuntu-20-04 and also https://www.linuxjournal.com/article/10108
Install nginx and set up its firewall with:
sudo apt install nginx
sudo ufw enable
sudo ufw app list
# make sure that Nginx is on there:
sudo ufw allow 'Nginx HTTP'
sudo ufw allow 'Nginx HTTPS'
sudo ufw allow 'Nginx Full'
# check the status and see that Ngnix is on
sudo ufw status
sudo ufw app list
# make sure that nginx is running
sudo systemctl start nginx
Now a connection to the URL localhost:80
or http://localhost/
or
http://YOUR_IP_ADDRESS/
should work. (Remember: to find your IP
address you can type ip link
)
12.4.2. nginx ongoing management
The usual daemon commands, this is how they are run in modern times:
sudo systemctl status nginx
sudo systemctl stop nginx
sudo systemctl start nginx
sudo systemctl restart nginx
sudo systemctl reload nginx
sudo systemctl disable nginx
sudo systemctl enable nginx
(Remember that in the past it would have been something like
/etc/init.d/nginx restart
and so forth. Nowadays instead we use
systemctl for all that.)
12.4.3. nginx configuring server blocks
Server blocks are what allow you to run several separate activities on the same web server.
The tutorial at https://www.digitalocean.com/community/tutorials/how-to-install-nginx-on-ubuntu-20-04 explains the ideas reasonably well in step 5, so I will just give the instructions here:
sudo mkdir -p /var/www/static-web.galassi.org/html
sudo mkdir -p /var/www/playing-around.galassi.org/html
sudo mkdir -p /var/www/learning-flask.galassi.org/html
# then give yourself permission to write to these areas:
sudo chown -R $USER:$USER /var/www/*.galassi.org/html
sudo chmod -R 755 /var/www/*.galassi.org
Put a simple minimalist web page in
/var/www/static-web.galassi.org/html
cat <<INDEXEOF >> /var/www/static-web.galassi.org/html/index.html
<html>
<head>
<title>Dude, this is a mock galassi.org web page!</title>
</head>
<body>
<h1>Dude, this is a mock galassi.org web page!</h1>
</body>
</html>
INDEXEOF
Now make the server block that points to this:
sudo bash # become root to paste this in automatically
cat <<AVAILABLEEOF > /etc/nginx/sites-available/static-web.galassi.org
server {
listen 127.0.0.1:80;
listen [::]:80;
server_name www.static-web www.static-web.galassi.org static-web static-web.galassi.org;
location / {
root /var/www/static-web.galassi.org/html;
index index.html index.htm;
}
}
AVAILABLEEOF
exit # get out of sudo
Creating the site (in /etc/nginx/sites-available) is not enough: you need to enable it by symlinking it into /etc/nginx/sites-enabled:
sudo ln -s /etc/nginx/sites-available/static-web.galassi.org /etc/nginx/sites-enabled/
Now there’s a technical detail which I have not understood, but we need to uncomment the nginx config file line that says “server_names_hash_bucket_size 64;”. You can do that with:
sudo cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf-SAVE-`date --iso=seconds`
sudo sed -i 's/# server_names_hash_bucket_size 64;/server_names_hash_bucket_size 64;/' /etc/nginx/nginx.conf
ngnix comes with a handy sanity test feature:
nginx -t
12.5. Installing jenkins docker image
https://hub.docker.com/r/jenkins/jenkins
docker pull jenkins/jenkins:lts
12.6. Install rocketchat docker image
https://hub.docker.com/r/rocketchat/rocket.chat
docker pull rocketchat/rocket.chat:latest
12.7. Prepare a web app docker image using flask
The entire docker setup can be found in this book’s source directory
under the directory flask-docker-learn
. The flask app in
app.py
is tiny, and the Dockerfile is pretty small: it starts from
an ubuntu 20.04 image, installs python3, pip3, flask, and then starts
running the flask app.
12.7.1. Build and run
cd flask-docker-learn
docker build -t flask-docker-learn:latest .
docker run -d -p 5001:5000 flask-docker-learn
Test that it works by using a web browser to go to the URL
localhost:5001
and you should see the phrase “Hey, we have Flask
in a Docker container!” produced by this python code:
def hello_world():
return 'Hey, we have Flask in a Docker container!'
12.8. Putting it all together with docker-compose
docker-compose
is one of those examples of “the dust settling” in
the world of web development. With thousands of containers available
for all sorts of web service configurations, it is possible to get
quite lost in how to make a few of those coordinate with each other.
docker-compose coordinates the building and running
sudo apt install docker-compose
We have a compose-learn directory here. In it you will find
docker-compose configuration file called
docker-compose-3containers.yml
.
version: '3'
services:
flask-docker-learn:
build: ../flask-docker-learn
ports:
- "5001:5000"
web:
build: .
ports:
- "5002:5000"
redis:
image: "redis:alpine"
Looking closely you will see that it builds three things:
- flask-docker-learn
A simple project with the flask web backend. This container is located in
../flask-docker-learn/
and it will be mapped to listen on port 5001.- web
An even simpler project with the flask web backend. This container is located in the current directory
.
, and it will be mapped to listen on port 5002.- redis
redis
is needed by the simple web server calledweb
in this directory, but it is trivially available as a container from the standard curated collection of containers that the docker team offers. That’s why there is a single line in the configuration file which basically says “pick up the redis container built on the lightweight alpine distribution”.
You can build this system with:
docker-compose --file docker-compose-3containers.yml build
and then run it with:
docker-compose --file docker-compose-3containers.yml up
You will see a bunch of output from docker-compose, but the important
thing to notice here is that the two web servers we have are listening
on ports 5001 and 5002. The docker-compose setup has coordinated the
ports they answer on! You can hook up to localhost:5001
and
localhost:5002
to see that they are separate.
12.9. A more complex application to animate updating of maps
Flask tutorial (just one of many) at:
https://www.tutorialspoint.com/flask/flask_quick_guide.htm
First, choose a javascript framework. Here’s a review of react vs. vue:
https://medium.com/@kylemh/yet-another-react-vs-vue-article-a47b5946f1eb
More cartopy tutorials at:
https://data-flair.training/blogs/python-geographic-maps-graph-data/
Two approaches to reverse geographical information from https://stackoverflow.com/questions/6159074/given-the-lat-long-coordinates-how-can-we-find-out-the-city-country
An approach to finding political map location names using openstreetmap with Nominatim:
pip3 install geopy
# then in python3:
from geopy.geocoders import Nominatim, GeoNames
# note that with Nominatim you can only do one query per second
# geolocator = Nominatim(user_agent="give-a-name")
geolocator = GeoNames("demo") ## NOTE: you shoul make yourself an account
# location = geolocator.reverse("48.8588443, 2.2943506")
location = geolocator.reverse((48.8588443, 2.2943506))
print(location)
print(location.address)
print(location.raw)
And an approach with a simple list of cities:
Download the cities database from http://download.geonames.org/export/dump/
Add each city as a lat/long -> City mapping to a spatial index such as an R-Tree (some DBs also have the functionality)
Use nearest-neighbour search to find the closest city for any given point
Advantages:
Does not depend on an external server to be available
Very fast (easily does thousands of lookups per second)
Disadvantages:
Not automatically up to date
Requires extra code if you want to distinguish the case where the nearest city is dozens of miles away
May give weird results near the poles and the international date line (though there aren't any cities in those places anyway
Also note that http://geonames.org/ has a RESTful API: http://api.geonames.org/findNearbyJSON?lat=9.9271&lng=-84.082&username=demo
If you download cities500.txt (all cities with more than 500 people) from https://download.geonames.org/export/dump/ then you can parse it yourself.
12.10. Adding that fourth project to our docker-compose setup
I have a file called docker-compose-4containers.yml
which you can
look at. It pulls together the previous trivial web services on ports
5001 and 5002, like before, and adds to those on port 5003 our
containerized mapping web server:
version: '3'
services:
flask-docker-learn:
build: ../flask-docker-learn
ports:
- "5001:5000"
web:
build: .
ports:
- "5002:5000"
flask-docker-map-app:
build: ../map-app
ports:
- "5003:5000"
redis:
image: "redis:alpine"
As we did above with the 3 containers, you can build this 4-container system with:
docker-compose --file docker-compose-4containers.yml build
and then run it with:
docker-compose --file docker-compose-4containers.yml up
You will see a bunch of output from docker-compose, but the important
thing to notice here is that the three web servers we have are
listening on ports 5001 and 5002 and 5003. The docker-compose setup
has coordinated the ports they answer on! You can hook up to
localhost:5001
and localhost:5002
and much more interestingly
to localhost:5003
to see that they are separate.
12.11. Putting the flask containers behind nginx
This tutorial: https://dev.to/ishankhare07/nginx-as-reverse-proxy-for-a-flask-app-using-docker-3ajg seems to be just the right one.
They go even further and use nginx containers, although in this situation I did not do that because I wanted to serve content on my machine with nginx.
12.12. Additional resources
How do you clean up all those huge containers and images? There is a good writeup of this at:
https://medium.com/better-programming/docker-tips-clean-up-your-local-machine-35f370a01a78
and
https://linuxize.com/post/how-to-remove-docker-images-containers-volumes-and-networks/
https://docs.docker.com/engine/reference/commandline/rmi/
https://docs.docker.com/engine/reference/commandline/image_rm/
https://www.digitalocean.com/community/tutorials/how-to-remove-docker-images-containers-and-volumes
And how do you make disk space available between the host and the container? At build time you can’t do much because the build has to be very clearly segragated from anything except the “build context” (the directory from which you run “docker build”).
At run time you can use the -v option, which works quite well to map host and container paths. But there is a wealth of other suggestions at this stackoverflow answer:
https://stackoverflow.com/a/39382248/693429
A nice discussion of how to keep your docker images small:
https://opensource.com/article/18/7/building-container-images
Discussions of “who is logged in”:
https://jtreminio.com/blog/running-docker-containers-as-current-host-user/
https://medium.com/redbubble/running-a-docker-container-as-a-non-root-user-7d2e00f8ee15
An article with a trick to avoid COPY and ADD, in favor of having an ad-hoc web server.
https://medium.com/ncr-edinburgh/docker-tips-tricks-516b9ba41aa2
An article with commands, tips, tricks. Mentions docker-compose
https://medium.com/@clasikas/docker-tips-tricks-or-just-useful-commands-6e1fd8220450