25. The Traveling Salesman

[status: content-mostly-written]


One of the most important things done by computer software is to find approximations to mathematical problems that are too complex to carry out exactly. We will examine one of these problems, the Traveling Salesman Problem (TSP), which is easy to formulate and has many practical applications.


  • The 10 hour “serious programming” course.

  • A GNU/Linux system with python3 and python3-tk installed (on an Ubuntu 16.04 system this can be done with sudo apt install python3-tk)

  • The mini-course on drawing on canvases, in Section 24


We will start by formulating what the problem looks like, then we will discuss the general idea of optimization, after which we will write programs in Python that find an easy (but not very good) solution with the “greedy” algorithm. Finally we will write a program which uses the “hill climbing” algorithm to find an approximate solution.

Throughout these examples we will use the simple drawing techniques we learned in Section 24 to show rather cute animations of our paths through the cities.

And this video snippet, which discusses Buddy Holly’s “Winter Dance Tour”, mentions the poorly chosen route through the midwest: https://youtu.be/NFdWrwbxNms?t=214 – the cities were, in this order: Milwaukee WI, Kenosha Wi, Mankato MN, Eau Claire Wi, Montevideo MN, St. Paul MN, Davenport IA, Fort Dodge IA, Duluth MN, Green Bay WI, Clear Lake IA, Moorhead MN, Sioux City IA, Des Moines IA, Cedar Rapids IA, Spring Valley IL, Chicago IL, Waterloo IA, Dubuque IA, Louisville KY, Canton OH, Youngstown OH, Peoria IL, Springfield IL.


25.1. Cities and path lengths

Our cities are a list of \((x, y)\) coordinates within the canvas size. A future improvement would be to use realistic latitude and longitude values and fit those into a canvas.

The length of the path is the sum of the space between each successive pair of cities, and the distance between two cities can be calculated with the Pythagoras theorem (IMPROVEME: a simple picture would be nice here). If the cities have coordinates \((x_1, y_1)\) and \((x_2, y_2)\), then the x and y distances between them are \((x_2-x_1)\) and \((y_2-y_1)\), and the distance “as the crow flies” is given by the Pythagorean formula:

\[d = \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}\]

If we were to represent the coordinates of a city in Python with a simple pair then we might represent two cities like this:

c1 = (14, 182)
c2 = (71, 50)

We might then write a function to calculate the distance between them like this:

import math
## ...
def distance(city1, city2):
    x1 = city1[0]   ## coordinates of the first city
    y1 = city1[1]
    x2 = city2[0]   ## coordinates of the second city
    y2 = city2[1]
    r = math.sqrt((x2-x1)**2 + (y2-y1)**2)
    return r

Once we have this function we could calculate the distance like this:

c1 = (14, 182)
c2 = (71, 50)
d = distance(c1, c2)
print('The distance between the cities (%d, %d) and (%d, %d) is %g'
      % (c1 + c2 + (d,)))

The output would look like this:

The distance between the cities (14, 182) and (71, 50) is 143.781

To calculate the distance of the entire path we would add together all the distances from the first city to the second, from the second to the third, and so on until the last one. Finally we would add the “return home” path length from the last city back to the first.

25.2. Solving the Traveling Salesman Problem

What does it mean to “solve” the TSP? It means to find the shortest path between the cities. We will see that for larger sizes this is not practically possible, but we will try to do better than a random path, and to take some steps to further improve our solutions.

25.2.1. A digression on optimization

Show pictures of a search in 1d and 2d, possibly with simple 1-d and 2-d gaussian pictures.

The optimization task we are looking at (the traveling salesman problem) is an interesting one, but let us digress to look at some simpler optimization tasks for which we can make interesting plots.

Let us examine the following gaussian function \(e^{(-(x-3)^2)}\) which you might remember from Section 4.7:

$ gnuplot
gnuplot> set samples 400
gnuplot> plot exp(-(x-3)**2)
gnuplot> set terminal svg
gnuplot> set output 'single-hill-1d.svg'
gnuplot> replot
gnuplot> quit

Note that you could have similar function that points downward:

$ gnuplot
gnuplot> plot -exp(-(x-3)**2)

In the first case our function has maximum, and in the second case it has a minimum. The word optimum can refer to one or the other.


Figure 25.2.1 An example of a one dimensional hill-shaped function. Note the single optimum: the top of the hill for \(x = 3\)

Now try a two dimensional function with a clear single hill and single optimum:

$ gnuplot
gnuplot> set pm3d
gnuplot> set samples 150
gnuplot> set isosamples 60
gnuplot> set hidden3d
gnuplot> splot exp(-(x-1)**2 - (y-0.5)**2)
gnuplot> set terminal svg
gnuplot> set output 'single-hill-2d.svg'
gnuplot> replot
gnuplot> quit

Figure 25.2.2 An example of a two dimensional hill-shaped function. Note the single optimum: the top of the hill for \((x, y) = (1, 0.5)\)

In Figure 25.2.1 and Figure 25.2.2 we see that some optimization problems are as simple as climbing to the top of a single nearby hill. There are no other hills to confuse us and it should be simple to find the optimal value of x (in one dimension) or of x and y (in two dimensions).

These figures give a clear idea of what we are looking for in optimization when we optimize a function of x or of x and y. The picture is not so pretty when we try to optimize something more complicated: there is no such visualization of the distance in the traveling salesman problem, since city routes cannot be plotted as an x axis or as x and y axes. When we get to it, we will visualize the improvement in the TSP by showing an animation of the path through the cities.

25.2.2. Generating and visualizing lists of cities

Examine and type in the program in Listing 25.2.1

Listing 25.2.1 Program which draws a simple set of random cities on a canvas, in file cities_simple.py.
#! /usr/bin/env python3

"""This program demonstrates basic generation of a canvas: it makes a
list of random city coordinates, then draws them (with paths) on the

import random

## we use the tkinter widget set; this seems to come automatically
## with python3 on ubuntu 16.04, but on some systems one might need to
## install a package with a name like python3-tk
from tkinter import *

canvas_width = 640
canvas_height = 480
n_cities = 25

def main():
    ## prepare a basic canvas
    root = Tk()
    w = Canvas(root, 
    w.pack()       # boiler-plate: we always call pack() on tk windows
    city_list = make_random_cities(0, canvas_width-1, 0, canvas_height-1, n_cities)
    draw_city_path(w, city_list)

def draw_city_path(w, city_list):
    """draws lines between the cities"""
    for city in city_list:
        draw_city(w, city[0], city[1])
    draw_city(w, city_list[0][0], city_list[0][1], color='green', name='Home')
    ## now draw lines between them
    for i in range(len(city_list)-1):
        w.create_line(city_list[i][0], city_list[i][1], 
                      city_list[i+1][0], city_list[i+1][1])
    ## now draw a line that goes from the last city back to our home
    w.create_line(city_list[-1][0], city_list[-1][1], 
                  city_list[0][0], city_list[0][1])

def make_random_cities(xmin, xmax, ymin, ymax, n_cities):
    """returns a list of randomly placed cities in the given rectangle"""
    city_list = []
    for i in range(n_cities):
        x = random.randint(xmin, xmax)
        y = random.randint(ymin, ymax)
        city_list.append((x, y))
    return city_list

def draw_city(w, x, y, color='yellow', name=None):
    """draws a city; if a name is given also writes the name of it"""
    w.create_oval(x-5, y-5, x+5, y+5, fill=color)
    ## if a name was given, write in the name
    if name:
        w.create_text(x, y+10, text=name)

Exercise 25.1

Modify cities_simple.py to print some information about the path of these cities at the bottom of the canvas. (An example can be downloaded in cities_simple_with_info.py.)

25.2.3. Animating the drawing of cities

We can modify the program in Listing 25.2.1 quite straightforwardly to modify some information about the list of cities and animate the city drawing to visualize those changes.

You can download this program cities_animated.py and try running it to see how it works.

Exercise 25.2

Modify cities_animated.py to keep the starting point of the search.

Exercise 25.3

Write a function which calculates the distance between two cities, then write a function which calculates the total length of a list of cities.

Exercise 25.4

Modify cities_simple.py and cities_animated.py to print the total length of the path and other interesting information at the bottom of the screen. (An example can be downloaded in this file: cities_animated_with_info.py.)

25.3. Improvements to the route

25.3.1. Before you start

25.3.2. Impossibile to compute the optimal solution

People will tell you that the traveling salesman problem cannot be solved by “direct attack” because there are too many possible paths for a computer to explore them all.

Let us look in to this with pencil and paper in hand and see if we agree.

Draw three cities on a sheet of paper, pick a first city. Draw all paths that start with that first city and eventually get you back home to it. There should be 2 different paths.

Now do the same with 4 cities. There should be 6 paths.

Try to do the same with 5 cities. There should be 24 possible paths.

In general the equations is:

\[n_{paths}(n_{cities}) = (n_{cities}-1)! = (n_{cities}-1)\times(n_{cities}-2)\times ... \times 3 \times 2 \times 1\]

where the -1 in \(n_{cities}-1\) comes in because at the end of the run you always return to the given home city.

Now try to argue with yourself and with your partners to convince everyone that this factorial formula works for larger \(n_{cities}\). And how rapidly does \((n_{cities}-1)!\) grow? Remember from Section 20.4.1 that it grows at an amazingly fast rate:

\[\begin{split}2! &= 2 \times 1 = 2 \\ 3! &= 3 \times 2 \times 1 = 6 \\ 4! &= 4 \times 3 \times 2 \times 1 = 24 \\ 5! &= 5 \times ... \times 2 \times 1 = 120 \\ 6! &= 6 \times ... \times 2 \times 1 = 720 \\ 7! &= 7 \times ... \times 2 \times 1 = 5040 \\ 8! &= 8 \times ... \times 2 \times 1 = 40320 \\ 9! &= 9 \times ... \times 2 \times 1 = 362880 \\ 10! &= 10 \times ... \times 2 \times 1 = 3628800 \\ &... \\ 20! &= 10 \times ... \times 2 \times 1 = 2432902008176640000 \\ &... \\ 32! &= 32 \times 31 \times ... \times 2 \times 1 = 263130836933693530167218012160000000\end{split}\]

So \(32!\) is a number with 35 digits, approximately \(2.63131\times 10^{35}\), so clearly we cannot hope for a computer to look at all possible paths. Thus we look for approximate ways of doing it.

25.3.3. Greedy algorithm

One algorithm that comes to mind is to always travel to the city that is closest to you and that you have not yet visited. This is not the best way, but it is usually not the worst way either.

Exercise 25.1

With pen and paper draw short routes (4 or 5 cities) and solve them with the greedy algorithm. Discuss with your partners if this is the optimal solution or not.

Exercise 25.2

You can download this program tsp_solution_greedy.py and try running it to see how it works. Change the number of cities to be quite small and quite big.

Exercise 25.3

Discuss with your partners what the term deterministic algorithm might mean, and whether the greedy algorithm is deterministic.

Exercise 25.4

With your partners try to figure out (with pen and paper) how many times the greedy program calculates the distance between two cities. This is called the computational complexity of the algorithm. This number should be a function of n_cities.

Exercise 25.5

Modify the program to count how many times it calculates the distance between two cities. Then run it with a variety of different values for n_cities.

Exercise 25.6

Try to come up with a layout of cities in which the greedy algorithm performs very poorly. After you have tried conjuring a set of your own, you may look at [Bir15] for some examples.

Exercise 25.7

Try swapping triplets of cities instead of pairs and see if it does better. A couple of things to consider as you try this:

  • Triplets might make it harder to reach an optimum because sometimes the best improvement in hill-climbing might come from a simple 2-point swap. So you might need to explore sometimes swapping 2, sometimes 3, maybe even more.

  • To compare 2-swaps or 3-swaps or even more, you might want to set the random number seed to a fixed value at the start so that you can reproduce the same layout of cities. In python you can do this with random.srandom(1234) near the start of the program.

Exercise 25.8

Learn about the “basin hopping” algorithm provided by the scipy python library, and see if it can be used to solve the traveling salesman problem. Write a program which uses this method and run it alongside the program we wrote which uses the hill climbing algorithm.

25.3.4. A digression on hill climbing Before you start

25.3.5. Hill climbing for the traveling salesman problem

So what is hill-climbing? It is a simple algorithm (expressed for when we seek a maximum):

  1. Start at a certain location (random, or you can pick it if you know something about the landscape).

  2. Try taking a step in a random direction.

  3. Calculate your function. If it is bigger, take that step; if it is smaller, go back to where you were.

If we try to climb up the hills shown in above in Figure 25.2.1 and Figure 25.2.2 we find it quite straightforward: there is a single peak and we will find it easily.

Now look at different type of hilly functions that we might need to optimize. For example in one dimension we might look at the function \(\sin(6*x)*e^{(x-3)^2}\)

We can generate a plot with:

gnuplot> set samples 400
gnuplot> plot[-4:] sin(6*x) * exp(-(x-3)**2)
gnuplot> set terminal svg
gnuplot> set output 'multiple-hills-1d.svg'
gnuplot> replot
gnuplot> quit

Figure 25.3.1 A one dimensional example of multiple hills with different heights. The global maximum is for a value of x near 3.3.

Now look at a 2-dimensional situation where the function could be \(e^{-(x^2+y^2)/7} \times \cos(2.5\times \sqrt{x^2 + y^2})\)

set pm3d
set samples 250
set isosamples 80
set hidden3d
splot exp(-(x**2 + y**2)/7.0) * cos(2.5*sqrt(x*x + y*y))
set terminal svg
set output 'multiple-hills-2d.svg'

Figure 25.3.2 A two dimensional example of multiple maxima. The global maximum is near \((x, y) = (0, 0)\).

In the functions show in Figure 25.3.1 and Figure 25.3.2 we will run in to the problem of climbing to a local maximum. This is similar to when you go hiking and it’s foggy and you think you reached the peak of the main mountain, but then the sky clears and you realize that you got stuck on one of the lesser peaks.

Another thing to say about the hill-climbing algorithm is that it is a stochastic approach. This means that some of the decisions made by the algorithm involve generating random numbers.

You can download and run the program tsp_solution_hillclimbing.py. Examine the source code, then run it a few times.

You will see that it starts out with a random and rather chaotic path through the cities. Then it will attempt to swap two cities picked randomly in the path and that swap will stick when the new path is shorter than the previous one. Exercises

  • Use the terminal output from tsp_solution_hillclimbing.py to make a plot of the path distance as a function of how many steps have been taken. You could do so by redirecting the output into a file called hill.out and using gnuplot to plot columns 2 and 3.

  • Discuss with your class how far apart are the cities get swapped successfully. Are they near each other on the list? Are they far apart? Are they near each other in distance? Does that change as we get further in the run?

  • Add more information to the print() statement to show how far apart the swapped cities are in the list and in distance. Plot how those distances change in the course of a search.

25.3.6. Further study

These lectures on youtube are at a more advanced level than what we do here.