1.2. Solving a Problem – Algorithms#
Let us return to the definition of operations research given in the previous handout. One aspect that we did not touch upon is that we require the mathematical model be analyzed, so it can provide guidance in the given decision-making setting. For an optimization model, the element is thereby an algorithm, a computation procedure that can solve the task at hand. And just as we have described a generic computational task, there should also be a generic procedure that can handle all possible inputs. One can think of this as a sufficiently robust software package that can handle any of the inputs you might “reasonably” want to solve.
For the traveling salesman problem, there is a quite simple algorithm that is guaranteed to produce an optimal solution. One might “simply” enumerate all possible permutations, one after the other, evaluate the cost associated with each permutation, and then output the permutation with the cheapest cost. This certainly works correctly for any input. So are we done? Actually, no. The problem is that there are too many permutations for this to be of much practical use. For \(n=100\), how many permutations are there? There are 100 choices for the first city, and for each of those there are 99 choices for the second, and then 98 for the third, and so forth. So it seems like there are \(100!\) (100 factorial). However, this is a bit overdone because recall that it does not matter which city we start at, so in fact the same tour is counted 100 times. But \(99!\) is still big enough. It is greater than \(10^{155}\), which is more than the number of atoms in our universe.
Therefore, the simple algorithm is too slow. There are a few options to consider. One approach is to develop much more sophisticated algorithms. Throughout this course, you will learn many of the elements used by the best solver for the traveling salesman problem, a package called Concorde (which you can install on your iPhone - you are encouraged to do so). Concorde can easily solve inputs with 1000’s or even 10,000 cities, but even that code has its limits.
Another approach is to design simple algorithms that do not necessarily produce an optimal solution but instead produce good solutions. One natural one is called the nearest neighbor rule: start at one city and iteratively choose the next city as the one not yet visited that is currently nearest. There are a number of simple algorithms such as this one that will be demonstrated below.
Heuristics#
A heuristic aims to find a good feasible solution to a problem although it is not guaranteed to be optimal. We will consider 4 different TSP heuristics. Before we move on, we will abstract the TSP to finding an optimal tour on a set of nodes rather than cities. These nodes can represent anything.
To compare the heuristics, we will use a simple 6x8 grid of nodes. The cell below creates this instance. Note that we will initially use the Manhattan distance.
Show code cell content
# Imports
import urllib.request
urllib.request.urlretrieve('https://engri-1101.github.io/textbook/modules/tsp.py', "tsp.py")
from tsp import *
urllib.request.urlretrieve('https://engri-1101.github.io/textbook/data/tsp/optimal_tours.pickle', "optimal_tours.pickle")
import vinal as vl
import pandas as pd
from IPython.display import Image
from bokeh.io import output_notebook, show
output_notebook()
G = vl.grid_instance(6, 8, manhattan=True)
1. Random Neighbor: Start at some node. Randomly select one of the nodes which has not been visited to visit next. Continue doing so until all nodes have been visited. Return to the start.
Let’s use random neighbor (a terrible heuristic) to get a baseline for the cost of a tour. We will use a function called plot_tsp_heuristic
to see the mechanics of the algorithm. Run the cell, and use the Previous
and Next
buttons to move through the iterations of the algorithm. The tour cost will update in the bottom left. done.
will appear in the bottom right when the heuristic has finished.
show(vl.tsp_heuristic_plot(G, 'random_neighbor', i=0))
To view the complete tour right away, we can use the function random_neighbor
to run the random neighbor heuristic and the function plot_tour
to plot the tour and its cost in the lower left.
tour = vl.random_neighbor(G)
# 'tour' is an ordered list of the nodes starting and ending at the same node
print(tour)
show(vl.tour_plot(G, tour))
Q: Does this look like a good tour to you? Run it a few times, and see what the average tour cost is.
2. Nearest Neighbor: Start at some node. Visit the closest unvisited node next (if there are multiple closest nodes, choose one randomly). Continue doing so until all nodes have been visited. Return to the start.
Now, let’s look at the nearest neighbor heuristic.
show(vl.tsp_heuristic_plot(G, 'nearest_neighbor', i=0))
Q: As you iterate through, examine the “choices” made by the algorithm at each step. What does it do well? What does it do poorly?
Q: Run this a few times. Do you get the same tour every time? Why or why not?
3. Nearest Insertion: Start with a “tour” on two of the nodes (e.g., the closest pair of nodes). Find the closest unvisited node to any node currently in the tour. Insert the node into the tour at the best place (if there are multiple closest nodes, choose one to add randomly).
Now, let’s look at the nearest insertion heuristic.
show(vl.tsp_heuristic_plot(G, 'nearest_insertion', initial_tour=[0,1,0]))
Q: Run this a few times. How does it compare to the previous heuristics?
4. Furthest Insertion: Start with a “tour” on two of the nodes (e.g., the closest pair of nodes). Find the node whose smallest distance to a node already in the tour is maximized. Insert the node into the tour at the best place (if there are multiple furthest nodes, choose one to add randomly).
Now, let’s look at the furthest insertion heuristic.
show(vl.tsp_heuristic_plot(G, 'furthest_insertion', initial_tour=[0,47,0]))
Q: Run this a few times. How does it compare to the previous heuristics?
To compare the heuristics further, let’s run each on the 6x8 grid say, 250 times.
Q: Now that you have seen each heuristic, which do you think will do the best, and which will do the worst?
# Run each heuristic n times
n = 250
random_neighbor_total = 0
nearest_neighbor_total = 0
nearest_insertion_total = 0
furthest_insertion_total = 0
for i in range(n):
random_neighbor_total += vl.tour_cost(G, vl.random_neighbor(G))
nearest_neighbor_total += vl.tour_cost(G, vl.nearest_neighbor(G))
nearest_insertion_total += vl.tour_cost(G, vl.nearest_insertion(G))
furthest_insertion_total += vl.tour_cost(G, vl.furthest_insertion(G))
print("Heuristic Averages:")
print("Random Neighbor: %s" % (random_neighbor_total / n))
print("Nearest Neighbor: %s" % (nearest_neighbor_total / n))
print("Nearest Insertion: %s" % (nearest_insertion_total / n))
print("Furthest Insertion: %s" % (furthest_insertion_total / n))
Q: What were the results? Was this what you expected?
Let’s look at a 9x9 grid using the Euclidean distance now. Run each of the cells below to see each heuristic executed on the new instance.
G = vl.grid_instance(9, 9, manhattan=False)
# Random neighbor
tour = vl.random_neighbor(G)
show(vl.tour_plot(G, tour))
# Nearest neighbor
tour = vl.nearest_neighbor(G, i=0)
show(vl.tour_plot(G, tour))
# Nearest insertion
tour = vl.nearest_insertion(G, initial_tour=[0,1,0])
show(vl.tour_plot(G, tour))
# Furthest insertion
tour = vl.furthest_insertion(G, initial_tour = [0,80,0])
show(vl.tour_plot(G, tour))
Q: How did the results compare to the 6x8 grid using Manhattan distances?
Again, let’s run each heuristic numerous times and see how they compare.
n = 100
random_neighbor_total = 0
nearest_neighbor_total = 0
nearest_insertion_total = 0
furthest_insertion_total = 0
for i in range(n):
random_neighbor_total += vl.tour_cost(G, vl.random_neighbor(G))
nearest_neighbor_total += vl.tour_cost(G, vl.nearest_neighbor(G))
nearest_insertion_total += vl.tour_cost(G, vl.nearest_insertion(G))
furthest_insertion_total += vl.tour_cost(G, vl.furthest_insertion(G))
print("Heuristic Averages:")
print("Random Neighbor: %s" % (random_neighbor_total / n))
print("Nearest Neighbor: %s" % (nearest_neighbor_total / n))
print("Nearest Insertion: %s" % (nearest_insertion_total / n))
print("Furthest Insertion: %s" % (furthest_insertion_total / n))
Q: How did the results compare to the 6x8 grid using Manhattan distances?
Improving Tours: 2-OPT#
Above we used heuristics to create TSP tours. However, we can also use heuristics to try and improve tours we have already found. First, let’s think about how we may try improving a tour.
# Load a new tour
G = vl.grid_instance(4,4)
tour = [6,9,8,12,13,14,15,11,10,5,4,0,1,2,3,7,6]
show(vl.tour_plot(G, tour, width=300, height=300))
Q: Look at the tour above. How might you improve this tour? How could you generalize this strategy?
We will examine a tour improvement heuristic called 2-OPT in this part. 2-OPT looks for pairs of edges which can be reconnected to strictly improve the tour cost. (Note: There is only one way to reconnect a pair of edges.) It continues in this fashion until no more improvements can be made. Let’s run 2-OPT on the 4x4 example. We will use plot_two_opt
to generate a visualization of 2-OPT. In each iteration, 4 edges will be highlighted red: 2 solid and 2 dotted. The solid edges indicate the current position, and the dotted indicate the positon they will be reconnected in.
show(vl.tsp_heuristic_plot(G, '2-OPT', tour=tour, width=300, height=300))
Now, we will run 2-OPT after the nearest neighbor heuristic on a 5x5 (Euclidean distance) example.
# First, we run the nearest neighbor heuristic to get an initial tour
G = vl.grid_instance(5,5, manhattan=False)
tour = vl.nearest_neighbor(G)
show(vl.tour_plot(G, tour))
# Run 2-OPT
show(vl.tsp_heuristic_plot(G, '2-OPT', tour=list(tour)))
Q: Run 2-OPT a few times. Do you get the same result every time? Why or why not?
Let’s run 2-OPT a few times on a slightly larger grid.
G = vl.grid_instance(9,9, manhattan=False)
tour = vl.nearest_neighbor(G)
show(vl.tsp_heuristic_plot(G, '2-OPT', tour=list(tour)))
Q: After running 2-OPT, do you ever get a tour that crosses itself? When using Euclidean distances, is this even possible? Explain why or why not.
Let’s compare the heuristics (besides random neighbor) with and without executing 2-OPT on a 6x6 example. While we are at it, let’s compare them to the optimal solution which has already been computed.
G = vl.grid_instance(6,6, manhattan=False)
tour = optimal_tour('6x6_grid')
optimal_cost = vl.tour_cost(G, tour)
show(vl.tour_plot(G, tour))
n = 50
nearest_neighbor_total = 0
nearest_insertion_total = 0
furthest_insertion_total = 0
nearest_neighbor_2_total = 0
nearest_insertion_2_total = 0
furthest_insertion_2_total = 0
optimal_total = 0
for i in range(n):
nearest_neighbor_total += vl.tour_cost(G, vl.nearest_neighbor(G))
nearest_insertion_total += vl.tour_cost(G, vl.nearest_insertion(G))
furthest_insertion_total += vl.tour_cost(G, vl.furthest_insertion(G))
nearest_neighbor_2_total += vl.tour_cost(G, vl.two_opt(G, vl.nearest_neighbor(G)))
nearest_insertion_2_total += vl.tour_cost(G, vl.two_opt(G, vl.nearest_insertion(G)))
furthest_insertion_2_total += vl.tour_cost(G, vl.two_opt(G, vl.furthest_insertion(G)))
print("Nearest Neighbor: %s" % (nearest_neighbor_total / n))
print("Nearest Neighbor + 2-OPT: %s" % (nearest_neighbor_2_total / n))
print("Nearest Insertion: %s" % (nearest_insertion_total / n))
print("Nearest Insertion + 2-OPT: %s" % (nearest_insertion_2_total / n))
print("Furthest Insertion: %s" % (furthest_insertion_total / n))
print("Furthest Insertion + 2-OPT: %s" % (furthest_insertion_2_total / n))
print("Optimal: %s" % (optimal_cost))
Q: Compare the heuristics by their before and after 2-OPT performance. Compare them to the optimal.
For fun, let’s go back to the 23 US city example. Let’s run 2-OPT on the tour you created in the previous section (or a new one if you would like). To do this, you will need to define the tour as follows:
nodes = pd.read_csv('https://engri-1101.github.io/textbook/data/tsp/us_cities_23.csv', index_col=0)
G = vl.create_network(nodes, manhattan=False)
urllib.request.urlretrieve('https://engri-1101.github.io/textbook/other/us.png', "us.png")
show(vl.create_tour_plot(G, width=600, height=375, image='us.png'))
Q: Set the tour
variable to be the tour you manually created.
# We can define a tour like this:
tour = [22,21,19,20,18,15,17,16,14,13,10,12,11,8,1,2,3,4,9,7,0,6,5,22]
# After manually creating a tour, you can copy the list associated with that tour from the bottom-right
# TODO: Define your tour.
Run 2-OPT!
show(vl.tsp_heuristic_plot(G, '2-OPT', tour=list(tour), width=600, height=375, image='us.png'))
Q: Did 2-OPT improve your tour? By how much?
Now, let’s look at an optimal solution.
tour = optimal_tour('us_cities_23')
show(vl.tour_plot(G, tour, width=600, height=375, image='us.png'))
Q: Was your tour optimal before or after 2-OPT?