Constructive Heuristics in Discrete Optimization
Discrete or Combinatorial Optimization is a relevant study area in Operations Research (OR) and Computer Science, dedicated to identifying the best (or a suitable) solution from a finite set of possibilities. With applications including vehicle routing, operations scheduling, and network design, often the problems cannot be tackled by exact approaches in tractable runtimes. Therefore, heuristics can be an interesting alternative to provide fast and good quality solutions guiding operations in reasonable computing time.
Not only as stand-alone techniques, constructive heuristics can be coupled with other algorithms to improve their runtimes, cost functions, or other performance aspects. For instance, providing an initial solution to a Mixed-Integer Programming (MIP) solver can establish a dual bound that helps to prune the search space. Additionally, this initial solution can enable the solver to incorporate local search heuristics more effectively, potentially leading to faster convergence and better overall solution quality.
Discrete Optimization
In a broad sense, a numerical optimization problem aims to find the best value of an objective function which is a function of decision variables and might be subject to some equality and inequality constraints, functions of the variables as well. The objective can be defined in either a minimization or maximization sense.
Discrete optimization refers to a category of optimization problems in which decision variables can assume only discrete values. Therefore one faces a finite (although it might be large) set of possible solutions from which it must be selected a feasible one leading to the best objective.
Solution Construction
Many algorithms for combinatorial optimization problems build a solution incrementally from scratch, where at each step, a single ground set element is added to the partial solution under construction. A ground set element to be added at each step cannot be such that its combination with one or more previously added elements leads to infeasibility.
The choice of the next element to include in the solution can vary according to the problem and strategy adopted. In some situations, choosing one element that leads to the best immediate effect in the partial solution can be an interesting alternative, in other situations random effects might be desirable. We will compare both approaches in two different problems in the remainder of this article.
The Knapsack Problem
In the knapsack problem, from a set of items with individual attributes of weight and value, one must select the most valuable ones to include in a knapsack of pre-defined capacity in such a way that the total weight of items selected does not exceed it. In this problem, we can think of the items available as our ground set.
A naive solution procedure can iterate over our set of items and include the next item in the solution if its corresponding weight is lesser than or equal to the remaining capacity. However, this method can lead to poor-quality solutions. Suppose at the beginning of our list there was a heavy item with a small value. It would be included in the solution occupying available space that more valuable alternatives could use.
A better choice could have been to first sort items by their density and then run the previous procedure of including the next from the input if it fits in the remaining space. That leads us to the Greedy selection.
Greedy Selection
A greedy approximation algorithm is an iterative algorithm which produces a partial solution incrementally. Each iteration makes a locally optimal or suboptimal augmentation to the current partial solution, so that a globally suboptimal solution is reached at the end of the algorithm.
In the context of the knapsack problem, we could choose the next element using a priority of density as previously suggested. In this case, a greedy approach does not guarantee the optimality of the solution, but it can be an interesting alternative for fast and good-quality results.
A simple illustration of the Knapsack Problem
The Maximum Independent Set Problem
The next example is a classical problem on subset partitioning in which our goal is to find a subset of elements from an undirected graph G(V,E) with the maximum number of elements such that there are no edges connecting any pair from the given subset.
Let us start by creating classes to work with the graph elements of this problem. The class Node will be used to represent a vertice (or node) from our undirected graph. It will have as attributes: neighbors: A list of neighbor vertices, index: Its identifier, and selected: A boolean to indicate when it was included in the solution.
Now, let us create our Graph class. It should be instantiated from a list of edges and an optional list of nodes. It should have an attribute N which is a dictionary of existing nodes (or vertices).
The property queue should return a list of nodes not yet selected for us to consider including in the solution at each step of our constructive heuristic.
Whenever a Node instance is selected, the method select should be called, which changes its selected attribute and calls its delete method.
Let us first create an algorithm that randomly chooses the next node to include into our solution.
Alternatively, at each step, we could have chosen the next node that has the smallest impact on the “pool” of feasible elements from the ground set. It would mean choosing the next element that in the subgraph has the smallest number of neighbors. This is the same approach adopted by Feo et al. (1994).
Notice the degree of our nodes might vary as the partial solution changes and elements are removed from the subgraph. It can be therefore defined as an adaptive greedy procedure.
Multistarts
In this approach, multiple independent runs are performed, and a registry of the best solution is kept. In the end, the best solution is returned.
A simple illustration of Multistarts
Further Reading
At the beginning of this text, I suggested constructive heuristics can be coupled with local search techniques. One fantastic meta-heuristic that explores it is called Greedy Randomized Adaptive Search Procedure (GRASP).
The idea of GRASP is to use a multi-start framework in which a random element will lead the constructive phase to produce different initial solutions to which local search should be applied. This way, the solution procedure escapes local optima.
Those interested in exploring more optimization problems and solution techniques, I have several other stories available on Medium which I have aggregated on a comprehensive list.
Conclusions
Throughout this article, constructive heuristics in the context of discrete optimization were introduced and applied to the knapsack and maximum independent set problems. Intuitions on how to choose ground elements to build a solution were presented exemplifying a greedy choice, random factors, and multi-starts.
A simple illustration of Optimization