1 of 231

Optimization Engineering

Dr. LIPIKA PANIGRAHI

Associate professor in mathematics

Gandhi Institute foe Education and Technology

2 of 231

Outlines

Linear programming
Transportation and Assignment problem
Integer problem
Mixed Integer problem
Queuing theory

3 of 231

LINEAR PROGRAMMING�Segment 1

Definitions

Feasibility and optimality

Linear programs in standard form

“Economical” interpretation/application

LP in two dimensions – geometric interpretation

Linear programs in slack form

4 of 231

Linear Programming

Optimize (maximize or minimize) �a linear objective function subject to a set �of linear constraints (equalities and inequalities)

A problem of this form is called a linear program.

LP:

5 of 231

Feasibility and optimality

(Assuming the LP is a maximization program.)

An LP is feasible if it has a feasible solution.

An LP has an optimal solution iff it is feasible and bounded.

6 of 231

Importance of Linear Programming

Linear Programming has a lot of applications.

LP can be solved efficiently both in theory and in practice.

Many problems, including some of the problems �considered in this course, are LP problems.

Many software packages can solve LP problems.

We shall learn the simplex algorithm, and use it �to prove the duality theorem for linear programming

The simplex algorithm behaves very well in practice. But, �there are pathological examples on which it is not polynomial.

There are other, more complicated, LP algorithms that �run in polynomial time. (Ellipsoid, interior-point, etc.)

7 of 231

LPs in standard form

All variables are constrained to be non-negative.�(Very special linear inequalities.)

8 of 231

LPs in standard form

transpose

9 of 231

“Economic” application

How many units of each product should �we produce to maximize our profit?

10 of 231

Converting an LP to standard form

Every LP can be easily converted to standard form.

11 of 231

LP in two dimensions

Figure 29.2 from CLRS

Each constraint �defines a half-plane.

The intersection of all these �half-planes is a polygon,�called the feasibility polygon.

The objective function �specifies a direction.

The goal is to find the furthest feasible point in this direction, �i.e., the point with the largest inner-product with the direction.

12 of 231

LP in two dimensions

Figure 29.2 from CLRS

13 of 231

Each constraint defines a half-space.

The intersection of the constraints is the feasibility polyhedron.

The extreme points of the polyhedron are called vertices.

The finite intersection of half-spaces is a polyhedron.

There is always an optimal solution which is a vertex.

14 of 231

Equational Form

15 of 231

Standard form 🡪 Slack form

The resulting LP is said to be in slack form.

All constraints, except the non-neg constraints are equalities.

16 of 231

Slack form

Standard form

17 of 231

Standard form 🡪 Slack form

For those interested,�this is the corresponding�equational form:�(We will not use �it explicitly.)

Standard form

Slack form

18 of 231

END OF �SEGMENT 1

19 of 231

LINEAR PROGRAMMING�Segment 2

The simplex algorithm by example

20 of 231

The simplex algorithm by example

max

s.t.

max

s.t.

Start with an LP in standard form:

Convert it into slack form:

21 of 231

Basic and non-basic variables

max

s.t.

In a basic solution, all non-basic variables have value 0.

22 of 231

Pivoting step (1)

max

s.t.

23 of 231

Pivoting step (2)

max

s.t.

For brevity, from now on we do not�explicitly show the non-negativity constraints.

It is important to remember that they are still there!

24 of 231

Pivoting step (3)

max

s.t.

25 of 231

Pivoting step (4)

max

s.t.

26 of 231

Pivoting step (5)

max

s.t.

27 of 231

An equivalent slack form

max

s.t.

The new form is completely equivalent to the previous form.

The objective function is the same.

We have a new basic feasible solution of value 27.

28 of 231

Another pivoting step

max

s.t.

We do row operations to get a new equivalent form.

29 of 231

Another equivalent slack form

max

s.t.

30 of 231

Final slack form

max

s.t.

It achieves a value of 28.

31 of 231

END OF �SEGMENT 2

32 of 231

LINEAR PROGRAMMING�Segment 3

The simplex algorithm

33 of 231

The simplex algorithm�[Dantzig (1947)]

Start with a problem in standard form.

Try to find an equivalent slack form that �yields a basic feasible solution (bfs).

If no such slack form is found, the problem is infeasible.

If all coefficients in the objective function are �non-positive, then the current bfs is optimal.

34 of 231

The simplex algorithm – more details

If no initial bfs can be found, the problem is infeasible,

If there are several non-basic variables with positive coefficients (also called modified costs), we need to choose one of them.�This is done using a pivoting rule.

If there are no degenerate steps, then the objective value �strictly increases, and the algorithm must eventually terminate.

35 of 231

Basic and non-basic variables

The slack form, also known as the simplex tableau:

The basic solution corresponding to the slack form is:

If it is feasible, it is called a basic feasible solution (bfs).

36 of 231

If the set is empty, the problem is unbounded.

37 of 231

A pivoting step – updating the table (1)

38 of 231

A pivoting step – updating the table (2)

39 of 231

A pivoting step – updating the table (3)

Finally, we update the objective function:

All simple algebraic operations.

40 of 231

END OF �SEGMENT 2

41 of 231

THE SIMPLEX ALGORITHM�Segment 4

Pseudocode

Termination

Degeneracy

Cycling

Pivoting rules

Complexity

42 of 231

The simplex algorithm – pseudocode

43 of 231

A pivoting step

Convert the slack form:

into the slack form:

44 of 231

Termination

There may, however, be degenerate iterations in �which the objective function stays the same.

If there are no degenerate iterations, then the objective �function strictly increases we can never visit a bfs again.

Thus, if there are no degenerate iterations, the simplex� algorithm terminates after a finite number of iterations.

Note, however, that the upper bound on the �number of iterations is not polynomial.

45 of 231

Degeneracy

			+

Example of degeneracy:

degenerate step

Degeneracy occurs a lot when solving combinatorial problems.

46 of 231

Cycling

In the example given, the degenerate step �was followed by a non-degenerate step.

Can we have several consecutive degenerate steps?

Can the degenerate steps form a cycle,�i.e., return to a previously visited bfs?

Unfortunately, cycling can happen, if we do not �carefully choose the entering and leaving variables.

(The smallest cycling example known is on an LP �with 7 variables that gives a cycle of length 6.)

47 of 231

Avoiding cycles

A crude way of avoiding cycles is to slightly perturb �the linear program. Cycles are unlikely to happen.

But, this slightly changes the optimal solution.�(Can be fixed by doing the perturbations ‘symbolically’.)

A more satisfying to solution is to use Bland’s rule.

Bland’s rule: (Use the smallest index)

Theorem: The simplex algorithm with Bland’s rule cannot cycle.

(We do not prove the theorem in this course.)

48 of 231

Pivoting rules

However, rhe pivoting rule can have a significant effect �on the efficiency of the simplex algorithm.

Note: We can revert to Bland’s rule if we see �that we are about to perform a degenerate step.

If cycles are avoided, the simplex algorithm returns �an optimal solution no matter which pivoting rule is used.

49 of 231

Common pivoting rules

Unfortunately, the simplex algorithm with these rules, �and many other rules, is not polynomial in the worst case.

It is a major open problem whether there is a pivoting �rule that makes the simplex algorithm polynomial.

50 of 231

END OF �SEGMENT 4

51 of 231

THE SIMPLEX ALGORITHM�Segment 5

Finding an initial bfs

The auxiliary LP

Existence of optimal basic solutions

52 of 231

Checking feasibility and �finding an initial basic feasible solution

Standard form

Slack form

We need to check whether the LP is feasible, and if so, �find a bfs to start the simplex algorithm from.

53 of 231

Auxiliary LP

Original LP

Auxiliary LP

To check feasibility, we introduce an auxiliary LP.

54 of 231

Running simplex on the Auxiliary LP

To solve the auxiliary LP we use the simplex algorithm.

We first transform it into slack form.

But, don’t we have exactly the same problem?

But, we can easily manipulate the slack form to get a bfs.

55 of 231

Finding a bfs of the auxiliary LP

The new basic solution is feasible!

56 of 231

Solving the auxiliary LP

We found a bfs of the auxiliary LP,�i.e., a slack form that defines a bfs.

We can thus use the simplex algorithm to solve the auxiliary LP.

We do not present pseudo-code.

The auxiliary LP is bounded, as�the value of the objective function is at most 0.

57 of 231

Optimal basic solutions

Corollaries of the correctness of the simplex algorithm:

Other implications of the simplex algorithm will be discussed �in the next segment when we consider LP duality.

(The theorems say that there is always an optimal solution �which is a vertex (extreme point) of the feasibility polyhedron.)

58 of 231

END OF �SEGMENT 5

59 of 231

LINEAR PROGRAMMING�Segment 6

LP duality

“Deriving” the dual program.

Weak duality

Defining the dual program.

LP duality theorem

60 of 231

LP duality

primal

To every LP in standard form we define a dual:

dual

Maximization becomes minimization.

61 of 231

LP duality theorem

Theorem: If the primal and dual linear programs�are both feasible, then their optimal values are equal.

primal

dual

62 of 231

Deriving the dual

We want to get an upper bound on the optimal value of the primal.

The theorem, which needs a proof, says that the best bound is tight!

primal

63 of 231

Weak duality

In the previous slide we proved the following weaker �version of the duality theorem known as weak duality:

Here is the same proof written in matrix notation:

64 of 231

END OF �SEGMENT 6

65 of 231

LINEAR PROGRAMMING�Segment 7

LP duality

Economical interpretation of the dual

66 of 231

Interpretation of the dual (1)

Recall:

primal

dual

67 of 231

Interpretation of the dual (2)

Primal: How do we maximize our profit?

primal

dual

68 of 231

Interpretation of the dual (3)

Dual: What is the minimal total worth of an offer we accept?

Instead of producing products, we try to sell the raw materials.

primal

dual

“Shadow prices”

69 of 231

END OF �SEGMENT 7

70 of 231

LINEAR PROGRAMMING�Segment 8

LP duality

Proof of the duality theorem

71 of 231

Lemma

Before proving the lemma we use it to prove strong duality.

Consider an initial slack form:

and the objective function of an equivalent slack form:

Then:

Reduced costs

72 of 231

Proof of duality theorem

We show that the simplex algorithm produces not only a �primal solution, but also a dual solution, and they both have �the same value. By weak duality, they are both optimal.

73 of 231

Proof of lemma

Simple calculations similar to those we saw before.

Claim of lemma

74 of 231

END OF �SEGMENT 8

75 of 231

LINEAR PROGRAMMING�Segment 9

LP duality

A “full” version of the duality theorem

Duals of LP programs that not in standard form

Frakas’ Lemma

76 of 231

Full LP duality theorem

(P)

(D)

(1) If (P) is feasible and bounded, then so is (D), and vice verse. �(P) and (D) have optimal values and their optimal values are equal.

(2) If (P) is feasible but unbounded, then (D) is infeasible.

(It is possible that both (P) and (D) are infeasible.)

(3) If (D) is feasible but unbounded, then (P) is infeasible.

77 of 231

Primal/Dual pairs

Standard form

Inequalities form

Equational form

The LP duality theorem holds for all these forms.

The dual of the dual is the primal.

78 of 231

Frakas’ Lemma (1902)

Proof: Consider the pair of primal dual programs:

(P)

(D)

(1) says that (P) is feasible and bounded.

(P) always bounded. (D) always feasible. By the duality theorem:

If (P) is feasible (and bounded), then (D) is bounded. Only (1) holds.

(2) says that (D) is feasible and unbounded

79 of 231

Dualization Recipe

	Max problem (P)	Min problem (D)
variables
matrix
constants
objective
-constraint in (P)�
-constraint in (D)�

(You do not need to memorize it.)

80 of 231

END OF �SEGMENT 9

81 of 231

LINEAR PROGRAMMING�Segment 10

Linear Programming�vs.�Integer Programming

82 of 231

“Economic” application revisited

(LP)

Can we force the variables to assume only integral values?

83 of 231

Integer (Linear) Programs

(IP)

The simplex algorithm does not solve IP problems.

Solving IP problems in NP-hard.

Sometimes, we can ignore the integrality constraints, solve the �resulting LP relaxation, and then round the solution, hoping �that the solution obtained is close to optimal.

This gives us an approximation algorithm.

In some “lucky” cases, the LP relaxation always has �optimal integral solutions, so we get an exact result.

84 of 231

IP for Maximum Independent Set

(IP)

Finding an independent set of maximum size is NP-complete.

We can easily get an IP for the MIS problem.

Thus, IP is also an NP-complete problem.

85 of 231

END OF �SEGMENT 10

86 of 231

LINEAR PROGRAMMING�Segment 11

Formulating problems as Linear Programs

Single Source Shortest Paths

87 of 231

Formulating problems as linear programs

A wide variety of problems can be formulated as linear programs.

We focus in this course on graph/network related problems.

We consider LP formulations of the following problems:

Shortest paths

Maximum flow in a network

Minimum cost flow

Multi-commodity flow

For Shortest Paths we saw more efficient direct algorithms.

For Maximum flow we shall see more efficient direct algorithms.

The power of LP is its versatility and robustness.

For specific problems, more efficient algorithms can be found.

88 of 231

Shortest Paths – reminder

Proof: Review exercise on shortest paths.

89 of 231

Shortest Paths as an LP problem

(LP)

90 of 231

Shortest Paths as an LP problem

Shortest Paths is a minimization problem.

(LP)

Why did we formulate it as a maximization LP problem?

We can get an alternative formulation of Shortest Paths as a minimization LP problem by taking the dual. (Stay tuned!)

(LP) is infeasible.

(LP) is unbounded.

91 of 231

Shortest Paths as a maximization problem

(LP)

Each edge is represented by a piece of rope.

The ropes of the edges incident on a vertex a tied together.

92 of 231

Shortest Paths as an LP problem

What did we gain from this?

(LP)

We know much more efficient algorithms �for finding shortest paths directly.

First, it is an instructive example.

Second, if we change the problem definition, our �shortest paths algorithms are likely to stop working, �while the LP approach may remain valid.

Examples, beyond the scope of this course, are Stochastic �Shortest Paths (SSPs) and Markov Decision Processes (MDPs)

93 of 231

END OF �SEGMENT 11

94 of 231

LINEAR PROGRAMMING�Segment 12

Formulating problems as Linear Programs�Network flow problems

Maximum flow

Minimum cost flow

Multicommodity flow

95 of 231

Network Flow

capacities

source

sink

96 of 231

Network Flow

capacities

source

sink

Maximum flow

capacity

flow

97 of 231

Maximum Flow is an LP problem

Maximize the total net amount of flow entering the sink.�(We show later that all this flow must leave the source.)

Subject to the conservation constraints,

capacity constraints,

and non-negativity constraints.

The problem is always feasible and bounded,�so it has an optimal solution.

98 of 231

Maximum Flow and relatives

As Maximum Flow is an LP problem, we immediately get �fairly efficient (polynomial time) algorithms for solving it.

This demonstrates the versatility of the LP problem.

As with shortest paths, we can get even more efficient �maximum flow algorithms by studying the problem directly.

We next consider several variants of the maximum flow problem:

Using LP, we immediately get algorithms for them too.

More efficient algorithms can also be obtained for some �of these problems, but we will not do it in this course.

Minimum cost flow

Multicommodity flow

Minimum cost multicommodity flow

Maximum flow with losses

99 of 231

Minimum cost Flow

capacity

source

sink

cost

100 of 231

Minimum cost Flow

capacity

source

sink

cost

101 of 231

Minimum cost Flow is an LP problem

If the problem is feasible, then it is bounded �and has an optimal solution.

102 of 231

Minimum cost Flow �with general supplies and demands

103 of 231

Multicommodity Flow

Can this be done?

104 of 231

Multicommodity Flow is an LP problem

The problem is always feasible and bounded,�and hence has an optimal solution.

Note how quickly we were able to adapt the�LP formulations of the various flow problems we considered.

105 of 231

END OF �SEGMENT 12

106 of 231

LINEAR PROGRAMMING�Segment 13

LP duals of:

Maximum flow

Minimum cost flow

Shortest Paths

107 of 231

The LP dual of Maximum Flow

Before taking the dual, we do a “trick”.�(Not essential, but slightly simplifies things.)

108 of 231

Maximum Flow – Primal LP

We assume that no edges enter the source,�and that no edges leave the sink.

For “convenience”, we measure the flow �going out of the source, rather then the flow going into the sink.

109 of 231

Maximum Flow – Primal and dual LPs

110 of 231

Maximum Flow – Primal and dual LPs

(P)

(D)

111 of 231

Maximum Flow – dual LP

(D)

What does it mean?

The capacity of a cut is the sum of �the capacities of the edges that cross the cut.

112 of 231

Maximum Flow – Minimum Cut

(D)

We shall soon prove theorem, �which is special case of the LP duality theorem.

113 of 231

The primal LP for Shortest Paths

The primal LP for the shortest paths problem is �a minimum cost flow problem with no capacities!

An optimal solution of (P) is easily �obtained from a shortest paths tree.

Corollary: (P) always has an �integral optimal solution.

(P)

114 of 231

END OF �SEGMENT 13

END OF �LINEAR PROGRAMMING

115 of 231

Transportation Problem (TP) and Assignment Problem (AP)

(special cases of Linear Programming)

116 of 231

1. Transportation Problem (TP)

Distributing any commodity from any group of supply centers, called sources, to any group of receiving centers, called destinations, in such a way as to minimize the total distribution cost (shipping cost).

117 of 231

1. Transportation Problem (TP)

Total supply must equal total demand.

If total supply exceeds total demand, a dummy destination, whose demand equals the difference between the total supply and total demand is created. Similarly if total supply is less than total demand, a dummy source is created, whose supply

equals the difference.

All unit shipping costs into a dummy destination or out of a dummy source are 0.

118 of 231

Example 1:

119 of 231

Example 2:

		Destination			Supply
	D1	D2	D3	D4
S1	50	75	35	75	12
Source S2	65	80	60	65	17
S3	40	70	45	55	11
(D)	0	0	0	0	10
Demand	15	10	15	10

120 of 231

Transportation Tableau:

121 of 231

1. Northwest Corner Starting Procedure

1. Select the remaining variable in the upper left (northwest) corner and note the supply remaining in the row, s, and the demand remaining in the column, d.

2. Allocate the minimum of s or d to this variable. If this minimum is s, eliminate all variables in its row from future consideration and reduce the demand in its column by s; if the minimum is d, eliminate all variables in the column from future consideration and reduce the supply in its row by d.

REPEAT THESE STEPS UNTIL ALL SUPPLIES HAVE BEEN ALLOCATED.

Initial Solution Procedure:

122 of 231

Total sipping cost = 2250

123 of 231

2. Least Cost Starting Procedure

1. For the remaining variable with the lowest unit cost, determine the remaining supply left in its row, s, and the remaining demand left in its column, d (break ties arbitrarily).

2. Allocate the minimum of s or d to this variable. If this minimum is s, eliminate all variables in its row from future consideration and reduce the demand in its column by s; if the minimum is d, eliminate all variables in the column from future consideration and reduce the supply in its row by d.

REPEAT THESE STEPS UNTIL ALL SUPPLIES HAVE BEEN ALLOCATED.

124 of 231

125 of 231

Total sipping cost = 2065

126 of 231

3. Vogel’s Approximation Method Starting Procedure

1. For each remaining row and column, determine the difference between the lowest two remaining costs; these are called the row and column penalties.

2. Select the row or column with the largest penalty found in step 1 and note the supply remaining for its row, s, and the demand remaining in its column, d.

3. Allocate the minimum of s or d to the variable in the selected row or column with the lowest remaining unit cost. If this minimum is s, eliminate all variables in its row from future consideration and reduce the demand in its column by s; if the minimum is d, eliminate all variables in the column from future consideration and reduce the supply in its row by d.

REPEAT THESE STEPS UNTIL ALL SUPPLIES HAVE BEEN ALLOCATED.

127 of 231

Total sipping cost = 2030

128 of 231

Solving TP – Transportation Simplex Method

1. Find the current C_ij–Z_ij values for each nonbasic variable and select the one with the most negative C_ij–Z_ij value as the entering variable; if all C_ij–Z_ij values are nonnegative, the current solution is optimal.

2. Determine which basic variable reaches 0 first when the entering variable is increased.

3. Determine a new basic solution and repeat the steps.

129 of 231

Step 1: Determine the C_ij–Z_ij Values for the Nonbasic Variables

1. If U_i is the dual variable associated with the i-th supply constraint, and V_j is the dual variable associated with the j-th demand constraint, then for shipments from node i to node j, one can find the corresponding Z_ij value by Z_ij = U_i - V_j. Thus the C_ij–Z_ij value for variable X_ij is found by

C_ij - Z_ij = C_ij - (U_i - V_j) = C_ij - U_i + V_j

130 of 231

2. Given that there is a redundant equation among the m n constraints (and any of the m n constraints can be considered the redundant one), one can show that the U_i or V_j associated with the redundant equation is 0. Thus one U_i or V_j can arbitrarily be selected and set to 0. Arbitrarily choose U₁ = 0.

3. Since the C_ij–Z_ij values for basic variables are 0 (i.e., C_ij - U_i + V_j = 0 for basic variables), we can easily solve for the remaining values of the U_i’s and V_j’s from the m + n - 1 equations for the basic variables.

131 of 231

4. Once the U_i’s and V_j’s have been determined, the C_ij–Z_ij values for the nonbasic variables can be calculated by

C_ij - Z_ij = C_ij - U_i + V_j

132 of 231

133 of 231

Non-basic cells:

Note: X₁₃ is the entering variable.

134 of 231

Step 2: Determine Which Current Basic Variable Reaches 0 First

Note: 1. Cycle property

2. X₁₂ is the leaving variable

135 of 231

Step 3: Determine the Next Transportation Tableau

Total shipping cost = 2020

{Improvement = 2 (-5) = 10 }

136 of 231

Transportation Simplex Method

Find an initial basic feasible solution by some starting procedure. Then,

1. Set U₁ 0. Solve for the other U_i’s and V_j’s by:

C_ij – U_i + V_j = 0 for basic variables.

Then calculate the C_ij–Z_ij values for nonbasic variables by:

C_ij – Z_ij = C_ij – U_i + V_j

Choose the nonbasic variable with the most negative C_ij–Z_ij value as the entering variable. If all C_ij–Z_ij values are nonnegative, STOP; the current solution is optimal.

2. Find the cycle that includes the entering variable and some of the BASIC variables. Alternating positive and negative changes on the cycle, determine the “change amount” as the smallest allocation on the cycle at which a subtraction will be made.

3. Modify the allocations to the variables of the cycle found in step 2 by the “change amount” and return to step 1.

137 of 231

Note: there must be m + n - 1 basic variables for the transportation simplex method to work!

=> Add dummy source or dummy destination, if necessary

(m=# of sources and n=# of destinations)

138 of 231

2. The Assignment Problem (AP) —

a special case of TP with m=n and s_i=d_j for all i, and j.

The Hungarian Algorithm

=> solving the assignment problem of a least

cost assignment of m workers to m jobs

139 of 231

Assumptions:

1. There is a cost assignment matrix for the m “people” to be assigned to m “tasks.” (If necessary dummy rows or columns consisting of all 0’s are added so that the numbers of people and tasks are the same.)

2. All costs are nonnegative.

3. The problem is a minimization problem.

140 of 231

The Hungarian Algorithm

Initialization

1. For each row, subtract the minimum number from all numbers in that row.

2. In the resulting matrix, subtract the minimum number in each column from all numbers in the column.

141 of 231

Iterative Steps

1. Make as many 0 cost assignments as possible. If all workers are assigned, STOP; this is the minimum cost assignment. Otherwise draw the minimum number of horizontal and vertical lines necessary to cover all 0’s in the matrix. (A method for making the maximum number of 0 cost assignments and drawing the minimum number of lines to cover all 0’s follows.)

2. Find the smallest value not covered by the lines; this number is the reduction value.

3. Subtract the reduction value from all numbers not covered by any lines. Add the reduction value to any number covered by both a horizontal and vertical line.

GO TO STEP 1.

142 of 231

For small problems, one can usually determine the maximum number of zero cost assignments by observation. For larger problems, the following procedure can be used:

Determining the Maximum Number of Zero-Cost Assignments

1. For each row, if only one 0 remains in the row, make that assignment and eliminate the row and column from consideration in the steps below.

2. For each column, if only one 0 remains, make that assignment and eliminate that row and column from consideration.

3. Repeat steps 1 and 2 until no more assignments can be made. (If 0’s remain, this means that there are at least two 0’s in each remaining row and column. Make an arbitrary assignment to one of these 0’s and repeat steps 1 and 2.)

143 of 231

Again, for small problems, the minimum number of lines required to cover all the 0’s can usually be determined by observation. The following procedure, based on network flow arguments, can be used for larger problems:

Drawing the Minimum Number of Lines to Cover All 0’s

1. Mark all rows with no assignments (with a “‧”).

2. For each row just marked, mark each column that has a 0 in that row (with a “‧”).

3. For each column just marked, mark each row that has an assignment in that column (with a “‧”).

4. Repeat steps 2 and 3 until no more marks can be made.

5. Draw lines through unmarked rows and marked columns.

144 of 231

Example:

145 of 231

146 of 231

Minimum uncovered

number

147 of 231

Minimum uncovered

number

148 of 231

149 of 231

CONVERSION OF A MAXIMIZATION PROBLEM TO A MINIMIZATION PROBLEM

The Hungarian algorithm works only if the matrix is a cost matrix. A maximization assignment problem can be converted to a minimization problem by creating a lost opportunity matrix. The problem then is to minimize the total lost opportunity.

150 of 231

Profit Matrix:

J1 J2 J3 J4

W1 67 58 90 55

W2 58 88 89 56

W3 74 99 80 22

(D) 0 0 0 0

151 of 231

The lost opportunity matrix given below is derived by subtracting each number in the J1 column from 74, each number in the J2 column from 99, each number in the J3 column from 90, and each number in the J4 from 56.

J1 J2 J3 J4

W1 7 41 0 1

W2 16 11 1 0

W3 0 0 10 34

(D) 74 99 90 56

The Hungarian algorithm can now be applied to this lost opportunity matrix to determine the maximum profit set of assignments.

152 of 231

Integer linear programming

Optimization problems where design variables have to be integers are more difficult than ones with continuous variables.
The degree of difficulty is particularly damaging for large number of variables:

With continuous variables finding a local optimum increases linearly or quadratically with the number of variables.
With integer variables it can increase exponentially (or non-polynomially): These problems are NP-hard.

Problems with linear objective and linear constraints (Linear programming) are easier to solve.
Furthermore: They do not have optima that are local but not global.

153 of 231

Formulation

The standard form of an integer programming problem is

Algorithm papers often limit themselves to standard form, but software usually allows the more general form

154 of 231

Example

job	Pay ($)	Time (min)	Fun index
1	1	3	3
2	2	5	2
3	1	2	2
4	3	2	1

Maximize fun time so that you make at least $75 and do not spend more than two hours.

155 of 231

Example Formulation

job	Pay	Time	Fun
1	1	3	3
2	2	5	2
3	1	2	2
4	3	2	1

Only two variables are likely to be non-zero. Which do you expect them to be?

156 of 231

Continuous solution from Solver

Without integer constraint, solution does not have to be integer

Objective Cell (Max)
	Cell	Name	Original Value	Final Value
	$E$1	fun	80	112.4999958


Variable Cells
	Cell	Name	Original Value	Final Value	Integer
	$B$1	x1	10	0	Contin
	$B$2	x2	10	0	Contin
	$B$3	x3	10	52.49999759	Contin
	$B$4	x4	10	7.500000626	Contin


Constraints
	Cell	Name	Cell Value	Formula	Status	Slack
	$E$2	pay	74.99999946	$E$2>=75	Binding	0
	$E$3	time	119.9999964	$E$3<=120	Not Binding	3.57628E-06

157 of 231

Integer solution

Will rounding work?

Objective Cell (Max)
	Cell	Name	Original Value	Final Value
	$E$1	fun	112.4999958	112.0000005


Variable Cells
	Cell	Name	Original Value	Final Value	Integer
	$B$1	x1	0	2.000000156	Contin
	$B$2	x2	0	0	Contin
	$B$3	x3	52.49999759	49	Integer
	$B$4	x4	7.500000626	8	Integer


Constraints
	Cell	Name	Cell Value	Formula	Status	Slack
	$E$2	pay	75.00000016	$E$2>=75	Binding	0
	$E$3	time	120.0000005	$E$3<=120	Binding	0
	$B$3=Integer
	$B$4=Integer

158 of 231

Integer problems in laminate design

When ply angles are unrestricted we have he restriction of integer number of plies.
Most points on the Miki diagram are accessible, but with specific ply angles.
When angles are limited to a small set, together with integer number of plies, we are limited to a finite number of points on diagram.
We will investigate which laminate design problems can be cast as linear integer programming.
Some times it requires some ingenuity.

159 of 231

Relaxation: A general optimization technique

Want:

x* = argmin_x f(x) subject to x ∈ S
S is the feasible set

Start by getting:

x₁ = argmin_x f(x) subject to x ∈ T
where S ⊆ T

T is a larger feasible set, obtained by dropping some constraints
Makes problem easier if we have a large # of constraints or difficult ones

If we’re lucky, it happens that x₁∈ S

Then x* = x₁, since

x₁ is a feasible solution to the original problem
no feasible solution better than x₁(no better x ∈ S since none anywhere ∈ T)

Else, add some constraints back (to shrink T) and try again, getting x₂

x₁, x₂, x₃, … 🡪 x* as T closes in on S

600.325/425 Declarative Methods - J. Eisner

159

160 of 231

Relaxation: A general optimization technique

Want:

x* = argmin f(x) subject to x ∈ S
S is the feasible set

Start by getting:

x₁ = argmin f(x) subject to x ∈ T
where S ⊆ T

T is a larger feasible set, obtained by dropping some constraints
Makes problem easier if we have a large # of constraints or difficult ones

Else, add some constraints back (to shrink T) and try again

600.325/425 Declarative Methods - J. Eisner

160

Integrality constraints: if we drop all�of these, we can just use simplex.�“LP relaxation of the ILP problem.”

But how can we add �integrality constraints back?

(simplex relies on having dropped them all)

161 of 231

Rounding doesn’t work

600.325/425 Declarative Methods - J. Eisner

161

image adapted from �Jop Sibeyn

round to nearest int (3,3)? No, infeasible.

round to nearest feasible int (2,3) or (3,2)? No, suboptimal.

round to nearest integer vertex (0,4)? � No, suboptimal.

Really do have to add the integrality constraints back somehow, and solve a new optimization problem.

162 of 231

Cutting planes: add new linear constraints

New linear constraints can be handled by simplex algorithm
But will collectively rule out non-integer solutions

600.325/425 Declarative Methods - J. Eisner

162

figure adapted from Papadimitriou & Steiglitz

x₁

x₂

x₃=x₀

x₁

x₂

163 of 231

Add new linear constraints: Cutting planes

Can ultimately trim back to a new polytope with only integer vertices

This is the “convex hull” of the feasible set of the ILP

Since it’s a polytope, it can be defined by linear constraints!

These can replace the integrality constraints

Unfortunately, there may be exponentially many of them …

But hopefully we’ll only have to add a few (thanks to relaxation)

600.325/425 Declarative Methods - J. Eisner

163

figure adapted from Papadimitriou & Steiglitz

x₁

x₂

x₃=x₀

164 of 231

Example

How can we find these new constraints??

600.325/425 Declarative Methods - J. Eisner

164

No integrality constraints!

But optimal solution is the same.

example from H. P. Williams

165 of 231

Chvátal cuts

Add integer multiples of constraints, �divide through, and round using integrality

This generates a new (or old) constraint

Repeat till no new constraints can be generated

Generates the convex hull of the ILP!
But it’s impractical

600.325/425 Declarative Methods - J. Eisner

165

example from H. P. Williams

even coeffs:�divide by 2

Must be integer �because of integrality �constraints on x₁,x₂,x₃

round

+ ( )

+ 5 * ( )

7x₁ + 21x₂ + 7x₃ ≥ 57

divide by 7 �and round

166 of 231

Gomory cuts

Chvátal cuts:

Can generate the convex hull of the ILP!
But that’s impractical
And unnecessary (since we just need to find optimum, not whole convex hull)

600.325/425 Declarative Methods - J. Eisner

166

x₁

x₂

x₃=x₀

Gomory cuts:

Only try to cut off current relaxed optimum that was found by simplex
“Gomory cut” derives such a cut from the current solution of simplex

figure adapted from Papadimitriou & Steiglitz

167 of 231

Branch and bound: Disjunctive cutting planes!

600.325/425 Declarative Methods - J. Eisner

167

figure from �H. P. Williams

why?

For each leaf,�why is it okay�to stop there?

When does solving�the relaxed problem�ensure integral X2?

168 of 231

Remember branch-and-bound from constraint programming?

600.325/425 Declarative Methods - J. Eisner

168

figure thanks to Tobias Achterberg

169 of 231

Branch and bound: Pseudocode

600.325/425 Declarative Methods - J. Eisner

169

pseudocode thanks to Tobias Achterberg

In notation, ^ is upper bound (feasible but poor objective) – decreases globally

v is lower bound (good objective but infeasible) – increases down the tree

Can round and do stochastic local search to �try to find a feasible solution x^ near x^v,�to improve upper bound c^ further

Branch&cut: add new constraints here (cutting planes, conflict clauses, or pick from huge set (“row generation”)) �Branch&price: add new non-0 vars picked from huge set (“column gener.”)

Simplify if desired by propagation; then �relax by dropping integrality constraints

Or set c^, x^ to a known feasible solution.

May simplify (“presolve”) it first.

170 of 231

How do we split into subproblems?

600.325/425 Declarative Methods - J. Eisner

170

figure thanks to Tobias Achterberg

Where’s the variable ordering?

Where’s the value ordering?

171 of 231

How do we add new constraints?

600.325/425 Declarative Methods - J. Eisner

171

figure thanks to Tobias Achterberg

172 of 231

Variable & value ordering heuristics (at a given node)

Priorities: User-specified var ordering
Most fractional branching: Branch on variable farthest from int
Branch on a variable that should tighten (hurt) the LP relaxation a lot

Strong branching: For several candidate variables, try rounding them and solving the LP relaxation (perhaps incompletely).
Penalties: If we rounded x up or down, how much would it tighten objective just on next iteration of dual simplex algorithm? (Dual simplex maintains an overly optimistic cost estimate that relaxes integrality and may be infeasible in other ways, too.)
Pseudo-costs: When rounding this variable in the past, how much has it actually tightened the LP relaxation objective (on average), per unit increase or decrease?

Branching on SOS1 and SOS2

600.325/425 Declarative Methods - J. Eisner

172

173 of 231

Warning

If variables are unbounded, the search tree might have infinitely many nodes!

600.325/425 Declarative Methods - J. Eisner

173

figures from H. P. Williams

174 of 231

Warning

If variables are unbounded, the search tree might have infinitely many nodes!
Fortunately, it’s possible to compute bounds …

Given an LP or ILP problem (min c⋅x subj. to Ax ≤ b, x ≥ 0)
Where all numbers in A,b,c are integers; n vars, m constraints
If there’s a finite optimum c⋅x, each x_i is ≤ a bound whose log is

O(m² log m log ( biggest integer in A or b )) [for LP]

600.325/425 Declarative Methods - J. Eisner

174

Intuition for LP: Only way to get LP optima far from the origin is to have slopes that are close but not quite equal� … which requires large ints.

figures from Papadimitriou & Steiglitz

175 of 231

Warning

If variables are unbounded, the search tree might have infinitely many nodes!
Fortunately, it’s possible to compute bounds …

Given an LP or ILP problem (min c⋅x subj. to Ax ≤ b, x ≥ 0)
Where all numbers in A,b,c are integers; n vars, m constraints
If there’s a finite optimum x, each x_i is ≤ a bound whose log is

O(m² log m log ( biggest integer in A or b )) [for LP]
O(log n + m(log n + log (biggest int in A, b, or c)) [for ILP]

600.325/425 Declarative Methods - J. Eisner

175

For ILP: A little trickier.�(Could ILP have huge finite optimum if LP is unbounded?

Answer: no, then ILP unbounded too.)

figures from Papadimitriou & Steiglitz

176 of 231

Reducing ILP to 0-1 ILP

If log bound=100, then e.g. 0 ≤ x₅≤ 2¹⁰⁰
Remark: This bound enables a polytime reduction �from ILP to 0-1 ILP

Remember: Size of problem = length of encoding, not size of #s

Can you see how?
Hint: Binary numbers are encoded with 0 and 1
What happens to linear function like …+ 3 x₅+ … ?

600.325/425 Declarative Methods - J. Eisner

176

Given an LP or ILP problem (min c.x subj. to Ax=b, x≥0)
Where all numbers in A,b,c are integers; n vars, m constraints
If there’s a finite optimum x, each x_i is ≤ a bound whose log is

O(log n + m(log n + log (biggest int in A, b, or c)) [for ILP]

177 of 231

Totally Unimodular Problems

There are some ILP problems where nothing �is lost by relaxing to LP!

“some mysterious, friendly power is at work”� -- Papadimtriou & Steiglitz
All vertices of the LP polytope are integral anyway.
So regardless of the cost function, the LP has an optimal solution in integer variables (& maybe others)
No need for cutting planes or branch-and-bound.
This is the case when A is a totally unimodular integer matrix, and b is integral. (c can be non-int.)

600.325/425 Declarative Methods - J. Eisner

177

???

178 of 231

Totally Unimodular Cost Matrix A

A square integer matrix is called unimodular if its inverse is also integral.
Equivalently: it has determinant 1 or -1.

(if det(A)=±1, then A^-1 = adjoint(A) / det(A) is integral)
(if A, A^-1 are integral, then det A, det A^-1 are ints with product 1)
Matrices are like numbers, but more general. Unimodular matrices are the matrix generalizations of +1 and -1:�you can divide by them without introducing fractions.

A totally unimodular matrix is one whose square submatrices (obtained by crossing out rows or columns) �are all either unimodular (det=±1) or singular (det=0).

Matters because simplex inverts non-singular square submatrices.

600.325/425 Declarative Methods - J. Eisner

178

179 of 231

Some Totally Unimodular Problems

The following common graph problems pick a subset of edges from some graph, or assign a weight to each edge in a graph.

Weighted bipartite matching
Shortest path
Maximum flow
Minimum-cost flow

Their cost matrices are totally unimodular.

They satisfy the conditions of a superficial test that is sufficient to guarantee total unimodularity.
So, they can all be solved right away by the simplex algorithm or another LP algorithm like primal-dual.
All have well-known direct algorithms, but those can be seen as essentially just special cases of more general LP algorithms.

600.325/425 Declarative Methods - J. Eisner

179

180 of 231

Some Totally Unimodular Problems

The following common graph problems pick a subset of edges from some graph ...

Weighted matching in a bipartite graph

600.325/425 Declarative Methods - J. Eisner

edge scores �from drawing

180

drawing from Edwards M T et al. Nucl. Acids Res. 2005;33:3253-3262 (Oxford Univ. Press)

If we formulate as Ax ≤ b, x ≥ 0,�the A matrix is totally unimodular:

Sufficient condition: Each column (for edge x_ij) has at most 2 nonzero entries (for i and j).

These are both +1 (or both -1) and are in different “halves” of the matrix.

(Also okay if they are +1 and -1 and are in same “half” of the matrix.)

each top/bottom�node has at most �one edge

	x_I,A	x_I,B	x_II,B	x_III,C	x_IV,B	x_IV,C
(i=I)	1	1	0	0	0	0
(i=II)	0	0	1	0	0	0
(i=III)	0	0	0	1	0	0
(i=IV)	0	0	0	0	1	1
(j=A)	1	0	0	0	0	0
(j=B)	0	1	1	0	1	0
(j=C)	0	0	0	1	0	1

Not �needed!

(just need�x_ij ≥ 0)

181 of 231

Some Totally Unimodular Problems

The following common graph problems pick a subset of edges from some graph ...

Shortest path from s to t in a directed graph

edge scores �from drawing

600.325/425 Declarative Methods - J. Eisner

181

Can formulate as Ax = b, x ≥ 0 so�that A matrix is totally unimodular:

	x_sA	X_sC	x_AB	x_BC	x_CA	x_Bt	x_Ct
(s)	1	1	0	0	0	0	0
(j=A)	-1	0	1	0	-1	0	0
(j=B)	0	0	-1	1	0	1	0
(j=C)	0	-1	0	-1	1	0	1
(t)	0	0	0	0	0	-1	-1

s

B

A

C

t

2

3

-1

8

9

5

Q: Can you prove that every feasible solution is a path?

A: No: it could be a path plus some cycles.

But then can reduce cost by throwing away the cycles. So optimal solution has no cycles.

182 of 231

Some Totally Unimodular Problems

The following common graph problems pick a subset of edges from some graph ...

Shortest path from s to t in a directed graph

edge scores �from drawing

600.325/425 Declarative Methods - J. Eisner

182

Can formulate as Ax = b, x ≥ 0 so�that A matrix is totally unimodular:

Sufficient condition: Each column (for edge x_ij) has at most 2 nonzero entries (for i and j).

These are +1 and -1 and are in the same “half” of the matrix.

(Also okay to be both +1 or both -1 and be in different “halves.”)

	x_sA	X_sC	x_AB	x_BC	x_CA	x_Bt	x_Ct
(s)	1	1	0	0	0	0	0
(j=A)	-1	0	1	0	-1	0	0
(j=B)	0	0	-1	1	0	1	0
(j=C)	0	-1	0	-1	1	0	1
(t)	0	0	0	0	0	-1	-1

s

B

A

C

t

2

3

-1

8

9

5

(this “half” is empty; can divide rows into “halves” �in any way that satisfies sufficient condition)

Not needed! (just need xij ≥ 0)

183 of 231

Some Totally Unimodular Problems

The following common graph problems pick a subset of edges from some graph ...

Maximum flow (previous problems can be reduced to this)
Minimum-cost flow

Cost matrix is rather similar to those on the previous slides, �but with additional “capacity constraints” like x_ij ≤ k_ij
Fortunately, if A is totally unimodular, so is A with I (the identity matrix) glued underneath it to represent the additional constraints

184 of 231

Solving Linear Programs

600.325/425 Declarative Methods - J. Eisner

184

185 of 231

Canonical form of an LP

min c ⋅ x subject to Ax ≤ b, x ≥ 0
m constraints (rows)�n variables (columns) (usually m < n)

600.325/425 Declarative Methods - J. Eisner

185

A

x

b

≤

m

n

m

So x specifies a linear combination of the columns of A.

186 of 231

Fourier-Motzkin elimination

An example of our old friend variable elimination.
Geometrically:

Given a bunch of inequalities in x, y, z.
These define a 3-dimensional polyhedron P₃.
Eliminating z gives the shadow of P₃ on the xy plane.

A polygon P₂ formed by all the (x,y) values for which ∃z (x,y,z) ∈ P₃.��

Eliminating y gives the shadow of P₂ on the x line.

A line segment P₁ formed by all the x values for which ∃y (x,y) ∈ P₂.
Now we know the min and max possible values of x.

Backsolving: Choose best x ∈ P₁. For any such choice,

can choose y with (x,y) ∈ P₂. And for any such choice,
can choose z with (x,y,z) ∈ P₃. A feasible solution with optimal x!

600.325/425 Declarative Methods - J. Eisner

186

Thanks to the ∃ �properties above.

Warning: P₂ may have more edges than P₃ has faces. That is, we’ve reduced # of variables but perhaps increased # of constraints. �As usual, might choose variable z carefully (cf. induced width).

187 of 231

Remember variable elimination for SAT?�Davis-Putnam

This procedure (resolution) eliminates all copies of X and ~X.

We’re done in n steps. So what goes wrong?
Size of formula can square at each step.

600.325/425 Declarative Methods - J. Eisner

187

Resolution fuses each pair (V v W v ~X) ^ (X v Y v Z) into (V v W v Y v Z)

Justification #1: Valid way to eliminate X (reverses CNF 🡪 3-CNF idea).

Justification #2: Want to recurse on a CNF version of ((ϕ ^ X) v (ϕ ^ ~X))

Suppose ϕ = α ^ β ^ γ

where α is clauses with ~X, β with X, γ with neither

Then ((ϕ ^ X) v (ϕ ^ ~X)) = (α’ ^ γ) v (β’ ^ γ) by unit propagation

where α’ is α with the ~X’s removed, β’ similarly.

= (α’ v β’) ^ γ = (α’₁ v β’₁) ^ (α’₁ v β’₂) ^ … ^ (α’₉₉ v β’₉₉) ^ γ

some two clauses in ϕ

188 of 231

Fourier-Motzkin elimination

Variable elimination on a set of inequalities

minimize x + y + z��subject to x - z ≥ 0� x - y ≥ 0� - z ≥ 2� 2z + x - y ≥ 0�

600.325/425 Declarative Methods - J. Eisner

188

example adapted from Ofer Strichman

minimize a

a ≤ x+y+z�a ≥ x+y+z

z ≤ x

(doesn’t mention z; leave alone)

z ≤ -2

z ≥ y/2 – x/2

z

a - x - y

z ≥ a - x - y�z ≤ a - x - y

y/2 - x/2

≤

-2

x

≤

Solve for z

∃ a value of z satisfying these constraints

iff each of these ≤ each of these

Eliminate z

y/2 - x/2 ≤ -2

y/2 - x/2 ≤ x

y/2 - x/2 ≤ a - x - y

a - x - y ≤ -2

a - x - y ≤ x

a - x - y ≤ a - x - y

x - y ≥ 0 (unchanged)

189 of 231

Fourier-Motzkin elimination

Variable elimination on a set of inequalities.
To eliminate variable z, take each inequality involving z and solve it for z. Gives z ≥ α₁, z ≥ α₂, … , z ≤ β₁, z ≤ β₂, …

Each α_ior β_jis a linear function of the other vars a,b,…,y.

Replace these inequalities by α_i ≤ β_jfor each (i,j) pair.

Equivalently, max α ≤ min β.
These equations are true of an assignment a,b,…,y iff it can be extended with a consistent value for z.

Similar to resolution of CNF-SAT clauses in Davis-Putnam algorithm! But similarly, may square the # of constraints. ☹
Repeat to eliminate variable y, etc.
If one of our equations is “a = [linear cost function],” then at the end, we’re left with just lower and upper bounds on a.

Now easy to min or max a! Back-solve to get b, c, … z in turn.

600.325/425 Declarative Methods - J. Eisner

189

190 of 231

Simplex Algorithm: Basic Insight

n variables x₁, … x_n
m constraints, plus n more for x₁, … x_n≥ 0
Each constraint is a hyperplane
Every vertex of the polytope is defined by an intersection of n hyperplanes
Conversely, given n hyperplanes, we can find their intersection (if any) by solving a system of n linear equations in n variables
So, we just have to pick which n constraints to intersect
Sometimes we’ll get an infeasible solution (not a vertex)
Sometimes we’ll get a suboptimal vertex: then move to an adjacent, better vertex by replacing just 1 of the n constraints

191 of 231

From Canonical to Standard Form

min c ⋅ x subject to Ax ≤ b, x ≥ 0
m constraints (rows)�n variables (columns)

≤

600.325/425 Declarative Methods - J. Eisner

191

A

x

b

m

n

m

Ax = b

=

+m

(Sometimes # vars is still called n, even in �standard form. It’s usually > # constraints. �I’ll use n+m to denote the # of vars in a standard-form problem – you’ll see why.)

+m

192 of 231

From Canonical to Standard Form

min c ⋅ x subject to Ax = b, x ≥ 0
m constraints (rows)�n+m variables (columns)

=

600.325/425 Declarative Methods - J. Eisner

192

A

x

b

m

n+m

m

We can solve linear equations!

If A were square, we could try to invert it to solve for x.

But m < n+m, so there are many solutions x.

(To choose one, we min c⋅x.)

We are looking to express b as a linear combination of A’s columns.

x gives the coefficients of this linear combination.

193 of 231

Standard Form

min c ⋅ x subject to Ax = b, x ≥ 0
m constraints (rows)�n variables (columns) (usually m < n)

600.325/425 Declarative Methods - J. Eisner

193

A’ rest

x’

0

b

m

n

m

We can solve linear equations!

If A were square, we could try to invert it to solve for x.

But m < n+m, so there are many solutions x.

(To choose one, we min c⋅x.)

=

n

m

If we set these variables to 0, we can get one solution by setting x’ = (A’)^-1 b .

(A’ is invertible provided that the m columns of A’ are linearly independent.)

194 of 231

Standard Form

min c ⋅ x subject to Ax = b, x ≥ 0
m constraints (rows)�n variables (columns) (usually m < n)

600.325/425 Declarative Methods - J. Eisner

194

rest A’

0

x’

b

m

=

n

Here’s another solution via �x’ = (A’)^-1 b .

n

In fact, we can get a “basic solution” like this for any basis A’ formed from m linearly independent columns of A.

This x is a “basic feasible solution” (bfs) if x ≥ 0 (recall that constraint?).

Notice that the bfs in the picture �is optimal when the cost vector is �c = (1,1,1,1,0,0,0,0,…)�Similarly, any bfs is optimal �for some cost vector.

Hmm, sounds like polytope vertices…

Remark: If A is totally unimodular, then the bfs (A’)^-1 b will be integral�(assuming b is).

We can solve linear equations!

If A were square, we could try to invert it to solve for x.

But m < n+m, so there are many solutions x.

(To choose one, we min c⋅x.)

195 of 231

Canonical vs. Standard Form

Ax ≤ b

x ≥ 0

m inequalities

+ n inequalities

(n variables)

Ax = b

x ≥ 0

m equalities

+ n+m inequalities

(n+m variables)

add m slack variables�(one per constraint)

Eliminate last m vars

(how?)

rest A’

�x

b

m

=

n

m

A’^-1

m

A’^-1

m

Eliminating last m vars�turns the last m “≥ 0” constraints�& the m constraints (“Ax=b”) into m inequalities (“Ax ≤ b”).

E.g., have 2 constraints on x_n+m:

x_n+m ≥ 0 and the last row, namely

(h₁x₁+…+h_nx_n) + x_n+m = b.

To elim x_n, replace them with

(h₁x₁+…+h_nx_n) ≤ b.

Multiply Ax=b through by A’^-1.

This gives us the kind of Ax=b that we’d have gotten by starting with Ax ≤ b and adding 1 slack var per constraint.

Now can eliminate slack vars.

And change x_n+m �in cost function to�b - (h₁x₁+…+h_nx_n). �

A’^-1⋅ �rest

�x

A’^-1� ⋅ b

=

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

196 of 231

Canonical vs. Standard Form

=

Pick n of the�n+m constraints �to be tight

vertex

(defined by intersecting n of the constraints, each of which reduces dimensionality by 1)

bfs

(defined by selecting n of the�variables to be 0)

Ax ≤ b

x ≥ 0

m inequalities

+ n inequalities

(n variables)

Ax = b

x ≥ 0

m equalities

+ n+m inequalities

(n+m variables)

add m slack variables�(one per constraint)

Eliminate last m vars

197 of 231

Simplex algorithm

Geometric interpretation
Move to an adjacent vertex�(current vertex defined by n facets C₄,C₅,C₆; choose to remove C₅ facet by allowing slack; now C₄,C₆ define edge)�

Computational implementation
Move to an adjacent bfs�(add 1 basis column, remove 1)

At right, expressed an unused column C₅ as linear combination of basis: C₅ = C₁ + 2C₂ - C_3.

Gradually phase in unused column C₅�while phasing outC₁ + 2C₂ - C₃, to keep Ax=b.

Easy to solve for max ε (=2) that keeps x ≥ 0.

Picked C₅ because increasing ε improves cost.

=

Suppose n+m=6 and n=3

Denote A’s columns by C₁…C₆

x=(5,4,7,0,0,0) is the current bfs (n zeroes)

So C₁,C₂,C₃ form a basis of R^m and Ax=b for

x=(5, 4, 7, 0, 0, 0) 5C₁ + 4C₂ + 7C₃ = b

x=(4.9, 3.8, 7.1, 0, 0.1, 0) …

x=(4.8, 3.6, 7.2, 0, 0.2, 0) …

x=(5-ε, 4-2ε,7+ε,0, ε, 0) …

x=(3, 0, 9, 0, 2, 0) 3C₁ + 9C₃ + 2C₅ = b� is the new bfs

Pick n of the�n+m constraints �to be tight

vertex

(defined by intersecting n of the constraints)

bfs

(defined by selecting n of the�variables to be 0 🡺 n tight constraints;�solve for slack in other constraints)

198 of 231

Canonical vs. Standard Form

Cost of origin is easy to compute (it’s a const in cost function). Eliminating a different set of m variables (picking a different basis) would rotate/reflect/squish the polytope & cost hyperplane to put a different vertex at origin, aligning that vertex’s n constraints with the orthogonal x ≥ 0 hyperplanes. This is how simplex algorithm tries different vertices!

Eliminate last m vars

Ax ≤ b

x ≥ 0

m inequalities

+ n inequalities

(n variables)

Ax = b

x ≥ 0

m equalities

+ n+m inequalities

(n+m variables)

=

Pick n of the�n+m constraints �to be tight

vertex

(defined by intersecting n of the constraints)

bfs

(defined by selecting n of the�variables to be 0 🡺 n tight constraints;�solve for slack in other constraints)

199 of 231

Simplex algorithm: More discussion

How do we pick which column to phase out (determines which edge to move along)?

How to avoid cycling back to an old bfs (in case of ties)?

Alternative and degenerate solutions?
What happens with unbounded LPs?
How do we find a first bfs to start at?

Simplex phase I: Add “artificial” slack/surplus variables to make it easy to find a bfs, then phase them out via simplex. (Will happen automatically if we give the artificial variables a high cost.)
Or, just find any basic solution; then to make it feasible, phase out negative variables via simplex.
Now continue with phase II. If phase I failed, no bfs exists for original problem, because:

The problem was infeasible (incompatible constraints, so quit and return UNSAT).
Or the m rows of A aren’t linearly independent (redundant constraints, so throw away the extras & try again).

=

Pick n of the�n+m equalities �to be tight

vertex

(defined by intersecting n of the constraints)

bfs

(defined by selecting n of the�variables to be 0 🡺 n tight constraints;�solve for slack in other constraints)

200 of 231

Recall: Duality for Constraint Programs

600.325/425 Declarative Methods - J. Eisner

200

original (“primal”) problem:�one variable per letter,�constraints over up to 5 vars

1

2

3

4

5

6

7

8

9

10

11

12

13

slide thanks to Rina Dechter (modified)

12

1,2,3,4,5

5,7,11

5

3,6,9,12

8,9,10,11

9

12,13

10,13

13

3

10

11

transformed (“dual”) problem:�one var per word, 2-var constraints.

Old constraints 🡺 new vars

Old vars 🡺 new constraints

Warning: Unrelated to AND-OR duality from SAT

201 of 231

Duality for Linear Programs (canonical form)

600.325/425 Declarative Methods - J. Eisner

201

max c⋅x

Ax ≤ b

x ≥ 0

(m)

(n)

Primal problem

min b⋅y

A^Ty ≥ c

y ≥ 0

(n)

(m)

Dual problem

dualize

Old constraints 🡺 new vars

Old vars 🡺 new constraints

202 of 231

Where Does Duality Come From?

We gave an asymptotic upper bound on max c⋅x �(to show that integer linear programming was in NP).
But it was very large. Can we get a tighter bound?

As with Chvátal cuts and Fourier-Motzkin elimination, let’s take linear combinations of the ≤ constraints,�this time to get an upper bound on the objective.

As before, there are lots of linear combinations.
Different linear combinations 🡺 different upper bounds.
Smaller (tighter) upper bounds are more useful.
Our smallest upper bound might be tight and equal max c⋅x .

203 of 231

Where Does Duality Come From?

Back to linear programming. Let’s take linear combinations of the ≤ constraints, to get various upper bounds on the objective.
max 4x₁ + x₂ subject to x₁, x₂ ≥ 0 and�C₁: 2x₁ + 5x₂ ≤ 10�C₂: x₁ - 2x₂ ≤ 8�
Can you find an upper bound on the objective?

Hint: Derive a new inequality from C₁ + 2*C₂

What if the objective were 3x₁ + x₂ instead?

Does it help that we already got a bound on 4x₁ + x₂?

example from Rico Zenklusen

204 of 231

Where Does Duality Come From?

Back to linear programming. Let’s take linear combinations of the ≤ constraints, to get various upper bounds on the objective.
max 2x₁ + 3x₂ subject to x₁, x₂ ≥ 0 and�C₁: x₁ + x₂ ≤ 12�C₂: 2x₁ + x₂ ≤ 9�C₃: x₁ ≤ 4�C₄: x₁ + 2x₂ ≤ 10�
objective = 2x₁ + 3x₂ ≤ 2x₁ + 4x₂ ≤ 20 (2*C₄)
objective = 2x₁ + 3x₂ ≤ 2x₁ + 3x₂ ≤ 22 (1*C₁+1*C₄)
objective = 2x₁ + 3x₂ ≤ 3x₁ + 3x₂ ≤ 19 (1*C₂+1*C₄)

example from Rico Zenklusen

positive coefficients�so ≤ doesn’t flip

205 of 231

Where Does Duality Come From?

Back to linear programming. Let’s take linear combinations of �the ≤ constraints, to get various upper bounds on the objective.
max 2x₁ + 3x₂ subject to x₁, x₂ ≥ 0 and�C₁: x₁ + x₂ ≤ 12�C₂: 2x₁ + x₂ ≤ 9�C₃: x₁ ≤ 4�C₄: x₁ + 2x₂ ≤ 10
(y₁+2y₂+y₃+y₄)x₁+ (y₁+y₂+2y₄)x₂ ≤ 12y₁+ 9y₂+4y₃+10y₄
Gives an upper bound on the objective 2x₁ + 3x₂ if y₁+2y₂+y₃+y₄≥ 2, y₁+y₂+2y₄ ≥ 3
We want to find the smallest such bound:� min 12y₁+ 9y₂+4y₃+10y₄

example from Rico Zenklusen

General case:�y₁C₁ + y₂C₂ + y₃C₃ + y₄C_4�with y₁,…y₄ ≥ 0�so that inequalities don’t flip

206 of 231

Duality for Linear Programs (canonical form)

600.325/425 Declarative Methods - J. Eisner

206

max c⋅x

Ax ≤ b

x ≥ 0

(m)

(n)

Primal problem

min b⋅y

A^Ty ≥ c

y ≥ 0

(n)

(m)

Dual problem

dualize

The form above assumes (max,≤) ⬄ (min,≥).
Extensions for LPs in general form:

Any reverse constraints ((max,≥) or (min,≤)) ⬄ negative vars
So, any equality constraints ⬄ unbounded vars� (can simulate with pair of constraints ⬄ pair of vars)

Also, degenerate solution (# tight constraints > # vars)� ⬄ alternative optimal solutions (choice of nonzero vars)

207 of 231

Dual of dual = Primal�Linear programming duals are “reflective duals” (not true for some other notions of duality)�

600.325/425 Declarative Methods - J. Eisner

207

max c⋅x

Ax ≤ b

x ≥ 0

(m)

(n)

Primal problem

min b⋅y

A^Ty ≥ c

y ≥ 0

(n)

(m)

Dual problem

dualize

min (-c)⋅x

(-A)x ≥ -b

x ≥ 0

(m)

(n)

Equivalent to primal

dualize

max (-b)⋅y

(-A^T)y ≤ (-c)

y ≥ 0

(n)

(m)

Equivalent to dual

Just negate A, b, and c

208 of 231

Primal & dual “meet in the middle”

We’ve seen that for any feasible solutions x and y, c⋅x ≤ b⋅y.

b⋅y provides a Lagrangian upper bound on c⋅x for any feasible y.

So if c⋅x = b⋅y, both must be optimal!

(Remark: For nonlinear programming, the constants in the dual constraints are partial derivatives of the primal constraint and cost function. The equality condition is then called the Kuhn-Tucker condition. Our linear programming version is a special case of this.)

For LP, the converse is true: optimal solutions always have c⋅x = b⋅y!

Not true for nonlinear programming or ILP.

600.325/425 Declarative Methods - J. Eisner

208

max c⋅x

Ax ≤ b

x ≥ 0

(m)

(n)

Primal problem

min b⋅y

A^Ty ≥ c

y ≥ 0

(n)

(m)

Dual problem

209 of 231

Primal & dual “meet in the middle”

600.325/425 Declarative Methods - J. Eisner

209

max c⋅x

Ax ≤ b

x ≥ 0

min b⋅y

A^Ty ≥ c

y ≥ 0

c⋅x

Max achievable under primal constraints

Not feasible under primal constraints

Min achievable under dual constraints

b⋅y

Not feasible under dual constraints

c⋅x ≤ b⋅y for all feasible (x,y).

(So if one problem is unbounded, �the other must be infeasible.)

primal

dual

210 of 231

Duality for Linear Programs (standard form)�Primal and dual are related constrained optimization problems, each in n+m dimensions

210

max c⋅x

Ax + s = b

x ≥ 0

s ≥ 0

(n struct vars)

Primal problem

min b⋅y

A^Ty - t = c

y ≥ 0

t ≥ 0

(m struct vars)

Dual problem

(m slack vars)

(n surplus vars)

Now we have n+m variables and they are in 1-to-1 correspondence.
At primal optimality:

Some m “basic” vars of primal can be ≥ 0. The n non-basic vars are 0.

At dual optimality:

Some n “basic” vars of dual can be ≥ 0. The m non-basic vars are 0.

Complementary slackness: The basic vars in an optimal solution to one problem correspond to the non-basic vars in an optimal solution to the other problem.
If a structural variable in one problem > 0, then the corresponding constraint in the other problem must be tight (its slack/surplus variable must be 0).
And if a constraint in one problem is loose (slack/surplus var > 0), then the corresponding variable in the other problem must be 0. (logically equiv. to above)

x⋅t + s⋅y = 0

211 of 231

Why duality is useful for ILP

600.325/425 Declarative Methods - J. Eisner

211

max c⋅x

Ax ≤ b

x ≥ 0�x integer

c⋅x

Max achievable under LP relaxation

Can also find this from dual of LP relaxation

b⋅y

Max achievable �for ILP at this node

Min achieved so far at this node as dual simplex runs

best feasible

global solution so far

ILP problem at some node of branch-and-bound tree�(includes some branching constraints)

Optimistic bound�poor enough that�we can prune�this node

min b⋅y

A^Ty ≥ c

y ≥ 0

Instead, let’s find bound by dual simplex

prune�early!

212 of 231

Multiple perspectives on duality

As shown on earlier slide: The y_i ≥ 0 are coefficients on a nonnegative linear combination of the primal constraints. Shows c⋅x ≤ b⋅y, with equality iff complementary slackness holds.

Geometric interpretation of the above:�At a primal vertex x, cost hyperplane (shifted to go through the vertex) is a linear combination of the hyperplanes that intersect at that vertex. This is a nonnegative linear combination (y ≥ 0, which is feasible in the dual) iff the cost hyperplane is tangent to the polytope at x (doesn’t go through middle of polytope; technically, it’s a subgradient at x), meaning that x is optimal.

“Shadow price” interpretation: Optimal y_i says how rapidly the primal optimum (max c⋅x) would improve as we relax primal constraint i. (A derivative.) Justify this by Lagrange multipliers (next slide). It’s 0 if primal constraint i has slack at primal optimum.

“Reduced cost” interpretation: Each y_i ≥ 0 is the rate at which c⋅x would get worse if we phased x_iinto the basis while preserving Ax=b. This shows that (for an optimal vertex x), if x_i > 0 then y_i = 0, and if y_i > 0 then x_i = 0. At non-optimal x, y is infeasible in dual.

Drop the names s and t now; use standard form, but call all the variables x and y.

213 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)

Technically, this is not the method of Lagrange multipliers.

Lagrange (18^th century) only handled equality constraints.�Karush (1939) and Kuhn & Tucker (1951) generalized to inequalities.

214 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)

If it happens that a(x₀) ≤ b, we’re done! But what if not?

Then try adding a surplus penalty if a(x) > b :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

Still an unconstrained optimization problem, yay! Solve by calculus, dynamic programming, etc. – whatever’s appropriate for the form of this function. (c and a might be non-linear, x might be discrete, etc.)

Lagrangian term (penalty rate λ is a “Lagrange multiplier”)

215 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)

If it happens that a(x₀) ≤ b, we’re done! But what if not?

Then try adding a surplus penalty if a(x) > b :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

If a(x_λ) > b, then increase penalty rate λ ≥ 0 till constraint is satisfied.

Increasing λ gets solutions x_λwith a(x_λ) = 100, then 90, then 80 …

These are solutions to max c(x) subject to a(x) ≤ 100, 90, 80 …

So λ is essentially an indirect way of controlling b.

Adjust it till we hit the b that we want.

Each y_i from LP dual acts like a λ (in fact y is λ upside down!)

216 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)

If it happens that a(x₀) ≤ b, we’re done! But what if not?

Then try adding a surplus penalty if a(x) > b :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

If a(x_λ) > b, then increase penalty rate λ ≥ 0 till constraint is satisfied.
Important: If λ ≥ 0 gives a(x_λ) = b, then x_λis an optimal soln x*.

Why? Suppose there were a better soln x’ with c(x’) > c(x_λ) and a(x’) ≤ b.
Then it would have beaten x_λ: c(x’) - λ(a(x’) – b) ≥ c(x_λ) - λ(a(x_λ) – b)
But no x’ achieved this.

��

Lagrangian is ≤ 0,�since by assumption �a(x’) ≤ b

Lagrangian is 0�since by assumption �a(x_λ) = b

(In fact, Lagrangian actually�rewards x’ with a(x’) < b. These�x’ didn’t win despite this unfair advantage,�because they did worse on c.)

217 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)

If it happens that a(x₀) ≤ b, we’re done! But what if not?

Then try adding a surplus penalty if a(x) > b :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

If a(x_λ) > b, then increase penalty rate λ ≥ 0 till constraint is satisfied.
Important: If λ ≥ 0 gives a(x_λ) = b, then x_λis an optimal soln x*.

Why? Suppose there were a better soln x’ with c(x’) > c(x_λ) and a(x’) ≤ b. �Then it would have beaten x_λ: c(x’) - λ(a(x’) – b) ≥ c(x_λ) - λ(a(x_λ) – b)

If λ is too small (constraint is “too relaxed”): infeasible solution. � a(x_λ) > b still, and c(x_λ) ≥ c(x*). Upper bound on true answer (prove it!).
If λ is too large (constraint is “overenforced”): suboptimal solution. � a(x_λ) < b now, and c(x_λ) ≤ c(x*). Lower bound on true answer.�

Tightest upper bound: min c(x_λ) subject to a(x_λ) ≥ b. See where this is going?

218 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)�If it happens that f(x₀) ≤ c, we’re done! But what if not?
Then try adding a slack penalty if g(x) > c :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

Complementary slackness: “We found x_λ with Lagrangian=0.”

That is, either λ=0 or a(x_λ)=b.
Remember, λ=0 may already find x₀ with a(x₀) ≤ b. Then x₀ optimal.
Otherwise we increase λ > 0 until a(x_λ)=b, we hope. Then x_λ optimal.
Is complementary slackness necessary for x_λ to be an optimum?

Yes if c(x) and a(x) are linear, or satisfy other “regularity conditions.”
No for integer programming. a(x)=b may be unachievable, so the�soft problem only gives us upper and lower bounds.

Lagrangian

219 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)�If it happens that f(x₀) ≤ c, we’re done! But what if not?
Then try adding a slack penalty if g(x) > c :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

Can we always find a solution just by unconstrained optimization?

No, not even for linear programming case. We’ll still need simplex method.
Consider this example: max x subject to x ≤ 3. Answer is x*=3.

But max x - λ(x-3) gives x_λ = -∞ for λ > 1 and x_λ = ∞ for λ < 1.
λ=1 gives a huge tie, where some solutions x_λ satisfy constraint and others don’t.

Lagrangian

220 of 231

Where Does Duality Come From?

More generally, let’s look at Lagrangian relaxation.� max c(x) subject to a(x) ≤ b (let x* denote the solution)
Try ordinary constraint relaxation:� max c(x) (let x₀ denote the solution)�If it happens that f(x₀) ≤ c, we’re done! But what if not?
Then try adding a slack penalty if g(x) > c :� max c(x) - λ(a(x) – b) (let x_λdenote the solution)

How about multiple constraints?� max c(x) subject to a₁(x) ≤ b₁, a₂(x) ≤ b₂

Use several Lagrangians:� max c(x) - λ₁(a₁(x) – b₁) - λ₂(a₂(x) – b₂)
Or in vector notation:� max c(x) - λ ⋅ (a(x) – b) where λ, a(x), b are vectors

Lagrangian

221 of 231

Problem

Consider the following problem.
Message packets are sent from a computers on a LAN to systems on other networks through a router. Given the network specification and request pattern, determine queuing time at router, and queue length, say, 95% of the time and similar metrics.
What information do you need?
How will you model the network?
What formulas or expressions will you use in your computation of metrics? What metrics?

B.ramamurthy

221

3/23/2023

222 of 231

Some More Questions

What happens to file retrieval time when disk utilization goes up?
How does response time change if both processor speed and number of users on the system are doubled?
How many lines should a dial-in facility of a time-sharing system have?
How many lines are needed on on-line inquiry center (call center) ?

B.ramamurthy

222

3/23/2023

223 of 231

Queuing Analysis

Queuing analysis is one of the most important tools for answering such questions.
The number of questions addressed that can be addressed by queuing analysis is endless and it touches every area we discussed in the operating systems course and also in networking.
We will look at some practical application of queuing analysis and simple rudimentary formulas.

B.ramamurthy

223

3/23/2023

224 of 231

Alternatives

Case Study: Scaling up a LAN to two buildings.
Actual implementation and evaluation of metrics.
Make a simple projection using prior experience.
Computer simulate a model.
Develop an analytic model based on queuing theory.

B.ramamurthy

224

3/23/2023

225 of 231

Example

Disk that is capable of transferring 1000 blocks per second is considered as having 50% (1/2 the load) when it is transferring at 500 blocks per second.
Response time is the time it takes to retransmit any incoming block.
See graph.

B.ramamurthy

225

3/23/2023

226 of 231

Queuing Models

Single server queue
Entities: server, clients, queue (line)
Parameters: arrival distribution, arrival times, service time, queue length, waiting time, server utilization (busy time / total time)

B.ramamurthy

226

3/23/2023

227 of 231

Queuing Parameters

λ – average arrival rate (of requests)

w – average queue length

Tw – average wait time

Ts – average service time

Tq – average time spent in the system (Tw+ Ts)

λmax = 1 / Ts theoretical maximum rate that can be handled by the system.

See Table A.1

B.ramamurthy

227

3/23/2023

228 of 231

Queuing structure

B.ramamurthy

228

3/23/2023

Server

arrivals

queue

dispatch

discipline

departures

W- items waiting

Tw – waiting time

q – items in queuing system

Tq – queuing time

229 of 231

Multiple servers

Multi-server , single queue
Multi-server, multiple queue
Theoretical maximum input rate:

λmax = N / Ts for N servers

Lets look at basic queuing relationships specified a well Little’s formulas.

B.ramamurthy

229

3/23/2023

230 of 231

Little’s Formula

B.ramamurthy

230

3/23/2023

General Single server Multi-server

q = λTq ρ = λTs ρ = λΤs/N

w = λTw q = w + ρ u = ρΝ

Tq = Tw + Ts q = w + Nρ

q -- mean number of items in the system

– utilization : fraction of busy time

u -- traffic intensity

w – mean number of items waiting to be served

λ -- arrival rate

231 of 231

Queuing Models

We would like to estimate (w,Tw, q,Tq), waiting items, waiting time, queued items, and queuing time, mean and standard deviation of each.
Assumptions made about queuing model is summarized using a notation:
G – general distribution, M – exponential distribution, D – deterministic fixed rate
Accordingly models are named: M/M/1 refers to Poisson arrival , exponential service with a single server.

B.ramamurthy

231

3/23/2023