1 of 91

2 of 91

Module 3 : Geographic Modelling & Visualisation

Session 3D: Facility Location Problems

Sammi Rosser

“You’ll be back”

3 of 91

Facility Location Problems

One job of budding Data Scientists is to help their organisations in making decisions when it comes to deciding about the locations for services.

Real life healthcare examples may include

  • reviewing (and rationalising) existing service locations�
  • providing a comparison of the pros and cons of several proposed sites for a service�
  • suggesting optimal locations for new services to best tackle existing inequalities or areas of poor access�
  • suggesting which existing services can be closed down while having the least impact on access and equality

4 of 91

Facility Location Problems

“In health care, the implications of poor location decisions extend well beyond cost and customer service considerations.

If too few facilities are utilized and/or if they are not located well, increases in morbidity (infections) can have a major impact on public health.

Thus, facility location modelling takes on an even greater importance when applied to the siting of health care facilities.”��- Meskarian R, Penn ML, Williams S, Monks T (2017)

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

5 of 91

An Example

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

The project aimed to assist in planning the sexual health service provision in Hampshire.

1. They forecast future demand for different aspects of the service over the next 3–5 years, based on population age and gender projections published by the council.

2. They performed a facility location analysis based on the current and forecasted demand to identify an optimal number of clinics and their geographic location.

The project aimed to �reduce the number of clinics �while maintaining/improving service access

6 of 91

An Example

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

Scenario

1

Of 28 existing clinic locations, which should be kept open to maximise access?

Aim is for people to be able to access within

  1. 30 minutes on public transport�
  2. 15 minutes drive

2

Of 28 existing clinic locations or any other potential postcode sector, where should clinics be located to maximise access?

(giving 305 potential candidate locations!)

7 of 91

An Example

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

There were 278 ‘demand points’ where people could be coming from - these were all of the postcode sectors.

8 of 91

An Example

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

Proposed 20 facilities to allow public transport travel targets to be met for 90% of the population

Meeting travel time target for 100% of the population would require an additional 5 facilities - is it worth it?

Meanwhile, only 14 clinics are required if going by car travel times to achieve 99% coverage of being able to drive to a clinic in 15 minutes.

If you increase this to ‘within 20 minutes’, this solution covers 100% of the population

9 of 91

An Example

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

Findings

  • The clinic locations could be reduced from 28 to 20 and still keep 90% of all patient journeys by public transport (e.g. by bus or train) to a clinic within 30 minutes.

  • The number of clinics could be further reduced to 8 if the travel time is based on car travel times within 15 minutes.

Limitations

  • The public transport analysis assumed everyone would travel by public transport - but this is unrealistic (and census data showed over 85% of the population across the county having access to 1+ cars)

10 of 91

An Example

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

You get diminishing returns with more and more clinics

11 of 91

A powerful decision aid

You can see that these sorts of analyses can be really powerful for �optimising the use of healthcare resources…

So how do we go about formulating them?

12 of 91

Discrete vs Continuous Location Allocation Models

Assume that demands can be aggregated to a �finite number of discrete points.

To achieve this, a region could be split up into a series of points (e.g. the central point of postcodes or LSOAs)

They also assume that there is a �finite set of candidate locations where facilities will be placed.

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

Discrete Models

* Alternative is continuous models, where locations could be anywhere within a range - we don’t cover this!

13 of 91

Formulating a Location-Allocation Problem

p-median problem

Location set covering problem

Maximal covering location problem

I want to place 6 or fewer facilities that will minimise the overall travel time of everyone who might access the service

If we have more people coming from somewhere in particular, �I want to take that into account

I want to place as many services as required �to make sure that everyone can access a service �in 30 minutes or less

I want to place 6 or fewer facilities so that �as many people as possible �can get to the service in 30 minutes or less

14 of 91

Formulating a Location-Allocation Problem: Definition Reference

p-median problem

A.k.a. Weighted average

Given discrete demands (e.g. demand per LSOA)

Locate a number (p or less) of health facilities, minimising the total weighted travel distance/time

We are aiming to minimise the the total weighted travel distance or time between facilities and ‘demand centres’ (i.e. where the demand for services - i.e. the patients - are going to be travelling from)

Location set covering problem

Find the minimum number of facilities (and their locations)

such that each and every demand centre is covered by at least one facility within a given maximal service distance/time

Maximal covering location problem

Locate the facilities in such a way that

as few people as possible lie outside the desired service distance/time

15 of 91

p-Median

A p-median problem is basically just one where we care about

locating p services

(so p could be 1, 2, 3, 5, 25 - whatever!)

in such a way that the weighted average travel time (or distance) of users is minimized

16 of 91

p-Median

17 of 91

p-Median

Car icons created by fjstudio - Flaticon (https://www.flaticon.com/free-icons/car)

Journey A takes 10 minutes

18 of 91

p-Median

Car icons created by fjstudio - Flaticon (https://www.flaticon.com/free-icons/car)

Journey A takes 10 minutes

Journey B takes 30 minutes

Average

= (10 + 30) / 2

= 20 minutes

19 of 91

p-Median

Car icons created by fjstudio - Flaticon (https://www.flaticon.com/free-icons/car)

But what if a lot of people live in St Just and do journey A, but only one lives in Sithney and does journey B?

20 of 91

p-Median

Car icons created by fjstudio - Flaticon (https://www.flaticon.com/free-icons/car)

Weighted Average

A: 10 mins x 5 = 50

B: 30 mins x 1 = 30

50 + 30 = 80

Weighted Average = 80 / 6 trips = 13.3 mins

But what if the demand was distributed differently?

21 of 91

p-Median

Car icons created by fjstudio - Flaticon (https://www.flaticon.com/free-icons/car)

Weighted Average

A: 10 mins x 1 = 10

B: 30 mins x 5 = 150

10 + 150 = 160

Weighted Average = 160 / 6 trips = 26.6 mins

If this was the demand distribution instead, then this might not be such a good location for the hospital

22 of 91

p-Median

Weighted Average

160 / 6 trips

26.6 mins

Weighted Average

80 / 6 trips

13.3 mins

23 of 91

Limitations of p-Median

However - one problem with this approach is that it doesn’t look at the maximum travel time.

For certain services, like accident and emergency, stroke units, or anything else where being able to get to a service within a certain length of time is crucial, then you may be less concerned about minimizing the average length.

Instead, you may care more about ensuring that everyone in your catchment area isn’t more than a certain number of miles or minutes away.

10 minutes

15 minutes

45 minutes

Just focussing on the weighted average time can lead to an inequitable solution.

24 of 91

Location Set Covering Problem and

Maximal Covering Location Problem

Location Set

Covering Problem

Maximal Covering Location Problem

How can we locate the facilities in such a way that as few people as possible

don’t have a service within the target distance (or time)

What is the minimum number of facilities (and their locations) so that all people have at least one facility within a given maximum distance (or time)

* If we cannot cover all demands because it’s too expensive to do so, the model leans towards covering areas that generate lots of demand rather than those that generate less.

Meskarian R, Penn ML, Williams S, Monks T (2017) A facility location model for analysis of current and future demand for sexual health services. PLoS ONE 12(8): e0183942. https://doi.org/10.1371/journal.pone.0183942

25 of 91

In Reality…

You might care about all of these to some extent!

You might find that your best solution to the p-median problem has a really long maximum travel time…

But your second best solution actually has a much more reasonable maximum travel time.

Average Weighted Time

Maximum travel time

Solution 1

16.7 minutes

49 minutes

Solution 2

17.1 minutes

31 minutes

Solution 3

19.3 minutes

45 minutes

Make sure you’re not omitting this information when presenting the output to stakeholders!

It can be crucial to help them make an informed, fair and safe decision.

26 of 91

In Reality…

You could even consider ranking possible combinations by multiple methods and then seeing which has the lowest overall rank (indicating the best performance against multiple objectives)

6

23

14

26

19

36

Table Ordered by Avg Weighted Travel Time - You can see the total ranks on the right don’t match this ordering!

27 of 91

So… how do we actually start calculating the best locations?

We focus on the p-median problem in the course.

28 of 91

So… how do we actually start calculating the best locations?

Much of the time, you will start with a series of known locations for services.

These will either be:�

  • Proposed locations for services
    • Managers will often consider a range of possibilities based on buildings owned by the organisation, areas with buildings available for rent, and so on, before you receive a list of possibilities
    • Alternatively, you can have more flexibility (e.g. allowing the centroid of each LSOA to be a possible location)�
  • Locations of existing services�
  • A mix of the two

29 of 91

So… how do we actually start calculating the best locations?

You’ll also need a travel matrix, with

  • the sources of demand (where people will be travelling from) as rows
  • the destinations (all the possible clinic locations) as the columns
  • travel times or distances as the values.

Sound familiar…?

Yep - that sounds like the travel matrices from the last session!

Possible site location 1

Possible site location 2

Possible site location 3

Possible site location 4

LSOA 1

23.7

26.2

19.1

42.1

LSOA 2

26.5

28.9

14.2

41.5

LSOA 3

31.3

37.1

21.3

18.7

LSOA 4

18.1

15.4

35.6

28.2

LSOA 5

12.1

25.6

23.2

47.1

30 of 91

So… how do we actually start calculating the best locations?

When we did our travel maps before, we just worked out the shortest possible distance

Possible site location 1

Possible site location 2

Possible site location 3

Possible site location 4

Shortest

LSOA 1

23.7

26.2

19.1

42.1

19.1

LSOA 2

26.5

28.9

14.2

41.5

14.2

LSOA 3

31.3

37.1

21.3

18.7

18.7

LSOA 4

18.1

15.4

35.6

28.2

15.4

LSOA 5

12.1

25.6

23.2

47.1

12.1

31 of 91

So… how do we actually start calculating the best locations?

But now, we want to work out the shortest distance depending on which clinics are available

Possible site location �1

Possible site location �3

Shortest

LSOA 1

23.7

19.1

19.1

LSOA 2

26.5

14.2

14.2

LSOA 3

31.3

21.3

21.3

LSOA 4

18.1

35.6

18.1

LSOA 5

12.1

23.2

12.1

Possible site location �1

Possible site location 4

Shortest

LSOA 1

23.7

42.1

23.7

LSOA 2

26.5

41.5

26.5

LSOA 3

31.3

18.7

18.7

LSOA 4

18.1

28.2

18.1

LSOA 5

12.1

47.1

12.1

32 of 91

So… how do we actually start calculating the best locations?

We then need to be able to represent these as an integer array

Let’s say we currently have four locations we deliver routine health screenings from.

Due to constraints, we’re going to have to cut it down to three.

Icons created by Pixel perfect - Flaticon: https://www.flaticon.com/packs/nature-46

Leafy Lane Practice

Cactus Heights Surgery

Flowerpot Road Practice

Sunny Mountain Surgery

33 of 91

So… how do we actually start calculating the best locations?

What are all the possible combinations of three practices?

Icons created by Pixel perfect - Flaticon: https://www.flaticon.com/packs/nature-46

Note that the order doesn’t matter

=

And each practice can only appear once

34 of 91

So… how do we actually start calculating the best locations?

Now - how do we turn these combinations into something the computer can understand?

Let’s give each practice a numeric ID - remembering Python starts from 0.

Icons created by Pixel perfect - Flaticon: https://www.flaticon.com/packs/nature-46

Leafy Lane Practice

Cactus Heights Surgery

Flowerpot Road Practice

Sunny Mountain Surgery

0

1

2

3

0

1

3

2

0

2

3

1

1

2

3

0

0

1

2

3

35 of 91

So… how do we actually start calculating the best locations?

Potential_combinations = [

[0, 1, 2]

[0, 1, 3],

[0, 2, 3],

[1, 2, 3]

]

In our code, we will write this as a list of lists.

0

1

2

3

0

1

3

2

0

2

3

1

1

2

3

0

36 of 91

So… how do we actually start calculating the best locations?

And in fact, we won’t actually write it ourselves at all…

We’ll get the combinations function from the itertools package to do that for us - via a custom function.

37 of 91

What is this function doing?

facility = np.arange(n_facilities, dtype=np.uint8)

facility = np.arange(4, dtype=np.uint8)

> array([0, 1, 2, 3], dtype=uint8)

combinations(facility, p)

combinations(array([0, 1, 2, 3], dtype=uint8), 3)

> <itertools.combinations at 0x7f14b4304950>

First, it creates an array to represent all the possible facilities, based on the number of facilities we pass in.

Next, it works out all possible combinations and returns this as a special type of object.

38 of 91

So… how do we actually start calculating the best locations?

[a for a in combinations(facility, p)]

[

(0, 1, 2),

(0, 1, 3),

(0, 2, 3),

(1, 2, 3)

]

<itertools.combinations at 0x7f14b4304950>

Let’s turn it into a nicer output.

We would get this as our output.

39 of 91

So… how do we actually start calculating the best locations?

However, rather than using standard python lists or tuples, we’ve asked the function to return numpy arrays for each combination.

return [np.array(a) for a in combinations(facility, p)]

[[0, 1, 2],

[0, 1, 3],

[0, 2, 3],

[1, 2, 3]]

This just ensures things remain as speedy as possible when we get to larger number of combinations.

But fundamentally these two are the same.

40 of 91

So… how do we actually start calculating the best locations?

It turns out someone forgot to tell us about one of the practices!

But they are still just looking to drop it down to 3 practices.

So how many combinations can we get now?

Icons created by Pixel perfect - Flaticon: https://www.flaticon.com/packs/nature-46

Leafy Lane Practice

Cactus Heights Surgery

Flowerpot Road Practice

Sunny Mountain Surgery

Pine Tree Partnership

41 of 91

Work out every possible combination of these practices.

How many are there in total?

You might find it helpful to use numeric IDs for the practices.

Exercise 1

We’ll work on this for 3 minutes

Then there will be an anonymous poll

Leafy Lane Practice

Cactus Heights Surgery

Flowerpot Road Practice

Sunny Mountain Surgery

Pine Tree Partnership

0

1

2

3

4

42 of 91

So… how do we actually start calculating the best locations?

What are all the possible combinations of three practices?

Icons created by Pixel perfect - Flaticon: https://www.flaticon.com/packs/nature-46

43 of 91

So… how do we actually start calculating the best locations?

What are all the possible combinations of three practices?

Icons created by Pixel perfect - Flaticon: https://www.flaticon.com/packs/nature-46

0 1 2

0 1 3

0 1 4

0 2 3

0 2 4

0 3 4

1 2 3

1 2 4

1 3 4

2 3 4

10 possible combinations

44 of 91

So… how do we actually start calculating the best locations?

That was a bit of a pain to work out… so you can see why it’s really useful to have a function that will work this out for us when we get to bigger numbers of combinations!

But just how big can the number of combinations get?

45 of 91

So… how do we actually start calculating the best locations?

We can use the formula below to work out the total number of possible combinations in a scenario where the order of options doesn’t matter and you can’t repeat options in an answer.

number of options!

(number of optionsnumber of items to select)! x number of items to select!

The ! isn’t just emphasising things here.

It is the mathematical symbol for factorial.

So if we have 5 options, the top of our fraction is 5!, or 5 factorial.

This is 5 x 4 x 3 x 2 x 1

46 of 91

So… how do we actually start calculating the best locations?

2! = 2 x 1

3! = 3 x 2 x 1

10! = 10 x 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1

47 of 91

So… how do we actually start calculating the best locations?

So where we have 4 options and are selecting 3, this is

number of options!

(number of optionsnumber of items to select)! x number of items to select!

Where we have 5 options and are selecting 3, this is

4!

(43)! x 3!

4 x 3 x 2 x 1

(1) x (3 x 2 x 1)

24

6

= 4

5!

(53)! x 3!

5 x 4 x 3 x 2 x 1

(2 x 1) x (3 x 2 x 1)

120

12

= 10

48 of 91

So… how do we actually start calculating the best locations?

The key thing to note is that the number of combinations can get out of hand very quickly…

If you’re working at the level of a single trust, or with some fairly centralised services for an ICB, you’ll probably be ok.

But if I’m a national commissioner looking to close 1 of 30 services, then there are…

30 possible combinations

49 of 91

So… how do we actually start calculating the best locations?

If they instead decide to close 2 of 30 services, then there are…

435 possible combinations

☕☕☕

If they instead decide to close 3 of 30 services, then there are…

4060 possible combinations

😴🗓️

50 of 91

So… how do we actually start calculating the best locations?

If they instead decide to close 4 of 30 services, then there are…

27,405 possible combinations

If they instead decide to close 5 of 30 services, then there are…

142,506 possible combinations

😱😱😱😱😱😱😱

😱😱😱

51 of 91

So… how do we actually start calculating the best locations?

If there are 100 services and they want to close 5, there are�

84,000,000

75,287,520 possible combinations

52 of 91

So… how do we actually start calculating the best locations?

We’ll briefly cover this concept a bit later today, but we won’t cover how to do it in code or go into much detail about the process.

However,

  • we will provide a range of materials if you want to dive into this in your own project
  • there are some members of the HSMA community with experience in this area if this is a route you need to go down for your project.

There are ways to deal with situations where there are too many possibilities to ‘brute force’ - but it’s quite a tricky area to wrap your head around!

53 of 91

So… how do we actually start calculating the best locations?

Right - now we know the combinations we want to test and how long it takes to get to each clinic location, we’ll need some sort of demand figures.

Chances are your demand isn’t spread evenly across your potential service user catchment area.

  • Maybe certain LSOAs have more people of childbearing age, so maternity and paediatric services will get more demand from those areas�
  • Maybe certain LSOAs have an older population, so stroke services are more centred around there�
  • Maybe other LSOAs have a higher incidence of obesity, so you might expect more demand for type 2 diabetes services and retinopathy clinics

54 of 91

So… how do we actually start calculating the best locations?

Your demand data can be formatted quite simply!

You just want to make sure that you’re using the same level of location data as in your travel matrix (e.g. they need to both use LSOA, or both use postcode sectors - not a mixture)

Demand

LSOA 1

25

LSOA 2

15

LSOA 3

3

LSOA 4

12

LSOA 5

25

Possible site location 1

Possible site location 2

Possible site location 3

Possible site location 4

Shortest

LSOA 1

23.7

26.2

19.1

42.1

19.1

LSOA 2

26.5

28.9

14.2

41.5

14.2

LSOA 3

31.3

37.1

21.3

18.7

18.7

LSOA 4

18.1

15.4

35.6

28.2

15.4

LSOA 5

12.1

25.6

23.2

47.1

12.1

Our demand dataframe

Our travel dataframe

55 of 91

So… how do we actually start calculating the best locations?

You might use historic demand figures as your starting point.

But always consider…

  • If you seem to have areas with no historic demand, is this actually a pattern, or is it just that the historic numbers are small and there’s an element of chance?�
  • Is there some preexisting bias or issue with access that you might be perpetuating by using the historic figures?

56 of 91

So… how do we actually start calculating the best locations?

How can you deal with these issues?

If you’re using historic data, can you bring in more years?

Alternatively, you could explore what happens when you just weight demand by the number of people in each LSOA.

If your service only covers a certain age range, you can find LSOA-level figures estimating the proportion of the population in that age group.

Travel times weighted by historic TIA clinic demand

Weighted by theoretical demand by LSOA

57 of 91

So… how do we actually start calculating the best locations?

You might also just make predictions of demand based off something else.

  • Number of people in a target age demographic or of a particular gender in your LSOAs, multiplied by some factor
    • (e.g. 20% of women aged 18-34)

58 of 91

So… how do we actually start calculating the best locations?

Just always consider whether what you’re doing is

- appropriate

- sensible

- transparent

- justifiable

Make it clear what data you have used and why!

Make the limitations of your conclusions explicit.

59 of 91

So… how do we actually start calculating the best locations?

Right! We’ve got everything we need - except a way to evaluate the weighted average travel time…

Cast your mind back many many slides - when we present our options, we want to give more weight to the travel times from areas that have higher demand.

60 of 91

So… how do we actually start calculating the best locations?

Let’s write some code to do this.

It’s something we’re going to want to reuse quite a lot - so we can turn it into a class with…

Methods

    • Evaluate a solution
      • Limit the travel dataframe to a subset of candidate locations
      • Work out the smallest value per location (the ‘minimum cost’)
      • Return a dataframe with demand and min cost per region�
    • Return the weighted average for the solution�
    • Return the unweighted average for the solution�
    • Return the maximum travel time for any locations in the solution

Attributes

  • our travel dataframe �
  • our demand dataframe�

61 of 91

So… how do we actually start calculating the best locations?

demand is our dataframe containing demand sources (e.g. LSOAs) as rows and the demand (e.g. yearly patients coming from that LSOA) as the single column

merge_col is the name of the column that is consistent between our demand and travel dataframes

travel_matrix is the dataframe with demand sources as rows and potential service locations as columns, with the travel cost (time) as the values in the cells

demand_col is a string referring to the column that contains the demand value in the demand dataframe

62 of 91

So… how do we actually start calculating the best locations?

This method, given a list of sites in the form of their column indices (e.g. [1, 4, 5] for sites 2, 5 and 6), will return a dataframe with the demand and minimum travel time per row for this potential solution.

63 of 91

So… how do we actually start calculating the best locations?

This method, given a list of sites in the form of their column indices (e.g. [1, 4, 5] for sites 2, 5 and 6)

  • Runs the evaluate_solution method
  • Returns a range of metrics relating to the problem

64 of 91

So… how do we actually start calculating the best locations?

Let’s take a look at the demand and travel matrices we’re passing in.

They become attributes of our class.

65 of 91

So… how do we actually start calculating the best locations?

This is the output of the evaluate_solution method.

It’s limited the sites and added the min_cost column based on the available sites.

66 of 91

So… how do we actually start calculating the best locations?

And now we can easily pass in a range of solutions - including with different numbers of sites.

Note the column names changing to reflect the site indices we are passing in.

67 of 91

So… how do we actually start calculating the best locations?

We can then return the metrics for the solutions in each case…

… and easily access a single metric like the weighted average travel time.

68 of 91

So… how do we actually start calculating the best locations?

Now we bring in code to calculate every possible combination of a certain number of facilities.

We could even loop through this to find every combination with every possible number of facilities…

69 of 91

So… how do we actually start calculating the best locations?

Now we can loop through every possible combination and save the outputs.

70 of 91

So… how do we actually start calculating the best locations?

And easily turn our list of dictionaries into a dataframe!

71 of 91

So… how do we actually start calculating the best locations?

Then it’s easy to find the best combinations.

72 of 91

So… how do we actually start calculating the best locations?

And, since we stored the combinations in our dataframe too, we can easily pull this out by chaining together a few steps.

73 of 91

So… how do we actually start calculating the best locations?

We can now put this all together with the map plotting skills we’ve learned over the last few sessions.

74 of 91

So… how do we actually start calculating the best locations?

When plotting our sites as well, we can ensure we are only plotting the sites included in our solution.

75 of 91

So… how do we actually start calculating the best locations?

This first line ensures we just pull out the dataframe for the best solution

This line then ensures we’re just pulling out the site indices for the best sites, and filtering our site dataframe to just those.

76 of 91

So… how do we actually start calculating the best locations?

And then it doesn’t take much effort to create a map showing weighted average travel times for every possible solution!

77 of 91

So… how do we actually start calculating the best locations?

The key thing is that we just iterate through our dataframe of all solutions - pulling one row out at a time.

If we order them by weighted average first, the map in the top left is the best solution.

Then it’s just our usual plot code - making sure to specify the axis we are plotting on to.

78 of 91

So… how do we actually start calculating the best locations?

Another benefit of saving all these different things is that we can then easily add the weighted average travel time to the title of each plot!

79 of 91

So… how do we actually start calculating the best locations?

Here is the full code for this plot of solutions.

80 of 91

So… how do we actually start calculating the best locations?

Consider other ways to display the outputs too.

81 of 91

BONUS: What do you do when it gets too big?

Before we go onto really complex algorithms, we can try a simple heuristic.

Heuristic =

“any approach to problem solving that employs a practical method that is not fully optimized, perfected, or rationalized, but is nevertheless sufficient for reaching an immediate, short-term goal or approximation”

There’s no guarantee that the solution from a heuristic will be the optimal one - or necessarily even that close to being the optimal one!

Wikipedia contributors. (2024, February 28). Heuristic. In Wikipedia, The Free Encyclopedia. Retrieved 17:22, March 20, 2024, from https://en.wikipedia.org/w/index.php?title=Heuristic&oldid=1210861290

82 of 91

What do you do when it gets too big?

What we’re going to do is try a series of random solutions.

While we might not manage to find the perfect solution, we might well find a ‘good enough’ one.

We’ll set a maximum number of solutions that we’ll try.

We’ll keep track of the best solution we’ve found so far as we go along.

83 of 91

What do you do when it gets too big?

Let’s bring in a larger dataset.

Here’s we’ve got a sexual health clinic dataset with 306 possible clinic locations (some existing clinics + centroids of LSOAs without a clinic currently).

They want to place 20 clinics.

This gives us

11,290,838,603,408,335,008,742,942,703,616 options

(that’s eleven nonillion options)

84 of 91

What do you do when it gets too big?

In this code, we store the best ‘cost’ (lowest weighted travel time) we’ve found so far.

We then just keep asking for random solutions, evaluate them, keep going a specified number of times, and track the best solution.

85 of 91

What do you do when it gets too big?

We could extend this code to save the output of every run - but be aware this isn’t a good idea for large very large numbers of restarts or very large areas (that therefore have big geodataframes)

86 of 91

What do you do when it gets too big?

306 possible clinic locations

20 clinics to place

11,290,838,603,408,335,008,742,942,703,616 options

Iterations

Time

Overall Weighted Travel Time of Best Solution Found

1000

~ 10 seconds

6.01 minutes

10,000

~ 45 seconds

5.69 minutes

100,000

~ 6 minutes

5.47 minutes

1,000,000

~ 36 minutes

~ 5.3 minutes

87 of 91

What do you do when it gets too big?

We can look at how the solution finding progresses over time

1000 iterations

10,000 iterations

100,000 iterations

1,000,000 iterations

88 of 91

Is there any way we can get closer to the optimum solution?

Evolutionary Algorithms�

Linear Programming

89 of 91

Before we go…

We won’t have time today, but I will be providing a video on genetic and evolutionary algorithms for larger-scale optimization problems, as well as alternatives like linear programming.

If you are interested, the slides for this section are available here: https://docs.google.com/presentation/d/1YLGLE3DnzCVv9OnAimtM7iueAsXy_edXrL67D6DbbyA/edit?usp=sharing

90 of 91

Geographic Visualisation and Optimization Masterclasses

In phase 2, we hope to provide optional additional sessions on�

  • Geostatistics (hotspot/�coldspot/�outlier detection)�
  • Multiobjective optimization problems

91 of 91

Open up the notebook

HSMA 3D Exercise 2 - Facility Location Problems.ipynb

and work through this individually.

Don’t forget to have someone sharing their screen!

Exercise 2

Take a 10 minute break now, and then �we’ll work on this for 75 minutes