3 of 16

Monte Carlo Simulation

Current method for selecting alternatives from discrete choice models
A probability is calculated for each alternative based on the exponentiated utility for that alternative and all other alternatives
The cumulative probability distribution is created
A pseudorandom number* is generated from a uniform distribution and used to select an alternative from the cumulative probability distribution

* Pseudorandom because the random number is generated by a deterministic algorithm (“Mersenne Twister”).

4 of 16

Monte Carlo Simulation

Mode	utility	exp(util)	probability
auto	-0.69315	0.50	0.50
walk	-1.38629	0.25	0.25
transit	-1.38629	0.25	0.25
total		1.00	1.00

Random number draw = 0.49

Choice = auto

Simple mode choice model

5 of 16

Monte Carlo Simulation: base versus build

Random number draw = 0.49

Build choice = walk

Mode	Base			Build
Mode	utility	exp(util)	probability	utility	exp(util)	probability
auto	-0.6931	0.5000	0.5000	-0.6931	0.5000	0.3909
walk	-1.3863	0.2500	0.2500	-1.3863	0.2500	0.1954
transit	-1.3863	0.2500	0.2500	-0.6363	0.5293	0.4137
total		1.0000	1.0000		1.2793	1.0000

Problem: Choice for this decision-maker changed from auto to walk but walk utility did not change and walk probability decreased

Base versus build utilities and probabilities

6 of 16

Monte Carlo Simulation: base versus build

base	build
base	auto	walk	transit	total	percent
auto	392	100	0	492	49%
walk	0	82	167	249	25%
transit	0	0	259	259	26%
total	392	182	426	1000	100%
percent	39%	18%	43%

Marginals are fine
The problem is the 100 decision-makers who switched from auto to walk
Conclusion:

Current approach can be used to analyze aggregate changes from baseline to build
Current approach cannot be used to cross-tabulate changes in baseline to build for individual decision makers

Cross tabulation of 1k tours by base versus build choice

7 of 16

Simulation with explicit error terms

The utility for each alternative in a random-utility discrete choice model is composed of a measurable component and an error term

U_i = U_m,i + e_i

The error term is an unobserved, random component which is why the model is probabilistic
The logit model is a discrete choice model that is derived from the assumption that the error terms for each alternative are independently and identically distributed according to a Gumbel distribution

Where μ is the location parameter and σ is the scale parameter

Cumulative density function = e^{−e−(x−μ)/}^σ Inverse cumulative density function = μ - σ * log(-log(x))

8 of 16

Simulation with explicit error terms

Simulation with explicit error terms refers to the process where the error term is drawn from the inverse of the Gumbel cumulative distribution function and added to the measurable utility
Then the alternative with the highest utility is selected

Mode	utility	random number	error term	total utility
auto	-0.6931	0.8544	1.8496	1.1565
walk	-1.3863	0.6841	0.9684	-0.4179
transit	-1.3863	0.9212	2.4998	1.1136

Random number draw = 0.8544

Error term_auto = 0.0 – 1.0 * log(-log(0.8544))

Error term_auto = 1.8496

Auto has the highest utility (-0.6931 + 1.8496 = 1.1565)

9 of 16

Simulation with explicit error terms: base versus build

	Base				Build
Mode	utility	random number	error term	total utility	utility	total utility
auto	-0.6931	0.8544	1.8496	1.1565	-0.6931	1.1565
walk	-1.3863	0.6841	0.9684	-0.4179	-1.3863	-0.4179
transit	-1.3863	0.9212	2.4998	1.1136	-0.6363	1.8636

base	build
base	auto	walk	transit	total	percent
auto	388	0	108	496	50%
walk	0	195	49	244	24%
transit	0	0	260	260	26%
total	388	195	417	1000	100%
percent	39%	20%	42%

New approach can be used to cross-tabulate changes in baseline to build for individual decision makers

This decision-maker chooses transit in the build scenario because it now has the highest utility with the same error terms as the base scenario

No decision-makers will switch to an alternative whose utility did not increase in the build scenario

10 of 16

The previous examples are illustrative and simplistic

Examples show multinomial logit, whereas mode choice is typically nested logit
Total number of decision-makers is not changing between alternatives

In a real-world application, this typically only happens when the models are household or person level, or where upstream model components have been held constant from baseline to build

All decision makers have the same measurable utility for each alternative

In a real-world application each decision-maker has unique utilities, which are a function of household and person characteristics, the outcome of upstream models, and the attributes of each alternative

11 of 16

What are some of the expected ‘use cases’ of base versus build comparisons by decision-maker?

User benefit calculations

Sum up total travel expenditures by base versus build mode choice. Use to calculate user benefits (‘rule of half’ used to calculate benefits for travelers who change modes)

Understanding who is changing their choice in a build alternative to provide more information for decision-making

What is the income distribution of households who switch from 1+ auto households to 0 auto households?
What is the income distribution of travelers who switch from auto to transit?
What is the income distribution of travelers who switch from a free auto path to a tolled path?

12 of 16

Scope of work

Design phase

Initial meeting to discuss methodology and random number seeds. Today
Proposed user features and gather feedback. Week of March 31?
Recommended software solution(s) to the consortium, along with a more refined level of effort for developing prototype code. Week of April 14?
Technical memorandum summarizing design activities and software prototyping plan

Software development phase

Briefing 1. Week of April 28
Briefing 2. Week of May 19
Final meeting. Week of June 23
Initial prototype code pull request
Technical memorandum summarizing software development activities

13 of 16

Key issues to address in design: Computational

Monte Carlo simulation requires one exponentiation for each alternative (more for nested logit) and one random number draw for each model
Explicit error terms requires one random number draw and two log calculations for each alternative (more for nested logit)

Exponentiation of each alternative’s utility is also required if calculating logsum

There are more calculations required for explicit error terms

14 of 16

Key issues to address in design: Practical

Under what circumstances is it appropriate (or not) to compare outcomes across base versus build scenarios

Base versus build comparisons are straightforward when comparing choices that are always made when the model is run

Auto ownership. Coordinated daily activity pattern.

Comparisons can be challenging when choices change between scenarios

Base: 1 shop tour, Build: 2 shop tours.
Base: 2 trips on first shop tour, Build: 4 trips on first shop tour.

Are there software features that will make the comparisons more meaningful for those cases?

For example, provide flexibility in how random numbers are drawn

Recommendations on when comparisons are meaningful and when they are not

15 of 16

Key issues to address in design: Other

Logsums

Expected value or inclusive of error terms (taste heterogeneity)

User features

Control over what method is applied by model component?
Other?

1 of 16

2 of 16