1 of 15

ActivitySim Random Number Generation

Investigating ways to make Explicit Error Terms run faster

May 5, 2026

2 of 15

Agenda

  • RNG background
  • Performance testing
  • Proposed new approaches
  • Discussion on next steps

2

3 of 15

RNGs are used to make decisions

  • ActivitySim (legacy aka non-EET) calculates utilities, computes probabilities, and draws a random number to make the choice
  • Need only 1 random number per choice

3

Mode

Utility

Exp(Utility)

Probability

Cumulative

Probability

SOV

-1.6

0.2070

0.5501

0.5501

HOV

-2.2

0.1136

0.3019

0.8520

Transit

-2.9

0.0557

0.1480

1.0000

Sum

0.3763

1.0000

> 0.6721 ?

Random Number Draw = 0.6721

Chosen Mode = HOV

4 of 15

Explicit Error Terms Requires Many More RNG Draws

  • ActivitySim in EET mode instead samples from a Gumbel distribution
  • Need only 1 random number per choice

4

Mode

Utility

Exp(Utility)

Probability

Cumulative

Probability

Explicit Error Term

Total Utility + EET

SOV

-1.6

0.2070

0.5501

0.5501

-0.2

-1.8

HOV

-2.2

0.1136

0.3019

0.8520

0.5

-1.7

Transit

-2.9

0.0557

0.1480

1.0000

0.1

-2.8

Sum

0.3763

1.0000

Now drawing random number for every alternative

For 2-zone location choice models, the number of draws grows very fast!

5 of 15

EET Reminder – EET leads to more stable results

  • Example shows how increases in Walk probabilities can cause people to switch from Car to PT.
  • EET allows for:
    • Walk utility to increase
    • Car and PT utility to remain the same
    • Model sensitivities that make more sense!

  • This has all been implemented and tested, but the drawing random numbers for every alternative is slowing the runtime down by a lot!

5

6 of 15

Properties of ActivitySim’s RNG

  • ActivitySim needs to produce reproducible choices based on household / person IDs
  • We “seed” the random number generator based on:
    • Global seed + Model step + Chooser ID + Alternative ID (if needed)
  • We “draw” values from the random number generator
  • We need an “offset” if we want to draw more random numbers again for this seed
    • Stores how far into the stream we already are
    • Offset of 10 means: skip the first 10 draws and use what comes next

6

7 of 15

Current Random Number Generator

  • ActivitySim uses numpy’s default “RandomState” MT19937 which implements the “Mersenne Twister” algorithm
  • Numpy has other random number generators that can run faster

  • Problem solved? – No ☹️
    • Re-seeding is much more expensive than the actual number draws
    • Re-seed times are not well reported

7

8 of 15

Performance Testing – Drawing Random Numbers

8

9 of 15

Performance Testing – Reseed and Draw Same Number

9

10 of 15

Performance Testing – Reseed + offset + many draws

10

  • The runtime differences between algorithms can change depending on how you structure the tests with the reseeding, offsets, number of random draws, etc.

  • No easy, simple way to say one RNG is better than another independent of ActivitySim. Need to look at how ActivitySim is actually creating random numbers.

11 of 15

Random Numbers in Location Choice

Location Choice Flow for 2-zone using EET

  • Sample 30 TAZs per chooser –
    • Reseed for each chooser and draw #TAZs x 30
    • Offset is now 30T
  • MAZ-within-TAZ selection:
    • Reseed for each chooser and draw offset (30T) + 30 more
    • Offset is now 30T+30
  • Calculate Logsum
    • No draws here!
  • Final MAZ choice:
    • Reseed for each chooser and draw offset (30T + 30) + #MAZs more
    • (drawing for every MAZ is faster than reseeding based on MAZ IDs)

11

Total: C(TS + S + M)

For something like SANDAG where

C = 500k workers

T = 6k TAZs

S = 30 samples

M = 30k MAZs

Total = 1.05x10^11

105 billion draws!

12 of 15

Results Calling ActivitySim’s actual RNG generators

12

Calculated with 1,000 choosers

Calculated with 6k TAZs

and 4 MAZs per TAZ

13 of 15

Reseeding is slow – Can we get around it?

Use a hash-based, “stateless” random number generator

  • Avoid the costly re-seeding implemented within these numpy random number generators and use a hash-based approach

Start with: (global seed & chooser ID & model) + alt_id + offset

Combine this into a single 64-bit integer as our new “state”:

Scramble this integer with a fast hash mixer that: shifts bitx, xors bits, multiplies by large constant

Converts the hashed 64 bit into a uniform number between 0 and 1:

draw = top 53 random-looking bits / 253

Convert to Gumbel by

g = -log(-log(draw)

13

14 of 15

Ok, hashed approach is fast for ActivitySim, but does it work?

  • Do the raw uniform draws still look like uniform draws? -- yes
  • Do sparse alternative ids introduce visible bias or correlation? -- No
  • Do MNL-style choice shares come out right with EV1 shocks? -- Yes
  • Does the keyed approach preserve the invariance ActivitySim relies on when offsets and sampled choice sets change? – Yes

14

15 of 15

Next Steps

  • This isn’t the only approach, this is just the one I happened to come up with
  • Discuss with team and come to consensus on approach
  • Test with actual SANDAG model and confirm results

15