Title: Design Your Own Compute Market (SF Compute)

Description: A distinctive feature of the current AI wave is its degree of resource intensiveness. By one estimate, the total annual investment in compute will be $8T in 2030.

Other markets of that size are typically quite financialized. Taking oil as an example (about $2T delivered each year), there is a spot price, monthly futures, options, etc.

How can we design financial markets for compute?

In this interactive workshop, we will try to map out the space of possible compute markets, and understand their trade-offs.

Where to find this doc: https://bit.ly/compute-markets-manifest 

Add your name and email to this list :)

Authors:


Agenda

  1. Intro (10-15 mins, in progress)
  2. Breakout groups of 2-4 (30 mins)
  3. Discussion and closing (15 mins)

Intro

Compute is eating the world!

Leopold thinks that there will be $8T/year spent on compute in 2030. Microsoft is already planning to spend $50B/year starting next year.

Other industries of that size typically have financial tools for hedging risk. Taking oil as an example (about $2T delivered each year), there is a spot price, monthly futures, options, etc.

Each of these things is useful in different ways:

  • futures help suppliers and purchasers lock-in prices for future supply
  • options let you pay a premium to guarantee a floor or ceiling price
  • spot markets increase liquidity of supply available right now

There are various things you might want to do with compute that you can’t today:

  • buy 25k H100s for a month, to train your GPT-5 scale model
  • sell one week of that month back into the market, because your training code isn’t ready yet
  • get one H100 node for a couple of hours around lunch time, to try an idea you just had
  • buy an option on compute prices over the next 3 years, giving you the right (but not the obligation) to buy H100s at $2.30/hr
  • place a standing bid with a limit price of $1.50/hr with a Kubernetes job attached to it, so that you can run your training job whenever the price of compute dips below that level
  • sell 3 years of compute supply into the market, so that you can take advantage of higher prices for shorter compute blocks without taking long-term price risk

How can we design a market for compute that allows us to do these things?

The Two-Dimensional Model

We represent each cluster with just two dimensions: time, and the number of GPUs it has:

There are N clusters on the exchange, so really we have this grid of rectangles:

Failures of the Two-Dimensional Model

  • Not every cluster is the same
  • Motherboards
  • CPU
  • Memory
  • Interconnect (Infiniband, RoCE, Ethernet)
  • Storage
  • how many PB?
  • local per node, or remote?
  • WEKA or VAST?
  • how fast? bandwidth and IOPS
  • Network
  • internal:
  • separate storage and data networks?
  • separate management network?
  • public:
  • how many gbps?
  • what part of the world? (the ISP may not have that much bandwidth across continents)
  • What happens when nodes fail?
  • Roughly 10-20% die during burn-in
  • About 10%/year after that
  • Early revisions of hardware could have much higher failure rates
  • What happens if a cluster is delayed?
  • What if there is data loss?

Market Desiderata

  • From the perspective of AI companies:
  • Get compute whenever you want, for any training runs you need, at arbitrary scale
  • This is how ordinary cloud works!
  • Get large amounts of compute (e.g. 100k H100s) for short periods of time (e.g. 1 month)
  • Have it be as cheap as possible (companies often spend more than half of their money on compute)
  • Ideally you don’t have to pre-specify how long your training runs will take, though this is a relatively weak preference
  • Don’t want to have your training run interrupted / preempted (stronger preference)
  • From the perspective of compute providers / datacenters:
  • Want to be able to finance large clusters to sell into the market, without having a long-term buyer
  • Prefer to receive money upfront when possible
  • Ideally can get financing for clusters that haven’t been built yet using the market
  • From the perspective of speculators:
  • Prefer to lock up money as late as possible, so that trading strategies are capital efficient
  • Prefer to trade on ordinary CLOBs, so that they can use their normal trading strategies

                


Classes of Compute Markets (Interactive)

What are different mechanisms that one could imagine for allowing compute to be tradeable? This part we will work out together, in small groups.

Make a section on this page for your group, and write down what you all are thinking about in bullets in that section. We’ll do this for about 30 minutes, and then regroup afterwards.

Various things to think about:

  • Binpacking: make sure that AI companies can buy nodes that are *interconnected* in the same cluster
  • How to break up the time dimension: can you buy arbitrary lengths of time? or only months? hours? is there one unified mechanism?
  • Can I make money by trading in this market?
  • Does the market tell me prices, or do I have to look in my heart and decide what my willingness to pay is?
  • Can I short the market?
  • If there are futures, are they physically settled or cash settled?

Questions:

  • (DJ) what does market theory tell us about the tradeoffs between a market with N dimensions vs N+1 dimensions?
  • (DJ) quality issues: what is offered versus what is actually available (not necessarily deceptive action from a supplier)
  • (DJ): imagine AWS’s incentives for an internal market. But a cross-supplier market has different incentives.

Teams:

  • How might one short “compute”?
  • Isabel: can short pure electronic concepts (stock ownership)
  • Oil/soybean futures: mostly don’t need to settle
  • Not sure what settlement looks like with compute
  • [smitty] Cash settle from a benchmark?
  • What’s a spot price for H100? [smitty] Get them off of AWS?
  • Can do a contract-for-difference
  • Why set up shorting?
  • Allows speculators to make money
  • If you buy -1 compute, what are you doing with it?
  • You’re essentially selling compute
  • [austin] Smooth out demand across time (you want to sell compute now and buy back later to use)
  • If there are futures, are they physically settled or cash settled?
  • [smitty] Cash settlement only seems simpler
  • [isabel] If the price goes way up and you don’t have the cash; market fails
  • [smitty] settlement failures happen in most markets, generally resolved with intermediary/clearinghouse who takes on counterparty risk
  • [isabel] Compute: It’s not like one global cluster, need to figure out how to apportion. How to allocate over time?
  • Value of having 1000 H100s in one cluster, is different than having 1 H100 in 1000 clusters
  • [alex] Cash settlement could usually be indexed with averaging across different
  • Does the market tell me prices, or do I have to look in my heart and decide what my willingness to pay is?
  • [isabel] Why is there no compute market yet?
  • [smitty] AWS reserve marketplace (CPU compute). Pay now for some time in the future, and you can buy&sell in the future. AWS takes large fees compared to commodities
  • Amazon has more economists than any institution in the world (might exceed the Fed)
  • [austin] Maybe compute is simpler to deliver than soybeans, leading to consolidation in big tech companies
  • [isabel] compute seems more complicated to produce, also to deliver at sufficiently large scale?
  • A cluster in europe is not the same as a cluster in the US, maybe insufficient
  • Can’t actually move a cluster across the ocean, unlike a big package of oil
  • Obvious: just less time for these markets to build up
  • [smitty] Much compute is bought through large clusters (AWS, etc). Maybe they haven’t implemented this yet

Discussion:

  • Group 1:
  • If your company is trying to match independent data centers with startups, you’re a market maker
  • But if you’re vertically integrated and selling your own startup, less market problem, more of a coordination/resource allocation problem.
  • [alex] Can you design tradeable blocks of markets if you only have one supplier?
  • Complexity of being a market maker: compute vs commodities. Optimization function over market design spaces. Could try to get efficiency, bringing in participants. More complicated = more efficient but harder to get people to bring in
  • Constraint elimination: eg assuming everything is H100s. Might be necessary for building a market
  • Group 2:
  • Two classes of problems:
  • AWS is not a good design
  • Trying to commoditize a thing like land (things being next to each other is valuable)
  • How do you do this while caring about security without one giant provider?
  • [alex] Block allocation mechanism ideas? [liz] not really
  • How does commodity futures market handle failure to provide?
  • [alex] Clearinghouse guarantees it and provides insurance for this
  • Assumption is that you never run out of corn; you’d buy it from the spot market, if a bunch are short, but then the spot market cost goes up
  • Market failure happens more in compute: there might not be enough compute, only 1 cluster and 2 different people want it