1 of 25

GenAI, quantum computing, and the power of convergence

DECEMBER 2023

1

2 of 25

Topics for today's session

  • The value of GenAI for business
  • The convergence potential of quantum computing + GenAI

The roadblocks preventing it from scaling

1

3

2

  • Matt Langione
  • Managing Director & Partner
  • Boston Consulting Group
  • langione.matt@bcg.com

2

3 of 25

The power of generative AI (I/II)

Simulated exams

Score

Percentile

Bar Exam (MBE+MEE+MPT)1

298/400

90th

LSAT

163/180

88th

SAT Reading & Writing

710/800

93rd

SAT Math

700/800

89th

GRE Quant

163/170

80th

GRE Verbal

169/170

99th

GRE Writing

4/6

54th

USABO Semifinal Exam 2020

87/150

99-100th

AP Art History

5/5

86th - 100th

AP Biology

5/5

85th-100th

AP Calculus BC

4/5

43rd – 59th

The value of GenAI

Generative AI is the most viral technology of all time…

… & the first to generalize human-level intelligence

0

25

50

75

100

125

150

0

  • 400,000
  • 800,000
  • 1,200,000

# of days

Instagram

~75 days

ChatGPT

~5 days

Spotify

~150 days

# Users

GPT-4 Test Results

3

4 of 25

The power of generative AI (II/II)

What mood does the attached work of art convey? Why? What aesthetic choices does the artist make that contribute to the mood?

  • "The image portrays a sense of anguish and despair, which is most prominently conveyed through the central figure's open-mouthed scream. The swirling, tumultuous background further intensifies the emotional turmoil depicted in the scene. Here are some of the aesthetic choices the artist makes that contribute to the mood.

  • 1. Color Palette: The artist uses a bold color palette with stark contrasts. The fiery red and orange tones in the sky evoke a sense of danger and urgency which is juxtaposed against the cool, eerie blue of the water below.

  • 2. Brushwork: The swirling, almost chaotic brush strokes contribute to the unsettled, turbulent atmosphere of the scene. The lack of clear definition and the fluidity of the forms make the surroundings seem unstable and dreamlike, heightening the sense of disorientation and distress.

  • 3. Expression: The central figure's facial expression is on of terror and desperation. The open mouth, wide eyes, and hands-on the sides of the face are universal signs of fear and horror. This expressive portrayal captures the viewer's attention and communications the emotional essence of the scene.

  • 4. Composition: The curved, flowing lines of the landscape and the figure's elongated form create a sense of movement and fluidity, further contributing to the unsettling mood. The bridge and the two figures in the background provide a sense of depth and isolation emphasizing the central figure's alienation and despair.

  • 5. Perspective: The exaggerated perspective, with the bridge receding sharply into the background, amplifies the sense of depth and distance, making the central figure appear trapped or enclosed within the foreground.

  • These elements work together to create a dramatic, emotionally charged atmosphere that encapsulates the existential angst and unease that is characteristic of Edvard Munch's iconic work, The Scream."

ChatGPT response

The value of GenAI

4

5 of 25

GenAI today primarily used for enterprise operations

Source: June 2023 AI Adoption survey, N=125; BCG analysis

Value creation with GenAI today

FMs can be used to automate call centers, but we are going one step further, creating predicative call and text functionality that allows us to know ahead of time why customers are calling

We are using FMs in drug discovery to assist with gene sequencing and small molecule simulation to speed up pre-clinical studies and selection of leads for clinical studies

Operational efficiency use cases

(e.g. extracting clauses from legal documents)

Product & CX enhancement

(e.g. financial robo-advisor)

New products and business models

(e.g. GenAI drug discovery)

% of value creation by use case type

The value of GenAI

We are enhancing our chatbots with GPT trained on 30 years of documentation…it's like having our CSO sitting next to you when you're on a phone with a client.

$100B+ market size for tech providers by 2027

5

6 of 25

5 GenAI capabilities that drive value for enterprises

Tech. capabilities

Description

Illustrative use cases

Note: FM = foundation model

1. GenAI transformations can leverage multiple tech. capabilities (e.g., ChatGPT leverages content generation & creativity)

  • Interactive �& dynamic engagement of information, ideas, or questions between humans �& AI systems, responding to questions and generating appropriate responses
    • Chatbot for HCPs interactions
    • Internal chatbot for employees

UNDERSTAND

language / image

  • Extract structured knowledge from unstructured or semi-structured
  • data sources
    • Clinical data extraction from protocols or patient records

EXTRACT

knowledge

  • Summarize large amounts of information or text into shorter, more concise versions, that capture the key points of the content
    • Documents key take-aways
    • Content aggregator & syntheses

SUMMARIZE

& transcribe

  • Create specific types of content (e.g., text, images, videos, audio, code), and generate new and innovative ideas, concepts or designs
    • Meeting minutes writing & transcription
    • Marketing context localization
    • Innovative product concept generation

GENERATE

content

ANALYZE

data & text

  • Logical & reasoning process to make inferences
  • and draw conclusions, make informed judgments, derive new insights based on available information, data, or knowledge
    • Sentiment analysis

The value of GenAI

6

7 of 25

Model performance scales linearly with model size…

Expert systems

  • Machine Learning
  • Deep Learning
  • Large pre-trained (foundation) models
  • Generative, Multi-modal Neural Networks, Transfer Learning techniques

175 B

10k

0

10 M

  • Time
  • Powermodel parameters

GPT-3

1.7 T

GPT-41

Ability to complete multiple tasks on text format

Ability to reason and control other systems

Ability to complete multiple tasks on multiple formats (texts, image, audio, video)

…but what made it possible?

The value of GenAI

7

8 of 25

The rise of GenAI is fueled by three primary factors

1

2

3

  • Algorithm advances ("attention is all you need")
  • Exponential increase in available training data
  • Exponential increase in compute power

The value of GenAI

8

9 of 25

Algorithm advances | Transformer models

Transformer Model Overview

First described in 2017 by Google researchers in a paper titled "Attention is All You Need"

Transformers are a neural network that can apply mathematical techniques called "attention" to detect how data may be related to each other

Transformers are now the dominant models used to "pre-train" foundation models

Source: Attention is all you need, Proceeding of ​NIPS (2017), pp. 5998-6008; https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/

  • "Attention" removes need for labelled data

Before transformers, neural networks were trained on data labelled by humans (e.g., this image is a cat), which is expensive and a huge limitation to training data availability.

  • With transformers and "attention", data labelling no longer needed as model automatically understands context. This allows model to be fed huge amounts of training data with no human labelling, which is key to scaling
  • "Attention" enables learning around context

With "attention" mechanisms, transformers can tease out subtle ways different parts of the data are related through self-learning (e.g., "cats" are related to "pets")

  • This enables the model to learn and understand context
  • "Attention" gives rise to generative content

Using both vast amounts of training data, and learning around context, transformers now had generative ability.

  • The model effectively connected distinct concepts to create new content (e.g., images, text, video etc.)

The value of GenAI

9

10 of 25

Digital data growth | 10 to 120 exabytes in 10 yrs80% is unstructured, requiring "attention"

  • 0
  • 60
  • 120
  • Stored Digital Data (exabytes)
  • 1970
  • 1980
  • 1990
  • 2000
  • 2010
  • 2020

Requires “attention mechanism” to avoid human labeling bottleneck

80%+

of digitized data

Unstructured�no data model

Structured�well-defined, easily-organized database information

The value of GenAI

10

11 of 25

Compute | Exponential increase via Moore's Law

Larger models perform better than smaller, pushing practitioners to build larger models…

Human ability to detect if AI generated news article

Error

Model parameters

100M

10B

100B

1B

50% line of random chance = human cannot distinguish AI generated news articles from human generated ones

Source: Brown Tom B. et al., 2020. Language models are few-shot learners; "In-Datacenter Performance Analysis of a Tensor Processing Unit", Google

  • 1960
  • 2000
  • 1970
  • 2010
  • 1980
  • 1990
  • 2020
  • 1,000,000
  • 1
  • 100
  • 10,000
  • 100,000,000
  • Year
  • Transistor Density (Transistors mm-2)

Moore's law provides exponential advances in compute power

The value of GenAI

11

12 of 25

Each enabler also a roadblock for scaling GenAI

  • Transformer models are "greedy" and inefficient (as of now)
    • 175B parameter model scored bottom 10% on bar exam, 1.7T model top 10%
    • $1B+ training costs for largest models

Roadblock

Evidence

We are running out of training data to feed the models

  • We are running out of compute resources to power the models
    • So far, each gen of model requires 10-100X more compute
    • Current models are trained on superclusters for months
    • The next iteration that can show visible improvement (like GPT3 to 4) may require a decade to train, or 100 superclusters

Roadblocks for GenAI

    • Error rates, currently at 20%, cut in half with every order of magnitude increase in training data
    • Available internet data in principle, incl. private data, permits only two more orders of magnitude

12

13 of 25

Key roadblocks | Will we run out of data?

…cutting error rates in half

(Tokens from the internet can increase from ~1012 to ~1014 and reduce model error from ~20% to ~10%)

Source

Tokens in�current models

Tokens accessible�in principle

Internet

~1012

5 x 1014

Books

5 x 1011

1013

Wikipedia (English)

6.5 x 109

6.5 x 109

Wikipedia (All)

2.5 x 1010

3.9 x 1010

Scientific papers

2.7 x 1010

1.5 x 1012

Twitter

Few

2.5 x 1013

Text Messages

0

1012/year

Youtube

0

4 x 1012

80%+ of model improvement due to amount of data vs. model design

Available internet training data can grow another 2 orders of magnitude …

Source: dynomight.net, BCG analysis

Roadblocks for GenAI

13

14 of 25

Key roadblocks | Will we run out of compute?

Meta RSC AI supercluster

6,080 Nvidia's latest A100 GPUs

~1.9 exaflop/s (1018 float-point operations per second)

  • If you were training on the largest AI supercluster…

~15 hours

~60 days

~16 YEARS

(or 600,000 GPUs)

  • GPT-3
  • GPT-4
  • Another 100X increase*

* GPT-4 uses ~100X computing power in training vs. GPT-3

Roadblocks for GenAI

14

15 of 25

Convergence possibilities for Quantum & GenAI

Quantum Computer x GenAI to work side by side

    • Maximize the power of GenAI and quantum computing in a hybrid workflow by using them to handle different tasks

Quantum Computing to solve challenges in scaling GenAI

    • Enhance performance with "quantum-inspired" classical algorithms
    • Run models on quantum computers / quantum data
    • Generate synthetic training data (with true randomness)
    • Enhance error correction
    • Improve quantum hardware design
    • Augment quantum software development
    • Speed up calibration process

GenAI accelerating progress in Quantum Computing

Convergence potential

A

C

B

15

16 of 25

Convergence | QC to resolve GenAI roadblocks

  • Transformer models are "greedy" and inefficient
  • Quantum-inspired techniques to improve efficiency (e.g., QNN, simulated annealing, tensor networks for LLM)

Roadblock

Use case

We are running out of training data to feed the models

  • Synthetic data with quantum random number generator

  • Using quantum data to reduce sample complexity
  • We are running out of compute resources to power the models
  • Run model on quantum machine leveraging vast state space

Example research in the field

A

  • Quantinuum (CQC) is researching quantum natural language processing to reduce the size of neural networks (e.g. training datasets for LLMs)
  • IonQ has partnered with FCAT to develop a quantum machine learning (QML) proof of concept for correlation in the finance industry using synthetic data from a quantum AI model

Zapata researchers explored substituting parts of MolGAN with variational quantum circuits (VQCs), resulting in quantum GANs surpassing classical GANs in performance

Convergence potential

16

17 of 25

Convergence | GenAI to accelerate QC timeline

Potential to impact fundamental QC algo research?

qecGPT Project1: A framework called qecGPT has been proposed for decoding quantum error-correcting codes using generative modeling. This model employs autoregressive neural networks, specifically Transformers, to learn the joint probability of logical operators and syndromes​

B

Source: 1. arvix.org H Cao, F Pan, Y Wang, P Zhang 2. GitHub, 3. Electronic Engineering Journal, BCG analysis

GenAI massively impacting classical programming today

  • Code assist

Benefit

    • Code assist tools help expedite processes like quantum code autocompletion for quantum languages like Cirq, Qiskit, etc
    • Developers using code assist tools coded up to 55% faster2

E.g. Copilot, Codey

  • Debugging

E.g. Tabine, PyCharm, Visual Studio Code

    • AI-assisted software may highlight errors in syntax and provide additional context for coding bugs
    • 75% of users3 ranked a debugging tool (Visual Studio Code) as their most important tool
  • Anomaly detection

E.g. TensorFlow

    • AI-supported anomaly detection techniques help quantum by providing system monitoring (of qubits, etc) and identifying code inefficiencies

Convergence potential

17

18 of 25

Registered chemicals

275,000,000

Entire chemical space - # of compounds:

1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (1060)

Approved medicines

Today ~10,000

Ultralarge chemistry databases

100,000,000,000,000,000,000,000,000 (1026)

Illustrative drug discovery workflow

Completely inaccessible today…

Not even theoretically

C

Convergence potential

18

19 of 25

Registered chemicals

275,000,000

Entire chemical space - # of compounds:

1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (1060)

Approved medicines

Today ~10,000

Ultralarge chemistry databases

100,000,000,000,000,000,000,000,000 (1026)

GenAI to generate potential chemical formulas that are likely synthesizable with desired properties

Quantum computer to verify exact properties with quantum simulation on atomic scale and find drug candidates

GenAI generated candidates

Tomorrow could be millions times more efficient

C

Convergence potential

Illustrative drug discovery workflow

19

20 of 25

Our goal is to compress the next 250 years of chemistry and materials science progress into the next 25

Convergence potential

C

20

21 of 25

Traditional computers had many applications prior to the development of error correction

1937 | Atanasoff-Berry Computer solves systems of linear equations for astronomy research

1941 | British Bombe deciphers German Enigma codes

1944 | IBM Harvard Mark I simulates atomic reactions for Manhattan Project

1945 | ENIAC calculates artillery firing tables for the US Army

1950 | Hamming "error correction" codes are introduced

Why collaborate now

21

22 of 25

Government funding, and public-private collaboration around core tech critical ingredients

Research

Technology

Investment

Users

Research

Technology

Investment

Users

Why collaborate now

1951

1952

1953

1954

1955

1955

1956

1956

1953

IBM Harvard Mark I (1939-1944)

1941

1943

1943

UNIVAC I (1950-1956)

    • Census Bureau was not a "passive" customer: defined specifications (e.g. size, tabulation time) and worked alongside researchers for six months on the design of UNIVAC and approved its readiness in 19511

1952

Source: 1. US Census Bureau

22

23 of 25

We are in early stages today but components in place

Research

Technology

Investment

Users

We must unify these efforts to scale effectively

In April 2023, IBM and Moderna launch partnership to explore the use of quantum computing and GenAI to advance and accelerate mRNA research & science.

Moderna kicks off the build of a quantum- and GenAI-ready workforce, while IBM explores how quantum technology can apply to mRNA research.4

Source: 1. J. Chem. Inf. Model. 2023, 63, 11, 3307–3318, 2. arvix.org H Cao, F Pan, Y Wang, P Zhang, 3. Global Quantum Intelligence, 4. IBM newsroom

May 2023: Fundamental research into GenAI and quantum computing convergence1

July 2023: Researchers proposed a framework called qecGPT for decoding quantum error-correcting codes using generative modeling2

Governments have pledged $55B+ to quantum computing to date and over $200B by 2030.3

Why collaborate now

23

24 of 25

Key takeaways

GenAI and Quantum Computing independently have transformative value creation potential

Convergence opportunities include both (a) roadmap acceleration with QC advancing GenAI and vice versa, and (b) workflow transformation where they create exponential benefits working side-by-side

But the expertise and resources required for technology convergence on this scale overwhelms single-company efforts and even small partnerships

What it requires is concerted, integrated and multi-disciplinary consortium-led efforts to define goals, guide research and deliver results

24

25 of 25

Thank you

25