1 of 39

Connecting Questions to Questions: How to Translate Real-world Questions into Deducible Insights

Project by: Andrew Ackerman, Panos Andreou, Elyse Borgert, 

Minji Kim, Dawn Sanderson, Kendall Thomas, and Yifei Zhang

Presented by: Panos Andreou, Elyse Borgert, 

Dawn Sanderson, and Kendall Thomas

2 of 39

ABOUT USDepartment of Statistics & Operations Research at UNC Chapel Hill

The Department of Statistics and Operations Research specializes in inference, decision-making, and data analysis involving complex models and systems exhibiting both deterministic and random behavior. We focus on developing and analyzing the necessary quantitative and computational tools to enable practitioners to solve problems in statistical and probabilistic analysis, modeling, optimization, and the evaluation of system performance.

Our faculty engage in fundamental research in probability, statistics, stochastic processes, and optimization, and are also heavily involved with interdisciplinary areas of application such as genomics, biological modeling, environmental statistics, insurance, and financial mathematics, revenue, workforce, and supply-chain management, traffic flow and congestion, and telecommunications.

We are (2nd-5th year) PhD students, leading efforts to engage STOR graduate students and faculty in data science education outreach

3 of 39

Classroom to Reality: Project Origin

Applying tools from statistics help answer scientific questions

Question: How do we formulate those scientific questions into a statistical problem we can solve?

Learn by example: how we address this challenge in our own research

4 of 39

Presentation Outline

Two research examples

        • Mars sample return mission
          • Overview of scientific question
          • Formulation of statistical questions
          • Interactive tutorial: simulation study

        • Opinion dynamics
          • Overview of socio-scientific question
          • Formulation of mathematical model
          • Interactive tutorial: simulation study

5 of 39

Presentation Goals / Deliverables

      • Motivate applications of curriculum standards with real research questions

      • Demonstrate how real research questions are connected to statistical problems (answerable with familiar methods)

      • Provide open-access interactive tutorials for classroom use!

6 of 39

Research Example 1:

Mars Sample Return Mission

Dawn Sanderson

7 of 39

Questions to Questions: An Overview

  • Broad Question           Further Questions

  • Example: Mars Sample Return Mission

  • Statistical Technique: Simulation

  • Question Outline:

How do we achieve a mission safely?

Quantify "safely"?

Mission Process?

Ensure/Check Standard?*

Quantifiable Answer

8 of 39

A Safe Return Home

  • Mars Sample Return Mission

  • Back Planetary Protection

How can we ensure the safe return of the Martian samples to Earth?

How do we achieve a mission safely?

9 of 39

What Does "Safe" Even Mean?

  • Decision makers: How much risk are we willing to accept?

  • Factors in decision:

Goal: Maintain containment of the unsterilized Mars materials with a probability of failure less than one in a million

Quantify "safely"?

Scientific Advancements

Risk to Earth

10 of 39

How Do We Get the Sample to Earth?

  • Multi-stage, multi-part process

  • Many components

Mission Process?

Goal: probability of loss of containment < 1 in a million

  • Focus on re-entry vehicle entering earth's atmosphere

11 of 39

  • What affects sample containment upon re-entry?
      •  types of materials & physical properties (strength, rigidity, etc.)
      • re-entry acceleration – If re-entry acceleration exceeds a certain threshold value, the result is loss of containment of the sample! 

Ensure/Check Standard?*

Earth Entry Vehicle (EEV) Re-entry

High Velocity

Loss of Containment = Exceeding Acceleration Threshold

AHA!

12 of 39

  • How to determine if EEV will exceed the re-entry acceleration threshold?

  • Alternatives to repetitive physical experiments:
      • Modeling & Simulation

Threshold Exceedance?

Ensure/Check Standard?*

Probability of Loss of Containment = Probability of Exceeding Acceleration Threshold

13 of 39

Modeling

Model

Input Variables

Material properties

Parameters

Re-entry: angle, location, weather, etc.

Output value

Re-entry Acceleration

Ensure/Check Standard?*

14 of 39

Ensure/Check Standard?*

Simulation

Physical tests of material properties

Fit distributions based on samples

Simulate random samples based on distributions

Output of model: re-entry Acceleration

Determine the probability of Threshold Exceedance

RUN

Use random samples of input variables in re-entry modeling

  • Input Variables: small sample of physical properties
    • But we need more values to run model

  • Fit a distribution based on sample – simulate more values for modeling purposes

15 of 39

Be the Statistician!

  • Now you get to apply the concepts yourself

  • Using your computer (recommended) or phone, type in the address or scan the QR code to access an interactive activity that allows you to be the statistician who makes the decisions

  • We will also work through the activity as a group on the big screen and I will ask your decisions as a group via a Kahoot! Poll

OR

16 of 39

Review: Questions to Questions

Determine threshold exceedance?

How do we achieve a mission safely?

Quantify "safely"?

Mission Process?

Ensure/Check Standard?*

What affects  re-entry?

What distributions fit?

What is the probability of exceedance?

Did we meet safety requirements?

17 of 39

  • Engineering decisions are made regarding the types of materials to be used and the physical properties of those materials that are most relevant to sample containment upon re-entry

  • Next, the engineers determine that the most crucial aspect of EEV re-entry regarding material sustainment (and thus sample containment) is re-entry acceleration
      • That is, if re-entry acceleration exceeds a certain threshold value, the result is loss of containment of the sample 

Ensure/Check Standard?*

Back to Earth

18 of 39

What Could Go Wrong?

  • Intricate mission = multiple error prone elements

  • Any time there's talk of probability, there's usually a statistician involved.

There's nothing wrong with having a plan. Plans are great. But missions are better. Missions survive when plans fail, and plans almost always fail. - Seth Godin

Meet Standard?

Safety Goal 

Mission Details

Probability requirement possible?

19 of 39

Zooming into Details

Earth Entry Vehicle (EEV) release & re-entry:  this unmanned craft must endure high velocity, extreme heat, and a potentially hard impact. 

Solid model of the MSR EEV

Break down process?

MAV Earth Launch

Sample Delivery

MAV Mars Launch

MAV & ERO Rendezvous

CCRS insertion to EEV

ERO departs Mars Orbit

EEV release & re-entry

Probability Requirement

20 of 39

Research Example 2:�Opinion Dynamics�Panos Andreou & Kendall Thomas

21 of 39

Opinion Dynamics

  • Today, people seem to be polarized on many important topics. �    We would like to know: what causes polarization?

  • This seems to mostly relate to social sciences. Could we use

math and/or statistics to approach this question?

  • What would be the right quantities to study?

  • What type of real data could we use? Could we approach the problem without having available data? What about simulation?

22 of 39

Opinions & Polarization

  • Why do people have such divided opinions?
      • e.g., pineapple on pizza, cats vs. dogs, socks with sandals

  • Can we predict opinions using math? Do we need data?

  • Imagine we could simulate a world to study opinions; no real data needed!

23 of 39

First Considerations

  • What are the quantities that describe the problem?
      • E.g., time, people, media, etc.

  • In what "structures" could these quantities be organized in the real world?

  • We have a collection of entities (people) and connections between them (opinion exchange). This reminds us of a network!

24 of 39

First Considerations

  • What factors shape our opinions?
      • friends, media, stubbornness

  • How could we represent this in the real world?

  • Group of people and connections between them. That's a network!

25 of 39

How can we model this?

  • Group of people and connections between them; that's a network!

  • Examples: World Wide Web, social media, NYC subway 

  • We often don't know whether an edge is there or not, so it makes sense to think of it as random!

  • Three candidates: Erdős-Rényi, Stochastic Block Model, Preferential Attachment model

26 of 39

Erdős-Rényi

  • The simplest random graph model

  • Connect any two nodes with the same probability
      • Like flipping a coin: heads, they connect; tails, they don't

  • Too simple for our purpose: not all connections behave the same!

27 of 39

Stochastic Block Model 

  • Nodes are clustered in communities

  • Higher connection probability within the same community

  • Reflects real-life behavior: people tend to align more with those in the same group

28 of 39

Preferential Attachment Model

  • New nodes favor connecting with more popular ones

  • Captures the "rich get richer" phenomenon

  • Examples: Instagram influencers, TikTok viral videos

29 of 39

Opinion Dynamics

Group of people exchanging opinions on a given topic (social network)

Several factors affecting opinions:

    • our own stubbornness
    • our friends' opinions
    • influence by the media

30 of 39

From Networks to Opinions

  • How do opinions flow on a given network?

  • Opinions change based on:
      • Own stubbornness
      • Neighbors' opinions 
      • External media influence

31 of 39

An Opinion Model

  • The social network is an SBM with 2 communities, traditionally carrying opposite opinions on the given topic of discussion.

  • We represent an opinion by a number between –1 and 1.

  • At each time step, individuals update their opinions based on their own opinion, their friends' opinions and the media signals they receive. 

  • People put more weight on opinions of people from the same community.

  • We want to know: under which conditions do people's opinions converge? If they don't, then what causes polarization?

32 of 39

An Opinion Model

  • Network: Stochastic Block Model with 2 communities

  • Opinions: numbers between –1 and 1

  • At each time step, update opinions based on own opinions, neighbors' opinions and media influence

  • Confirmation bias: put more weight on same-community connections

33 of 39

How to recognize Polarization? 

  • Draw histogram of final opinions

  • Color people in community 1 with blue and people in community 2 with red 

  • If blue and red form separate clusters, there is polarization

34 of 39

How to recognize Polarization? 

Idea: let the people exchange opinions for a long time and at the last step count the number of people holding an opinion on a given interval. 

      • This creates a histogram

  • Color people in community 1 as blue and people in community 2 as red

  • If blue and red form separate clusters, we say there is polarization. If they overlap, we say there is consensus.

35 of 39

Our opinions are affected by the opinions of the people we talk to (neighbors)

      • We put more weight on them

We are also affected by external media signals we receive

      • Different types of media can strongly affect polarization

Neighbors' and Media Effect

36 of 39

  • Uniform: same signal distribution for everyone (unbiased media) 

  • Limited info: same as uniform, but less reach

  • Partial info: left-leaning bias

  • Biased info: left for community 1, right for community 2 (targeted media)

  • Biased extreme: extreme version of biased info

Media Types

37 of 39

Media Types

  • Random walk: accumulates past influences

  • Periodic: incorporates cyclic or seasonal trends

  • Gradual polarization: increasingly targeted over time

  • Sudden change: shift from positive to negative bias

  • Oscillating: shift between two extremes at each time step

In the app, we'll see that polarization is caused by targeted media!

38 of 39

Networks & Opinions 

Activity

39 of 39

Q&A + How to connect with us

  • STOR website

  • Student email addresses