1 of 42

Physics Is All You Need?

IMPROVING NEURAL NETWORKS TRAINED WITH LIMITED DATA BY DERIVING IMPROVEMENTS BASED ON KNOWLEDGE OF THE UNDERLYING PHYSICS INTERACTIONS

Ph.D. Dissertation Proposal

Author: Jose G. Perez

Department of Computer Science

The University of Texas at El Paso

2 of 42

Neural Networks

2

are Machine Learning (ML) models which:

  • Are based on a math abstraction of the human brain
  • Are “trained” to solve tasks such as object detection/classification, etc.
  • Use units named “neurons” in sets called “layers”
  • Have many variations depending on what type of data you are working with
    • Convolutional Neural Nets (CNNs) which are the best for image data
    • Long Short-Term Memory Nets (LSTMs) which are good for sequential data
    • Physics-Informed Neural Nets (PINNs) which are good when dealing with real-world data
    • U-Nets which are good when segmenting an image into different components

3 of 42

The Need For Data In Neural Networks

Object tracking and detection with models such as YOLOv3

Text generation with Large Language Models (LLMs) such as GPT-3

Synthetic video generation with SORA from OpenAI

Image

Text

Video

  • Deep neural network architectures are the state-of-the-art models for different types of data across multiple disciplines and fields of study
  • A big commonality among all the different architectures and methodologies is the need to have a lot of data to train models

3

4 of 42

45TBs of Crawled Text Data

300 Million Images

From 120,000 to 3,000,000 Images

Used to train GPT-3 by OpenAI

Used in the private dataset created by Google to train MLP-Mixer and other models

In each of the 10 categories of the LSUN dataset used by NVIDIA’s StyleGan2

4

5 of 42

The Difficulties of Getting Data

  • Gathering data is very important as the quantity and quality can make big impacts on the performance of your models

  • Although public datasets are available for many tasks, it is usually difficult for people of other fields outside Computer Science to find useful public datasets

  • If no datasets exist for your problem, you need to gather data which comes with its own set of challenges:

5

6 of 42

Difficulties Gathering Your Own Data

6

Gathering and storing sensitive information from personal medical records to classified national security data can be challenging

Collecting and labelling data can be very time consuming, leaving less time for research and experimentation

Hiring people to help, buying software and specialized equipment/sensors, and other similar expenses can be costly

Sometimes data is just unavailable no matter what resources you have

Difficulties

Data Sensitivity

Time

Money

Availability

7 of 42

Working Around Data Limitations with Few-Shot Learning

“Transfer” knowledge from a pre-trained model for a similar problem

Generate more data samples from your existing ones

“Learning to learn”, improve the learning algorithm by observing how networks learn

Transfer Learning

Data Augmentation

Meta Learning

Embed laws of physics in the network to make learning easier

Physics-Informed Neural Net

  • Idea: Train a neural network with a “few” labeled data samples while still achieving “good performance” for a given problem
  • Approaches typically focus on taking one component of the training and adapting it:

7

8 of 42

Integrating Physics Into Neural Networks

8

  • Humans don’t start learning from scratch, we have a lot of prior knowledge
  • Physics is a widely studied science with a lot of knowledge
  • Real-world data must follow the “laws of physics”
  • Idea: If we have limited data, could we leverage our knowledge of physics in a neural network model trained with real-world data?

Network Loss Gradients

Defined as partial differential equations

Laws of Physics

Defined as partial differential equations

Physics-Informed Neural Networks (PINNs)

Network outputs are constrained as to not deviate from the laws of physics

1

2

3

Idea

9 of 42

Physics-Informed Neural Networks

  • Networks where the output is constrained by known physics through partial differential equations (PDEs) combined with the network loss function

9

10 of 42

Thesis Statement

Incorporating physics in neural network models through modification of data and loss functions will allow for better performance and faster convergence for the problems of fluid flow velocity prediction and glacial ice segmentation.

10

11 of 42

Specific Problems With Limited Datasets

Predict what the velocity of a fluid will be at certain points and certain times given some initial conditions and defined geometry

Fluid Flow Velocity Prediction

11

12 of 42

Specific Problems With Limited Datasets

Determine which areas are clean ice, debris covered ice (ice mixed with rocks), or mountain in a satellite image of a glacier

Glacier Ice Segmentation

12

13 of 42

Expected Contributions

  • Develop a model that uses PINNs for the task of fluid flow velocity prediction
  • Investigate ways to combine PINNs with networks built for sequential data (LSTMs) as fluid flows happen in a sequence across time
  • Improve existing baseline U-Net model for the task of glacier ice segmentation by investigating ways to augment existing labeled data based on physics

Physics-Informed Neural Network

For Fluid Flow Velocity Prediction

Physics-Informed Data Augmentation

For Glacier Ice Segmentation

13

1

2

14 of 42

Expected Contributions (cont.)

  • Investigate ways to improve the model developed for fluid flow prediction from contribution #1 and ways to adapt it for the specialized task of glacial ice velocity prediction
  • Develop a new architecture based on a combination of PINNs and a segmentation model (UNet) for the task of glacier segmentation
  • Investigate ways to combine datasets of hyperspectral satellite images that have no velocity information with datasets of glacier velocities

Physics-Informed Neural Network

For Glacial Ice Velocity Prediction

Physics-Informed Neural Network

For Glacier Ice Segmentation

14

3

4

15 of 42

#1 - Physics-Informed LSTM For Fluid Flow Prediction

15

16 of 42

Fluid Flow Velocity Prediction

  • Understanding how fluids flow is very important for the study and development of airplanes, cars, boats, rockets, and much more
  • A common approach is running Computational Fluid Dynamic (CFD) simulations by
    • Setting up a simulation scene with the relevant geometry and fluids
    • Defining the scale of the simulation by creating a discretized mesh of finite elements of specific size (like 1m)
    • Setting the boundary conditions
    • Selecting a solver and running it for a specific period of time

16

17 of 42

17

17

18 of 42

Fluid Flow Velocity Prediction (cont.)

  • One drawback of CFD simulations is that high-fidelity simulations can be very computationally expensive
    • If you want high-fidelity, your finite elements should be small so you can have a lot more of them in your mesh
    • Bigger mesh, bigger geometries, bigger environment = MORE COST
  • We developed an approach to help alleviate this computational cost issue by
    • Training a neural network to predict fluid flow that leverages two facts about fluid flows:

18

1. They are governed by physics

Therefore, we want to apply concepts from Physics-Informed Neural Networks

2. They are a type of sequential data

Therefore, we want to use Long Short-Term Memory Networks as part of our design architecture

19 of 42

Long Short-Term Memory Networks (LSTMs)

  • Neural network models designed to learn long term dependencies in
    • Sequential data like time series data
  • Selectively pick and store short-term information that might be useful to know later
  • Information is stored with input, forget, and output gates
    • Determine what past information to forget
    • What new information to keep track of
  • By keeping and storing short memories for a long period of time, you get long short-term memory networks

19

20 of 42

Physics-Informed LSTM for Fluid Flow

  • We generated a dataset by running CFD simulations, for every point we stored:
    • x, y, timestep (t), pressure (p), velocity components (u, v), water volume %
  • We developed an architecture combining PINNs and LSTMs for the task
    • Input: 10 previous timesteps
    • Output: The velocity components (u, v) predicted for the next timestep

20

Focuses on the sequential relationships of the data

Focuses on enforcing the governing physics

21 of 42

Architecture

21

22 of 42

Metric: Mean Squared Error

  • We need a metric to evaluate the performance of our network results
  • We will use Mean Squared Error (MSE) for this problem
    • Closer to 0 = Better

22

23 of 42

Results

  • We trained the model using the 250m/s simulation
  • Evaluated performance predicting the velocities of the 300m/s simulation.
  • To measure how well the model predicts the velocities, we use Mean Square Error (MSE)

23

After running each model for a maximum of 200 epochs, from the results we can observe

  • PINNS + LSTM outperforms all other models in MSE for both velocity predictions
  • Each epoch without physics is 27 seconds on average
  • Each epoch with physics is 30 seconds on average

24 of 42

#2 - Physics-Informed Data Augmentation For Glacier Ice Segmentation

24

25 of 42

Glacier Ice Segmentation

  • Glaciers are a very important source of water for many regions of the world, so monitoring their changes over time is critical
  • There exists many satellites such as NASA’s Landsat-7, Landsat-8, and Sentinel-2 that have captured hyperspectral images of these glaciers of a long period of time
  • Glaciologists take these images, and use their expertise to determine what areas of the images are clean ice, debris covered ice (ice mixed with rocks), and regular rocks
    • This is called segmentation
  • However, manually labeling these images is very time consuming
    • Satellite images have more channels than the regular 3 in digital cameras (hyperspectral) which can also be high in resolution
    • Labeling a single Sentinel-2 image can take an expert 1 to 4 weeks depending on the complexity

25

26 of 42

Dataset

  • A labeled dataset exists for the glaciers in the Hindu-Kush Himalayas (HKH)
    • Created by the International Centre for Integrated Mountain Development (ICIMOD)
    • Annotates images from NASA’s Landsat-7 satellite
    • Contains RGB, Near-Infrared, and Digital Elevation Map data

26

27 of 42

U-Net Baseline

  • A neural network model that was published in 2023 using U-Net for this glacier dataset was used as a starting baseline
    • Boundary Aware U-Net for Glacier Segmentation” by Bibek Aryal et. al

27

28 of 42

U-Net

  • Variation of a Convolutional Neural Network
  • Split into two components
    • Left = Encoder
    • Right = Decoder
  • Encoder takes input and extracts features that capture that important context of the image
  • Decoder takes encoded features and upsamples them until we get an output the same size as the original input
  • Location information lost while encoding is maintained by having the skip-connections going across the U-shape
  • Achieved state-of-the-art in segmentation of biomedical images when introduced in 2015

28

29 of 42

Physics-Informed Data Augmentation

  • Idea: Snow falls on the glaciers, then the glaciers flow as fluids and change
    • How can we encode this “precipitation” in the network so it can leverage it?
  • We developed an algorithm that augments the existing glacier dataset by adding an additional channel to each satellite image based on this idea

29

30 of 42

Algorithm (Input)

  • We will use the elevation data of each image
    • Values are between 0 and 1
    • 0 being the lowest elevation (black pixels) and 1 being the highest (white pixels)
  • Assume that every pixel in the image can “accumulate drops of precipitation”
    • This is the new channel’s data, it will be normalized from 0 (none) to 1 (highest accumulated precipitation)

30

31 of 42

Algorithm (cont.)

  • We want to encode how fluids flow from higher elevations to lower elevations
  • Put a “drop” of precipitation in each pixel, then:
  • For each pixel in the image:
    • Find paths from its current elevation to a lower elevation
  • We use the path-finding algorithm called Breadth-First Search (BFS)
    • Each time we visit a pixel in a path, that pixel’s accumulated precipitation is increased by 1

31

32 of 42

Algorithm (Overview)

  • Take the elevation map as the input image
  • Create an empty image of the same size as the input (for the output)
  • For each pixel in the image, perform Breadth-First Search (BFS) with a special limitation where you can only visit pixels you haven't visited before AND that are lower elevation than the current pixel popped from the queue
    • Water/ice/snow can only flow down
  • Each time a pixel is visited, the output at that pixel's position increases by 1 as 1 drop of water/ice/snow has flowed down to it
  • At the end, normalize the image between 0 and 1 by subtracting the mean and dividing by the standard deviation.

32

33 of 42

Data Augmentation Example

Left: Digital Elevation Map

Right: Precipitation Accumulation Channel (augmented data)

33

34 of 42

Metric: Intersection over Union (IoU)

  • To measure the performance of the segmentation model, we use a metric called the Intersection over Union (IoU)
  • From 0 to 1, with 1 being perfect

34

35 of 42

Results

  • Proposed data augmentation leads to improved performance on segmentation of Debris-covered ice
    • Debris Ice classification is challenging for both the baseline model & glaciologists
  • With some hyperparameter tuning, I hypothesize we can reach 40% IoU in Debris Ice
  • Proposed approach does not outperform baseline in Clean Ice segmentation
    • I hypothesize this is due to my model being a single multi-class model and the baseline being two separate binary models
    • Will train separate binary models for final dissertation

35

36 of 42

#3 - Physics-Informed Network For Glacial Velocity Prediction

36

37 of 42

Dataset

  • There exists a dataset of glacier ice velocities created by the National Snow and Ice Data Center
    • “MEaSUREs ITS_LIVE Regional Glacier and Ice Sheet Surface Velocities”
    • https://nsidc.org/data/nsidc-0776/versions/1
  • This dataset contains the surface velocities for major glacier-covered regions in the world, including the Hindu-Kush Himalayas (HKH)
  • The dataset spans from 1985 to 2018, and was compiled with the satellite images of NASA’s Landsat-4 to Landsat-8
  • Glacial ice velocity prediction is a specialized form of fluid velocity prediction where
    • Fluid = Ice from glaciers

37

38 of 42

Architecture

  • We have already developed Physics-Informed LSTMs for fluid velocity predictions
    • We can use that as the starting architecture for this new dataset
  • We will experiment with changes to the architecture by:
    • Trying newer sequential models that have been shown to outperform LSTMs such as Transformers, GRUs
  • The only changes needed to be investigated would be for the loss function
    • Currently, the loss function is based on the 2D incompressible Navier-Stokes equations
    • Need to learn more about ice flow and determine how to adapt the previous model to this new type of data

38

39 of 42

#4 - Physics-Informed Network For Glacial Ice Segmentation

39

40 of 42

Physics-Informed Segmentation

  • We will investigate different approaches to combine
    • Physics-Informed Neural Networks used for the glacial velocity predictions
    • U-Net used for the glacial ice segmentation
  • This is feasible and achievable because:
    • The datasets for these two problems can be combined as they cover the same time periods and were created from the same input satellite images (Landsat-7)
  • We will try and investigate two methodologies to combine these:
    • A two branch network, the same way the PINN + LSTM network was created
    • A self-learning loss that combines the losses of both networks into one

40

41 of 42

Summary of Contributions

  1. Physics-Informed Network for Fluid Flow Velocity Prediction
    1. Completed and published with Physics-Informed LSTMs
    2. Achieved better performance than just LSTMs and PINNs by themselves
  2. Physics-Informed Data Augmentation for Glacial Ice Segmentation
    • Completed with some extra experiments pending
    • Achieved better performance than baseline in Debris-ice segmentation
  3. Physics-Informed Network for Glacial Ice Velocity Prediction
    • Dataset already exists
    • Can be adapted to contribution #1’s model as glacial ice flow is a subclass of the general fluid flow problem
  4. Physics-Informed Network for Glacial Ice Segmentation
    • Feasible as data from contribution #2 and #3 are both from the same satellite and same time periods
    • Requires investigation into determining how to combine the separate models into one big multi-modal network

41

42 of 42

Thank You!

Please keep this slide for attribution

CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon, and infographics & images by Freepik