1 of 80

	Speaker	Timing (60 mins total)	Outline
Intro	Steve / Nick	5 min	What do we mean by Classical Machine Learning? EE offers built-in APIs for Classical ML. The basic architecture of an ML model that classifies the land. Advantages of Classical ML over Deep Learning The ML journey - Gather Data, Prepare Data, Train and Test.
Extracting more from each pixel with derived information	Eliana	10 min	Urban Forests in Porto Alegre Show Code in slides. Code will be provided for folks to run afterwards.
A basic land classification task. Dealing with unbalanced classes	Steve / Nick	10 min	A basic example with USDA NASS Cropland Data Layers as our labels using Stratified Random Sampling Hands on Coding Exercise
Hyperparameter Tuning	Steve / Nick	8 min	Switch USDA NASS model to use HPTuning of tree size Hands on Coding Exercise OR Demo?
K-Fold Cross-Validation	Steve / Nick	2 min	Why K-Fold?
Post-Processing	Andréa	5 min	Removing Isolated Pixels - Mapping Mangroves in Guyana Show Code in slides. Code will be provided for folks to run afterwards.
Ensembling Multiple Models	Emma	15 min	Mapping Smallholder Farms Show Code in slides. Code will be provided for folks to run afterwards.
Q&A	All	5 min

2 of 80

	Link	Target
Cropland Exercise	https://goo.gle/g4g24-croplands	https://colab.research.google.com/drive/1O1DyeamBnl2-jnFR6kljmnH7mGTn7Uxy?usp=sharing
Cropland Solutions	https://goo.gle/g4g24-croplands-solution	https://colab.research.google.com/drive/1YnroAmIQH_Zl6q_I4T6fN817oF1iT2qa?usp=sharing
Demo Script
		https://code.earthengine.google.com/fb6dd088744a43b868ada67a38dadffd
		https://code.earthengine.google.com/91072ebc1a373fe5a7775a9d561fc91e

3 of 80

Geo for Good Summit Sept 25th - What’s Happening Now?

Classifying the Land - Scaling and Improving the Accuracy of Classical ML APIs in EE

Room: Experiment

Data Catalog - new datasets and new capabilities �Room: Carina Nebula

Configuring your Earth Engine access

Room: Birr Castle

9:15-10:15am

10:15-10:45am

Coffee break outside Experiment, Carina Nebula and Birr Castle

10:45-12:15pm

Automating Earth Engine workflows

Room: Experiment

Sustainable Sourcing and Convergence of Evidence

Room: Carina Nebula

Importing into Earth Engine

Room: Birr Castle

12:15-1:45pm

Lunch outside Experiment

“Ask Me Anything” session with Earth Engine Product Managers in Experiment from 12:45-1:15pm

You are here

4 of 80

Classifying the Land Scaling and Improving the Accuracy of Classical Machine Learning APIs in Earth Engine

Steve Greenberg, Earth Engine Developer Relations Lead

Andréa P Nicolau, Spatial Informatics Group

Eliana Lima da Fonseca, Universidade Federal do Rio Grande do Sul

[Pre-recorded] Emma Izquierdo Verdiguier,

University of Natural Resources and Life Sciences (BOKU)�

September 2024 | #GeoForGood24

Hi I'm Steve Greenberg… I lead Developer Relations for Earth Engine. I'm here with two of Earth Engine's Google developer experts or GDEs. GDEs are recognized both for their expertise and for their contributions back to the community.�

Andréa Nicolau from Spatial Informatics Group
Eliana Fonseca from Universidade Federal do Rio Grande do Sul
Emma Izquierdo from University of Natural Resources and Life Sciences will also be joining us virtually in a pre-recorded section of the talk.

While Earth Engine is a *general* purpose remote sensing platform, Land Classification is Earth Engine's most- popular and central use case.

Earth Engine is *optimized* for land classification. The platform comes with built-in APIs, and it's truly amazing what you can get done with only a few lines of code.

Since you're at this conference, my guess is that many of you have used Earth Engine to classify land.

You've probably hit some limitations in that kind of classification. I'm here with some true Earth Engine experts to provide some tips to overcome these challenges.

5 of 80

Classifying the Land Scaling and Improving the Accuracy of Classical Machine Learning APIs in Earth Engine

Nicholas Clinton, Earth Engine

[Pre-recorded] Andréa P Nicolau, Spatial Informatics Group

[Pre-recorded] Eliana Lima da Fonseca, Universidade Federal do Rio Grande do Sul

Emma Izquierdo Verdiguier, University of Natural Resources and Life Sciences (BOKU)

September 2024 | #GeoForGood24

6 of 80

Agenda

01

02

03

Introduction

Why classify?

Why not use deep learning?

Extracting more from your pixels

[Demo] Urban Forests in Porto Alegre

Hyperparameter Tuning

[Hands on Coding] Modifying Random Forest size

Post Processing

[Demo] Mangroves in Guyana

Ensembling Multiple Models

[Demo] Mapping Smallholder Farms

Q&A

04

05

06

#GeoForGood24

7 of 80

Introduction

8 of 80

Introduction

9 of 80

Introduction

10 of 80

Introduction

11 of 80

Different from Deep Neural Networks

Classical ML

Each Pixel is (typically) considered independently

Less computationally intense

Earth Engine supports with built-in APIs

Deep Learning

Classification is done on arrays of pixels

More computationally intense

Requires use of Tensorflow / PyTorch and Vertex AI

12 of 80

Easy to get started

Using classical machine learning in Earth Engine is very easy if the algorithm you want to use is built-in.

Various examples to adapt from, minimal set up allows you to focus on other important problems with machine learning instead.

Good results

Although some classical machine learning methods appear primitive compared to their neural-network counterparts, depending on the problem, can give good results.

Many analysis using algorithms such as random forest can lead to real work impact and good results.

Understandable

Classical machine learning models are well understood with lots of literature detailing their strengths and weaknesses.

Easier to cite for research and easier to introspect for internal state and learned features than a deep neural network which appears as a block box.

Why use a classical machine learning model?

13 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

14 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

Extracting more from each pixel with derived information

15 of 80

Identifying Urban Forests Extracting more from each pixel with derived information

�Eliana Lima da Fonseca

Professor at Universidade Federal do Rio Grande do Sul

Earth Engine Google Developer Expert

”

Demo

16 of 80

Urban forests in Landsat images

Landsat images

Long time-series

Good temporal resolution for urban areas

Limitations

Spatial resolution (30 meters)

Many different targets inside the same pixel

Some feasible solutions to improve the classification:

“ Choose the correct season to identify urban greens”

“ Looking for more information with derived information”

17 of 80

Building an urban vegetation time-series images

Limitations

Spatial resolution (30 meters)

Many different targets inside the same pixel

No samples collected for the past images

Validations metrics cannot be calculated

Solutions

Sub-pixel information

Visual inspection

Science validate methods

using Landsat 8 imagery and machine learning algorithms

18 of 80

Building an urban vegetation time-series images

Input data

Landsat 8 imagery (2014 - present)

Original optical bands

Endmembers images

Summer and Winter images

19 of 80

Information in a sub-pixel level

20 of 80

Information in a sub-pixel level

Linear spectral mixture model in urban areas

The linear system´s solution gives the proportion (X) of each endmember inside each pixel.

These proportions are given in new bands

r = vegetation*x1 + builts*x2 + shadow*x3

Where:

r = pixel spectral reflectance

x1 = proportion value of the vegetations in the pixel

x2 = proportion value of the builts in the pixel

x3 = proportion value of the shadows in the pixel

21 of 80

Endmember spectral patterns

22 of 80

Endmember selection

Builts

Area with no trees and no shadows

Vegetation

Urban forest

Shadow/Asphalt

A shadow projected over the asphalt

Visual inspection over basemap (high resolution image)

23 of 80

Endmember selection

24 of 80

Input data: Sub-pixel information

25 of 80

Classification / clustering: CART classifier

Input data tests: using Summer and Winter images with/without endmembers images

26 of 80

Results

Basemap

Original bands

Derived information

Visual inspection over basemap (high resolution image) - Point (-51.2189, -30.05509)

27 of 80

Results

Basemap

Derived information

Winter (2021_06_01)

Derived information

Summer (2021_12_10)

Visual inspection over basemap (high resolution image) - Point (-51.2189, -30.05509)

28 of 80

Code for endmember selection

Code for classification

Building an urban vegetation time-series images

https://goo.gle/g4g24-endmembers

https://goo.gle/g4g24-unmixing

using Landsat 8 imagery and machine learning algorithms

29 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

30 of 80

A simple classification example

Predicting crops with USDA Croplands Dataset

https://goo.gle/g4g24-croplands

31 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

Hyperparameter Tuning

32 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

Hyperparameter Tuning

Removing isolated pixels

33 of 80

Removing Isolated Pixels

Mangrove Classification

�Andréa Puzzi Nicolau

Geospatial Data Scientist, Spatial Informatics Group

Earth Engine Google Developer Expert

”

Demo

34 of 80

Removing Isolated Pixels

Often when classifying an image to obtain a land use land cover map, we encounter isolated pixels in the final classified image. What are some ways of removing these isolated pixels?

Mangrove Classification

35 of 80

Removing Isolated Pixels

Often when classifying an image to obtain a land use land cover map, we encounter isolated pixels in the final classified image. What are some ways of removing these isolated pixels?

Mangrove Classification

36 of 80

Focal Mode Operator

var classifiedMode = classified.focalMode(3)

37 of 80

Focal Mode Operator

var classifiedMode = classified.focalMode(3)

Shortcut for .reduceNeighborhood() with a mode reducer ee.Reducer.mode()

38 of 80

Focal Mode Operator

var classifiedMode = classified.focalMode(3)

The mode operator gets the value that occurs most frequently within the neighborhood (kernel)

39 of 80

Focal Mode Operator

var classifiedMode = classified.focalMode(3)

Kernel radius

40 of 80

Focal Mode Operator

var classifiedMode = classified.focalMode(3)

41 of 80

Weighted Focal Mode

// Define 3x3 window with (euclidean) distance weights from corners.

var weights = [[1,2,1],

[2,3,2],

[1,2,1]];

// Create kernel.

var kernel = ee.Kernel.fixed(3,3,weights);

// Apply mode on neighborhood using weights.

var classifiedWmode = classified.focalMode({kernel: kernel})

42 of 80

Weighted Focal Mode

// Define 3x3 window with (euclidean) distance weights from corners.

var weights = [[1,2,1],

[2,3,2],

[1,2,1]];

// Create kernel.

var kernel = ee.Kernel.fixed(3,3,weights);

// Apply mode on neighborhood using weights.

var classifiedWmode = classified.focalMode({kernel: kernel})

Gives higher weight to closer pixels.

Controls for “over postprocessing”.

43 of 80

Weighted Focal Mode

Original Classification

Focal Mode

Weighted Focal Mode

44 of 80

Clustering

var seeds = ee.Algorithms.Image.Segmentation.seedGrid(5);

var snic = ee.Algorithms.Image.Segmentation.SNIC({

image: composite,

compactness: 0,

connectivity: 4,

neighborhoodSize: 10,

size: 2,

seeds: seeds

});

var clusters = snic.select('clusters');

var smoothed = classified.addBands(clusters);

var classifiedCluster = smoothed.reduceConnectedComponents({

reducer: ee.Reducer.mode(),

labelBand: 'clusters'

});

45 of 80

Clustering

var seeds = ee.Algorithms.Image.Segmentation.seedGrid(5);

var snic = ee.Algorithms.Image.Segmentation.SNIC({

image: composite,

compactness: 0,

connectivity: 4,

neighborhoodSize: 10,

size: 2,

seeds: seeds

});

var clusters = snic.select('clusters');

var smoothed = classified.addBands(clusters);

var classifiedCluster = smoothed.reduceConnectedComponents({

reducer: ee.Reducer.mode(),

labelBand: 'clusters'

});

Selects seed pixels for clustering. Seeds are created and used to form square “super-pixels” (can be hexagonal too).

46 of 80

Clustering

var seeds = ee.Algorithms.Image.Segmentation.seedGrid(5);

var snic = ee.Algorithms.Image.Segmentation.SNIC({

image: composite,

compactness: 0,

connectivity: 4,

neighborhoodSize: 10,

size: 2,

seeds: seeds

});

var clusters = snic.select('clusters');

var smoothed = classified.addBands(clusters);

var classifiedCluster = smoothed.reduceConnectedComponents({

reducer: ee.Reducer.mode(),

labelBand: 'clusters'

});

Superpixel clustering based on SNIC (Simple Non-Iterative Clustering)

47 of 80

Clustering

var seeds = ee.Algorithms.Image.Segmentation.seedGrid(5);

var snic = ee.Algorithms.Image.Segmentation.SNIC({

image: composite,

compactness: 0,

connectivity: 4,

neighborhoodSize: 10,

size: 2,

seeds: seeds

});

var clusters = snic.select('clusters');

var smoothed = classified.addBands(clusters);

var classifiedCluster = smoothed.reduceConnectedComponents({

reducer: ee.Reducer.mode(),

labelBand: 'clusters'

});

Superpixel clustering based on SNIC (Simple Non-Iterative Clustering)

48 of 80

Clustering

var seeds = ee.Algorithms.Image.Segmentation.seedGrid(5);

var snic = ee.Algorithms.Image.Segmentation.SNIC({

image: composite,

compactness: 0,

connectivity: 4,

neighborhoodSize: 10,

size: 2,

seeds: seeds

});

var clusters = snic.select('clusters');

var smoothed = classified.addBands(clusters);

var classifiedCluster = smoothed.reduceConnectedComponents({

reducer: ee.Reducer.mode(),

labelBand: 'clusters'

});

Applies a reducer to all of the pixels inside of each cluster. In this case, a mode reducer.

49 of 80

Original Classification

Focal Mode

Weighted Focal Mode

Clustering

50 of 80

When adding to the map…

…classified.reproject('EPSG:XXXX', null, 10)

Map.addLayer(...)

Neighborhood methods are scale-dependent so the results will change as you zoom in/out, this is why you need to force a reprojection to be able to see the changes on the map.

Alternatively, you can export the image as an asset and re-import it to see it.

51 of 80

Code

https://goo.gle/g4g24-mangroves

52 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

Ensembling Multiple Models

53 of 80

Crop Classification

Ensembling multiple models

�Emma Izquierdo-Verdiguier

University of Natural Resources and Life Sciences, Vienna (BOKU)

Earth Engine Google Developer Expert

”

Demo

54 of 80

Crop classification

Very complicated task!

#GeoForGood24

55 of 80

Crop classification

What Could Possibly Go Wrong?

Lack of labels.
Crops with similar phenology calendar.
Different conditions for same type of crop.

Very complicated task!

#GeoForGood24

56 of 80

Crop classification

What Could Possibly Go Wrong?

Lack of labels.
Crops with similar phenology calendar.
Different conditions for same type of crop.

Very complicated task!

#GeoForGood24

57 of 80

Crop classification

What Could Possibly Go Wrong?

Lack of labels.
Crops with similar phenology calendar.
Different conditions for same type of crop.

Very complicated task!

#GeoForGood24

58 of 80

How can it be improved?

#GeoForGood24

59 of 80

#GeoForGood24

60 of 80

#GeoForGood24

61 of 80

Weights can be:

Overall accuracy
Kappa Index
Value derived from them

Cloud-based ensemble classifier

Aguilar, R., et al. "A cloud-based multi-temporal ensemble classifier to map smallholder farming systems." Remote sensing 10.5 (2018): 729.

Based classifiers

#GeoForGood24

62 of 80

How does it work?

#GeoForGood24

63 of 80

Classifier 1

Classifier 3

𝜅 = 0.58

𝜅 = 0.62

Simple example

Classifier 2

𝜅 = 0.65

Class 1

Class 2

#GeoForGood24

64 of 80

Simple example

Class 1

Class 2

#GeoForGood24

65 of 80

Simple example

Class 1

Class 2

Ensemble

#GeoForGood24

66 of 80

Ensemble in the Earth Engine code editor

#GeoForGood24

67 of 80

Ensemble in the Earth Engine code editor

var names_clas = ['LIN', 'POLY','RBF','RF','GB'];

//Based classifiers: weights obtained from the kappa values are needed:

var weight = ee.Number(kappa.divide(ee.Number(1).subtract(kappa))).log10();

...

//Classification maps:

base_img = base_img.set({

'weight':ee.Number(weight),

'Img_id':clas

}).rename('based_classifier');

#GeoForGood24

68 of 80

Ensemble in the Earth Engine code editor

var num_classifiers= ee.List.sequence(0,num_classifiers-1);

// Create an imageCollection with one image per class. Each image contains two bands (i.e., weight per class and the class number):

function ic_weight(cl){

var img_weight = ee.ImageCollection(num_classifiers.map(

function (classi){

...

}));

img_weight = img_weight.sum();

return img_weight.addBands(ee.Image.constant(cl).toByte())};

#GeoForGood24

69 of 80

Ensemble in the Earth Engine code editor

var ic_weight_cls = ee.List.sequence(0,num_classes-1).map(ic_weight);

ic_weight_cls = ee.ImageCollection(ic_weight_cls);

var enseble_class = ic_weight_cls.reduce(ee.Reducer.max(2));

enseble_class = enseble_class.rename(['weight','ensemble_cls']);

#GeoForGood24

70 of 80

Ensemble in the Earth Engine code editor

#GeoForGood24

71 of 80

Demo

#GeoForGood24

72 of 80

Ensemble in the Earth Engine code editor

Based classifiers code:

https://goo.gle/g4g24-based-classifiers

Ensemble generation code:

https://goo.gle/g4g24-ensemble-generation

#GeoForGood24

73 of 80

Q&A

Urban Forests - Unmixing

Croplands - Simple Example with Hyperparameter Tuning

Mangroves - Removing Isolated Pixels

https://goo.gle/g4g24-mangroves

Crop Classification with Ensembled Models

Links for more information …

#GeoForGood24

apnicolau@sig-gis.com

eliana.fonseca@ufrgs.br

emma.izquierdo@boku.ac.at

sgreenberg@google.com

74 of 80

Geo for Good Summit Sept 25th

Classifying the Land - Scaling and Improving the Accuracy of Classical ML APIs in EE

Room: Experiment

Data Catalog - new datasets and new capabilities �Room: Carina Nebula

Configuring your Earth Engine access

Room: Birr Castle

9:15-10:15am

10:15-10:45am

Coffee break outside Experiment, Carina Nebula and Birr Castle

10:45-12:15pm

Automating Earth Engine workflows

Room: Experiment

Sustainable Sourcing and Convergence of Evidence

Room: Carina Nebula

Importing into Earth Engine

Room: Birr Castle

12:15-1:45pm

Lunch outside Experiment

“Ask Me Anything” session with Earth Engine Product Managers in Experiment from 12:45-1:15pm

Sign up for Office Hours: goo.gle/g4g24-office-hours

75 of 80

The Machine Learning Journey

Assessment

Post Processing

Train Model

Gather and Prepare Data

K-Fold Cross Validation

76 of 80

K-Fold Cross Validation

All Samples

k random subsets

Validation

Training

fold 1 accuracy

Training

Validation

Training

fold 2 accuracy

Training

Validation

Training

fold 3 accuracy

Training

Validation

fold k accuracy

………

average accuracy

https://courses.spatialthoughts.com/end-to-end-gee-supplement.html#k-fold-cross-validation

77 of 80

A simple classification example - https://goo.gle/g4g24-croplands

# We will use USDA NASS Cropland Data Layers as our labels to train on. Add the s2 bands to the CDL data.

cdl = ee.ImageCollection("USDA/NASS/CDL")

cdl = cdl.addBands(composite)

# Take a stratified sample. 1000 points per class.

sample = cdl.stratifiedSample(numPoints = 1000, classBand = 'cropland')�

# Partition the training into 3481 training and 1519 validation samples

training = sample.filter(ee.Filter.lt('random', 0.7))

validation = sample.filter(ee.Filter.gte('random', 0.7))

# Train a random forest

classifier = ee.Classifier.smileRandomForest(10).train(features = training, classProperty = 'cropland', inputProperties = composite.bandNames())

# Accuray on Test Set

testAccuracy = validation.classify(classifier).errorMatrix('cropland', 'classification').accuracy()

So we're going to use a simple hands-on example here to show some concepts. This is a sample predicting crop coverage using Sentinel 2.

I'll walk you through the code on this slide first, while folks also open the notebook.

We uses the Crop Data Layers map from the USDA National Agricultural Statistics Service as labels.
We're going to take a stratified sample within an area of interest surrounding of Memphis Tennessee. We'll only be predicting 5 classes of crops.
We partition our samples into a training and validation sets. The validation set is used to make sure the model hasn't overfit the training data. We split 70% training, 30% validation.
Then we train a random forest with 10 trees. Why 10 trees? It's a good place to start. And we'll do soon do some steps to figure out whether it was a good choice.
We're predicting the cropland class and we're using Sentinel2 bands as covariates.
Finally we'll assess accuracy on the validation set.

The accuracies are not going to be particularly impressive. I chose to simplify it to keep the focus on learning.

We're going to run the code and then we're going to add Hyperparameter Tuning to it.

78 of 80

Hyperparameter Tuning

79 of 80

The Grid Search Method

You define the range of valid values for each parameter.

Train a model for each permutation.
Evaluate each model. Calculate the accuracy on validation data.
Choose the model with the highest accuracy.

80 of 80

Let's code

https://goo.gle/g4g24-croplands

�

Solution: https://goo.gle/g4g24-croplands-solution