1 of 18

Land Cover

Multi-class Semantic Segmentation

Gianluca Mangiapane & Johnathan Clementi

Remote Sensing - MUSA 650

Spring 2022

2 of 18

About the problem

Land Use/Land Cover (LULC) data are an important input for ecological, hydrological, and agricultural models [1]. The National Land Cover Database (NLCD) is developed by the USGS from Landsat imagery. [2]

However, these data have traditionally have large temporal gaps (~5 years) as they are computationally intensive to create.

More temporally granular land cover data are needed for a studying a rapidly changing environment

3 of 18

How can we access more temporally granular land cover data?

Example

Classified Image

Example

Raw Image

4 of 18

Automate land cover classification

using semantic segmentation

Semantic segmentation is the process of assigning each pixel in an image a label based on a predefined set of classes

Instead of manually classifying the pixels of an image, we will train a type of a deep neural network, known as U-Net, to assign a predicted label to each pixel in an image

The U-Net architecture was proposed in 2015 by O. Ronneberger, P. Fischer, and T. Brox [3]

U-Net consists of two processes: encoding and decoding

Encoding - capture low-level patterns to identify shapes and patterns
Decoding - capture high-level patterns to group pixels together

5 of 18

The U-Net Architecture

Image from Citation 3

Encoding

Decoding

6 of 18

Benefits of segmentation and

Other methods considered

While traditional CNN’s return a single output label for each image, a U-net returns a classification for each pixel. Therefore, it can be used as a multi-class classifier on a single image, with multiple classes labeled

We did explore testing traditional encoder-type CNN’s on the LULC problem, but would have needed to adjust the labeling scheme of the data to obtain single labels

7 of 18

The Data

2018 DeepGlobe Land Cover Classification Challenge, accessed via Kaggle [4]
803 training images and corresponding labeled images (known as masks)
Each image is 2448 x 2448 pixels with 3 channels (RGB)
Each mask is also 2448 x 2448 x 3 channel (RGB)

8 of 18

Data Preparation

Trimming

2448 is not divisible by 32, a requirement for U-Net
Remove 200 pixel border

Cropping

2048 x 2048 x 3 images are too large for less powered computers
Crop each image to 512 x 512

9 of 18

One-Hot Encoding

Machine learning algorithms need numerical values for the dependent variable
The masks are categorical in nature (urban, ag, range, etc.) and must be converted
To do this we one-hot encode the masks, creating an unique channel containing binary (0/1) outcomes for each class

10 of 18

Our 1st U-Net!

Constructed U-Net based on original U-Net architecture
Batch size = 5, epochs = 50, optimizer = Adam, loss = categorical cross entropy
Use callback ReduceLROnPlateau to reduce learning rate when the training accuracy plateaus
Testing accuracy: 76.64%

11 of 18

Our 1st U-Net: results

12 of 18

Our 2nd U-Net!

Based on the prediction errors from 1st UNet, hypothesized that increasing the batch size could help the model differentiate between classes
Batch size = 25, epochs = 75, optimizer = Adam, loss = categorical cross entropy
Use callback ReduceLROnPlateau to reduce learning rate when the training accuracy plateaus
Testing accuracy: 80.03%

13 of 18

Our 2nd U-Net: results

14 of 18

Our 3rd U-Net: testing other loss functions

We also tested other loss functions: total loss, which is dice loss + (1 * focal loss)
Batch size = 25, epochs = 75, optimizer = Adam
Use callback ReduceLROnPlateau to reduce learning rate when the training accuracy plateaus
Testing accuracy: 78.48%

15 of 18

Our 3rd U-Net: results

16 of 18

Conclusion

We were able to build a land cover classification model using the U-Net architecture using an trimmed and cropped dataset containing 500 images

The model with the highest accuracy correctly identifies 80.03% of an the pixels within the testing set

17 of 18

Next steps

Cropping

We built in functionality to adjust the cropping size for our data augmentation step. It would be interesting to see how increasing or decreasing the size of the cropped images and masks would affect a model’s ability to predict our land cover classes

Generalizability

We originally intended to test the model’s generalizability on this dataset, unfortunately we ran out of time

Loss functions

In future iterations of this project we would like to test segmentation loss functions such as IOU and Dice further

Transfer Learning

We first tested transfer learning using this package, but ran into many problems. There appear to be many U-net like architectures out there to test such as using Resnet50 as the UNet encoder

18 of 18

Citations

M. K. Arora, “Land cover classification from Remote Sensing data,” Geospatial World, Dec. 09, 2010. https://www.geospatialworld.net/article/land-cover-classification-from-remote-sensing-data/ (accessed May 02, 2022).

“National Land Cover Database | U.S. Geological Survey.” https://www.usgs.gov/centers/eros/science/national-land-cover-database (accessed May 02, 2022).

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” arXiv:1505.04597 [cs], May 2015, Accessed: Apr. 05, 2022. [Online]. Available: http://arxiv.org/abs/1505.04597

Demir et al., “DeepGlobe 2018: A Challenge to Parse the Earth Through Satellite Images.” https://competitions.codalab.org/competitions/18468#learn_the_details (accessed May 02, 2022).