4 of 58

KWCOCO

Manifest of videos, images, categories, and annotations.�
Efficient raster and vector sampling at specified resolution, space-time location, and sensor/band combination.�
Multispectral images, raster features, and predicted heatmaps are simply another “asset” for each image.�
Python API and CLI interfaces.�
gitlab.kitware.com/computer-vision/kwcoco

Assets 3

Assets 2

Assets 1

Video

Img 1

Img 2

Img 3

R|G|B

Videos:

Corresponds to a region.
“Video-space” corresponds to a constant GSD.

Assets:

File-paths
Channels
Transform to “image-space”

Images:

Sensor
Datetime
Transform to “video-space”

pan

R|G|B

Annots

Annotations:

Bounding box
Segmentation
Category
Stored in “image-space”

5 of 58

KWCOCO For SMART

Asset Space

native on disk

Image Space

all bands resampled to largest

Video Space

all regions in a common resolution

Register any raster data (raw or features)
All resampling is done on the fly

Assets 3

Assets 2

Assets 1

Video

Img 1

Img 2

Img 3

R|G|B

pan

R|G|B

Annots

6 of 58

KWCOCO (tracking)

MS-COCO modified to support simple tracking:�

Added top-level video table�
Images contain:

The “video_id” they belong to
A “frame_index” or “timestamp” to define ordering of the frames�

Annotations contain:

A “track_id” to be shared between annotations� in the same track.

Limitations: annotations are stored for each frame. In the future we will introduce an alternative specification to reduce redundancy. �
gitlab.kitware.com/computer-vision/kwcoco

Assets 3

Assets 2

Assets 1

Video

Img 1

Img 2

Img 3

R|G|B

Annots

R|G|B

depth

<track_id>

7 of 58

Motivation

MS-COCO is an easy to use format, with good / bad properties

All metadata in one file
No holes in polygons
No keypoint category encoding

8 of 58

KWCOCO

Pycocotools is the official API

completely static
Needs compiled modules�

KWCOCO is an alternative

Can add / remote objects
Command line interface
Can run in pure-python
*new* experimental SQL backend
Support for an extended schema

Holes in polygons
Keypoints with categories
Group images into videos
Auxiliary channels

Pycocotools

9 of 58

KWCOCO CLI

Quick stats on a dataset
Combine two datasets together
Splits two datasets into train / validation / test
Change the referenced images paths to an absolute or relative new location
Create DEMODATA!
Infer unset attributes to conform to the spec
Evaluate object detection using one COCO file as truth and another as predictions
Change category names
Validate the schema and assets

10 of 58

Demo Data

The toydata command can generate a toy dataset consisting of objects to detect on a noisy background.�
Useful for testing of ML algorithms without having to rely on downloading a large dataset.

11 of 58

Default Dictionary Backend

To open an existing dataset:�import kwcoco�dset = kwcoco.CocoDataset(path)�
dset.dataset is exactly what is loaded from the json file.�
dset.index maintains fast lookup tables by primary keys (e.g. id)

dset.index.anns
dset.index.imgs
dset.index.cats
dset.index.gid_to_aids
dset.index.cid_to_aids
dset.index.name_to_video
dset.index.file_name_to_img

Note on notation: a = Annotation, c = Category, g = imaGe, vid = video (e.g. aid = annotation id, gid = image id)

Example JSON data structure

12 of 58

Vectorized interface

Convenience accessors are provided to allow for accessing multiple attributes with minimal code.

dset.images(<gids>)�dset.annots(<aids>)�dset.videos(<vidids>)�dset.categories(<cids>)

13 of 58

Experimental SQLAlchemy Backend

A coco dataset can be converted into a read-only SQLite file.�
Allows for working around Torch issue #13246 with DataLoaders and multiprocessing. (Only a python string is copied).�
API is exactly the same as the dictionary-based json data structure.

14 of 58

SQL Scaling

16 of 58

NDSampler -

Easy integration with kwcoco�
Tricky (due to GDAL), optional, but ultimately worthwhile automatic conversion of images to (configurable with sensible defaults) COG format. �
Spatial Indexes - quickly find all the annotations a specified region of the image.�
Support for kwimage (soon to be kwannot?) data structures.

17 of 58

Training a 224x224 Resnet50 Classifier on �Annotations in 1080x1920 images

With Cog

~80-95ish% constant utilization)

Without CoG

(67Hz @ 3x224x224)

With Cog

(best worker count = 4 )

(19.4GB RAM per worker)

Without Cog

With Cog

(best worker count > 12 )

(42GB RAM per worker!)

Without Cog

Jumps between 0 and 100% utilization

With CoG

(654Hz / 3x224x224)

18 of 58

KWImage Data Structures�(may change to kwannot)��These all interface nicely with shapely

19 of 58

Overview

Boxes - bounding boxes
Coords - arbitrary D-dim coordinates
Mask - binary mask�
Polygon - exterior ring, and a list of interior rings
Points - wraps Coords, used for xy keypoints
MultiPolygon - Multiple polygons�
Detections - Container for multiple boxes, and associated score, class-ids, etc..�
Heatmap - soft multi-category masks

All structures have methods:

tensor / numpy

Convert data to a torch / numpy backend

warp

Transform underlying raster or vector data with some transform specification, e.g. a matrix, imgaug, a GDAL transform, or an arbitrary function.

draw_on

Visualizes the structure on an image

random

classmethod to make a random instance for demo, testing, and algorithm purposes

Core principle: Classes should be extremely thin wrappers around underlying numpy arrays

20 of 58

kwimage.structs.Boxes

Maintains box format

xywh - tl_x, tl_y, w, h
ltrb - tl_x, tl_y, br_x, br_y
cxywh - cx, cy, w, h�

Fast backend operations:

Boxes.ious has a C backend�

Selected Boxes methods:

Boxes.translate
Boxes.scale
Boxes.warp

21 of 58

kwimage.structs.Coords

Simple backend for other classes that need to maintain lists of coordinates.�
Selected Coords methods:

Coords.to_imgaug
Coords.from_imgaug
Coords.soft_fill

22 of 58

kwimage.structs.Polygon

Maintain an exterior and multiple interior rings of coordinates.�
Selected Polygon methods:

Polygon.to_shapely
Polygon.to_mask
Polygon.to_geojson
Polygon.to_coco
<from variants of above>

23 of 58

kwimage.Detections

Container for columns of associated data such as boxes, classes, scores, keypoints, segmentations, etc... �
Selected Detection methods:

Detections.non_max_supress
Detections.argsort
Detections.warp
Detections.to_coco

24 of 58

KWArray

Misc Items
ArrayAPI - torch-numpy API interoperability
DataFrameArray - faster-than-pandas named-column-based arrays.

DataFrameLight - Python List version of the above structure useful for fast appends.
Future API: One class, specify between fast append mode or vectorized mode.

Algorithms:

Hungarian (i.e. Maximum Value Matching)
SetCover (Greedy Approximation and and Exact ILP)

Utilities

util_random (ensure_rng)
stats_dict(arr) -> {‘min’: 0, ‘max’: 3, ‘mean’: 1.2, …}
group_items | group_indices, apply_grouping, group_consecutive
Faster 32-bit RNGs: standard_normal32, uniform32 (numpy may have this now)

25 of 58

KWImage

The “Structs”: �Boxes, Mask, Coords, Points, Polygon, MultiPolygon Detections, Heatmap
IO:

imread - allows forcing of colorspace (but defaults to rgbx), allows choice of backend (gdal, opencv, scikit-image, — no PIL because PIL is slow — but defaults to the fastest.
imwrite - same sensible default colorspace and backend choices.
load_image_shape - without reading the entire image. This is where PIL is useful, also gdal is reasonable

Functional:

overlay_alpha_images(img1, img2) -> blended image # can also specify alpha values
stack_images ([*img1, img2, …], axis=0) -> concatenate images (options to handle heterogeneous sizes)
warp_tensor - warp a torch tensor or numpy array with a homography or affine matrix
grab_test_image, grab_test_image_path - Useful for unit/doc tests, you are testing right?
imresize - scale factor, constant size, letterboxing, returns transform info

26 of 58

KWImage - The Structures (kwannot?)

The “Structs”: �Boxes, Mask, Coords, Points, Polygon, MultiPolygon Detections, Heatmap
Drawing functions

ax = item.draw() # matplotlib
canvas = item.draw_on(img.copy(), color=’blue’) # opencv, inplace whenever possible

Casting Functions

coerce - try your best to make this type object from some other type of object
to_coco / from_coco - return a coco-compatible representation (may not always be perfect)
toformat - change the underlying data format
random - make a random instance for testing or randomized algorithms

27 of 58

KWPlot

Its ok, I know how to use it. Seaborn is really cool too, it’s probably better and you should learn that instead. Kwimage does depend on this to for its structure “.draw” methods.
The most likely to be useful bits:

kwplot.autompl is nice if you use IPython
kwplot.BackendContext can force mpl backends
multi_plot - very seaborn like interface, but no pandas required

28 of 58

Somewhat related to kwcoco

Some capabilities of smartwatch will be ported to kwcoco itself

29 of 58

Issues With Existing Machine Learning Systems

Lack of a good Data Manifest in existing DL CV systems (e.g. detectron2, mmdet)

Hard-coded number of categories
Hard-coded mean/std
MSI Images need to be resampled and aligned on-disk (if MSI is even supported)
Weight checkpoints are the only output of training (topology and metadata not included)
Annotations are decoupled from images (often need to specify a path to annotations and a path to images)

We want:

A manifest that registers paths to images and their annotations (1 file = 1 dataset)
Infer number of categories based on the manifest (extend to new categories)
Infer mean/std based on the manifest (extend to different MSI sensors)
Test data that contains corner cases so we run on the CI (auto-generated)
Produce deployed packages with all metadata needed to predict on unseen data.

30 of 58

KWCOCO

Extension of MSCOCO

Categories
Annotations (with Tracks)
Images

Auxiliary

Videos�

Combined with ndsampler to randomly sample space-time windows�
Images stored in native resolution in one or multiple files

New!

Visualizations of the an MSI kwcoco file.

Left: loaded red|green|blue features
Right: loaded inv_sort1|inv_augment1|inv_shared1 features (These are UKY TA2 features)�

Can load any set of channels as a “DelayedImage”, the finalize() operation loads and aligns the data at a specified resolution on the fly.

Note: Video corresponds a time sequence of images.

31 of 58

ToyData: Food for the CI

Toy Data is useful for developing, debugging, and running tests on CI! Separates data from algorithms. Because Drop1 (and all datasets we work with) are in the same format we can swap it in.

32 of 58

KW-COCO

JSON Manifest of images sequences (i.e. videos aka regions)

Heterogeneous sensors / resolutions / channels
Pixel based box, polygon, and mask annotations�

Kitware’s TA-2 Interchange Format

Stores data in native resolution
Intermediate features stored as new “auxiliary” channels
Final results stored as annotations�

Combines with ndsampler

Resampling at specified resolution (on the fly)
Random sampling of subregions for training�

SQLAlchemy backend for scaling

Sample Data From Region at a Virtual Resolution

I want channels [B2, nir, Material1, Material2, B11] at 10 meter GSD from [0:100, 100:200] at frames [3, 7] in video “23KPQ_BR_Rio_R01”.

kwcoco + ndsampler

Here’s that (5,2,100,100) tensor and annotations in relative coordinates.

33 of 58

Adding Your Features to the KWCOCO file

{

"videos": [{"name": "TheRegionName", "width": 300, "height": 400}, ...],

"images": [

{

"name": "TheImageName",

"width": 600,

"height": 800,

"video_id": 1,

"date_captured": "2018-10-16T16:02:29",

"warp_img_to_vid": {"scale": 0.5},

"auxiliary": [

{

"file_name": "B1.tif",

"warp_aux_to_img": {"scale": 2.0},

"width": 300, "height": 400

"channels": "coastal", "num_bands": 1,

{

"file_name": "B2.tif",

"warp_aux_to_img": {"scale": 1.0},

"channels": "blue", "num_bands": 1,

...

], }, ... ]}

Input KWCOCO

...

"auxiliary": [

{"file_name": "B1.tif", ...},

{"file_name": "B2.tif", ...},

{

"file_name": "YOUR_FEATURE_PATH.tif",

"warp_aux_to_img": {"scale": 4.0},

"width": 75, "height": 100,

"channels": "your_channel_code",

"num_bands": 32,

...

]

Output KWCOCO

Append a new “auxiliary” item

34 of 58

Part 1: The kwcoco + ndsampler libraries

35 of 58

The kwcoco library

kwcoco: https://kwcoco.readthedocs.io/en/latest/

A data format
An indexable manifest of categories, videos, images, and annotations.
Human readable (mostly, you wanna see what a segmentation looks like?)
A command line interface (CLI) tool. (with scoring code)

See `kwcoco --help`

An API with add / remove, statistic, and other helper methods.
Coercible - several ways to represent annotation data

i.e. There is a formal schema, but multiple backwards compatible formats are supported and new styles (like WKT for or on-disk masks for segmentations) can be added.

IS NOT

For loading data (it has lightweight - i.e. inefficient because no disk cache - ways of doing it)
For data streaming

36 of 58

I truncated the segmentations

37 of 58

The data structure

Accessing information in a kwcoco dataset is usually done by interfacing with an “index” object.�
There is also an alternative ORM-like API

38 of 58

The SQL Backend (don’t linger on this slide)

39 of 58

The ndsampler Library

ndsampler: https://ndsampler.readthedocs.io/en/latest/

DOES

A tool for loading (3d-video and 2d-image) data
Cache things like spatial indexes and COGs to a “workdir”
Do all of the alignment magic between with channels, images and videos with different resolutions
Provide an indexable regular grid of “positive” and “negative” samples. (helps write dataloaders)�

DOES NOT

Implement a torch dataset / dataloader by itself
Work with 1d or 4d+ data 😞
Work with anything but kwcoco (currently)
Use the GPU

40 of 58

Ndsampler API

Can be modified to suit developer needs�
Images:

gid
space-region

slices (y1:y2, x1:x2)
cx, cy, width, height�

Videos:

vidid
space-time-region

slices (t1:t2, y1:y2, x1:x2)

Alternates?

Specify space and time separately
Specify list of gids for time

41 of 58

Supporting Libraries

kwarray - https://kwarray.readthedocs.io/en/latest/

Low-level python library containing numpy-like array operations.
kwarray.SlidingWindow(<shape>, <window>) - will be used later when gridding up “videos”
kwarray.ArrayAPI - interoperability between torch and numpy�

kwimage - https://kwimage.readthedocs.io/en/latest/

Low level python library specifically for image operations
Currently is the home of the “kwimage” data structures:

kwimage.Boxes
kwimage.Coords
kwimage.MultiPolygon
kwimage.Mask
kwimage.Detections

42 of 58

Part 2: The WATCH Datasets

43 of 58

Givens and Goals

Given a static training dataset containing:

“Videos” of orthorectified spatial regions chosen a-priori�
Frames in each video may contain a mixture of:

Channels spread across different files and at different resolutions
Single files containing multiple channels�

Each frame might be at a different resolution.�
Note: Channels might be features from earlier steps in TA1 or TA2!

The TA2 developer should be able to:

Point at a single file to load a dataset�
Load any space-time-region at some specified resolution for any video with any subset of channels.

???

Gray (uint8)

TrueColor (uint8)

B1, B2, B3, B4, B5, B6, B7, B8, B8A B9, B10, B11

8 bands, unknown

Orthorectified Dataset

44 of 58

Method Overview

Raw GeoTiff Source

(e.g. RGD)

Orthorectified Naitive Resolution KWCoco

For each ROI-Query we:

find all overlapping geotiffs
orthorectify and crop them to the ROI at native resolution
Register them as a “video” in a kwcoco.

Updated KWCoco at a Specified “Virtual” Resolution

watch/scripts/geojson_to_kwcoco.py

watch/scripts/coco_align_geotiffs.py

watch/scripts/coco_add_watch_fields.py

Each file is associated with a “channel” code.�(a “|”-separated string)�
Transforms are computed between images in a video. (all resize ops are delayed until sample time)

warp_img_to_vid
warp_aux_to_img�

ndsampler can now sample given:

Temporal bounds in frame indexes (or specific image-ids)
Spatial bounds in pixels
Specified channels
Specified scale (TODO)

45 of 58

Important kwcoco Fields for WATCH

channels - a ‘|’ separated string that gives a codename to each channel.�
warp_aux_to_img : A transformation from auxiliary space to a chosen “image” space. �
warp_img_to_vid : A transformation from the chosen “image” space into a chosen “video” space.�
warp_aux_to_img: Not shown here, but will be used when an image has multiple bands in different files.

46 of 58

The updated “drop0-aligned” kwcoco dataset

Notice Videos now have fields:

“wld_to_vid”
“height”
“width”

Notice Images now have fields:

“channels”
“timestamp”
“num_bands”
“approx_meter_gsd”
“approx_elevation”
“warp_img_to_vid”
“warp_to_wld”

Note, that the TransformSpec is flexible, and we could do anything you want as long as `kwimage.Transform.coerce` can support it.�

Note: this would be slightly different for auxiliary images.

47 of 58

Method summary

We got an orthorectified kwcoco dataset.

Images are in COG format
Images have transforms populated
A video “gsd” is chosen�

We load the kwcoco dataset�
We create an ndsampler.CocoSampler�
We create an indexable grid of sample regions�
We use that grid and the sampler itself to write a data loader.

48 of 58

Part 3: In Domain Examples

51 of 58

The kwcoco API - compute stats on drop0

55 of 58

Assets 2

Assets 1

Video

Img 1

Img 2

R|G|B

Annot 1

Annot 3

R|G|B

depth

Track1

Track2

Annot 2

56 of 58

Image

Asset

Annotation

Track

Video

Image

Asset

Annotation

Track

Video

Category

Info

Image

Asset

Annotation

Track

Video

Category

Info

🎵�Audio

? audio / mpeg structure to be determined.

MPEG

57 of 58

Image

Asset

Annotation

Track

Video

Category

Info

🎵�Audio

MPEG

🎵�Audio

MPEG

Annotation

The basic audio annotation is a 1D box, indicating start / end time. Label could be a category / caption / etc...

58 of 58

Image

Asset

Annotation

Track

Video

Category

Info

🎵�Audio

Movies

Stills

1 of 58

2 of 58

3 of 58

4 of 58

5 of 58

6 of 58

7 of 58

8 of 58

9 of 58

10 of 58

11 of 58

12 of 58

13 of 58

14 of 58

15 of 58

16 of 58

17 of 58

18 of 58

19 of 58

20 of 58

21 of 58

22 of 58

23 of 58

24 of 58

25 of 58

26 of 58

27 of 58

28 of 58

29 of 58

30 of 58

31 of 58

32 of 58

33 of 58

34 of 58

35 of 58

36 of 58

37 of 58

38 of 58

39 of 58

40 of 58

41 of 58

42 of 58

43 of 58

44 of 58

45 of 58

46 of 58

47 of 58

48 of 58

49 of 58

50 of 58

51 of 58

52 of 58

53 of 58

54 of 58

55 of 58

56 of 58

57 of 58

58 of 58