Hu, Yuhuang. 2016. “Generation of Benchmarks for Visual Recognition with Spiking Neural Networks” NSC Short Project Report, Zurich, Switzerland: University of Zurich.
Hu, Yuhuang, Hongjie Liu, Michael Pfeiffer, and Tobi Delbruck. 2016. “DVS Benchmark Datasets for Object Tracking, Action Recognition, and Object Recognition.” *Frontiers in Neuroscience* 10 (August 31): 405. doi:10.3389/fnins.2016.00405. http://dx.doi.org/10.3389/fnins.2016.00405 .

How to Get Datasets

In this project, we targeted and converted 4 datasets:

VOT Challenge 2015 Dataset (Single Target Object Tracking)
TrackingDataset (Single Target Object Tracking)
UCF50 Action Recognition Dataset (Action Recognition)
Caltech-256 Object Recognition (Object Recognition)

You can access these DVS datasets in several ways:

Zenodo: (RECOMMENDED) https://zenodo.org/records/4807304 The Zenodo version of the dataset has only the HDF5 files. To access the raw AEDAT2.0 recordings, please download them from the next method (ResilioSync).
ResilioSync: Through the personal file sharing service BitTorrent Sync. Use this link to access the datasets including the raw AEDAT-2.0 recordings that can be processeed with jAER..

This dataset is hosted as part of the INI sensors group databases.

BitTorrent Sync folder

If you do not use selective sync, then you will end up with the files below.( A .bts file extension means the file has not yet downloaded completely.) Your share will be read-only, so changes you make will not affect the source files.

Screen Shot 2016-06-11 at 00.15.43.png

The md5_info.txt and checksums.md5 are text files that list the MD5 checksums for the archives. You can use for example HashCheck on Windows to check archive integrities.

The download (and sharing) status can be monitored in the BitTorrent Sync control panel:

jAER Setup

jAER is available at http://jaerproject.org.

Screenshot of the Datasets

These figures are generated from a Python package, which is part of the project SpikeFuel. Similar results should be obtained if the data is loaded and displayed via jAER. A detailed description of the datasets can be found in the report.

VOT Challenge 2015 Dataset DVS Recordings

(See https://youtu.be/TDhBZf_-yAg for a YouTube video). The bounding boxes of the VOT Dataset are available separately and in HDF5 format datasets. You can try this script to produce the VOT bounding boxes and above figures.

You can check out this script for producing amplitude spectrum.

TrackingDataset DVS Recordings

(See https://youtu.be/DFMvQ7r0UA8 for a YouTube video). The bounding boxes of the TrackingDataset are available in HDF5 format dataset. You can try this script to produce the TrackingDataset bounding boxes and above figures.

You can check out this script for producing amplitude spectrum.

UCF-50 DVS Recordings

See https://youtu.be/WCVIFKLOuhI for a YouTubeVideo.

You can check out this script for producing amplitude spectrum and this script for produce above figures.

Caltech-256 Dataset DVS Recordings

See https://youtu.be/Ir3cSqOgkLE for a YouTube video.

You can check out this script for producing amplitude spectrum and this script for produce above figures.

Further information

Bounding Box Generation

The bounding box is generated by a simple yet effective method. First, we calculated the relative position based on the original bounding box. Unlike the absolute position that uses pixel coordination, the relative position calculates the ratio between the length of left end to the point and the width of the figure, and the ratio between the length of the top end to the point and the height of the figure. For a concrete example, please see following figure:

For the top left point, the horizontal ratio is calculated by h1:(h1+h2), the vertical ratio is calculated by v1:(v1+v2). In this way, no matter how the figure is transformed geometrically, we can easily calculate the absolute position based on these 2 values. Note that there may be border padding for sequences that are not available as 4:3, an additional care of this border is taken. But the essence of the method is the same.

For AEDAT recording, bounding boxes are generated in text file, each file starts with a header and follows a sequence of bounding boxes information. Each line in the file is a bounding box that has 9 values. The first value is the timestamp in microseconds. The rest 8 values constitutes four [X, Y] pairs that describe the polygon of the bounding box. These coordinates are flipped relative to jAER XY coordinates because of a different origin and convention in python for images. These bounding boxes are released with the dataset in groundtruth-for-vot-and-trackingdataset.zip.

Bounding Box Usage in jAER

You can easily annotate bounding boxes in jAER using the event filter class YuhuangBoundingboxGenerator filter.

(Before you proceed, make sure you get the latest jAER update and compile from the IDE is recommended.)

Open jAER from IDE, click Filters

You should see a window pops out like following window, click Select Filters…

At Available classes, type in YuhuangBoundingboxDisplay and then add it into Selected classes

Once you click OK, you should be able to find the filter at end of your filter list:

Click controls, tick the checkbox beside Reset to enable the filter, and then click the Button LoadGroundTruthFromTXT, from the popped out chooser, choose one of the bounding box file you want to load (in this example, bag-groundtruth.txt):
The file is read, and a summary in the status line shows how many bounding boxes have been loaded. Coordinates are flipped to jAER DVS image coordinates for each bounding box. If a camera calibration has already been loaded (see below), then the bounding boxes are also undistorted by the calibration. Now you can File/Open... the corresponding logged recording and play it; and the bounding boxes should appear as white polygons. If there are more than one bounding box in the displayed time slice, then all of them are simultaneously displayed:

For application or testing, the bounding boxes are available from the filter in the using hte method getBoundingBoxes() as a TreeMap Java Collection, where each Entry is a key-value pair, with the key being the timestamp in us of the box and the value being a BoundingBox instance, that contains the timestamp and float[4] arrays of x and y corner points. These coordinates have already been transformed to DVS coordinates in jAER. Depending on whether camera calibration is enabled and loaded, the undistorted bounding boxes are returned or the ones loaded from the ground truth file.

Bounding Box Usage in SpikeFuel

Drawing bounding box in SpikeFuel is quite easy, there is a pubic API gui.draw_poly_box_sequence where you can draw bounding boxes by given a list of sequence frames and corresponding bounding boxes. A concrete example may be found here.

Camera Calibration

A camera calibration can be performed using SingleCameraCalibration filter. The calibration file is available in calibration.zip. You can add and use this filter as following instructions:

Select SingleCameraCalibration filter

In the control, click LoadCalibration and select the folder that contains the calibration files (note that there are 2 calibration files, so instead of selecting files, you need to select the enclosing folder; the OK button will only be enabled if the files cameraCalibration.xml and distortionCoefs.xml are in the folder)

Back to the panel, make sure you tick the enableFilter checkbox besides Reset and the last checkbox undistortDVSevents. You should see the recording as following.

Note that you should always load the bounding boxes after loading the calibration, because the bounding boxes are transformed using the calibration when they are loaded.

These datasets are hosted as part of the INI sensors group databases.

Questions about these datasets should be directed to yuhuang.hu@ini.uzh.ch and tobi@ini.uzh.ch

This page is maintained as the google doc https://docs.google.com/document/d/1m4gAHPkPIzVvhHirtSjzUJrKiD5uU2I76hTqXhk8CxI/edit?usp=sharing