Hacking Challenges
Challenge 1:
«Optimizing reporting-app»
by Tobias Brunner, GIS-Zentrum Stadt Zürich
Züri wie neu
What we need
Resources
Report-data: https://data.stadt-zuerich.ch/dataset/zueriwieneu-meldungen
Description, position (e, n), time (requested datetime), photo (media_url), interface_used
Various geodataset: https://data.stadt-zuerich.ch/, https://opendata.swiss/de/organization/geoinformation-kanton-zuerich
Service_code : ground truth (verified category)
Challenge 2:
«Extracting individual trees from LIDAR»
by Katharina Kälin, Statistisches Amt Kanton Zürich
Problem: How “green” is Zürich?
tree planted
by the municipal
gardeners
tree planted
by private people
Available data
Missing data
What we need: LIDAR Data
Extract individual trees from LIDAR and assign attributes to them:
(e.g. coniferous vs deciduous)
Resources
Challenge 3:
«OpenStreetMap POI Completeness»
by HSR Geometa Lab�Raphael Das Gupta
Problem
Completeness of OSM mostly unknown, but useful for:
This challenge: Estimate completeness of POI (Points of Interest, i.e. shops, bars, restaurants, …)
What we need
Develop intrinsic (within OSM data) approach(es) for estimating OSM POI completeness.�Verify/tune extrinsically (by comparing to non-OSM data)
Resources
OpenStreetMap data (current & history)
Earth (big!): planet.osm.org
Switzerland: planet.osm.ch
Other countries and regions: osm-internal.download.geofabrik.de
Login may be required to access OSM history.�Please respect the privacy of mappers!
Official & 3rd-Party comparison data: md.coredump.ch/SDD18ZHHack-OSM-POI#data-sources
OGD (City of Zurich)
Proprietary data (Cities of Zurich and Geneva)
Tools, APIs, documentation, literature ...
Challenge 4:
«Online Search Behavior and Government Information»
by Andrea Schnell, Statistisches Amt Kanton Zürich
zh.ch
The website of the Canton of Zurich (zh.ch) is the digital interface for netizens and the cantonal public administration.
Hierarchical organizational structures & large quantities of content → not always easy to find the most relevant information!
Do content and structure mirror the needs of our users?
ADD A PICTURE
What we need
Analyze website traffic / web search data to find out:
→ Help us to make zh.ch better. Your opinion and the insights, you can provide, count!
Resources
Google Search Terms related to zh.ch
data (caveat: varying timespan for different zh.ch domains!)
Web Analytics & Google Search Data for kapo.zh.ch / statistik.zh.ch
List of Topics (A-Z) → “official language”
Challenge 5:
«Automatic detection of color for strip tests for water quality»
by Lukas Müller, Barbara Strobl and Simon Etter�University of Zurich
Problem
What we need
You have 25 sample images available. The code should…
Resources
Challenge 6:
«Adding and Correcting Entities in executive minutes»
by Tobias Hodel, Staatsarchiv Kanton Zürich
Problem
150’000 pages of handwritten executive minutes (Regierungsratsprotokolle, 1803-1883) have been transcribed by students.
The documents inform us about high politics and daily life in Zürich of the 19th century.
To enhance the usability, entities (person, places, organizations etc.) need to be identified (and thus searchable).
What we need
For starters, entities have been identified automatically (using a fixed list of places and persons), now this entities need to be checked and missed entities added.
Resources
Document Dump
ZENODO: https://doi.org/10.5281/zenodo.803239
About the project
http://www.staatsarchiv.zh.ch/internet/justiz_inneres/sta/de/ueber_uns/organisation/editionsprojekte/tkr.html
Sample Document
http://www.archives-quickaccess.ch/stazh/rrb/ref/MM+1.101+RRB+1827/0874
Challenge 7:
«The RefBank Challenge:�How to clean and de-duplicate one million bibliographic references?»
by Guido Sautter�Lead Software Developer at Plazi
gsautter@gmail.com @gsautter
Problem
The RefBank Corpus
�1,146,552 distinct reference strings (character by character)�1,026,753 distinct reference string clusters (abstracting case, accents, spaces, and punctuation marks)�(as of Oct 19th, 2018)
What we need
These three reference the same work … recognize them as a cluster:
Baroni Urbani, C. (1980) The first fossil species of the Australian ant genus Leptomyrmex in amber from the Dominican Republic. Stuttgarter Beiträge zur Naturkunde, Serie B, 62, 1-10.
Baroni Urbani, C. (1980): The first fossil species of the Australian ant genus Leptomyrmex in amber from the Dominican Republic. (Amber collection Stuttgart: Hymenoptera, Formicidae. III: Leptomyrmicini.).: 1-10
Baroni Urbani, C. (1980): The first fossil species of the Australian ant genus Leptomyrmex in amber from the Dominican Republic. (Amber Collection Stuttgart: Hymenoptera, Formicidae. III: Leptomyrmicini). Stuttgarter Beiträge zur Naturkunde. Serie B (Geologie und Paläontologie) 62: 1-10
Resources
Data
SQL Dump: http://plazi.cs.umb.edu/RefBank/RefBankDB.sql.gz
WebApp
Dump&Run Pack: http://plazi.cs.umb.edu/RefBank/RefBank.zip
Install Guide: http://plazi.cs.umb.edu/RefBank/static/downloadRefBank.html
Challenge 8:
«Looking for the WOW Wikidata query»
(by Cristina Sarasua, Universität Zürich)
Problem
Wikidata, the free knowledge base that anyone can use and edit, is growing fast and it is highly used.
How to showcase what one can do with Wikidata, and teach others how to use it?
Let’s see what the crowd asks for and spot WOW! Queries.
*we help Wikidata Facts and Query Examples!
51+ Million Data Items
749+ Million Edits
SPARQL Query Service: https://query.wikidata.org/
What we need: Mining SPARQL queries by the crowd
Tip: Get a subset of organic queries
Resources
Learning to Query Wikidata with SPARQL
Notebook showcasing some SPARQL & Wikidata features: https://tinyurl.com/y9hrpmad
Millions of SPARQL Queries Executed
Anonymised data set released by WMDE & University of Dresden: https://iccl.inf.tu-dresden.de/web/Wikidata_SPARQL_Logs/en
Related Publication: by Malyshev et al. at ISWC 2018 https://tinyurl.com/yc2u6a9m
General Wikidata Access: https://www.wikidata.org/wiki/Wikidata:Data_access
PAWS (your Jupyter instance by Wikimedia): https://paws.wmflabs.org/ Gastrodon Library, SPARQL Jupyter Kernel
Happy to give a SPARQL 101 introduction!
Challenge 9:
«OpenStreetMap Location Classification»
(by Sustainable FinTech & Carbon Delta)
Problem
OpenStreetMap location data has been instrumental in crafting a database of a company’s physical assets, which we use to quantify the company’s exposure to physical risks due to changing weather patterns.
Currently, our database lacks any kind of classification data and we must treat all locations as equals, despite knowing that sometimes a location represents a tiny retail outlet while at other times it is a high-throughput factory.
ADD A PICTURE
What we need
We need a program that, given an input of high-specificity latitude and longitude, can find the corresponding installation on OpenStreetMaps and classify it as a factory, farm, logistics, retail, office, etc., and ideally give further information as to the size (in m^2). Assume mapping to company has already been done.
ADD A PICTURE of what you envision / reference examples or whatever that helps participants to
Resources
Sites to Crawl
OpenStreetMap: https://www.openstreetmap.org
OSM API: https://wiki.openstreetmap.org/wiki/API
OpenStreetMap tools
OSM Wiki: https://wiki.openstreetmap.org/wiki/Map_Features#Shop
OSM Wiki: https://wiki.openstreetmap.org/wiki/Key:designation
Thanks!