1 of 16

Open Data Scotland

Crowdsourcing and aggregating open datasets across the country

2 of 16

Who are we?

  • Open Data Scotland is an open community
  • The OD_BODS team/project started as a hackathon challenge in May 2021
  • https://opendata.scot is the result of the OD_BODS project

Open Data Scotland and opendata.scot are entirely volunteer run and comes from a desire to help people locate and understand Open Data in Scotland better.

A collaborative community promoting open data through open source

Open Data Scotland

opendata.scot

@opendata_sco

3 of 16

The problem we’re trying to solve

Users of open data don’t know where to find data

  • Sources are disparate and decentralised
  • Very reliant on knowing who to ask and where to look
  • High barrier to access for the occasional user

There have been attempts to collate sources before

  • Well intentioned, but one-off events
  • Huge effort to maintain
  • Sources become aged and irrelevant

Some people don’t recognise the term “open data”

Open Data Scotland

opendata.scot

@opendata_sco

4 of 16

Project objectives

Be the most complete and up-to-date source for locating Open Data in Scotland. �(Starting with Local Authorities)

  • Find: Help public users find a data source they can use.
    • Or submit a request/suggestion if it is missing

  • Learn: Understand how OD is in Scotland
    • Establish a comparable scorecard/barometer for assessing OD sources
    • Help identify where provision can be improved (FAIR principles)

Open Data Scotland

opendata.scot

@opendata_sco

5 of 16

The plan

12 months of active development split over 4 milestones

2021 Q4 - Establish front end

2022 Q1 - Fix data source issues

2022 Q2 - Automation

2022 Q3 - Expand on sources

Open Data Scotland

opendata.scot

@opendata_sco

6 of 16

The plan

12 months of active development split over 4 milestones

2021 Q4 - Establish front end

2022 Q1 - Fix data source issues

2022 Q2 - Automation

2022 Q3 - Expand on sources

Open Data Scotland

opendata.scot

@opendata_sco

7 of 16

Key events

Open Data Scotland

opendata.scot

@opendata_sco

8 of 16

Service Areas

1. Find Datasets

2. Learn about OD in Scotland

3. Learn about Open Data

Open Data Scotland

opendata.scot

@opendata_sco

9 of 16

Tools used

Backend

  • API/ scrapers: Python, C#
  • Data Cleaning/ Processing: Python
  • Data storage: .csv, .json on github
  • Pipeline automation: Github actions
  • Hosting: Github pages

Frontend

  • Interface: JKAN.io, HTML+ JavaScript+ CSS (bootstrap)
  • Web analytics: Plausible Analytics
  • Visualisations: Python, JavaScript (chart.js)
  • Forms: Google forms

Project Tools

  • Github repos
  • Github wiki
  • Github projects (Issues/ Milestones)
  • Slack
  • Twitter

Python Packages

  • python 3.9
  • pandas
  • beautifulsoup4
  • datefinder
  • datetime
  • os
  • requests
  • pyyaml
  • markdown
  • nbconvert
  • matplotlib
  • seaborn
  • textwrap
  • importlib

Open Data Scotland

opendata.scot

@opendata_sco

10 of 16

Data pipeline

Open Data Scotland

opendata.scot

@opendata_sco

11 of 16

Light walkthrough…

  • API calls + Scrapers → 1 CSV per org
    • ArcGIS
    • USMART
    • CKAN
    • DCAT
    • Manual scrapes
  • Merge_data.py → 1 CSV to rule them all!
    • Combine all the CSVs into a single table
    • Save unclean copy of data (for comparison/analysis)
    • Data cleaning - org names, categories, licensing, formatting
  • Export2jkan.py → 1 markdown file per dataset
    • Split CSV into files readable by JKAN
  • Visualisations(?)
    • Jupyter Notebook analysis
    • Live charts
  • Github Actions

Open Data Scotland

opendata.scot

@opendata_sco

12 of 16

The challenges

  • Inconsistent standards
    • Across different publishers
    • Within individual publishers
  • Some metadata doesn’t exist
    • Do we ignore, interpolate or other?
  • Working over a long time period
    • How do you keep momentum and engagement over 12 months?
    • Keeping collaborators informed in non-real time
    • How do you remember what you did 2 months ago?
  • Breaking tasks into bite-sized chunks
  • Getting people to know this service exists

Open Data Scotland

opendata.scot

@opendata_sco

13 of 16

The wins

  • Seeing active users on the service
  • Broad range of skills and expertise available
  • (Almost) everyone has been supportive and helpful
  • Opportunity to work with people you wouldn’t normally work with

Open Data Scotland

opendata.scot

@opendata_sco

14 of 16

What’s in the future?

  • Active plan of work till Sep 2022
  • Domain registered till 2023
  • Scottish Open Data Unconference 2022

Future Features:

  • Twitter bots?
    • Alerts for when sources go offline
    • Weekly key trends: pageviews + most popular dataset of the week
  • Smarter categorisation of datasets?
  • Dataset 5-star rating and review system?
  • Recommender engine: “see other datasets like this”?

Open Data Scotland

opendata.scot

@opendata_sco

15 of 16

How can I contribute?

  • Any feedback is good feedback

  • Use opendata.scot�interact with the site and tell us what is missing

  • Engage with us on Twitter

  • Code with us: Slack / CTC hack events

Open Data Scotland

opendata.scot

@opendata_sco

16 of 16

Summary

Service Areas:

  • Dataset Listing
  • Dataset Insights
  • Learning Resources

Core Tasks:

  • API Calls
  • Web Scraping
  • Data Processing
  • Frontend
  • Data Visualisation

Project Lessons:

  • Be clear on the end goal
  • But be flexible with the details
  • Timebox projects
  • Non-real time communication
  • Documentation

Contact us:

opendata.scot

@opendata_sco

@kardotjewell

@YesImJack

Open Data Scotland

opendata.scot

@opendata_sco