Open-innovation program
A series of websites acting as the agency-wide collecting place for NASA’s public data
including:
A Tour of NASA’s Data Universe
for a Space-Apps Audience
Justin Gosses
S.A.I.C. senior data scientist supporting
Office of the Chief Information Officer
Transformation & Data Division
The premise of this talk is that by telling you a little about why different open-data sites exist and how they relate to one another, you’ll be better prepared to find datasets.
Contents of this talk
Open-Innovation Program
Run by Office of the Chief Information Officer (OCIO),
Open-Government Mandate Driven,
& Agency-wide
4
API + Data + Code
Open-innovation program
API.nasa.gov
Data.nasa.gov
Code.nasa.gov
API.nasa.gov (a passthrough service with tracking by api.data.gov)
A.P.I. = Application Programming Interface (write code to get back data)
There are many other APIs available not listed on this site! This page serves as a central easy to find location for NASA’s easier to use A.P.I.
STATISTICS:
CODE.nasa.gov
555+ open-source projects
Fed from software that has gone through Software Release System run by Office of Chief Engineer
Most but not all code is also on github.com/nasa
Table shows the open-source projects with the most interaction on GitHub using GSA’s pre-built scripts
DATA.nasa.gov
The largest number of datasets. Harvests data from other sites. Get’s harvested into data.gov.
Who Puts All This Together?
1000s of NASA & contractor staff who contribute code projects, APIs, and datasets
2.5 developers who maintain the open-innovation sites
You! a lot of our code for these sites is on public github.com repositories and we accept pull requests.
The NASA Data Universe
Why so many places to find data?
10
Examples:
Mandates & Requirements Drive Dataset Storage Diversity
US Government-wide
Harvesting Relationships
data.gov
data.nasa.gov
api.nasa.gov
APOD website
NASA-wide
Project Specific
Domain Specific Archives
earthdata.nasa.gov
Small unique datasets
csv file on plants
Domain-specific NASA Data Sites:
FIND LONGER LIST HERE: https://www.nasa.gov/open/data.html
Suggestions for Finding NASA datasets
14
Most used datasets are easiest to work with!
Most reused Code Clones
API Downloads in May
Datasets Total Downloads
Find Starter Code! saves time on dataset finding & prepping
Datanauts is a program where members of the public work with NASA open-data. The datanauts github org is a great place to find starter code.
Searching for the terms on github will often provide some open-source licensed code you can reuse! ‘NASA’ returns 10,000 results!
Observable Notebooks are like Jupyter notebooks but JavaScript, live, editable, & forkable on the web! Search for terms or check out this NASA collection.
Search through 5000 past SpaceApps projects using this app by datanaut Alexandre Belloni Alves
Bl.ocks is a site that collects live d3 visualizations. You can put ‘nasa’ into the search and get back things that use NASA data!
Live JavaScript Code Collections
Github
Consider whether you’re finding or discovering datasets?
Dataset Finding = “You know the dataset exists and what it is called”
Dataset Discovery = “You don’t know the name, whether it exists, what it looks like until you see it, or how you’ll use it until you see it.”
Use data.nasa.gov or other sites where you can search via titles, names, and other things that work well with string matching.
Look at previous code projects or websites that only hold specific types of data. These are more likely to have visual representations of data that help you determine what exists and how you might use it.
Consider Discoverability vs. Data Site Type
data.nasa.gov
pds.nasa.gov
Insight Mars Weather API
Sites Type Example Meta-data Interfaces Built-for Discoverability Type
generic with links to more metadata
science field specific metadata
dataset specific metadata
General public & search engines
Scientists, engineers & developers who need authoritative files
dataset users
String matching in descriptions or titles.
Filter content types & location & format filtering
See example use-cases
Harvest generated site
Domain
specific site
Dataset specific site
Open-Innovation Site Specific:
API.NASA.GOV
DATA.NASA.GOV
CODE.NASA.GOV & Github.com/nasa
Example
Of Finding Data
20
THE CHALLENGE
You are the astronaut/robotic mission lead tasked with bringing valuable specimens from the Moon back to Earth for further study. How will you evaluate lunar samples quickly and effectively before or while still on the mission? How will you differentiate samples of potential scientific value from less interesting material?
Suggestion 1: Look for org pages of NASA groups that do related work:
Suggestion 2: Search for papers/descriptions of past NASA work in data.nasa.gov & sti.nasa.gov:
Any Questions?
Best of Luck with SpaceApps!