Open Government Data: Understanding Open Access vs. Public Domain
Documents Association of New Jersey
October 25, 2019: Trenton, NJ �Thomas Edison State University,
Presenter: Jennifer C. Boettcher, Georgetown University
Jennifer C. Boettcher
Jennifer C. Boettcher and Leonard M. Gains. Industry Research Using the Economic Census. Greenwood Press: Phoenix, AZ. 2004
M.B.A., Georgetown University, Washington, D.C., 2005
M.L.S., State University of New York, Albany, N.Y.,1992
B.A., University of New Hampshire, Durham, N.H., 1987
Georgetown University 1997-present
ALA RUSA BRASS Member since 1991
Founder of Business Information Finders (BIF) and Capital Area Business Academic Librarians (CABAL) in DC
2013 Emerald Research Grant: Zombie List (reanimated business sources)
2010 Gale Cengage Learning Award for Excellence in Business Librarianship
Librarian & Information Scientist
These are my views and do not reflect those of Georgetown.
Boettcher, J. C., & Dames, K. M. (2018). Government data as intellectual property:
Is public domain the same as open access? Online Searcher, 42(4), 42-48.
Adaptations of DIKW pyramid by US Army Knowledge Managers,
from https://en.wikipedia.org/wiki/DIKW_pyramid
Data are not:
Information
Technology
Digital
Analytics
Evidence
Research
Visualizations
Ideas
Data are
collected facts
“raw material”
Public Domain: No Copyright Restrictions
Public Domain is not protected by intellectual property laws. Anyone can use a public domain work without obtaining permission, but no one can ever own it.
Example: no longer protected due to age of creative work.
Works produced for the U.S. Government by its officers and employees should not be subject to copyright. The provision applies the principle equally to unpublished and published works. 17 USC 105
REMEMBER: Public domain data must be attributed.
Administrative Data and the Freedom of Information Act (FOIA) 5 U.S.C. § 552, 1966
File here FOIAonline
Oversight: Office of Government Information Services
Why Open Data Exists
Public Domain Vs. Open Access
Major Sources of Social Science Data in the US Government
Major Sources of Natural Science Data from the US Government
https://www.flickr.com/photos/notbrucelee/6897137283/in/photostream
Problems that come with government data
P.E.S.T. Analysis for Industry
Let’s discuss
202 687-7495
Twitter: @jenny.wombat
These slides are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© Bill Waterhouse, with permission
AMSTAT images from
http://magazine.amstat.org/blog/2018/05/01/fy18fedbudget
Vocabulary: Tools, Process, and Products
Datasets or compilation: Raw or statistical numbers, can be flat file such as Comma Separated Variable (CSV) or proprietary like Excel
Metadata: Includes field descriptions for the dataset, found in codebooks
Schema: How data is organized or structured using standards, like classification
Application Program Interface (API): Read-only machine to machine querying, generally from JSON or XML files
Big data: Raw, unstructured data; normally transactional (example: each check out)
Natural Language Processing (NLP): Use for text analysis, not numeric data
Artificial Intelligence (AI): Includes predictive analytics and machine learning
Reports: Usually aggregated statistics based on big data (example: how many checkouts)
Data Visualization: Using software to visually communicate relationships and context of data
Open Data: Freely accessible data, created for a specific purpose; by-product of decision making or research
Funding for Federal Data Collection
NIH- National Institutes of Health (HHS)
NSF- National Science Foundation
AHRQ- Agency for Healthcare Research & Quality (HHS)
FDA- Food & Drug Agency (HHS)
BEA- Bureau of Economic Analysis (DoC)
BJS- Bureau of Justice Statistics (DoJ)
BTS- Bureau of Transportation Stat. (DoT)
Census- DoC
EIA- Energy Information Admin. (DoE)
ERS- Economic Research Service (DoA)
NASS- Nat. Agricultural Stat. Service (DoA)
NCES- Nat. Center of Education Stat. (DoE)
NCHS- Nat. Center for Health Stat . (HHS)
NCSES- Nat. Center for Science and Engineering Sat. (NSF)
ORES- Off. of Research, Evaluation, and Statistcs (SSA)
SOI- Statistics of Income (IRS)
Image from AmStat (permission pending)
One Statistical Office in US: Why Not?
1. Privacy: The Privacy Act of 1974, Confidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA), and Statistical Policy Directive No. 1 (2014) require agencies to ensure that the collection and maintenance of citizens' data is accurate, confidential, and within legal restrictions. With different offices having access to those records, there would be less possibility of everything being leaked.
2. Security: Along the lines of fewer offices having access to data records. The more servers that hold the data, the safer it is. The times when an exchange of information is necessary laws and regulations among departments allow to protect access to data.
3. Integrity: The income you report to IRS might be different from what you report to the Census Bureau.
4. Methodology: Sometimes data must have a higher number of people questioned so the accuracy will be better; different methods of collection or sampling may be required.
5. Popularity: Anything being done by the government has a political dimension, especially funding for employees and for modernizing and updating technology, attractiveness of the research, and repetition of statistical programs by agencies.
Future of the Bureau of Labor Statistics
In danger: Nat. Longitudinal Sur., JOLTS, Am. Time Use Sur., Employee Benefits Sur.,
Cen. of Fatal Occupational Injuries, Evaluation $27M>$2M
Protected
Principal Federal Economic Indicators (PFEI) and programs written into or referenced by law for allocation or other purpose. 85% of budget
Open Government
US Federal
International
States and Cities
https://data.sonomacounty.ca.gov/dataset/SoCo-Data-PNG/3m9t-bc35
Major International Data Sources
By topic
Financial & Economic- International Monetary Fund
Labor- International Labour Org
Telecommunications- International Telecommunications Union
Governance- Transparency International
Developed Countries- Organisation for Economic Co-operation and Development (OECD)
By Country
More data available in national language
Some charge for access
Citizens of that country might have free access
National Repositories/Archives
Historical
Datasets
Associations: Blogs and Conferences
For Librarians
For Federal data Policy
Learning more
Government Sources
Accidental Government Librarian
DigitalGov from Digital Government Division of GSA
Standards for Born Digital images
Numerical Data
Public Knowledge: Access and Benefits (Information Today, 2016)
Innovation in Federal Statistics (National Academics, 2017)
Legal issues
Data and IP