LAT data strategy memo – revision 0.3
June 2, 2008
As a newspaper, we’re used to controlling how readers get information from us. We publish most of our data as stories, carefully selecting the information that best tells the story and leaving out most of the data we collect – often valuable information that readers want. The online medium not only gives us the means to publish virtually unlimited volumes of data, but it demands that we do so.
The web is all about putting more control in the hands of the users. When we publish information as structured data, we give users the power to take control of it. When we don’t, we invite them to search for that information elsewhere.
Understanding that our current data collection, analysis, storage and presentation functions are spread across multiple departments with different needs and priorities, we sought to set some universal goals and identify opportunities for further collaboration. Seeing a need for a comprehensive data strategy, we tasked ourselves with developing a set of recommendations for aligning these functions.
Key themes from our discussions:
We currently have people serving several different masters. Our priorities need to be in alignment.
The story-centric view of the universe constrains us. Often the data we collect will result in stories; sometimes it’ll work the other way around. But much of what we’ll be doing, at least at the start, is independent of any story. It’s gathering and publishing data for the sake of the data itself and to the broadest benefit of the readers, knowing that stories will reveal themselves.
We should train most reporters to identify opportunities for data collection and analysis within their beats (and we should train some reporters further on basic database tools).
We need to maintain an arsenal of tools and templates for visualizing and publishing data. This will enable us to work faster and allow more people to pitch in.
We offer three potential paths forward, though there are certainly others that we could take, and perhaps the answer is a combination of elements of these:
Option 1: Cross-functional data team
Appoint a data editor. The data editor’s role will be to harness ideas and coordinate data priorities across four departments: metro, web, graphics and the library
Assign two reporters from metro to acquire data. These folks need tech smarts, persistence and a good understanding of public records law. They’ll work with beat reporters and with data editors to assemble key data sets.
Continue working to standardize storage and organization of data. Part of a successful data strategy is a plan to manage the information we collect and make sure it’s optimally distributed.
Evangelize the data effort throughout the newsroom. We will need buy-in from reporters and editors throughout the newsroom, particularly in metro. As the war dead project has shown, collecting and vetting high-quality data can be a time-consuming process.
Effect: Requires the least investment and can be accomplished quickly but doesn’t integrate the team. Probably gives us a realistic shot of releasing a data app to the website at least every 3 weeks.
Option 2: Metro-web data desk
Establish a data desk that merges metro and web data teams under a single editor. The data desk would presumably report jointly to the metro editor and the editor of the website. Initially, the data desk can be a virtual construct, but eventually it should be a physical place in the newsroom. We’d staff the data desk with two reporters devoted full-time to wrangling data from public entities. Suggested roles:
Data editor – manage and prioritize projects, allocate resources for print and web.
Data administrator – organize our expanding universe of data and oversee its use and maintenance.
Data reporters (2) – negotiate with public agencies to acquire and ingest data.
Analysis specialists (2) – scrub and mash up data sets, work with beat reporters to develop data into stories.
Presentation specialists (2) – create visualizations for print and web, help develop web applications to present dynamic data. Includes suggested developer/UI specialist to be hired
Project manager (shared) – coordinates with tech team on projects that require tech resources
Ensure regular communication on data priorities. This includes frequent communication on priorities with editorial leadership, as well as collaboration with graphics and library staffs on certain projects.
Promote training in the newsroom. The data desk should be a resource for the wider newsroom, a place where story ideas are born and reporters and editors come to learn how to collect and analyze data. It should be a hub for collaboration.
Create a standing committee focused on the management and organization of the data we collect. Include representatives from the data desk, graphics, library and web tech team.
Effect: Requires some shuffling of bodies but doesn’t necessitate splitting off folks from graphics and the library. Probably gives us a realistic shot of releasing a data app to the website every ~2 weeks.
Option 3: Fully integrated data desk
Establish a data desk that merges metro and web data teams, plus parts of library and graphics, under a single editor. Suggested roles:
Data editor
Data administrator
Data reporters (2)
Analysis specialists (3)
Presentation specialists (3) – Includes suggested developer/UI specialist to be hired
Project manager (shared)
Ensure regular communication on data priorities. The data desk will make an effort to set its own priorities, but it will need direction from editorial and tech leadership, who will ensure that the data desk’s goals are consistent with the organization’s overall editorial and business needs.
Promote training in the newsroom.
Create a standing committee focused on the management and organization of the data we collect.
Effect: This is the most complicated solution. It requires the library and graphics department to give up resources to this team, though some duties currently assumed by those departments will also shift to the data desk. It probably gives us a realistic shot of releasing a data app to the website almost every week, once the team is assembled and has a few weeks to optimize processes and ramp up production.
The “Top 9”
The following are projects identified as exemplary of the long-term projects the data desk should tackle, in addition to serving breaking news and enterprise data analysis needs:
LAPD crime data (maps, alerts, etc)
Restaurant health inspection scores
Mapping L.A. data platform
Public employee salaries
Apartment/condo inspections
School profiles (scores, stats, user reviews)
Visualizations on the performance of pro athletes
Dog registrations by ZIP
Movie shoot locations