1 of 66

NICAR Beginner Track:

Tools to Save You Time

Link to slides: bit.ly/beginner-tools

2 of 66

Speakers

Cynthia Tu�She/her�Sahan Journal�

Tyler Dukes�He/him�McClatchy Media

Pooja Dantewadia�She/her�Realtor.com

3 of 66

Summary

Problems we will solve:

  • Website & data scraping
  • Document management
  • Summarization
  • Visualization

Tools to solve them:

  • RECAP
  • Wayback Machine extension
  • Junkpedia
  • Google Pinpoint
  • DocumentCloud
  • Summarize.tech
  • NotebookLM
  • Tabula
  • RAW
  • And more!!!

4 of 66

Website & Data Scraping

5 of 66

RECAP

Features:

  • Free documents from PACER archive.
  • Available as a browser extension (Chrome, Safari, etc.)
  • Contribute documents you pay for to a massive, distributed archive.
  • Get alerts to changes to cases you’re watching.

Links:

6 of 66

RECAP

7 of 66

RECAP

8 of 66

RECAP

9 of 66

RECAP

10 of 66

RECAP

11 of 66

Wayback Machine

12 of 66

Wayback Machine chrome extension

13 of 66

Wayback Machine chrome extension

Features:

  • Wayback machine saves the screenshot of the website in their archives
  • A service by the Internet Archive that stores and displays historical snapshots of web pages
  • Helps in looking at what a certain website look historically
  • Not just the archive version of the URL but also the outbound links
  • Instantly see snapshots of any webpage and discover how it looked at different points in time
  • Compare current content with past versions��

14 of 66

Wayback Machine chrome extension

Features:

  • Quickly archive any a webpage and other material like a PDF file, spreadsheet or a zip file using the "Save Now" button

15 of 66

Wayback Machine chrome extension

16 of 66

Wayback Machine

17 of 66

Wayback Machine chrome extension

18 of 66

Wayback Machine

19 of 66

Junkipedia

20 of 66

Junkipedia

What: Junkpedia is a tool designed to help journalists and researchers easily analyze and report on social media activity, similar to what CrowdTangle used to offer

21 of 66

Junkipedia

Features:

  • Annotated account database that quickly find information on social media accounts with notes about their credibility, background, or purpose.
  • It compares social media metrics for engagement, mentions, views, or other interactions across different accounts, languages, or categories
  • It has a user-friendly API to access data through a simple, well-documented API that integrates smoothly with your other tools and projects

22 of 66

Junkipedia

Multiple flexible filters that narrow down content by language, type of media (videos, images, text), or social media platform (like Facebook, X, Instagram, TikTok).

23 of 66

Junkipedia

It is easy to search through all collected social media data in one place

24 of 66

Junkipedia

How it helps:

  • Spot trends or misinformation quickly
  • Understand how topics or stories are spreading online
  • Create reports backed by accurate social media data
  • You can save time with easy-to-use analytics

25 of 66

Junkipedia

26 of 66

Junkipedia

27 of 66

Junkipedia

28 of 66

Junkipedia

29 of 66

Junkipedia

30 of 66

Junkipedia

Additional Features:

  • It is a user-controlled data collection which means you can build customized lists of social media accounts (including podcasts) to track what's important.
  • Automatically generating transcripts from collected content
  • Can also quickly find new social media channels by topics or engagement levels
  • Export data as CSV files and create customizable dashboards

31 of 66

Junkipedia

Ethical considerations to

clearly communicate that scraping is not illegal but highlight ethical considerations regarding data privacy, community norms, and Terms of Service compliance

32 of 66

Document Management

33 of 66

Google Pinpoint

Features:

  • Bulk document search, summarization, and extraction
  • Transcribe and search audio and video files
  • Extract similarly structured data into sortable spreadsheets
  • Collaborative workspace for projects

Examples:

34 of 66

Google Pinpoint

Organize and filter your documents by labels

35 of 66

Google Pinpoint

Filter documents by dates and entities

36 of 66

Google Pinpoint

  • Use quotation marks (“”) for specific words:
    • A search for “moon” will only return exact match of the word “moon”, while a search for moon w/o quotation marks may return results for the word “moon”/”lunar”/etc.
  • Use the minus sign (-) in front of any word you’d like to exclude from your search:
    • Results for the search moon -sun will not include documents where the word “sun” appears
  • Use the keyword “OR” or a pipe symbol “|” between any two phrases to return documents that include any of the two.

37 of 66

Google Pinpoint

38 of 66

DocumentCloud

Features:

  • Create document libraries searchable by keyword
  • Collaborate with colleagues by sharing across organizations
  • Annotate and link to important parts of each document
  • Share and embed documents with your audience
  • Add-ons expand capabilities

Links:

  • Sign up for free: �documentcloud.org/
  • Link your account to your news organization for full access to tools

39 of 66

DocumentCloud

40 of 66

DocumentCloud

41 of 66

DocumentCloud

42 of 66

Summarization

43 of 66

Summarize.tech

Features:

  • Get summary of local video files or Youtube videos
  • 5 free uploads per month
  • See transcriptions uploaded by other users
  • Premium subscription: $10 per month

44 of 66

Summarize.tech

45 of 66

Summarize.tech

46 of 66

Notebook LM

Features:

  • Chatbot for research/summarization, by Google Gemini
  • Upload/organize (multiple) sources
  • Generate instant insights
  • Clear citation, extract exact quotes from your sources
  • Notes organization
  • Transform insights into podcast/audio

47 of 66

Notebook LM

Upload docs to your collection

48 of 66

Notebook LM

Notebook LM will summarize your documents after uploading

49 of 66

Notebook LM

  • Ask questions about the documents in the chat box
  • …Or use the default prompts to generate insights

50 of 66

Notebook LM

What it’s good for:

  • Organize and analyze documents in bulk
  • Generate stories ideas, outlines, Q&A
  • Answer questions about the documents
  • Clear citation!!
  • Generate “podcasts” using your documents

51 of 66

Visualization

52 of 66

Tabula

Features:

  • Convert PDFs to data using a Web-based interface
  • Control your output by specifying a CSV, TXT or Excel file
  • Tweak the formatting to pull out the data you want

Links:

  • Get Tabula for free: �tabula.technology/

53 of 66

Tabula

54 of 66

Tabula

55 of 66

Tabula

56 of 66

RAW

57 of 66

RAW

  • Easy-to-use online tool to turn raw data into visual charts
  • Supports many chart types like bar, line, scatter, and sankey diagrams
  • Adjust colors, sizes, and positions to highlight data
  • Exports high-quality images
  • Free and open-source
  • Helps journalists quickly create clear, engaging visuals
  • NO SIGN UP TO BEGIN!

58 of 66

RAW

59 of 66

RAW

60 of 66

RAW

61 of 66

RAW

62 of 66

RAW

63 of 66

More resources!

(that we may not have time to talk about in depth)

64 of 66

65 of 66

66 of 66

Questions?

Cynthia Tu

Email: ctu@sahanjournal.com

Twitter/X: @CynthiaTu2

Tyler Dukes

Email: mtdukes@mcclatchy.com

Bluesky: @mtdukes.bsky.social

Pooja Dantewadia

Email: pd2713@columbia.edu

Twitter: @PoojaDante