1 of 21

Archives Catalogues as Data

2 of 21

TASK: Turn 1990s archival descriptions in print inventories into machine-actionable, online accessible data

APPROACH: Collections-as-data

GROUP MEMBERS

Yael Netzer

Aaron Christianson

Elena Hamidy

Amalia Levi

Imme Klages

3 of 21

9 printed volumes

Original database ‘lost’

5 Volumes are digitized

4 of 21

Background

  • Until 1989 - hardly any access to Jewish archival materials in DDR
  • After German reunification: concerted effort to locate+increase access
  • Identifying Jewish resources in Archives of East Germany (5 states)
  • Central, state, regional, city, university, business archives
  • Jewish community archives, business and personal archives
  • 1 comprehensive inventory and then detailed inventories

5 of 21

“Collections as Data”

https://collectionsasdata.github.io/

Human readable - Machine Actionable

Yerushah project https://www.yerusha-search.eu/viewer/index/

6 of 21

Description of institution and relevant Jewish collections

7 of 21

Notepad++ with regexp

8 of 21

GOALS

Continue working with all volumes (this is only one volume out of 9)

Networks: locations, persons, timeline…

Mapping: reconcile and link locations

Persons: reconcile and link [to other collections too]

Linking archives/records to digital catalogues.

Find interesting things through distant reading archival descriptions!

9 of 21

Notepad++ with regexp

10 of 21

Omeka-S

11 of 21

Exiles in “Quellen zur Geschichte der Juden in den Archiven der Neuen Bundesländer”

12 of 21

Types of Organisation of a Book (Institutionenregister)/ digital edition possibilities

  • Translation into English
  • Vocabulary machine readable
  • More categories
  • Historical/ still active institutions

13 of 21

Schocken

(reply to Sinai’s question)

14 of 21

Combining indexes to link persons to locations (Gephi-Graph)

15 of 21

16 of 21

17 of 21

18 of 21

19 of 21

LIMITATIONS OF AVAILABLE DATA

  • Not an in-person survey, information contributed by archives
  • Index of archives “with no Jewish materials”
    • Opportunity for further research
  • Included: materials with clear “Jewish” reference
    • Possibly lots of material missing
  • Problematic legacy titles (e.g., “Mischling” or “Aryanization”) carried over
    • Need for research to better contextualize past archival practices
  • Important to read extracted data alongside the introduction + directions to users in each volume.
    • So online users can understands issues, challenges, limitations with data they see online

20 of 21

21 of 21

Lessons Learned

Always be suspicious towards resources

  • Add ‘the unknown’ to your collection

Catalogues as data, books as data: think about future process of creation

Paradata/Provenance should not be neglected or ignored

Modeling of knowledge is an iterative, community-based process

Hackathons are effective and fun