1 of 47

Features of the Data Landscape

(Big) Data & Data Assemblages

Ceilyn Boyd

ceilyn_boyd@harvard.edu

ceilyn.boyd@simmons.edu

NEASIST

11 January 2019

2 of 47

Navigating the Data Landscape - What features will you encounter?

2

3 of 47

Connecting data practice and research

3

Practice

Harvard Library

Research

Simmons University

(Big) Data &

Data Assemblages

?

Critical Data Studies & Philosophy of data

Research Data Management

4 of 47

Concepts

4

5 of 47

5 Key Concepts

(Big) Data

Measurements, pre-factual, pre-analytical pieces of information exhibiting 3Vs (volume, variety & velocity)

Assemblages

Non-hierarchical arrangements of components characteristized by extrinsic properties and relations

Data Assemblages

Sociotechnical infrastructures concerned with data

Research Data Management

Active management of data throughout the research and data lifecycle

Critical Data Studies

Interdisciplinary research area concerned with the critical, systematic investigation of data & data assemblages

5

6 of 47

Term: Critical Data Studies (CDS)

Interdisciplinary research area concerned with the critical, systematic investigation of data & data assemblages

6

7 of 47

Critical Data Studies (CDS)

7

Data

big data

Critical Theory

Discipline

small(er) data

Critical Data Studies

  • Bias
  • Ethics
  • Justice
  • Power
  • Privacy
  • Technology

data assemblages

Social Theories

assemblage theory

Geography

(or LIS, RDM, etc.)

8 of 47

Foundation of Critical Data Studies

Origins

  • Geography, sociology
  • Circa 2012
  • Against Dataism
  • boyd, danah, & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon.

7 Guiding Principles

  1. Data are situated in time & space
  2. Technology is not neutral
  3. Data is shaped by context
  4. Data is not raw
  5. Data complements other ways of knowing
  6. Data should be used in progressive ways
  7. Scholars (geography, LIS) should engage in praxis, not solely theory

(Dalton & Thatcher, 2014)

(Oliphant, 2017)

8

9 of 47

Term: (Big) Data

9

10 of 47

Data are not pure or natural objects with an essence of their own. They exist in a context, taking on meaning from that context and the perspective of the beholder.

(Borgman, 2015, p. 18)

10

11 of 47

Big Data possess a suite of key traits: volume, velocity and variety (the 3Vs), but also exhaustivity, resolution, indexicality, relationality, extensionality and scalability...Our analysis reveals that the key definitional boundary markers are the traits of velocity and exhaustivity.

(Kitchin & McArdle, 2016, p. 1)

11

12 of 47

(Big) Data Characteristics

Data

  • (Usually) Operationally defined
    • Context-dependent: data when, not what
  • Many formats (e.g. structured to unstructured)
  • (Often) Machine-readable
  • Not raw or neutral

Big Data

Data and datasets that exhibit some or all of the following properties:

  1. Volume
  2. Velocity
  3. Variety
  4. Exhaustivity
  5. Resolution & Indexicality
  6. Relationality
  7. Extensibility & Scalability

(Kitchin & McArdle, 2016, p. 2)

12

13 of 47

Data: Small & Big

13

Characteristic

Small Data

Big Data

Volume

Limited to large

Very large

Exhaustivity

Samples

Entire populations

Resolution & Indexicality

Coarse and weak, to tight and strong

Tight and strong

Relationality

Weak to strong

Strong

Velocity

Slow, freeze-framed, or bundled

Fast, continuous

Variety

Limited to wide

Wide

Extensibility & Scalability

Low to medium

High

(Kitchin, 2014, p. 28)

(Kitchin & McArdle, 2016, p. 2)

14 of 47

(Bigger) Data in a (Harvard) Library Context

14

Characteristic

Examples

Volume

Digitized & digital collections

Born digital collections

Electronic records

Scientific research data

Exhaustivity

Scientific research data

Resolution & Indexicality

Scientific research data

Relationality

Collections as Data

Research data

Velocity

Research data

Variety

Born digital collections

Digitized collections

Metadata

Research data

Extensibility & Scalability

Born digital collections

Collections as Data

Research data

15 of 47

Term: Assemblage

Non-hierarchical arrangement of components characteristized by extrinsic properties and relations

15

16 of 47

An assemblage is an arrangement of heterogeneous, autonomous components whose extrinsic properties and interactions give rise to a unique, enduring individual.

(Boyd, 2018, p. 5)

16

17 of 47

[T]he assemblage permits the researcher to speak of emergence, heterogeneity, the decentred and the ephemeral in nonetheless ordered social life.

(Marcus & Saka, 2006, p. 101)

17

18 of 47

Assemblage

Definition

  • A real, non-hierarchical, nested arrangement or system of heterogeneous, autonomous components defined by exterior relations

Origins

  • Social theory, political philosophy, realist tradition
  • Assemblage Theory - Deleuze & Guattari (1980)
  • Neo-Assemblage Theory - DeLanda (2006, 2016)

Purpose & Use

  • Describe micro- and macro-aspects of social reality
  • Widespread use in social sciences & CDS, literature, very limited use in LIS

Related Concepts & Theories

  • Rhizome, Network, Actor-network theory (ANT)

18

19 of 47

Assemblage Characteristics

Assemblages are...

  • Real
  • Heterogeneous
  • Non-hierarchical
  • Tri-scale (3): spatial, temporal, nesting
  • Dynamic
  • Exhibit emergent behavior
  • Composable
  • Machine-like
  • Members of populations
  • Responsive to internal & external events

Assemblages have...

  • Components
  • Component interactions & relations
  • Capacities - respond to events
  • Processes - cohesion, dissolution
  • Properties
  • Possibility space
  • Roles (3) - material, expressive, hybrid
  • Scope

19

20 of 47

Examples of Real-World Assemblages

20

21 of 47

21

22 of 47

Sesay, A., Oh, O.-O., & Ramirez, R. (2016). Understanding Sociomateriality through the Lens of Assemblage Theory: Examples from Police Body-Worn Cameras.

22

23 of 47

Diagram of an assemblage that emphasizes its machine-like, interlocking qualities

(Boyd, 2018)

23

24 of 47

Term: Data assemblage

Sociotechnical infrastructure concerned with data

24

25 of 47

Data Assemblage

Definition

  • A sociotechnical infrastructure concerned with data

Origins

  • Critical Data Studies (CDS), circa 2014

Purpose & Use

  • Examine social dimensions (e.g. power relations, cultures) associated with sociotechnical infrastructures
  • Critical data studies, Critical algorithm studies

Related Concepts & Theories

  • Assemblage, Network, Platform (platform studies), Actor-network theory (ANT), Infrastructure theory (e.g. Star & Ruhleder, 1996)

25

26 of 47

Data Assemblage Characteristics

Data assemblages (are)…

  • Embody and encode standards
  • Encodes knowledge
  • Invisibly support use
  • Reflect systems of thought (e.g. dataism, capitalism)
  • Sites of data practice & culture
  • Sources and enablers of (big) data
  • Embedded in other infrastructures (e.g. technical, policy, financial)
  • Are learned as a part of membership

Data assemblages have…

  • Components that become visible upon systems breakdown
  • Intersections with social, economic, and governmental entities

26

27 of 47

27

28 of 47

28

29 of 47

Here’s How #Ferguson Exploded on Twitter Last Night (Oh, 2014) https://www.motherjones.com/politics/2014/11/ferguson-twitter-map

29

30 of 47

Amazon Doesn’t Consider the Race of Its Customers. Should It?

(Ingold & Soper, 2016)

https://www.bloomberg.com/graphics/2016-amazon-same-day/

30

31 of 47

Research

Build infrastructure to systematically describe, analyze & compare data assemblages

31

32 of 47

32

Me

CDS

Analyze & Compare

Data assemblages

33 of 47

Overview of Research Direction

Research Questions

  1. How can we systematically describe, analyze, and compare assemblage phenomena?
  2. How can assemblage theory be made more accessible and useful to LIS researchers?
  3. How can we support empirical studies involving data assemblages?
  4. How can we design better data assemblages?

Approach

  1. Develop domain ontology & glossary for assemblage theory
  2. Develop propositions & conceptual framework for assemblage theory
  3. Extend infrastructure to include data assemblages & conduct empirical studies
  4. Use infrastructure to inform system design and data modeling (MVC -> OVC)

33

34 of 47

Knowledge Organization Systems - Ontology is most complex

Pieterse, V., & Kourie, D. G. (2014)

34

Q: How can we systematically describe, analyze, and compare assemblage phenomena?

A: Build an ontology & glossary

35 of 47

A. Workflow to Develop Domain Ontology & Glossary

35

Statistics

#

Total concepts

71

Top-level concepts

8

Descendents for each top-level concept

36

assemblage

6

capacity

3

event

0

mechanism

1

population

1

process

5

property

20

virtual diagram

0

36 of 47

Visualization of Assemblage Theory Domain Ontology

36

37 of 47

Domain Ontology & Glossary

  • Domain Ontology
    • OWL File (XML) & Class documentation (PDF)
    • https://bit.ly/2Fksu3U

  • Glossary
    • Textual Glossary (PDF)
      • https://bit.ly/2H1TPtV

37

38 of 47

B. Workflow to Develop Assemblage Theory Conceptual Framework

38

Q: How can assemblage theory be made more accessible and useful to LIS researchers? How can we support empirical studies involving data assemblages?

A: Build a conceptual framework

39 of 47

B. Assemblage Theory Conceptual Framework Summary

8 Theory Propositions

  1. Real
  2. Heterogenous
  3. Non-hierarchical
  4. Tri-scale
  5. Dynamic
  6. Has possibility space
  7. Belongs to population
  8. Composable

12 Concepts

  1. Assemblage
  2. Capacity
  3. Component
  4. Event
  5. Mechanism
  6. Population
  7. Process
  8. Property
  9. Role
  10. Scale
  11. Scope
  12. Virtual diagram (possibility space)

3 Relationships

  1. Is-a
  2. Has
  3. Member-of

3 Facets

  1. Causality
  2. Probability
  3. Structure

39

40 of 47

Assemblage Theory Conceptual Framework Diagram

40

Assemblage

Probability

Causality

Structure

Virtual Diagram

Virtual Diagram

Virtual Diagram

Event

Mechanism

Process

Virtual diagram*

Capacity

Population

Virtual diagram*

Assemblage

Component

Property

Role

Scale

Scope

Virtual diagram*

41 of 47

41

42 of 47

Overview of Research Direction - Redux

Research Questions

  • How can we systematically describe, analyze, and compare assemblage phenomena?
  • How can assemblage theory be made more accessible and useful to LIS researchers?
  • How can we support empirical studies involving data assemblages?
  • How can we design better data assemblages?

Approach

  • Develop domain ontology & glossary for assemblage theory
  • Develop propositions & conceptual framework for assemblage theory
  • Extend infrastructure to include data assemblages & conduct empirical studies
  • Use infrastructure to inform system design and data modeling (MVC → OVC)

42

43 of 47

Summary

43

44 of 47

Why Assemblage Theory & (Data) Assemblages?

  • Assemblage concept is conceptually dense, complex, but powerful
  • Assemblage-like phenomena are ubiquitous, relevant to many disciplines & practices
  • Assemblage Theory can be paired with other theories (e.g. LIS--information behavior) to enrich research and practice
  • Critical Data Studies
    • Opportunities for critical engagement with data within LIS and RDM
      • Ex. Library as site of data practice; Data literacy; Data justice & advocacy
      • Ex. Navigation, use, and design of RDM assemblages

44

45 of 47

Questions?

Thank you.

ceilyn_boyd@harvard.edu

ceilyn.boyd@simmons.edu

45

46 of 47

References

Borgman, C. (2015). Big data, little data, no data: scholarship in the networked world. Cambridge, Massachusetts: MIT Press.

boyd, danah, & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(2), 662–679.

Boyd. C. M. (2018). A Prototype Domain Ontology for Neo-Assemblage Theory. Manuscript in preparation.

Dalton, C., & Thatcher, J. (2014). What does a critical data studies look like, and why do we care?

DeLanda, M. (2006a). A new philosophy of society: assemblage theory and social complexity. London: Continuum.

DeLanda, M. (2006b). Deleuzian Social Ontology and Assemblage Theory. In M. Fuglsang & B. Meier Sorensen (Eds.), Deleuze and the Social. Edinburgh University Press.

DeLanda, M. (2016). Assemblage theory. Edinburgh: Edinburgh University Press.

Ford, H. (2014). Big Data and Small: Collaborations between ethnographers and data scientists. Big Data & Society, 1(2).

Kitchin, R. (2014). The data revolution: big data, open data, data infrastructures & their consequences. Los Angeles, California: SAGE Publications.

Kitchin, R., & McArdle, G. (2016). What makes big data, big data? Exploring the ontological characteristics of 26 datasets. Big Data & Society, 1–6.

Marcus, G. E., & Saka, E. (2006). Assemblage. Theory, Culture & Society, 23(2–3), 101–106.

Oliphant, T. (2017). A case for critical data studies in library and information studies. Journal of Critical Library and Information Science Studies, 1(1).

Welles, B. F. (2014). On minorities and outliers: The case for making Big Data small. Big Data & Society, 1(1).

46

47 of 47

Credits

  • Presentation template by SlidesCarnival
  • Photographs by Unsplash, Pexels
  • Icons by:

47