2017 Digital Humanities Infrastructure Symposium



9:00-9:30: Humanistic Infrastructure

      Presentations (Thursday, Feb. 23)

9:30-10:15: DIY DH: Learning about DH by designing and launching our own research projects

9:30-10:15: Designing DResSUP4Lib

9:30-10:15: ePortfolios for Promotion and Tenure

10:30-11:00: Expanding the horizons of participation: Participatory infrastructure as training grounds for 21st century citizens

10:30-11:00: DH Class as Infrastructure: The Digital Watts Project

11:15-12:00: Works in Progress (see WiP list below)

12:00-1:00: Lunch

1:00-2:00: Every Day Should Be a “Thank Your Sys Admin” Day

1:00-2:00: Building a Coalition for Digital Scholarship on Campus

1:00-2:00: Supercomputing meets cultural heritage: supporting photogrammetry at scale

1:00-2:00: OCR: Precision and Scale

1:00-2:00: SD|DH: A Regional Digital Infrastructure

2:15-3:00: HaCCS Lab as model for unfunded research lab

2:15-3:00: The Making of the Digital Art History 101 Website

2:15-3:00: Text Mining and Topic Modeling Historical Newspapers

      3:00-4:00: Open Discussion on Digital Sustainability Best Practices

      4:00-4:30: Wrap up

      5:00-6:00: Reception

     Works in Progress (Thursday, Feb. 23)

11:15-12:00: Public Media Displays and Rigorous Digital Scholarly Content: Connecting Scalar, Tensor, and Global Crossroads

11:15-12:00: New Ways to Engage Museum and Library Patrons with Cultural Heritage Collections

11:15-12:00: Building Processual Infrastructures for an Oral Life Histories-based Digital Humanities Research Collaboration

11:15-12:00: Social Paper

11:15-12:00: Tools for Text Data Mining   

      Workshops (Friday, Feb. 24)

9:00-11:00: Getting Started: Digital Project Development Environments, RC Classroom, YRL Research Commons

9:00-11:00: An Introduction to DH Project Planning, CDH Learning Lab @ Rolfe

11:30-1:30: Introduction to Git and Github, RC Classroom, YRL Research Commons

11:30-1:30: How to Visualize Data with Tableau, West Electronic Classroom, YRL

2:00-5:00: Testing DH Code, RC Classroom, YRL Research Commons

      2:00-5:00: An Introduction to Text Analysis, CDH Learning Lab @ Rolfe


9:00-9:30: Humanistic Infrastructure

Description: We have seen a growth in scholarship on infrastructure, research infrastructure and knowledge infrastructures. This type of work, however, rarely engages strongly with the imagining and building of new infrastructures for the humanities. There is also a strong interest in infrastructure institutionally in the academy and other organizations, and ‘infrastructure speak’, as articulated in policy documents and plans, tends to give strong agency to technology (rather than research questions and educational goals), impose haste and an ‘approximate future’, include little explicit critical inflection and few critical categories, and see infrastructure as a generic tool rather than as an epistemic-critical machinery.


In this talk I will suggest that if infrastructure embeds ideas, values and institutional goals, we need to consider those perspectives when planning and building new infrastructures. This has several consequences. First, we must attend infrastructure critically and in terms of making at the same time. Second, humanities infrastructure cannot be imagined without considering the state and imagined future of the humanities at large and the institutions we represent.  Third, humanities infrastructure must relate to scholarship, intellectual agendas and societal action. Finally, I argue that infrastructural thinking and making provides an opportunity for the humanities, but also a frame and template we need to be critical about. I will draw on work inside and outside digital humanities that challenges traditional notions of infrastructure.


The talk will be supported throughout by visual materials and concrete examples, and will incorporate suggestions for building infrastructure. A list of references and sources will be posted on http://patriksv.net before the event.

Patrik Svensson (http://patriksv.com/) is Visiting Professor of Digital Humanities at UCLA, Professor of Humanities and Information Technology at Umeå University, and former Director of HUMlab at Umeå University (2000-2014). Recent publications include Big Digital Humanities. Imagining a Meeting Place for the Humanities and the Digital (University of Michigan Press. 2016), “The Why and How of Middleware” (with Johanna Drucker. Digital Humanities Quarterly. 2016), “‘One Damn Slide After Another’: PowerPoint at every Occasion for Speech” (with Erica Robles-Anderson. Computational Culture. 2016) and “Contemporary and Future Spaces for Media Studies and Digital Humanities” (In Sayers, Jentery (ed.), Routledge Companion to Media Studies and Digital Humanities, forthcoming).

Presentations (Thursday, Feb. 23)

9:30-10:15: DIY DH: Learning about DH by designing and launching our own research projects

Description: At the Claremont Colleges Library, librarians and staff from across divisions are learning more about Digital Humanities by directly engaging in their own DH research projects through a series of planned programs and designated project time. Inspired by the changing nature of scholarship and a desire to best support the DH work of students and faculty, this project-based initiative has not only helped librarians and staff understand DH concepts more broadly, but has also contributed to deeper engagement with the highs and lows of DH research through first-hand experience.

The programs offered included a five-week “Introduction to DH” short course in which cross-divisional teams of librarians and staff created DH project proposals that were then built out during the first Library DH Maker Week in July 2016. Once the projects began, the teams spent a significant amount of time reworking the project scope and managing expectations while still staying within the parameters of the originally defined project. Each of the teams identified different ways of handling the need to keep within personnel, tool, and data limits, such as narrowing the scope of deliverables or increasing personnel involved. One team specifically began with a significant goal and needed to rework the final objective as difficulties arose in the collection and analysis of the data. Other teams are now in a position to bring in more resources to their project in order to fulfill the initial design goals.

This presentation will focus on the Introduction to DH short course, the Maker Week, and how each DH team managed their project. The next iteration of project development focuses on bringing librarians, students, staff, and faculty together to either bring the original project idea to fruition or explore further avenues of research, analysis, and presentation.

Lydia Bello, STEM Team Librarian, Claremont Colleges Library

Madelynn Dickerson, Information Resources Coordinator, Claremont Colleges Library

Ashley Sanders, Ph.D., Director of the Claremont Colleges Digital Research Studio

9:30-10:15: Designing DResSUP4Lib

Description: Our long presentation will describe DResSUP (Digital Research StartUp Partnerships), an incubator for graduate student projects within the UCLA Library over the past two years, and outline our plans for the next phase, “DResSUP for Librarians.” This second phase will seek to develop digital research skills among UCLA librarians as well as graduate students through collaborative project-based partnerships.

The main goal of the program has been to provide graduate students with skills and methods to pursue digital projects on their own. Rather than doing projects for them, we focus instead on teaching students to start with a small sample of their own dataset, working it through the research data lifecycle: collecting, cleaning, and analyzing/visualizing data, supplemented by a workshop on project management and the specific tools that are applicable to the students’ research projects. In the second half of a typical program, we focus on the execution of their  individual projects, building prototypes and adjusting workflows that allow them to complete their projects independently. The program has been very successful. What has surprised us most, however, was the curiosity and enthusiasm of our librarian colleagues.

Building on this enthusiasm, we have plans to expand our capacity by adding a second track for training librarians. We recently applied for and received external funding to build out the infrastructure for this training program, which we will launch in 2017. This innovative program is designed to integrate skills and methods training for the librarians with project-based learning. In Summer 2017 the two threads of DResSUP will converge with a librarian team working on their digital project alongside graduate student projects.

Zoe Borovsky, Librarian for Digital Research and Scholarship, UCLA Library

Peter Broadwell, Academic Project Developer, UCLA Library

Dawn Childress, Librarian for Digital Collections and Scholarship, UCLA Library

Claudia Hornig, Metadata Team, UCLA Library

Andy Rutkowski, Geospatial Librarian, UCLA Library

9:30-10:15: ePortfolios for Promotion and Tenure

Description: Digital projects and online publishing are now encouraged in many academic disciplines, especially in the digital humanities, and can count towards earning tenure just like print publications. Yet a significant barrier towards digital materials being judged with the same weight as those that appear in print is that most institutions continue to evaluate tenure through paper dossiers, where media-rich digital works cannot be adequately rendered or interactively navigated. Consequently, there is a need for an online evaluation platform where the depth and innovation of digital work can be viewed electronically. Chapman ePortfolios, a web-based software platform built on WordPress software, addresses the issue of digital scholarly evaluation by providing candidates with a space to embed and/or link to their multimodal work, in addition to allowing candidates to attach or link to digital copies of their print publications.

Jana Remy, Associate Director of Digital Scholarship, Chapman University

10:30-11:00: DH Class as Infrastructure: The Digital Watts Project  

Description: Digital Watts is a project that was initiated by The Digital Watts Project (DWP), a Loyola Marymount University (LMU) digital humanities course taught in summer 2016 by Dr. Dermot Ryan, professor of English, and Melanie Hubbard, Digital Scholarship Librarian. In the course students used digital tools to examine literature and rhetoric related to the 1965 “Watts Uprising.” For their final project students worked with rare Watts materials from the Southern California Library (SCL), a community archive in South Los Angeles. Through digitization and the creation of metadata, students helped the SCL launch its digital library presence as well as Digital Watts, an LMU Omeka based project. The DWP demonstrates ways in which a digital humanities class can involve civic engagement as well as how the classroom environment can provide the infrastructure necessary to launch a digital project, a particularly important point for those at smaller institutions like LMU that struggle to get DH projects off the ground.

The most challenging aspect of Digital Watts was the labor intensive metadata process, for its creation was time consuming and raised some difficult questions: How does one describe the materials so that they are more discoverable by people with underrepresented perspectives? Was the Watts event a “riot” or an “uprising,” or something else entirely? Through critical use of digital tools, literary/rhetoric studies, and readings and discussions about issues of representation, students gained the base knowledge needed to approach the metadata creation. Consequently, the classroom not only provided the contributors (students) necessary to get the project off the ground; it provided an environment in which the students could gain the skills and knowledge necessary to participate.

Melanie Hubbard, Digital Scholarship Librarian, Loyola Marymount University

Dermot Ryan, Professor of English, Loyola Marymount University

10:30-11:00: Expanding the horizons of participation: Participatory infrastructure as training grounds for 21st century citizens

Description: In the past decade, scholars and educators working within the fields of digital humanities and digital learning have praised the adoption of participatory culture in scholarship and pedagogy as a means of updating these activities to 21st century practices. Roughly speaking, participatory methods can be defined as engagement in the peer production of knowledge, culture, and information using networked information technologies in public or private spaces with known and/or unknown collaborators. These methods have been praised by advocates such as Cathy Davidson, David Theo Goldberg, Henry Jenkins, and Howard Rheingold for the ways that they democratize knowledge production, increase public visibility of humanistic research activity and knowledge, and enable students and researchers to practice new forms of collaboration and public engagement. Notable examples of these practices include classroom contributions to Wikipedia, Kathleen Fitzpatrick’s use of CommentPress to crowd-source peer review, Hybrid Pedagogy’s monthly #digPed Twitter chats, academic-oriented groups on Facebook, and everyday informal sharing of academic resources and ideas on social media platforms.


However, despite the very important ways the adoption of participatory culture has advanced practices of research and pedagogy, growing concern regarding the political stakes of our networked information environments has pointed to the need for a more critical engagement with the infrastructure that makes participatory culture possible. This critical engagement, however, is extremely difficult to foster given that a majority of popular networked information technologies prohibit users from collectively or individually investigating the mechanisms that mediate their communicative activity as well as from modifying those mechanisms to better suit their needs and values as scholars, students, and citizens. In this five-minute lightning talk (or longer if the review committee sees fit), I will argue that the university has the opportunity to intervene upon this situation by providing participatory infrastructure to its students as a site for both hosting participatory learning and scholarship, as well as enabling student investigation into the ways in which infrastructure mediates these activities. I will then suggest one way in which institutions might collaboratively work together to provide both the infrastructure, organization, and outreach necessary for supporting this type of endeavor, and describe the early steps we are taking in this direction with UC San Diego, the San Diego Community College District, and the San Diego Digital Humanities network using the free and open source software package Commons in a Box.

Erin Rose Glass, Associate Director, Center for the Humanities, UCSD

1:00-2:00: Every Day Should Be a “Thank Your Sys Admin” Day

Description: Harold Shin will overview CDH’s experience meeting the demand for the hosting of DH projects and resources at UCLA from the perspective of the systems administrator. He’ll discuss lessons learned and things to be aware of.

Harold Shin, Chief Technology Officer, UCLA Center for Digital Humanities

1:00-2:00: Building a Coalition for Digital Scholarship on Campus

Description: The University Library at UC Santa Cruz opened the Digital Scholarship Commons (DSC) in January 2016. This new space includes a flexible work area, lounge, meeting space, and a lab with scanners, high end computers, and software open to the entire campus community. However, the DSC is not only a space: it is also the home of a growing DH program and set of digital scholarship services that provide students, faculty and staff with opportunities to engage with digital media, project incubation, and innovative scholarship. More, the DSC project reflects the successful growth of a dynamic and engaged community of users, sustained programming of speakers and workshops, and a substantial investment by the library and the Humanities division in public and digital scholarship.

This presentation will reflect on the DSC’s first year of operation by detailing the coalition of campus divisions and leaders that brought this project to fruition and the programmatic aspects that define its mission and purpose for the campus. Rachel Deblinger, Director of the Digital Scholarship Commons, will present the history of the project, including the choices made in furnishing and outfitting the space, and introduce the work of the DSC. In particular, Deblinger will discuss DSC efforts around undergraduate learning, focusing on a new Undergraduate Digital Research Fellowship and support for in-class digital assignments. These two initiatives provide insight into the infrastructure of the DSC as both are enabled by student labor: a Graduate Student Researcher and two undergraduate student assistants.

Rachel Deblinger, Director, Digital Scholarship Commons, UC Santa Cruz

1:00-2:00: Supercomputing meets cultural heritage: supporting photogrammetry at scale

Description: The use of photogrammetry software that can convert a series of partially-overlapping 2D images into a 3D model is increasing among archaeologists and scholars in related fields. This software is moderately expensive (in the range of $500 for a license), and very resource-intensive, requiring, at the minimum, a laptop designed for gaming. Because of the resource requirements for reasonable performance, many scholars rely on high-powered computers in shared computer labs. However, for extremely large or detailed models, the process can monopolize one of these computers for multiple days.

Leveraging a high-performance compute cluster can significantly speed up 3D model development, reducing the processing time from days to hours. A cluster with a mix of GPU (graphics processing unit) nodes for the “build dense mesh” stage, and standard nodes for the other stages, is particularly effective. However, most clusters only provide a command-line interface, which is unfamiliar and off-putting to many scholars in the humanities. UC Berkeley’s Research IT group is making use of new Python integration capabilities in the latest version of Photoscan in order to develop a Jupyter notebook that researchers can use to interact with Photoscan on an HPC cluster, and move their data to and from the cluster via the Box cloud storage service. The notebook juxtaposes human-readable documentation and executable code, and saves intermediate outcomes from the processing steps to allow the user to understand how the processing is proceeding, and determine whether manual intervention is needed before the next processing step. The Jupyter notebook is also paired with a virtual research desktop where researchers can access the Photoscan user interface and manually edit their model, even using a laptop that lacks the memory or hard drive space to run Photoscan locally. This talk will demo the workflow developed for Berkeley researchers, and describe how aspects can be replicated using infrastructure available at other institutions or through national centers.

Maurice Manning, Cyberinfrastructure Engineer, Research IT, UC Berkeley

1:00-2:00: OCR: Precision and Scale

Description: Converting scanned images of text into editable text that can be edited, analyzed, transformed, and displayed is a prerequisite step for a great deal of scholarship in the humanities and social sciences. Optical character recognition (OCR) software designed for business applications performs very well across numerous languages, and includes user-friendly interfaces for training the OCR on the nuances of a particular text, and making corrections to ensure very high accuracy. At around $100, desktop commercial OCR software is not prohibitively expensive for researchers undertaking a long-term project, but hard to justify for occasional use. Free, open source OCR software exists, but it still has difficulties with complex layout, and the command line is required for accessing the full functionality of one of the major packages. Compared to commercial software, the open source tool is slow and resource-intensive, but given the expensive per-page cost structure for running commercial software on a compute cluster, it is the only viable option for processing corpora of thousands of pages.

Research IT at UC Berkeley has developed two service offerings to support the range of OCR needs and requirements for humanities and social science researchers. One supports precision OCR for small data sets, through a virtual research desktop (built upon the Citrix infrastructure already supported for administrative use cases) with the commercial OCR package ABBYY FineReader. Researchers who have large corpora and are less concerned about extreme precision can use the open source Tesseract tool, which is provided in a container on Berkeley’s HPC cluster, where all faculty have a free annual compute allowance. This talk will discuss pros and cons of each option, and how consultants match researchers with services and support their use of the systems.

Quinn Dombrowski, Digital Humanities Coordinator, Research IT, UC Berkeley

1:00-2:00: SD|DH: A Regional Digital Infrastructure

Description: We have been building DH infrastructure in San Diego for the past three years by working across institutional lines and by focusing on people in a specific place whose passion brings them together. Last year, we (Pressman, Hijar, Giles-Watson) addressed this conference to share our aspirations and plans for our initiative. This year we come to share ouraccomplishments and testify to how our aspirational plans are working.

We are proud to lay claim to the idea that we are building DH infrastructure by defining “infrastructure” differently. Rather than building from and around funding sources, tools, and silohed experts, we work from a strategic view that infrastructure is about people, place, and passion: the resources of faculty and students, their research questions, and their energy. If we can stimulate and harness the natural energy of people, we can—and, indeed, we have—build a networked infrastructure from the ground-up. If we understand DH to not be about tools or a tool-centric focus, but to be about connecting people via tools and their exploration, than DH is about the Humanities—it (re-)centers the human and emphasizes the humane. That is what we’re after.

Jessica Pressman, Asst. Professor, English and Comparative Literature, SDSU

Pamela Lach, Digital Humanities Librarian, SDSU

Katherine Hijar, Asst. Professor, History, CSU San Marcos

Maura Giles-Watson, Asst. Professor, English, University of San Diego

2:15-3:00: HaCCS Lab as model for unfunded research lab

Description: The Humanities and Critical Code Studies (HaCCS) Lab (http://haccslab.com) is an unfunded research group based at the University of Southern California that sponsors talks, organizes working groups, and networks researchers. Rather than a physical space, the HaCCS Lab is a conceptual space which leverages affiliate titles and publication opportunities to extend its network and encourage participation. It maintains online presence (via social media and a Website) and in-person presence via co-sponsored events and panel presentations. It also interoperates with other more established (and funded) networks such as HASTAC, SLSA, and ELO. In other words, it makes use of existing and collectively shared resources.

To date the lab’s most significant achievement is the sponsorship and coordination of a biannual online conference, the Critical Code Studies Working Group, which brings together over 100 scholars (from students to established scholars) to explore digital culture through computer source code. The Working Group generates a bibliography, extensive discussion threads, and opportunities for future collaboration. Those discussion threads have led to books, articles, talks, and conferences.

In this presentation, alone or together with several research affiliates, I will present some of the HaCCS Labs strategies for fostering research in this nascent field. In this presentation I will offer the HaCCS Lab as a case study for other similar initiatives. As director of the HaCCS Lab, I will address the question, what are the real products of a virtual lab?

Mark Marino, Associate Professor (Teaching), University of Southern California

2:15-3:00: The Making of the Digital Art History 101 Website

Description: After hosting two summer institutes on Digital Art History at UCLA, we felt that the art historical community at large might benefit from a website that organized, edited, and arranged our set of resources for wider use. We thus built Digital Art History 101 (http://dah101.humanities.ucla.edu/), an online textbook and resource-aggregator to house these materials and allow for outside contributions.

In implementing the DAH101 website, we prioritized stability, longevity, and community contribution. We built the site using Jekyll, a static website generator, and hosted it on Github, a web-based platform for version-control and decentralized contribution. Jekyll-based sites are “flat,” meaning content is composed of static HTML, not dynamically retrieved from a database, as is the case with WordPress or Drupal sites. Thus, Jekyll sites are much less likely to suffer from hacking or vandalism, require no updates, and are more stable for preservation purposes.

Jekyll works closely with Github hosting. Used by software builders and web developers worldwide, Github allows for collaborative coding and is the largest open-source community in the world. This means that content of Digital Art History 101 can be updated by the community at their interest and convenience, via a simple pull request. In this way, the DAH101 site can be updated with ease to grow as the digital art history community does.

Francesca Albrezzi, Teaching Associate (World Arts and Cultures/Dance), UCLA

2:15-3:00: Text Mining and Topic Modeling Historical Newspapers

Description: My research uses text mining and topic modeling to assess the ways Victorians discussed stained glass and murals in newspapers. In the process of gaining access to OCR files to perform this analysis, I worked closely with USC librarians, who in turn worked closely with Gale Cengage. In this talk I will discuss the infrastructural benefits resulting from a dynamic exchange between researchers, librarians, and outside vendors.


        Christopher McGeorge, Andrew W. Mellon Doctoral Fellow in the Digital Humanities, USC

Works in Progress (Thursday, Feb. 23)

11:15-12:00: Public Media Displays and Rigorous Digital Scholarly Content: Connecting Scalar, Tensor, and Global Crossroads


Christopher Gilman, Associate Director, Center for Digital Liberal Arts, Occidental College

11:15-12:00: New Ways to Engage Museum and Library Patrons with Cultural Heritage Collections


Virginia Kuhn, Associate Professor, School of Cinematic Arts, USC

11:15-12:00: Building Processual Infrastructures for an Oral Life Histories-based Digital Humanities Research Collaboration


Sharon Traweek, Associate Professor of Gender Studies, UCLA

11:15-12:00: Social Paper

Description:  Social Paper is a writing tool for networking student writing across courses, terms, institutions, and publics. It was developed as a step towards the broader goal of creating metaparticipatory writing tools, that is, tools that allow students not only to practice participatory scholarship, but to participate in the technical development and ethical governance of those tools. This short talk will introduce Social Paper and the tool on which it was built, Commons in a Box.  

Erin Glass, Associate Director, Center for the Humanities, UCSD

11:15-12: Tools for Text Data Mining


Bret Costain, Director of New Product Strategy and Development, Gale|Cengage Learning

Workshops (Friday, Feb. 24)

Seating for all workshops is limited. Please reserve your space in the workshops of your choice when you register for the symposium. Any remaining space will be allocated on a first-come, first-serve basis after the symposium has begun.


Getting Started: Digital Project Development Environments

Where: RC Classroom, YRL Research Commons

Description: In this introductory, workshop we will introduce some of the basic collaborative tools that are commonly used for digital projects. Examples include free and low cost cloud based data stores, development and coding environments, and popular software frameworks to stand up a digital web project quickly. No previous programming or development work required.

Kristian Allen, Digital Library Architect, UCLA Library

~ OR ~

An Introduction to DH Project Planning


Description: In this workshop, I will introduce the “best practices” that we use for planning project collaborations at UCLA’s Center for Digital Humanities, with a special focus on working in resource-limited environments. Participants will discuss how expectation management and project management are critical to successfully completing collaborative projects, what critical questions to ask project stakeholders before beginning, and the basics of project management, among other topics. No previous project planning experience is necessary, and any more experienced participants will be encouraged to share their opinions on the challenges that we discuss. If time permits, I will invite participants to workshop their own prospective project plans with the rest of the group.

John A. Lynch, Academic Technology Manager, UCLA Center for Digital Humanities


Introduction to Git and Github

Where: RC Classroom, YRL Research Commons

Description: This 2-hour workshop will introduce participants to the Git and GitHub environments. Git is a version control system for tracking changes in files, especially where multiple contributors are working on these files. GitHub is a web-based environment for hosting your Git-versioned files so that you can share, collaborate on, and even publish your code, texts, and other files openly. Many in the DH community use Git/Github to write and share code, data sets, and even websites. During the workshop, participants will learn to use Git to track and version files on their own computers and in GitHub, how to create and push their files to a GitHub repository, and how to work collaboratively using GitHub features. Attendees should bring a laptop with Git installed (instructions will be sent a week before the workshop). Laptops with Git are available if needed.

Dawn Childress, Librarian for Digital Collections and Scholarship, UCLA Library

~ OR ~

How to Visualize Data with Tableau



Nick de Carlo, Research and Instructional Technology Consultant, UCLA Center for Digital Humanities


Testing DH Code

Where: RC Classroom, YRL Research Commons

Description: Testing is one of those things that we know we need, but we don't do because it takes too long and it's only useful for big projects. This is wrong on all counts. This workshop shows how testing helps small projects and saves time on projects of all sizes in the long run. It covers how to design good tests, how to choose the right things to test, and what testing does for DH projects specifically. More importantly, it shows how test-driven development is fun: testing gets us to the fun part of programming faster than development without tests. This workshop will be useful to any programmer, whether you’re writing scripts just to parse data, or you’re working on large applications.

Dave Shepard, Academic Programmer, UCLA Center for Digital Humanities

~ OR ~

An Introduction to Text Analysis


Description: Computational text analysis can be defined broadly as subjecting the texts to algorithmic procedures designed to detect and/or visualize statistical patterns from which we can construct narratives about the style, authorship, or history of the texts examined. This workshop is aimed at beginners hoping to become familiar with some basic text analysis tools and to more experienced text analysis scholars who wish to examine some of the theoretical implications of using text analysis methods. The focus will be on literary text analysis with small to medium-sized text collections, but we will pay some attention to how you would approach other disciplines or work with "big data". The "hands-on" portion of the workshop will focus on the use of the Lexos text analysis tool, which allows the user to experiment with the different workflow steps of pre-processing, statistical analysis, and visualization. The tool can be accessed online through a web browser, or users can install it on their own machines prior to the workshop.

Scott Kleinman, Professor, English, CSU Northridge