1 of 239

Computational Analysis of Political Texts

Bridging Research Efforts Across Communities

Goran Glavaš

Federico Nanni

Simone Paolo Ponzetto

Data and Web Science Group

University of Mannheim

2 of 239

Hi there!

Goran

Federico

Simone

https://poltexttutorial.wordpress.com/

3 of 239

Political Text

4 of 239

Political Text

Language as a medium for politics and political conflicts

From: https://twitter.com/realdonaldtrump/status/821772494864580614

5 of 239

Political Text

Language as a medium for politics and political conflicts
Much of politics is expressed in words

From: https://www.theguardian.com/politics/2018/jul/09/boris-johnson-his-path-to-resigning-as-foreign-secretary

6 of 239

Political Text

Language as a medium for politics and political conflicts
Much of politics is expressed in words
However, it is still hard to use texts for making inferences about politics.

From: https://www.flickr.com/photos/statephotos/48079686137/

7 of 239

Political Text

Too many political texts

From: https://www.bournemouthecho.co.uk/news/national/17504503.students-strike-over-politicians-inaction-on-climate-change/

8 of 239

Political Text

Too many political texts
Hiring and training annotators is very expensive

Complex phenomena
Domain-specific language
Subjectivity involved

From: https://www.bournemouthecho.co.uk/news/national/17504503.students-strike-over-politicians-inaction-on-climate-change/

9 of 239

Political Text

Too many political texts
Hiring and training annotators is very expensive
Automated analysis seems to be the only way to go

From: https://www.bournemouthecho.co.uk/news/national/17504503.students-strike-over-politicians-inaction-on-climate-change/

10 of 239

Political Text

Modelling complex phenomena:

Polarization
Tribalism
Party Loyalty / Partisanship
Euroscepticism
Populism
Hybrid Warfare

11 of 239

A Tale of Two Communities

Political Science

2003 Wordscores [Laver et al.]

2008 Wordfish [Slapin & Proksch]

2013 Text as Data [Grimmer & Stewart]

2016 The Manifesto Corpus [Merz et al.]

Programming language: R

Libraries: Quanteda, STM, austin, tm, koRpus, kerasR, coreNLP

Natural Language Processing

2005 EuroParl Corpus [Koehn]

2006 Get out the Vote [Thomas et al.]

2010 From Tweets to Polls [O'Connor et al.]

2014 Text scaling in ACL Anthology [Zirn]

Programming language: Python, Java

Libraries: CoreNLP, Spacy, NLTK, TensorFlow, Keras, Sci-Kit Learn

Laver, M., Benoit, K., & Garry, J. (2003). Extracting policy positions from political texts using words as data. American political science review, 97(2), 311-331.

Slapin, J. B., & Proksch, S. O. (2008). A scaling model for estimating time‐series party positions from texts. American Journal of Political Science, 52(3), 705-722.

Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3), 267-297.

Merz, N., Regel, S., & Lewandowski, J. (2016). The Manifesto Corpus: A new resource for research on political parties and quantitative text analysis. Research & Politics, 3(2), 2053168016643346.

Koehn, P. (2005). Europarl: A parallel corpus for statistical machine translation. In MT summit (Vol. 5, pp. 79-86).

Thomas, M., Pang, B., & Lee, L. (2006). Get out the vote: Determining support or opposition from Congressional floor-debate transcripts. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 327-335). Association for Computational Linguistics.

O'Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. (2010). From tweets to polls: Linking text sentiment to public opinion time series. In Fourth International AAAI Conference on Weblogs and Social Media.

Zirn, C. (2014). Analyzing positions and topics in political discussions of the German Bundestag. In Proceedings of the ACL 2014 Student Research Workshop (pp. 26-33).

12 of 239

A Tale of Two Communities

Current issues in interdisciplinary collaborations [Wallach, 2016]

Lack of understanding of each other’s norms, incentive structures, and goals
The need to publish in high-quality venues in a timely fashion
Publishing interdisciplinary research can be slower than single discipline research
These challenges are not always recognized by tenure and promotion committees

13 of 239

A Tale of Two Communities

Current issues in interdisciplinary collaborations [Wallach, 2016]

Lack of understanding of each other’s norms, incentive structures, and goals
The need to publish in high-quality venues in a timely fashion
Publishing interdisciplinary research can be slower than single discipline research
These challenges are not always recognized by tenure and promotion committees

Overall goal: training new generations of computational social scientists

14 of 239

Towards a Collaborative Future

From: https://textasdata.github.io//

15 of 239

Towards a Collaborative Future

New Directions in Analyzing Text as Data

(Mostly) US-based interdisciplinary conference
Since 2012, next October 2019 in Stanford
https://www.textasdata2019.net/

From: https://textasdata.github.io//

16 of 239

Towards a Collaborative Future

New Directions in Analyzing Text as Data
NLP+CSS

Workshop co-located with ACL events
Since 2015, last edition @NAACL 2019
https://sites.google.com/site/nlpandcss/

From: https://textasdata.github.io//

17 of 239

Towards a Collaborative Future

New Directions in Analyzing Text as Data
NLP+CSS
PolText

Interdisciplinary symposium
First edition 2016, next edition Tokyo (Sept ‘19)
https://www.poltextconference.org/

From: https://textasdata.github.io//

18 of 239

Towards a Collaborative Future

New Directions in Analyzing Text as Data
NLP+CSS
PolText
ParlaCLARIN

Workshop on Curating Parliamentary Proc.
@LREC 2018
https://www.clarin.eu/blog/clarin-parlaformat-workshop

From: https://textasdata.github.io//

19 of 239

Towards a Collaborative Future

The goals of this tutorial:

Systematize and analyze the body of research work from both communities
Provide a gentle, all-round introduction to research questions, methods and tasks
Find a common language across research fields
Expand the interdisciplinary community outside its natural environment
Prepare PIs and PhD Students to the exciting challenges ahead

20 of 239

Agenda

Texts

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

21 of 239

Agenda

Texts

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

22 of 239

Text as Data

23 of 239

Text as Data - Parliament Debates

General view of the European Parliament during a plenary session in Strasbourg, eastern France, Wednesday, March 13, 2019. (AP Photo)

24 of 239

Text as Data - EuroParl Corpus

EuroParl Corpus (first release 2001, most recent: 2012)

21 European languages (1996 -> 2011 or 2007->2011 depending on the country)

Corpus of tokenized sentences aligned across languages.

Used for:

Machine translation
Word sense disambiguation
Cross-lingual learning

Available at: https://www.statmt.org/europarl/

25 of 239

Text as Data - EuroParl Corpus

Learning phrase representations using RNN encoder-decoder for statistical machine translation [Cho et al., 2014]

KenLM: Faster and Smaller Language Model Queries [Heafield, 2011]

Normalized (pointwise) mutual information in collocation extraction [Bouma, 2009]

PPDB: The Paraphrase Database [Ganitkevitch, 2013]

Learning bilingual lexicons from monolingual corpora [Haghighi et al. 2008 ]

Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. EMNLP.

Heafield, K. (2011). KenLM: Faster and smaller language model queries. In Proceedings of the sixth workshop on statistical machine translation (pp. 187-197). Association for Computational Linguistics.

Bouma, G. (2009). Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL, 31-40.

Ganitkevitch, J., Van Durme, B., & Callison-Burch, C. (2013). PPDB: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 758-764).

Haghighi, A., Liang, P., Berg-Kirkpatrick, T., & Klein, D. (2008, June). Learning bilingual lexicons from monolingual corpora. In Proceedings of ACL-08: Hlt (pp. 771-779).

26 of 239

Text as Data - Linked EP

Plenary debates of the EP as Linked Open Data [Van Aggelen et al. 2016]

All plenary debates between 1999 and 2017 with links to GeoNames and DBpedia.

Available at : https://linkedpolitics.project.cwi.nl/web/html/home.html

Access to the data:

Through HTTP-resolvable URIs
Full-text search is provided
Through a SPARQL endpoint
Using the browse and search options of ClioPatria.
By downloading the data in turtle (2.5Gb, gzipped tar file).

27 of 239

Text as Data - Linked EP

Plenary debates of the EP as Linked Open Data [Van Aggelen et al. 2016]

All plenary debates between 1999 and 2017 with links to GeoNames and DBpedia.

Available at : https://linkedpolitics.project.cwi.nl/web/html/home.html

Issues:

Hard to access for a non-expert
Pre-clean / filtering of speeches not super transparent

28 of 239

Text as Data - Scraping the EP website

http://www.europarl.europa.eu

Pros:

Control on the selection process
Control on the metadata

Cons:

Not straightforward
You need to know “How to Crawl the Web Politely”

29 of 239

Text as Data - Irish Dáil

Wordscores [Laver et al., 2003] initially tested on a confidence debate in October 1991.

Worshoal [Lauderdale & Herzog, 2016] tested as an example of a multiparty system.

Database of Parliamentary Speeches in Ireland, 1919–2013 [Herzog & Mikhaylov, 2017]

30 of 239

Text as Data - Irish Dáil

Wordscores [Laver et al., 2003] initially tested on a confidence debate in October 1991.

Worshoal [Lauderdale & Herzog, 2016] tested as an example of a multiparty system.

Database of Parliamentary Speeches in Ireland, 1919–2013 [Herzog & Mikhaylov, 2017]

Popular because:

European party system
“Real” multi-party system
Content in English
1991 debate directly available in Quanteda

31 of 239

Text as Data - UK Hansard

Hansard Corpus (1803 - 2005)

Hansard Online (1803 - 2019)

DiLiPad (1803 - 2014)

TheyWorkForYou (1918 - 2019)

From: https://www.youtube.com/watch?v=H4v7wddN-Wg

32 of 239

Text as Data - UK Hansard

Hansard Corpus (1803 - 2005)

Hansard Online (1803 - 2019)

DiLiPad (1803 - 2014)

TheyWorkForYou (1918 - 2019)

Popular because:

Over two centuries of data
Curated by different interdisciplinary projects (NLP, CL, CSS, DH)
Content in English

From: https://www.youtube.com/watch?v=H4v7wddN-Wg

33 of 239

Text as Data - UK Hansard

Hansard Corpus (1803 - 2005)

Hansard Online (1803 - 2019)

DiLiPad (1803 - 2014)

TheyWorkForYou (1918 - 2019)

Popular because:

Over two centuries of data
Curated by different interdisciplinary projects (NLP, CL, CSS, DH)
Content in English

From: https://www.youtube.com/watch?v=H4v7wddN-Wg

34 of 239

Text as Data - US Congress

ConVote Dataset (all House debates, 2005) [Thomas et al., 2006]

Popular for:

Stance detection
Vote prediction
Opinion mining

35 of 239

Text as Data - US Congress

ConVote Dataset (all House debates, 2005) [Thomas et al., 2006]

Popular for:

Stance detection
Vote prediction
Opinion mining

Congressional Record for the 43rd-114th Congresses (1873 - 2017) [Gentzkow et al., 2018], derived from scans from HeinOnline.

36 of 239

Text as Data - United Nations

The UN General Debate corpus [Baturo et al., 2016]

Over 7300 country statements from 1970–2014

All in English (official translations by the UN)

37 of 239

Text as Data - Other Parliaments

CLARIN Parliamentary Corpora

ParlAT (Austria, 1996 - 2017)
Danish Parliamentary Corpus (2009-2017)
Italian Camera as RDF

38 of 239

Text as Data - Other Parliaments

CLARIN Parliamentary Corpora

ParlAT (Austria, 1996 - 2017)
Danish Parliamentary Corpus (2009-2017)
Italian Camera as RDF

Issues:

Many datasets (often more than one per country)
Resources are not aligned in time with each other
Often not maintained anymore

39 of 239

Text as Data - Other Parliaments

CLARIN Parliamentary Corpora

ParlAT (Austria, 1996 - 2017)
Danish Parliamentary Corpus (2009-2017)
Italian Camera as RDF

ParlSpeech [Rauh et al., 2017] (3.9 million plenary speeches Czech Republic, Finland, Germany, the Netherlands, Spain, Sweden, and the United Kingdom)

Sentiment and position-taking analysis of parliamentary debates: A systematic literature review [Abercrombie & Batista-Navarro, 2019]

40 of 239

Text as Data - Manifestos

41 of 239

Text as Data - Manifestos

Since 1979, the Manifesto Project collects and codes electoral programs for all relevant political parties at democratic elections from 1945 or the first democratic election in over 50 countries.

Country experts (usually political scientists who are native language speakers) are hired to code the electoral programs. Coders first split the electoral programs into so-called “quasi-sentences”, each of which “contains exactly one statement or message”.

The coders allocate to every quasi-sentence a code, corresponding to one of 56 categories, which captures the most relevant policy issues and goals.

In order to do this, coders are taken through a training process.

42 of 239

Text as Data - Manifestos

In the past the coding of these documents was performed using printed copies of the electoral programs, annotating in the margins of the pages.

The first serious effort towards digitization was made by Paul Pennings and Hans Keman of the Comparative Electronic Manifestos Project (2006), who digitized 1144 electoral programs included in the Manifesto Corpus (2015 v.1).

The corpus [Merz et al., 2016] currently covers electoral programmes from more than 50 different countries in more than 35 languages. It contains more than 2300 machine-readable programmes. For more than 1.150 of these, unitising and codings are available as well. These are more than 1,000,000 coded quasi-sentences.

https://manifesto-project.wzb.eu/

43 of 239

Text as Data - Manifestos

A gold standard or a “no-alternative” scenario? [Budge & Pennings, 2007; Benoit et al., 2012; Mikhaylov et al., 2012; Gemenis, 2013]

44 of 239

Text as Data - Manifestos

A gold standard or a “no-alternative” scenario? [Budge & Pennings, 2007; Benoit et al., 2012; Mikhaylov et al., 2012; Gemenis, 2013]

1) Theoretical framework of the coding scheme

Salience theory of party competition: policy differences between parties are assumed to consist of contrasting emphasis placed in different policy areas

Relevant for US and UK
Not true in many multi-party competitions -> niche parties
Core is not emphasis but position
Coding scheme actually captures also positions (pro/con)

45 of 239

Text as Data - Manifestos

A gold standard or a “no-alternative” scenario? [Budge & Pennings, 2007; Benoit et al., 2012; Mikhaylov et al., 2012; Gemenis, 2013]

1) Theoretical framework of the coding scheme

2) Document selection

Not only manifestos (drafts, speeches, reports, news, flyers, interviews)
Many of them are not equivalent to manifestos
Length = authority does not hold anymore!

46 of 239

Text as Data - Manifestos

A gold standard or a “no-alternative” scenario? [Budge & Pennings, 2007; Benoit et al., 2012; Mikhaylov et al., 2012; Gemenis, 2013]

1) Theoretical framework of the coding scheme

2) Document selection

3) Coding reliability

Central standard for coders (internal training and test)
One annotator per manifesto
What about crowdsourcing? [Benoit et al, 2016]
Issue not with coders but with coding scheme itself

47 of 239

Text as Data - Manifestos

A gold standard or a “no-alternative” scenario? [Budge & Pennings, 2007; Benoit et al., 2012; Mikhaylov et al., 2012; Gemenis, 2013]

1) Theoretical framework of the coding scheme

2) Document selection

3) Coding reliability

4) Scaling

Left-Right actually differs across countries
Emphasis as a proxy for determining position not really consistent
In general L-R should not be treated as gold information

48 of 239

Text as Data - Agendas

49 of 239

Text as Data - Agendas

The Comparative Agendas Project (CAP) assembles and codes information on the policy processes of governments from around the world. Focus on the policy used, proposed or discussed.

Initially developed in the US in the early 1990s. Aggregator of many different projects, analysing different types of documents (news, laws, etc.) using a single, universal and consistent coding scheme. CAP monitors policy processes by tracking the actions that governments take in response to the challenges they face.

https://www.comparativeagendas.net/

50 of 239

Text as Data - Agendas

The CAP Codebook:

21 Major Topics
220 Subtopics

51 of 239

Text as Data - Campaign Speeches / Debates

52 of 239

Text as Data - Campaign Speeches / Debates

The American Presidency Project has transcripts of:

Convention speeches
Debates
Party Platforms
Campaign documents

https://www.presidency.ucsb.edu

53 of 239

Text as Data - Campaign Speeches / Debates

Other resources:

From papers:
2012 Republican primary debates [Prabhakaran et al., 2013]
Dutch and Danish party congress speeches [Schumacher et al., 2019]
2015 UK Election debates (audio and transcripts) [Lippi & Torroni, 2016]

2) Transcript in news media (newspapers, fact-checking websites)

Full Transcript: Democratic Presidential Debates, Night 1 (NYT)
Fact-checking the Democratic debate in Miami, night 1 (PolitiFact)

54 of 239

Text as Data - Press Releases

55 of 239

Text as Data - Press Releases

AUTNES Content Analysis of Party Press Releases (OTS) 2013 [Müller et al., 2017]

56 of 239

Text as Data - Leaders

57 of 239

Text as Data - Leaders

The American Presidency Project has transcripts of:

Presidential orders
Memoranda
Proclamations
Interviews
Letters

https://www.presidency.ucsb.edu

58 of 239

Text as Data - Leaders

The American Presidency Project has transcripts of:

Presidential orders
Memoranda
Proclamations
Interviews
Letters

EUSpeech: a New Dataset of EU Elite Speeches [Schumacher et al., 2016]

Over 18k speeches from EU leaders
Time-range 2007 to 2015.

59 of 239

Text as Data - Leaders

The Global Populism Database is the most up-to-date, comprehensive and reliable repository of populist discourse in the world. It was commissioned by the Guardian and built by Team Populism, a global network of scholars dedicated to the scientific study of the causes and consequences of populism.

Issues:

The data is not directly available
The selection is not clear

60 of 239

Text as Data - Legislative Corpora

61 of 239

Text as Data - Legislative Corpora

Sunlight Foundation SUNLIGHT US CONGRESS API

Look up members of Congress by location or by zip code
Official Twitter, YouTube, and Facebook accounts
The daily work of Congress: bills, amendments, nominations
The live activity of Congress: past and future votes, floor activity, hearings

EurLex

Freely accessible repository for European Union law texts (multilingual)
treaties, international agreements, legislation in force, legislation in preparation, case-law and parliamentary questions
HTMLs and PDFs

62 of 239

Text as Data - Social Media

63 of 239

Text as Data - Social Media

Politician’s opinions:

List of all MEPs: https://twitter.com/europarl_en/lists/all-meps-on-twitter
All members of US Congress: https://twitter.com/cspan/lists/members-of-congress
UK MPs: https://twitter.com/twittergov/lists/uk-mps

Voters’ opinions:

Harvard Dataverse (e.g. 2018 U.S. Congressional Election Tweet Ids)
Internet Archive Twitter Stream
Reddit Corpus

64 of 239

Agenda

Text

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

65 of 239

Text as Data

66 of 239

Text as Data - Examples

Exploring the political agenda of the European Parliament using a dynamic topic modeling approach [Greene & Cross, 2017]

How the political agenda of the EP has evolved over time and reacted to stimuli in the period 1999-2014
They show how a dynamic topic modeling approach, based on Non-negative Matrix Factorization is better suited than LDA
Able to capture the attention of EP to external events (e.g., the Euro Crisis)

67 of 239

Text as Data - Examples

Exploring the political agenda of the European Parliament using a dynamic topic modeling approach [Greene & Cross, 2017]

A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in Senate press releases [Grimmer, 2010]

Measure how US senators explain their work in Washington to constituents using a collection of over 24,000 press releases from senators from 2007.
The Expressed Agenda Model measures priorities of each author
Ideal for comparing priorities

68 of 239

Text as Data - Examples

Exploring the political agenda of the European Parliament using a dynamic topic modeling approach [Greene & Cross, 2017]

A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in Senate press releases [Grimmer, 2010]

How to analyze political attention with minimal assumptions and costs [Quinn et al., 2010]

Topic model to examine the agenda in the U.S. Senate from 1997 to 2004
New database of over 118,000 speeches from the Congressional Record
Model reveals speech topic categories that are both distinctive and meaningfully interrelated and a richer view of democratic agenda dynamics

69 of 239

Text as Data - Examples

Measuring group differences in high-dimensional choices: Method and application to Congressional speech [Gentzkow et al., 2016]

Measure trends in the partisanship of congressional speech from 1873 to 2016
Partisanship as the ease with which an observer could infer a party from a single utterance
partisanship increased sharply in the early 1990s

70 of 239

Text as Data - Examples

Measuring group differences in high-dimensional choices: Method and application to Congressional speech [Gentzkow et al., 2016]

Position taking in European Parliament speeches [Proksch & Slapin, 2010]

how national parties position themselves in EP debates
we find that it reflects partisan divisions over EU integration and national divisions rather than left–right politics
Results are robust across languages used to scale the speeches

71 of 239

Text as Data - Examples

Measuring group differences in high-dimensional choices: Method and application to Congressional speech [Gentzkow et al., 2016]

Position taking in European Parliament speeches [Proksch & Slapin, 2010]

Testing the Etch-a-Sketch Hypothesis: Measuring Ideological Signaling via Candidates’ Use of Key Phrases [Gross et al., 2013]

Presidential candidates should shift toward the general electorate’s median voter after securing their parties’ nominations.
Test the theory using candidates’ campaign speeches as data
Develop a model to identify ideological cues in political text

72 of 239

Text as Data

[Grimmer & Stewart, 2013]

73 of 239

Text as Data

[Grimmer & Stewart, 2013]

74 of 239

Text as Data

[Grimmer & Stewart, 2013]

RQ1:

What is it about?

(the topic, the issue)

RQ2:

How does it compare with others?

(the position)

75 of 239

Agenda

Text

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

76 of 239

Topic Detection

[Quinn et al., 2010]

77 of 239

Topic Detection

[Quinn et al., 2010]

78 of 239

Different Objectives

Finding the needle in the haystack or characterizing the haystack? [Hopkins & King, 2010]

When social scientists use formal content analysis, it is typically to make generalizations using document category proportions
They conducted content analyses to learn about the distribution of classifications in a population, not to assert the classification of any particular document (which would be easy to do through a close reading)
Individual document classifications, do not usually constitute the ultimate quantities of interest.

79 of 239

Selection of Speakers? [Proksch & Slapin, 2012; Schwarz et al., 2017]

In political systems that foster an individual relationship between MPs and their voters, party leaders are more likely to accept speeches that deviate from the party line.

In contexts where these relations are mediated by the party, and party unity matters, the party leadership is likely to prohibit expression of dissent on the parliamentary floor.

Take Away: the speakers do not always represent the position of the entire party.

80 of 239

Topic Detection

Supervised

Unsupervised

81 of 239

Topic Detection

Supervised

Unsupervised

82 of 239

Supervised Approaches

Dictionaries

Intuitive, easy to apply, generate, monitor and extend
Often paired with scores for relevance (harder to have)
Difficult to apply out of domain (especially for sentiment analysis)
Example: Budget rhetoric in presidential campaigns from 1952 to 2000

[Burden & Sanberg, 2003]

Research direction: Expanding dictionaries with word embeddings [Tsai & Wang, 2014; Theil et al., 2018; Sternberg, 2018]

83 of 239

Supervised Approaches

Dictionaries

Support Vector Machines in Political Science

Used for collection filtering [D’Orazio et al., 2014]
Classifying Congressional Bills (226 possible topics) [Hillard et al., 2008; Karan et al., 2016]
Hard to apply out-of-domain [Burscher et al., 2015; Nanni et al., 2016]

Political news from a different newspaper or different point in time
Training on manifestos for coding on political campaign speeches

D'Orazio, V., Landis, S. T., Palmer, G., & Schrodt, P. (2014). Separating the wheat from the chaff: Applications of automated document classification using support vector machines. Political analysis, 22(2), 224-242.

Hillard, D., Purpura, S., & Wilkerson, J. (2008). Computer-assisted topic classification for mixed-methods social science research. Journal of Information Technology & Politics, 4(4), 31-46.

Burscher, B., Vliegenthart, R., & De Vreese, C. H. (2015). Using supervised machine learning to code policy issues: Can classifiers generalize across contexts?. The ANNALS of the American Academy of Political and Social Science, 659(1), 122-131.

Nanni, F., Zirn, C., Glavaš, G., Eichorst, J., & Ponzetto, S. P. (2016). TopFish: topic-based analysis of political position in US electoral campaigns. PolText.

84 of 239

Supervised Approaches

Dictionaries

Support Vector Machines in Political Science

Disadvantages

Lack of annotated resources
Hard to produce “gold” standard (low coder reliability)
Hard to generalize across contexts

85 of 239

Classification of Manifesto Quasi-Sentences

Classifying topics and detecting topic shifts in political manifestos [Zirn et al., 2016]

86 of 239

Classification of Manifesto Quasi-Sentences

Proportional Classification Revisited: Automatic Content Analysis of Political Manifestos Using Active Learning [Wiedemann, 2019]

Focus on proportional classification
Comparison between a method based on regression analysis with feature profiles from entire collections and a method aggregating classifier decisions for individual documents
Improvement on both using active learning

87 of 239

Classification of Manifesto Quasi-Sentences

Hierarchical Structured Model for Fine-to-coarse Manifesto Text Analysis [Subramanian et al., 2018]

Captures the dependency between the sentence- and document-level tasks, and also utilize additional label structure
Incorporates contextual information (e.g., political coalitions) and encode temporal dependencies for the coarse-level manifesto position using probabilistic soft logic

88 of 239

Topic Detection

[Quinn et al., 2010]

89 of 239

Topic Detection

[Quinn et al., 2010]

90 of 239

Topic Detection

Supervised

Unsupervised

91 of 239

Topic Detection

Supervised

Unsupervised

92 of 239

Unsupervised Approaches

93 of 239

Unsupervised Approaches

Available implementations [Benoit et al., 2018]

LDA in Quanteda
expAgenda Model
Structural topic model

94 of 239

Unsupervised Approaches

Available implementations [Benoit et al., 2018]

Advantages of LDA Topic Models [Quinn et al., 2010; Grimmer & Stewart, 2013]

Topics may be difficult to know beforehand
Very little investment in pre-analysis stage
No human coding of training data
Useful for initial exploration

95 of 239

Unsupervised Approaches

Available implementations [Benoit et al., 2018]

Advantages of LDA Topic Models [Quinn et al., 2010; Grimmer & Stewart, 2013]

Issues with Interpretation [Lauscher et al., 2016]

Topics are lists of co-occurring words
No label describing them
Results are very different, depending on the number of topic

96 of 239

Unsupervised Approaches

Available implementations [Benoit et al., 2018]

Advantages of LDA Topic Models [Quinn et al., 2010; Grimmer & Stewart, 2013]

Issues with Interpretation [Lauscher et al., 2016]

Issues with Evaluation [Chang et al., 2009; Wallach et al., 2009; Newman et al., 2010]

Post-hoc evaluation is necessary => different results any time LDA runs
Intrinsic evaluations (e.g. perplexity) have low correlation with human judg.
Word intrusion tasks and topic coherence are very time consuming to assess
Simply looking at topic outputs is not an evaluation!

97 of 239

Papers on Unsupervised Topic Detection

Structural Topic Models for Open-Ended Survey Responses [Roberts et al., 2014]

Allows the inclusion of covariates of interest into the prior distributions for document-topic proportions and topic-word distributions.
This makes analyzing open-ended responses easier, more revealing, and capable of being used to estimate treatment effects.

98 of 239

Papers on Unsupervised Topic Detection

Validating Cross-Perspective Topic Modeling for Extracting Political Parties’ Positions from Parliamentary Proceedings [van der Zwaan et al., 2016]

Do the topics learned from the parliamentary proceedings cover all relevant political subjects? (content validity)
Can the topics learned from the parliamentary proceedings be used to predict the political subject of texts? (criterion validity)

99 of 239

Papers on Unsupervised Topic Detection

Tea Party in the House: A Hierarchical Ideal Point Topic Model and Its Application to Republican Legislators in the 112th Congress [Nguyen et al., 2015]

Making multi-dimensional ideal point models interpretable
Combining votes, bill text, legislators speeches and topics from the Policy Agendas Project
Structuring topics in a hierarchy, allowing to analyze both agenda issues and issue-specific frames.

100 of 239

Our Experience on Political Topic Analysis

101 of 239

Our Experience on Political Topic Analysis

Unsupervised Text Segmentation of Manifestos [Glavaš et al. *SEM 2016]

Supervised Classification of Manifestos QS [Zirn et al., PolText 2016]

Domain Transfer (manifestos -> campaign speeches) [Nanni et al., PolText 2016]

Cross-lingual Classification of Manifestos QS [Glavaš et al., NLP+CSS 2017]

Key-Concept Clustering of Manifestos QS [Menini et al., EMNLP 2017]

Semantifying the UK Hansard [Nanni et al., ParlaCLARIN 2018 & JCDL 2019]

Glavaš, G., Nanni, F., & Ponzetto, S. P. (2016). Unsupervised text segmentation using semantic relatedness graphs. *SEM.

Zirn, C., Glavaš, G., Nanni, F., Eichorst, J. and Stuckenschmidt, H., Classifying Topics and Detecting Topic Shifts in Political Manifestos. PolText 2016.

Nanni, F., Zirn, C., Glavaš, G., Eichorst, J. and Ponzetto, S.P. TopFish: topic-based analysis of political position in US electoral campaigns. PolText 2016.

Glavaš, G., Nanni, F., & Ponzetto, S. P. (2017b, August). Cross-lingual classification of topics in political texts. In Proceedings of the second workshop on NLP and computational social science(pp. 42-46).

Menini, S., Nanni, F., Ponzetto, S. P., & Tonelli, S. (2017). Topic-based agreement and disagreement in us electoral manifestos. Proc. of EMNLP. Outstanding Paper.

Nanni, F., Osman, M., Cheng, Y. R., Ponzetto, S. P., & Dietz, L. (2018). UKParl: A data set for topic detection with semantically annotated text. ParlaCLARIN @LREC.

Nanni, F., Menini, S., Tonelli, S., & Ponzetto, S. P. (2019). Semantifying the UK Hansard (1918-2018), JCDL.

102 of 239

Interested in Computational Political Science?

We are hiring, get in touch!

{federico,goran,simone}@informatik.uni-mannheim.de

103 of 239

Unsupervised Text Segmentation [Glavas et al. *SEM 2016]

We employ word embeddings and a measure of semantic relatedness of short texts to construct a relatedness graph of the text.

104 of 239

Unsupervised Text Segmentation [Glavas et al. *SEM 2016]

We employ word embeddings and a measure of semantic relatedness of short texts to construct a relatedness graph of the text.

https://bitbucket.org/gg42554/graphseg/

105 of 239

Unsupervised Text Segmentation [Glavas et al. *SEM 2016]

We employ word embeddings and a measure of semantic relatedness of short texts to construct a relatedness graph of the text.

https://bitbucket.org/gg42554/graphseg/

106 of 239

Supervised Classification [Zirn et al., PolText 2016]

We developed a 7 classes coarse-grained classifier using macro-areas from the Manifesto Project.

107 of 239

Supervised Classification [Zirn et al., PolText 2016]

Features:

Bag of words of each sentence
Topic of the previous sentence
Semantic similarity between previous and current sentence
Level relevance of each word in sentence for each class

Linear Support Vector Machine

108 of 239

Supervised Classification [Zirn et al., PolText 2016]

109 of 239

Domain Transfer [Nanni et al., PolText 2016]

We collected all US campaign speeches for the 2008, 2012 and 2016 presidential elections.

Gold Standard: around 1k annotated sentences.

110 of 239

Domain Transfer [Nanni et al., PolText 2016]

We collected all US campaign speeches for the 2008, 2012 and 2016 presidential elections.

Gold Standard: around 1k annotated sentences.

111 of 239

Key-Concept Clustering [Menini et al., EMNLP 2017]

112 of 239

Key-Concept Clustering [Menini et al., EMNLP 2017]

Extract key-concepts with Keyphrase Digger
Represent them as vector using word embeddings
Graph-based clustering approach

https://dh.fbk.eu/technologies/kd

https://github.com/dhfbk/keyphrase_clustering

113 of 239

Key-Concept Clustering [Menini et al., EMNLP 2017]

114 of 239

Key-Concept Clustering [Menini et al., EMNLP 2017]

We extracted 87 topics from six U.S. Manifestos (2004, 2008 2012), which produced 350 pairs of statements, showing:

Disagreement on the responsibilities of previous administrations
Disagreement on foreign policy (Middle East)
Agreement on relation with Europe

115 of 239

Agenda

Text

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

116 of 239

Positioning

Political scientists need to infer policy positions from text.

117 of 239

Positioning

Political scientists need to infer policy positions from text.

The position could be determined:

Towards a topic, an issue, a target
Given an a priori definition of the political space (e.g., left-right)
Inductively, in comparison with other texts under study, in a data-driven setting

Many tasks (loosely) correspond to a type of positioning

Most prominent: Position detection/classification, political text scaling

118 of 239

Positioning

Political scientists need to infer policy positions from text.

The position could be determined:

Towards a topic, an issue, a target
Given an a priori definition of the political space (e.g., left-right)
Inductively, in comparison with other texts under study, in a data-driven setting

A notable difference NLP and Pol-Sci research: the former considers the specific target of expressed positions, while the latter generally analyses aggregated speeches/corpora.

119 of 239

Topic-based Positioning

120 of 239

Positioning: NLP vs. PolSci

A notable difference in positioning studies/tasks of two communities:

NLP research focused on the target of the expressed position/stance/sentiment

Ideology [Sim et al., 2013; Iyyer et al., 2014; Volkova et al., 2014, Kulkarni et al., 2018]
Legislation [Thomas et al., 2006; Lauderdale & Herzog, 2016; Eidelman et al., 2017]
Topic [van der Zwaan et al., 2016; Menini & Tonelli, 2016; Menini et al., 2017]
(Propositional) Statements [Bamman & Smith, 2015]

PolSci research focused on aggregate positional profiling of political actors

Positioning actors (people or parties) based on aggregate textual content and ignoring the targets of individual contributions

[Laver et al., 2003; Proksch & Slapin, 2010; Shwarz et al., 2017; Kim et al, 2018]

Sim, Y., Acree, B. D., Gross, J. H., & Smith, N. A. (2013, October). Measuring ideological proportions in political speeches. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 91-101).

Iyyer, M., Enns, P., Boyd-Graber, J., & Resnik, P. (2014, June). Political ideology detection using recursive neural networks. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1113-1122).

Volkova, S., Coppersmith, G., & Van Durme, B. (2014, June). Inferring user political preferences from streaming communications. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 186-196).

Kulkarni, V., Ye, J., Skiena, S., & Wang, W. Y. (2018). Multi-view Models for Political Ideology Detection of News Articles. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 3518-3527).

Lauderdale, B. E., & Herzog, A. (2016). Measuring political positions from legislative speech. Political Analysis, 24(3), 374-394.

Eidelman, V., Kornilova, A., & Argyle, D. (2018, August). How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 145-160).

Thomas, M., Pang, B., & Lee, L. (2006, July). Get out the vote: Determining support or opposition from Congressional floor-debate transcripts. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 327-335). Association for Computational Linguistics.

van der Zwaan, J. M., Marx, M., & Kamps, J. (2016, August). Validating cross-perspective topic modeling for extracting political parties' positions from parliamentary proceedings. In Proceedings of the Twenty-second European Conference on Artificial Intelligence (pp. 28-36). IOS Press.

Menini, S., & Tonelli, S. (2016, December). Agreement and disagreement: Comparison of points of view in the political domain. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 2461-2470).

Menini, S., Nanni, F., Ponzetto, S. P., & Tonelli, S. (2017). Topic-Based Agreement and Disagreement in US Electoral Manifestos. In Conference on Empirical Methods in Natural Language Processing (pp. 2928-2934). Association for Computational Linguistics.

Bamman, D., & Smith, N. A. (2015, September). Open extraction of fine-grained political statements. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 76-85).

Proksch, S. O., & Slapin, J. B. (2010). Position taking in European Parliament speeches. British Journal of Political Science, 40(3), 587-611.

Schwarz, D., Traber, D., & Benoit, K. (2017). Estimating intra-party preferences: comparing speeches to votes. Political Science Research and Methods, 5(2), 379-396.

Kim, I. S., Londregan, J., & Ratkovic, M. (2018). Estimating spatial preferences from votes and text. Political Analysis, 26(2), 210-229.

121 of 239

Positioning: Ideological classification

The line of work that aims to assign one or more ideological labels (classes) to actors

Challenges:

Ideology-annotated corpora? (for supervised learning)
Assign ideologies to politicians/parties and propagate to all their texts

Assumption: politicians/parties are ideologically consistent

Is text enough or is there complementary signal?

The famous “Etch-a-Sketch” example:

“Well, I think you hit a reset button for the fall campaign. Everything changes. It’s almost like an Etch-A-Sketch. You can kind of shake it up and restart all of over again.”

-Eric Fehrnstrom, Spokesman for Presidential Candidate Mitt Romney

122 of 239

Ideological shifts: Etch-A-Sketch

123 of 239

Positioning: Ideological classification

The line of work that aims to assign one or more ideological labels (classes) to actors

[Gross et al., 2013; Sim et al., 2013] “Ideological Proportions in Political Speeches”

Inferring proportions of known ideological labels from ideology-rich corpus
Bayesian approach, HMM-based model

[Iyyer et al., 2014] “Political Ideology Detection Using Recursive Neural Networks”

Crowdsource the ideology annotations on a sentential level
Train a neural model (recursive network) to detect ideological phrases

[Volkova et al., 2014; Kulkarni et al., 2018]

Additionally exploiting non-textual signal for ideological predictions
Network structures: neighbours in social media graphs or links between news

124 of 239

Generative Ideology Detection [Sim et al., EMNLP 13]

Data: Ideology Book Corpus [Gross et al., 2013]

Collection of 112 books and 10 magazines
Ideological labels (tree below) assigned to authors
Books chapters additionally labeled with topics

E.g., Chapter “Faith” from Obama’s “The Audacity of Hope” gets topical label RELIGION

Image taken from [Gross et al., 2013]

125 of 239

Generative Ideology Detection [Sim et al., EMNLP 13]

2-step approach:

Ideological cue identification using a probabilistic language model
Concretely, Sparse Additive GEnerative models (SAGE) [Eisenstein et al., 2011]
Probability of word’s appearance in the document determined by its effects on documents attributes (parameters η)
Attributes: Coarse-ideology label (RIGHT, LEFT, CENTER), fine-grained ideology label, topic label

Parameter estimation: objective with sparsity-inducing L1 regularization, OWL-QN solver

[Andrew & Gao, 2007]

126 of 239

Generative Ideology Detection [Sim et al., EMNLP 13]

2-step approach:

2. Cue-lag ideological proportions (central contribution)

Cues (and their effect scores) obtained in the first step employed for ideological profiling of text
Corpus: speeches from 2008 and 2012 US presidential elections
Speeches from Primary elections
Speeches from General elections
(Ideological) cue-lag text representations

Sequences of non-cue words replaced with sequence length

Example from [Sim et al., 2013]

127 of 239

Generative Ideology Detection [Sim et al., EMNLP 13]

2-step approach:

2. Cue-lag ideological proportions (central contribution)

CLIP Model: A type of a HMM

States are ideological classes (e.g., PROGRESSIVE LEFT, RELIGIOUS RIGHT)
State emissions: (clue, length-of-the-lag)
Transition probabilities
Influenced by the distances between Ideological classes in the ideology tree (i.e., label hierarchy)
The direct transitions between more distant ideologies are less likely
Additional tree- walk ideological parameters

128 of 239

Generative Ideology Detection [Sim et al., EMNLP 13]

2-step approach:

2. Cue-lag ideological proportions (central contribution)

CLIP Model: A type of a HMM

States are ideological classes (e.g., PROGRESSIVE LEFT, RELIGIOUS RIGHT)
State emissions: (clue, length-of-the-lag)
Emission probabilities
One multinomial distribution over the entire clue lexicon for each ideological state

Ψs,w: probability of state s (e.g., CENTER) emitting the cue w (e.g., communist)

One global Poisson distribution (one global parameter) generates the lags
Cue-ideology priors (from phase 1) captured by a Dirichlet distribution
Learning: collapsed Gibbs sampling
Proportion inference: states generate cues and lags, lengths of lags associated with each ideological state provide the amount of “time” the speaker was in that ideology

129 of 239

Generative Ideology Detection [Sim et al., EMNLP 13]

Evaluation:

1. Gold-standard ideological proportions difficult (impossible?) to obtain

Hypothesis based evaluation: strong and moderate hypothesis

STRONG: Republican primaries candidates draw more from RIGHT than from LEFT
STRONG: Democratic primaries candidates draw more from LEFT than from RIGHT
STRONG: In GE, democrats should draw more from the LEFT than from the RIGHT (and vice-versa)

Evidence for the “etch-a-sketch” hypothesis?

130 of 239

Discriminative ideology detection [Iyyer et al., ACL 14]

Common NLP story:

(Small amount of) high-level (e.g., document-level) annotations: generative models with latent variables
Large amount of fine-grained (e.g., sentence- or token-level) annotations: discriminative models

[Iyyer et al., 2014] “Political Ideology Detection Using Recursive Neural Networks”

Crowdsource the ideology annotations on a sentence level
Assume ideological compositionality over the syntactic structure of the sentence
Train a neural model (recursive network) to detect ideological phrases
Approach inspired by [Socher et al., 2013]

Semantic compositionality over syntactic structure for sentiment analysis

131 of 239

Discriminative ideology detection [Iyyer et al., ACL 14]

Assumption of semantic compositionality of ideological positions/labels:

Image from [Iyyer et al., 2014]

132 of 239

Discriminative ideology detection [Iyyer et al., ACL 14]

Recursive neural network model:

Word embeddings (randomly init.

or pre-trained): x_a, x_b, x_d

Intermediate node representation:

Non-linearity applied on embedding

projections: global params W_L, W_R

Labels available for the root node (i.e., the whole phrase or sentence)

Simple softmax classification (learning: minimization of cross-entropy loss)

Image from [Iyyer et al., 2014]

133 of 239

Discriminative ideology detection [Iyyer et al., ACL 14]

Evaluation:

Accuracy on the ConVote and IBC datasets

Ideology Book Corpus [Gross et al., 2013]
Crowdsourced annotations for

phrases and sentences

CONSERVATIVE
LIBERAL
NEITHER (NEUTRAL)

Image from [Iyyer et al., 2014]

134 of 239

Discriminative ideology detection [Iyyer et al., ACL 14]

Label probabilities over node depths Sentence-level bias detection accuracy

135 of 239

Ideological classification: non-textual signal

While text is very informative, it is often not the only type of signal that can be exploited to (more accurately) predict ideological positions

Common setup: there are links (associations) between the actors/items for which we make ideological predictions

[Volkova et al., 2014; Barbera 2015]

Predicting ideological preferences of users in social media
Combines user’s content with that of friends/followers (social network)

[Kulkarni et al., 2018]

Predicting ideological orientation of news
Combines the (multi-source) text of news with the links connecting the news

136 of 239

Ideological classification: non-textual signal

Common setup: there are links (associations) between the actors/items for which we make ideological predictions

[Kulkarni et al., 2018] Predicting ideological orientation of news

137 of 239

Predicting ideological orientation of news [Kulkarni et al., EMNLP 18]

Source: [Kulkarni et al., 2018]

138 of 239

Predicting ideological orientation of news [Kulkarni et al., EMNLP 18]

Top-10 rankings of different ideological news sources

139 of 239

Ideology or (actually) party classification?

Classifiers are often sensible to expressions of attack and defence, opposition and government, not ideology. This is especially true in context of strong party discipline. [Hirst et al., 2014; Søyland & Lapponi, 2017]

140 of 239

Positioning: Legislation

The overarching task is predicting the position of an actor with respect to particular piece of legislation in (parliamentary) voting

The so-called roll call data

Traditional approach (not using text as data): ideal point model [Clinton et al., 2004]

A generative (Bayesian) model for roll-call analysis
Each legislator is represented by an ideal point, a position in the (potentially multidimensional) policy space
Each legislative proposal (i.e., Yea or Nay vote on the proposal) is also a point in the same policy space

141 of 239

Positioning: Legislation

The overarching task is predicting the position of an actor with respect to particular piece of legislation in (parliamentary) voting

The so-called roll call data

Traditional approach (not using text as data): ideal point model [Clinton et al., 2004]

The utility of a concrete vote (Yay or Nay) of a legislator on a legislation: noise-augmented Euclidean-distance between legislator’s IP and vote point

( and - -- position and noise of Yea vote)

( and - -- position and noise of Nay vote)

IP’s of legislators and (Yay/Nay votes of) legislation are parameters to estimate

142 of 239

Positioning: Legislation

The overarching task is predicting the position of an actor with respect to particular piece of legislation in (parliamentary) voting

The so-called roll call data

Traditional approach (not using text as data): ideal point model [Clinton et al., 2004]

Corresponds to a probit model with unobserved regressor xi corresponding to the i-th legislator’s ideal point and j-th bill vote-point-specific parameters

143 of 239

Positioning: Legislation

The overarching task is predicting the position of an actor with respect to particular piece of legislation in (parliamentary) voting

The so-called roll call data

Traditional approach (not using text as data): ideal point model [Clinton et al., 2004]

Parameters (IPs) and Yea/Nay positions of legislations estimated by maximizing the likelihood of the observed votes (y variables):

144 of 239

Positioning: Legislation

The overarching task is predicting the position of an actor with respect to particular piece of legislation in (parliamentary) voting

The so-called roll call data

Traditional approach (not using text as data): ideal point model [Clinton et al., 2004]

Parameter estimation: Markov Chain Monte Carlo simulation

Base model: the legislators are mutually independent and so are the roll calls
It is possible to augment the model for additional effects

Party effects [Clinton and Mierowitz, 2001]
Vote trading and cue taking: making utilities of legislators inter-dependent
Additional information on legislators and legislation, coming from text

145 of 239

Predicting Legislative Roll Calls from Text

Text-based extension of the ideal point model (IPM) [Clinton et al., 2004]

Fundamental limitation of IPM as a predictive model:

It is a model of the vote itself
Can be used to fill in missing votes on past legislation
Cannot be used to predict votes on future legislation

Connect the votes of the legislator with the text of the bill [Gerrish & Blei, 2011]

Ideal point topic model (IPTM)
Based on the text of the future bill, one can predict legislator’s vote

146 of 239

Ideal Point Topic Model (Gerrish & Blei, ICML 11)

Ideal Point Topic Model: Effectively combines the IPM and topic modeling

Simplified view of IPM (LR with random effects)

Ideal point of the legislator u (latent)

Vote of the legislator u for bill d (obs.)

Bill position (polarity) b_d

Bill difficulty (popularity) a_d

Votes (V_ud) depend on bill variables (A_d, B_d) and legislator’s ideal point (X_u)
Bill variables A_d, B_d depend on the bill content (i.e., topics Z_dn)

147 of 239

Ideal Point Topic Model (Gerrish & Blei, ICML 11)

Ideal Point Topic Model: More complex variant of the supervised LDA [Blei & McAuliffe, 2008]

In sLDA, the response variables (labels) are observable
Here, they are latent bill variables: A_d, B_d

Estimation: variational inference (direct computation of the posterior intractable)
Inference (for new bills): per-word topic distribution informs a_d, and b_d, which together with x_u determine v_ud

148 of 239

Votes, Bill Text, and...

Ideal Point Model [Clinton et al., 2004]

Data: just the votes

Ideal Point Topic Model [Gerrish & Blei, 2011]

Data: votes + bill text

What additional data could improve roll call predictions?

Interactions between legislators (speeches) [Thomas et al., 06]
Additional information on the legislators (e.g., party) [Kornilova et al., 18]

149 of 239

Positioning and Legislation: (Dis)Agreement

[Thomas et al., 2006] “Get out the vote: Determining support or opposition from Congressional floor-debate transcripts”

Goal: predict if a speech supports or opposes a legislative proposal
Idea: besides comparing the speech and the proposal, exploit the discourse structure of parliamentary speeches

Speeches relate to (i.e., reply to) other speeches
By estimating (dis)agreements between speeches, one can better estimate alignment with legislation
Essentially a sentiment analysis tasks: positive or negative sentiment towards a proposal

150 of 239

Legislative positioning via (dis)agreement [Thomas et al., 2006]

A debate is given as a sequence of speeches: s₁, s₂, …, s_n

Approach:

Isolation-based speech classification

Linear SVM with unigram features trained for binary (sentiment) classification

Constraints (links) between speeches

A weighted graph is induced with speeches as nodes and weights indicating the constraints
Same speaker constraints: weight of the edge connecting speeches of the same speaker is infinity
Different speaker agreement: (1) identifying references (2) decide: agreement or disagreement

Reference identification: simply, by name
Agreement classification: SVM classifier, with the reference context as input

Global optimization, with the following objective (efficient solving by min-cut):

151 of 239

Predicting roll calls with embeddings

Discriminative models for predicting roll call votes

[Kraft et al., ACL 16]

Bilinear model: multi-dimensional position vectors for legislators (v_c)
Bill representation: linear transformation (W) of the word embedding average

More parameters to represent the legislator

IPM and IPTM used only a single score
v_c -- 10-dimensional vectors

152 of 239

Ideal points vs. ideal vectors

IPM/IPTM vs. Ideal vectors

Image from [Gerrish & Blei, 2011] Image from [Kraft et al., 2016]

1-dimensional positions 2D (PCA proj. of 10-dim. vectors)

153 of 239

Ideal vectors

Source: [Kraft et al., ACL 16]

154 of 239

Predicting roll calls with embeddings

Discriminative models for predicting roll call votes

[Kornilova et al., ACL 18]

Extension of the bilinear model of [Kraft et al., ACL 16] with party information
Base model, without party information

Party information: percentage of R/D sponsors of the bill, p_r and p_d
Bill embedding, v_B, is an encoding obtained with CNN (not word emb. avg)

155 of 239

Party Matters

Accuracy results show that

Better text representations (CNN as opposed to averaging pre-trained word embeddings)
Meta-data, i.e., sponsor information

provide a better bill representation

Source: [Kornilova et al., ACL 18]

156 of 239

Are Votes Reliable Labels? [Abercombie & Navarro, 2018; 2019]

“Rebel speeches” are only a small minority, so in general you can pair text and votes, but be aware that these are not gold labels.

157 of 239

Positioning: Topics

Topic-based positioning refers to a broad body of work in which a stance or position of the political actor is to be determined for one or more topics

Concrete task definitions depend on several factors:

Topic definition: explicit, implicit?
Corpora: with or without topic annotations?
Multiple topics: detecting positions independently or jointly for multiple topics?

Pipeline approach: first topic detection, then per-topic positioning

[Menini et al., 2016; Nanni et al., 2016; Menini et al., 2017]

Joint approach: joint detection of (multiple) topics and positions

[Lin et al., 2008; Fang et al., 2012; Trabelsi et al., 2014; Thonet et al., 2016]

Lin, W. H., Xing, E., & Hauptmann, A. (2008, September). A joint topic and perspective model for ideological discourse. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 17-32). Springer, Berlin, Heidelberg.

Trabelsi, A., & Zaiane, O. R. (2014, December). Mining contentious documents using an unsupervised topic model based approach. In 2014 IEEE International Conference on Data Mining (pp. 550-559). IEEE.

Fang, Y., Si, L., Somasundaram, N., & Yu, Z. (2012, February). Mining contrastive opinions on political texts using cross-perspective topic model. In Proceedings of the fifth ACM international conference on Web search and data mining (pp. 63-72). ACM.

Thonet, T., Cabanac, G., Boughanem, M., & Pinel-Sauvagnat, K. (2016, March). VODUM: a topic model unifying viewpoint, topic and opinion discovery. In European Conference on Information Retrieval (pp. 533-545). Springer, Cham.

van der Zwaan, J. M., Marx, M., & Kamps, J. (2016, August). Validating cross-perspective topic modeling for extracting political parties' positions from parliamentary proceedings. In Proceedings of the Twenty-second European Conference on Artificial Intelligence (pp. 28-36). IOS Press.

--

Menini, S., & Tonelli, S. (2016, December). Agreement and disagreement: Comparison of points of view in the political domain. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 2461-2470).

Menini, S., Nanni, F., Ponzetto, S. P., & Tonelli, S. (2017, September). Topic-Based Agreement and Disagreement in US Electoral Manifestos. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2938-2944).

Nanni, F., Zirn, C., Glavaš, G., Eichorst, J., & Ponzetto, S. P. (2016). TopFish: topic-based analysis of political position in US electoral campaigns.

158 of 239

Topical positioning: pipelined approaches

Unlike the joint models, which induce the topics and positions/viewpoints jointly, “pipeline approaches” determine positions or (dis)agreements for known topics

Topics are either pre-defined [Nanni et al., 2016] or induced in a pre-processing step [Menini et al, 2016, 2017]

[Nanni et al., 2016] Topic-Based Analysis of Political Position in US Electoral Campaigns

Predefined topics: top-level topics from the CMP
For each topic, create a topic-filtered version of the corpus (i.e., for each manifesto, keep only the sentences labeled with that topic)
Used WordFish [Slapin & Proksch, 2008] to induce positions for each topic

159 of 239

Topical positioning: pipelined approaches

[Menini et al., EMNLP 17] Topic-Based (Dis)Agreement in US Electoral Manifestos

Pipeline:

Supervised coarse-grained domain (macro topics) classification [Zirn et al., 2017]

For each domain, unsupervised topic induction (but not with topic models!):

2. Key concept extraction, rule-based [Moretti et al., 2015]

3. Key concept clustering (a cluster is a topic)

Soft graph-based clustering, similarities based on word embeddings

4. For each topic (cluster of keywords) t

5. Couple sentences from t of Democratic Manifestos with Republican t sentences 6. Annotate topical agreement/disagreement and train a supervised classifier

160 of 239

Topical positioning: pipeline [Menini et al., EMNLP 17]

161 of 239

Jointly detecting topics and positions

The so-called Topic Models for Viewpoint Extraction:

The generated text is a result of (1) the topics the author chooses to talk about and (2) positions (typically pro and con) the author holds
Evaluation (as always with topic models) is an issue: perplexity-based measures

Joint Topic and Perspective Model (JTPM) [Lin et al., 2008]

Detection of one global (ideological?) position and topics

Joint Topic Viewpoint Model (JTVM) [Trabelsi et al., 2014]

Detection of topics and viewpoints towards each of the topics

Viewpoint and Opinion Discovery Unification Model (VODUM) [Thonet et al., 2016]

Joint topic and viewpoint detection, with two types of observables (words)

162 of 239

Joint Topic and Perspective Model [Lin et al., 2008]

V viewpoints, each with its own parameter vector (sampled from a multivariate normal distribution)
Topic 𝜏: sampled from a multivariate normal distribution
Words sampled from a multinomial distribution 𝛽_V over the vocabulary V

Not a latent variable, deterministically computed from topic and viewpoint vectors

163 of 239

Joint Topic and Perspective Model [Lin et al., 2008]

Document’s ideological position is given with the Bernoulli variable P_d

Limitations of JTPM

Models only a single topic 𝜏: clean “single-topic corpora” hard to obtain
Models only two opposite (ideological?) positions (similar to 1-D text scaling)

P_d is a Bernoulli variable
Documents need to be divided into the two “ideological” classes (i.e., annotated)

164 of 239

Joint Topic and Perspective Model [Lin et al., 2008]

Red: Israeli authors

Blue: Palestinian authors

Red: Democratic authors

Blue: Republican authors

Images from [Lin et al., 2008]

165 of 239

Joint Topic Viewpoint Model [Trabelsi et al., 2014]

Extension of LDA to incorporate viewpoints

K topics, each of which is a multinomial distribution over L viewpoints
Each viewpoint (of each topic) is a multinomial distribution over terms

One can compare the positions by:

Selecting some topic k
Compare document-specific viewpoint multinomials 𝜓_dk for the topic k

166 of 239

Joint Topic Viewpoint Model [Trabelsi et al., 2014]

Generative story

For each topic k and each viewpoint l draw (Dir(𝛽)) a multinomial over the vocabulary V
For each document d

Draw a multinomial topic mixture (sample from Dir(𝛼))
For each topic k, draw a (document-specific) viewpoint mixture multinomial (Dir(𝛾))
For each word (i.e., position)

Sample a topic z_dn
From z_dn sample a viewpoint v_dn
Sample a word from the multinomial distribution over terms for the topic z_dn and viewpoint v_dn

167 of 239

Agenda

Text

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

168 of 239

Political Text Scaling

Arguably the most prominent task in text-as-data PolSci community

Task definition: Given a set of political actors and for each of them an aggregated collection of (political) text they produce, predict (numerically) their (typically relative) positions on a 1-dimensional scale

In most cases, the aim is the left-to-right ideological scaling, even though...

169 of 239

Political Text Scaling

Task definition: Given a set of political actors and for each of them an aggregated collection of (political) text, predict (numerically) their (typically relative) positions on a 1-dimensional scale (i.e., a regression task)

In most cases, the aim is the left-to-right ideological scaling, even though...

“substantive content of a “left-right” dimension varies significantly across different contexts, to such an extent that “it may be impossible for any single scale to measure this dimension in a manner that can be used for reliable or meaningful cross-national comparison” [Benoit & Laver, 2006]

(Weakly) supervised scaling: texts annotated with ideological scores
Unsupervised scaling: no annotations, just text collections

170 of 239

Same Politician or Not?

“It is unbearable, when refugees homes are attacked, when people try to make radical speeches. All those who come to us have the right to be treated correctly, to have a proper asylum procedure. That's our rule of law and we are proud of it.”

”Sometimes I feel ashamed, when I see how the question of refugees is being discussed in our countries, when just 30,000 or 40,000 refugees are arriving in a country of 82 million while here you have 120,000 refugees in a town of 100,000 inhabitants, and when i see how they are being taken care of, I have to take my hat off to them.”

171 of 239

Same Politician or Not?

“It is unbearable, when refugees homes are attacked, when people try to make radical speeches. All those who come to us have the right to be treated correctly, to have a proper asylum procedure. That's our rule of law and we are proud of it.”

”Sometimes I feel ashamed, when I see how the question of refugees is being discussed in our countries, when just 30,000 or 40,000 refugees are arriving in a country of 82 million while here you have 120,000 refugees in a town of 100,000 inhabitants, and when i see how they are being taken care of, I have to take my hat off to them.”

172 of 239

Same Politician or Not?

“It is unbearable, when refugees homes are attacked, when people try to make radical speeches. All those who come to us have the right to be treated correctly, to have a proper asylum procedure. That's our rule of law and we are proud of it.”

”Sometimes I feel ashamed, when I see how the question of refugees is being discussed in our countries, when just 30,000 or 40,000 refugees are arriving in a country of 82 million while here you have 120,000 refugees in a town of 100,000 inhabitants, and when i see how they are being taken care of, I have to take my hat off to them.”

173 of 239

Scaling Methods

Term-based scaling

Wordscores [Benoit & Laver, 2003]
Supervised: requires positions for some number of texts
Word positions computed based on occurrences in labeled documents
Combines word positions to infer positions of unlabeled documents

2. Wordfish [Slapin & Proksch, 2008; Proksch & Slapin, 2010]

Unsupervised: only text collection as input
Counts of words: a Poisson distribution, determined with (prior) word and party effects and the product of word position and text position
EM-like algorithm for parameter estimation

174 of 239

Scaling Methods

Semantic scaling

3. SemScale [Glavaš et al., 2017; Nanni et al., 2019]

Unsupervised: only text collection as input
Relies semantic representations of words, i.e., word embeddings
Positions induced via semantic similarity and graph-based label propagation

4. Party2Vec [Rheault & Cochrane, 2019]

Unsupervised: only text as input
Introduce special party tokens (e.g., “Dem_92”) and insert them into party texts
Train word embeddings with the SkipGram model [Mikolov et al., 2013]
For scaling: project party embeddings into a single score (1-dim with PCA)

175 of 239

Wordscores [Laver et al., 2003]

Image from [Laver et al., 2003]

176 of 239

Wordscores [Laver et al., 2003]

Generate wordscores from reference texts

Reference texts r have a priori scores S_r
P(w|r) = count(w, r) / length(r): unigram LM probability of w in r
Compute posteriors P(r|w): probability of reading text r if seeing word w

Wordscore S_w is the posterior-weighted sum of positions of reference texts

177 of 239

Wordscores [Laver et al., 2003]

Computing the scores of unlabeled texts

Wordscore S_r of the new text is then simply a weighted sum of wordscores S_w

Transform the scores of the virgin texts so that they have same dispersion metric as the reference texts

178 of 239

Wordscores [Laver et al., 2003]

Computing the scores of unlabeled texts

Wordscore S_r of the new text is then simply a weighted sum of wordscores S_w

179 of 239

Wordscores [Laver et al., 2003]

Source: Laver et al. 2003

180 of 239

Wordscores [Laver et al., 2003]

Shortcomings of Wordscores:

Needs reference texts, labeled with position scores

Vocabulary determined by the reference texts

Any word not in reference texts has no effect on the position of the new document

Global frequency of words (prior “word effects”) are ignored

Stopwords pull document scores towards the “mean” score

181 of 239

Wordfish [Slapin & Proksch, 2008]

Aims to remedy for the shortcomings of Wordscores

Unsupervised: no need for position annotated reference texts
Account for word effects (i.e., global frequency of terms)

Model: A “Poisson naive Bayes”

Word frequencies in documents (observables) sampled from Poisson distributions
Distributions (words appearances and freqs) independent of each other (NB)

Distribution parameter λ: prior word/text effects and word/text positions

182 of 239

Wordfish [Slapin & Proksch, 2008]

Model: A “Poisson naive Bayes”

Distribution parameter λ: prior word/text effects and word/text positions

α_i : Prior effect of the text (party/politician)
ψ_j : Prior effect of the word (e.g., “the” less relevant than “racist”)
β_j : Word-specific weight

Specifies word importance for discriminating between text positions

ω_i : Position of the text (party/politician)

This is what we are ultimately interested in

183 of 239

Wordfish [Slapin & Proksch, 2008]

Model: A “Poisson naive Bayes”

Distribution parameter λ: prior word/text effects and word/text positions

Parameter estimation

Initialization (starting values)
α_i : Logarithm of the collection-normalized document length
ψ_j : Logarithm of the mean frequency of the word across all documents
β_j and ω_i : based on the word-document co-occurrence matrix C (log. freqs)

Each element C_ijcorrected for α_i and ψ_j (α_i and ψ_jsubtracted from C_ij)
SVD of corrected C: values β_j and ω_iset to values from left and right singular vectors, respectively

184 of 239

Wordfish [Slapin & Proksch, 2008]

Model: A “Poisson naive Bayes”

Distribution parameter λ: prior word/text effects and word/text positions

Parameter estimation

2. Iterative estimation (EM-like algorithm)

Fix word parameters (ψ_j and β_j) and estimate document (party) scores (α_i and ω_i), by maximizing log-likelihood over all vocabulary words, for each document i

185 of 239

Wordfish [Slapin & Proksch, 2008]

Model: A “Poisson naive Bayes”

Distribution parameter λ: prior word/text effects and word/text positions

Parameter estimation

2. Iterative estimation (EM-like algorithm)

b. Fix party parameters (α_i and ω_i) and estimate word scores (ψ_j and β_j), by

maximizing log-likelihood over all documents, for each word j

186 of 239

Wordfish: alternative optimization

Model: A “Poisson naive Bayes”

Distribution parameter λ: prior word/text effects and word/text positions

It is unclear why Proksch & Slapin propose such this two-step EM like optimization

Alternative [Lowe, 2016; Glavaš et al., 2017]:

Minimize global negative log-likelihood for the whole collection
Optimize parameters via gradient descent

187 of 239

Wordfish [Slapin & Proksch, 2008]

Scale ends (parties most to the end of scale), 5th leg of the EP

Key question: what does the scale capture?

Left-right ideology? Pro-against EU (EU integration position)? Both?

188 of 239

Wordfish [Slapin & Proksch, 2008]

Source: Slapin & Proksch, 2008

189 of 239

Wordfish [Slapin & Proksch, 2008]

Wordfish results (positions) seem stable across languages

Note: not cross-lingual scaling, merely independent monolingual scalings

190 of 239

Wordfish [Slapin & Proksch, 2008]

Wordfish implementations:

R implementation (Slapin & Proksch)

http://www.wordfish.org/uploads/1/2/9/8/12985397/wordfish_1.3.r

R implementation (Will Love)

https://conjugateprior.github.io/austin/reference/wordfish.html

Python implementation (Goran Glavaš)

https://github.com/codogogo/topfish/blob/master/wordfish.py

191 of 239

Semantic scaling

Term-based scaling methods like Wordscores and Wordfish:

No semantics, effectively sparse text representations
Are inherently monolingual

The need to go beyond surface forms:

“bad hombres…”

“terrible guys…”

same position

192 of 239

SemScale [Glavaš et al., 2017]

First scaling method to use semantic text representation

Approach:

Measure semantic similarity between all pairs of documents
Similarity measures based on word embeddings
This induces a fully-connected weighted graph
Label the pair of most dissimilar texts as pivots: extreme position of the spectrum
Propagate labels over the graph
Using a graph-based label propagation algorithm
Rescale the pivot texts

193 of 239

SemScale [Glavaš et al., 2017]

Two different unsupervised measures of semantic textual similarity

Alignment similarity

Greedily pairs words between two text based on their embedding similarity

Aggregation similarity

Similarity between aggregated documents vectors (averaged word embeddings)

194 of 239

SemScale [Glavaš et al., 2017]

Pairwise similarities induce a weighted fully-connected graph

Assumption: two most dissimilar texts (pivots) represent position extremes: −1 and 1

Inducing scores for other nodes

Graph-based label propagation [Zhu & Goldberg, 2009]:

195 of 239

SemScale [Glavaš et al., 2017]

Graph-based label propagation

Harmonic Function Label Propagation (HFLP) [Zhu & Goldberg, 2009]:

Let L = W - D be the unnormalized Laplacian of the graph

If we order labeled nodes before the unlabeled ones, L is partitioned as:

The scores of the unlabeled nodes are then obtained analytically as:

y_l is the vector of scores of labeled nodes, in our case y_l= [-1, 1]^T

196 of 239

Political Text Scaling: Evaluation

Scaling algorithms produce scores in a single dimension

What is the meaning of that dimension?
A posteriori substantive analyses

Political scientists try to identify a meaningful dimension aligned with scaling results

Example: Wordfish for scaling EU parties from EP speeches [Proksch & Slapin, 2010]

Scores produced by Wordfish correlate better with positions on EU integration than with left-right ideological positions

197 of 239

Political Text Scaling: Evaluation

Scaling algorithms produce scores in a single dimension

What is the meaning of that dimension?

Gold position scores for the dimension of interest?

Chapel Hill Expert Surveys [Bakker et al., 2015]

Panels of PolSci experts judge ideological and EU positions of parties

But...are these really gold standard positions for the texts that algorithms scale?

Experts do not rate parties after reading all of their speeches
Rather rate them based on their prior knowledge
Expert’s political biases encoded in the grades

198 of 239

Political Text Scaling: Evaluation

Task definition: Scaling texts for two different political dimensions: (i) left-to-right ideological position; (ii) position on European integration

Evaluation metrics:

Pairwise Accuracy (PA), i.e., the percentage of pairs with parties in the same order as in the gold standard
Spearman and Pearson correlation between the two sets of positions

Baselines:

Term-based WordFish model (monolingual setting only)
Random positioning (sanity check)

199 of 239

Political Text Scaling: Evaluation

“Errors”:

Due to the limitations of the scaling model or
Due to the texts being results of positions over multiple (not one!) dimensions?

Correlation of produced positions with

Chapel Hill left-right ideology scores

Correlation of produced positions with

Chapel Hill EU integration positions

200 of 239

SemScale: Demo, Tool and Appendix

http://tools.dws.informatik.uni-mannheim.de/semScale

201 of 239

SemScale: Demo, Tool and Appendix

https://github.com/umanlp/SemScale

https://federiconanni.com/semantic-scaling/

202 of 239

Topical Scaling

IF we want to interpret the positions as relating to a certain dimension or topic, we need to filter out the content irrelevant to that dimension

Topical classification as a preprocessing step?

TopFish [Nanni et al., 2016]

Topical classification of US electoral speeches
Per-topic scaling with Wordfish

All topics

External relations

Welfare & Quality of Life

203 of 239

Party2Vec [Rheault & Cochrane, 2019]

Simple extension of the Skip-Gram model [Mikolov et al., 2013]

An artificial “document identifier” token added to every context used for training the Skip-Gram model
Embeddings obtained for those tokens are party/document representations

204 of 239

Party2Vec [Rheault & Cochrane, 2019]

Simple extension of the Skip-Gram model [Mikolov et al., 2013]

An artificial “document identifier” token added to every training context
Embeddings obtained for those tokens are party/document representations

“Party embeddings” are then multidimensional (like word embeddings, e.g., 300-dim)
Projected to lower-dimensional space with PCA: for scaling, only the first primary component

Image from [Rheault & Cochrane, 19]:

2D PCA projection of party vectors

205 of 239

Agenda

Text

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

206 of 239

Multilinguality

Political analysts compare actors from different countries

Most content in native languages

Crossing the language chasm

Old paradigm:

Language-specific NLP models
Language-specific feature computation (i.e., preprocessing)

New paradigm:

Representation learning: inputs are semantic vectors (embeddings)
Multilingual / cross-lingual rep. learning

207 of 239

Crossing the Language Chasm

Full-Blown MT (SMT or NMT)
Parallel data needed, critical for under-resourced languages
Translate everything from the target language to the source language
Unsupervised NMT?

208 of 239

Crossing the Language Chasm

Full-Blown MT (SMT or NMT)
Parallel data needed, critical for under-resourced languages
Translate everything from the target language to the source language
Unsupervised NMT?

2. Multilingual KBs

Texts represented using entities from a multilingual KB
Same entity ID for same concepts across languages
Issues: coverage, entity linking

209 of 239

Crossing the Language Chasm

3. Multilingual / Cross-lingual

representations of meaning

Word-level

Cross-lingual word embeddings
Words with similar meaning across languages have similar vectors

Sentence- / paragraph-level

Most recent developments
Multilingual unsupervised pre-training

[Lample & Conneau, ‘19; Devlin et al., 2019]

Image from [Luong et al., 2015]

210 of 239

CLWE: post-hoc alignment

Monolingual embeddings independently constructed

Post-hoc aligning monolingual spaces

X is distributional space of L1, Y of L2

We are looking for functions f and g that produce a meaningful bilingual embedding space f(X) ∪ g(Y)

Image from [Conneau et al., 2018]

211 of 239

CLWE: PolSci applications

Cross-lingual word embeddings (CLWEs) allow for

Semantic comparison of texts in different languages
Cross-lingual transfer of NLP models

Resource-rich training language, resource-poor target language

Cross-lingual text scaling with SemScale [Glavaš et al., 2017a]

Nothing changes for the SemScale algorithm
Semantic similarities between texts based on vectors from the CLWE space

Cross-lingual topic classification of manifestos [Glavaš et al., 2017b]

A lot of topic-annotated data in EN and DE, much less in other languages

212 of 239

Cross-lingual manifesto topic classification [Glavaš et al., 2017b]

Simple classification model:

Embeddings (from a CLWE space) as input
Convolutional network as the encoder
Softmax classifier

Manifesto top-level topics: 7 labels

Train/test sets in 4 langs: EN, DE, FR, IT
EN & DE datasets much larger than FR & IT

Two models

Mono-L: training data of one language
Cross-L: concatenation of all training data

https://github.com/codogogo/topfish

213 of 239

Conclusion

Text

RQs

Tasks

Topic Detection

Positioning

Scaling

Multilinguality / CL Transfer

214 of 239

Conclusion

Computational analysis of political text: a vibrant interdisciplinary research area

Natural language processing
Political science

Despite interdisciplinary nature of the work

Most research efforts from one OR the other community
Unexpectedly disjunct communities

Different lines of work on similar and closely related tasks

Positioning vs. scaling

Bridging efforts between communities paramount for more interdisciplinary work

This tutorial is our small contribution towards that goal

215 of 239

Thanks!

Goran

Federico

Simone

216 of 239

Interested in Computational Political Science?

We are hiring, get in touch!

{federico,goran,simone}@informatik.uni-mannheim.de

217 of 239

218 of 239

3 Days Text Scaling Hackathon (Dec. 2017)

219 of 239

3 Days Text Scaling Hackathon (Dec. 2017)

Supported by Villa Vigoni, the German-Italian Centre for European Excellence and by DFG.

23 young researchers from political science, computational social science and NLP.

220 of 239

3 Days Text Scaling Hackathon (Dec. 2017)

Participants from:

Mannheim, Bruno Kessler Found., Unitelma, Sheffield, Duisburg-Essen, GESIS, Scuola Normale Superiore, EUI, Scuola Superiore Sant’Anna, Bocconi, Liepzig, Zagreb, LSE, Alan Turing Institute, Edinburgh, CEU, Toronto.

221 of 239

Joint Work With:

Goran Glavas

Simone Paolo Ponzetto

Sara Tonelli

Nicolò Conti

222 of 239

What is a Hackathon?

A coding-intensive collaborative workshop.

223 of 239

What is a Hackathon?

Why?

A coding-intensive collaborative workshop.

To boost collaboration across disciplines.

224 of 239

What is a Hackathon?

Why?

Shared-Task?

A coding-intensive collaborative workshop.

To boost collaboration across disciplines.

Participants were divided in 5 groups and had to work together towards a specific goal.

225 of 239

European-Integration Scaling

The task: develop a method for scaling text on the EU integration dimension. We provide participants with:

226 of 239

European-Integration Scaling

The task: develop a method for scaling text on the EU integration dimension. We provide participants with:

Manually translated speeches made by 25 parties from France, Germany, Italy, Spain and UK at the EuroParl (1999 - 2017).

227 of 239

European-Integration Scaling

The task: develop a method for scaling text on the EU integration dimension. We provide participants with:

Manually translated speeches made by 25 parties from France, Germany, Italy, Spain and UK at the EuroParl (1999 - 2017).
A gold standard (Chapel Hill) of EU integration party-positions (leg: 5, 7, 8)

228 of 239

European-Integration Scaling

The task: develop a method for scaling text on the EU integration dimension. We provide participants with:

Manually translated speeches made by 25 parties from France, Germany, Italy, Spain and UK at the EuroParl (1999 - 2017).
A gold standard (Chapel Hill) of EU integration party-positions (leg: 5, 7, 8)
On the last evening, a test set (leg: 6)

229 of 239

European-Integration Scaling

The output: party positions for 6th legislation regarding European integration (between 0: strongly against and 1: strongly in favour).

230 of 239

European-Integration Scaling

The output: party positions for 6th legislation regarding European integration (between 0: strongly against and 1: strongly in favour).

All data available at: https://federiconanni.com/hack-vigoni/

231 of 239

What Participants Could Not Do

Find online the gold standard and predict based on that
Use external knowledge on the party to scale it (the task is text scaling)

232 of 239

How Did It Go?

233 of 239

How Did It Go?

234 of 239

How Did It Go?

235 of 239

How Did It Go?

236 of 239

Core Components of All Approaches

An initial filtering strategy (using a dictionary or a manually created list)
A text similarity approach, based on TF-IDF or word embeddings
A supervised scaling function (SVM regression model, canonical correlation analysis, etc.)

237 of 239

Results

238 of 239

Results

239 of 239

Results