Advanced NLP Seminar (11-713) 2009

Weekly topics and readings

Topic 1:  CCG parsing (discussion on January 22 led by Onur)

  1. Vijay-Shanker and Weir, Polynomial Time Parsing of Combinatory Categorial Grammars, ACL 1990,
  2. Eisner, Efficient Normal-Form Parsing for Combinatory Categorial Grammar, ACL 1996,
  3. Hockenmaier and Steedman, Generative Models for Statistical Parsing with Combinatory Categorial Grammar, ACL 2002,
  4. Clark, Hockenmaier, and Steedman, Building Deep Dependency Structures with a Wide-Coverage CCG Parser, ACL 2002,
  5. Clark and Curran, Parsing the WSJ using CCG and Log-Linear Models, ACL 2004,
  6. one additional related paper, chosen individually

Topic 2:  Data-driven dependency parsing (discussion on January 29 led by Shay)

  1. Lafferty, Sleator, and Temperley, Grammatical Trigrams:  A Probabilistic Model of Link Grammar, AAAI 1993 Fall Symposium ...,
  2. Eisner, An Empirical Comparison of Probability Models for Dependency Grammar, TR 1996, 
  3. Eisner and Satta, Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars,
  4. Nivre, An Efficient Algorithm for Projective Dependency Parsing, IWPT 2003,
  5. Ryan McDonald; Fernando Pereira; Kiril Ribarov; Jan Hajic, Non-Projective Dependency Parsing using Spanning Tree Algorithms, HLT-EMNLP 2005,
  6. one additional related paper, chosen individually

Topic 3:  Spoken language parsing (discussion on February 5 led by Noah)

  1. Charniak and Johnson, Edit Detection and Parsing for Transcribed Speech, NAACL 2001,
  2. Liu, Shriberg, and Stolcke, Automatic Disfluency Identification in Conversational Speech Using Multiple Knowledge Sources, Eurospeech 2003, or
  3. Kahn, Ostendorf, and Chelba, Parsing Conversational Speech Using Enhanced Segmentation, HLT-NAACL 2004,
  4. Hale, Shafran, Yung, Dorr, Harper, Krasnayanskaya, Lease, Liu, Roark, Snover, and Stewart, PCFGs with Syntactic and Prosodic Indicators of Speech Repairs, COLING-ACL 2006,
  5. one additional related paper, chosen individually

Topic 4:  Grammar induction (discussion on February 12 led by Vamshi)

  1. Carroll and Charniak, Working Notes of theWorkshop Statistically-Based NLP Techniques 1992, or
  2. Pereira and Schabes, ACL 1992,
  3. de Marcken, WVLC 1995, or 
  4. Chen, ACL 1995,
  5. one additional related paper, chosen individually

Topic 5:  Grammar induction, continued (discussion on February 19)

  1. Klein and Manning, ACL 2002,
  2. Klein and Manning, ACL 2004,
  3. Smith and Eisner, COLING-ACL 2006, 
  4. Cohen, Gimpel, and Smith, NIPS 2008,
  5. one additional related paper, chosen individually

Topic 6:  Generic approaches to structural search (discussion on February 26 led by Ramnath)

  1. Klein and Manning, Parsing and Hypergraphs, IWPT 2001,
  2. Klein and Manning, Factored A* Search for Models over Sequences and Trees, 2003,
  3. Eisner, Goldlust, and Smith, Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language, HLT-EMNLP 2005,
  4. Punyakanok, Roth, Yih, and Zimak, Semantic Role Labeling via Integer Linear Programming Inference, 2004,
  5. one additional paper, chosen from among the following:
  1. Huang and Chiang, Better k-Best Parsing, IWPT 2005,
  2. Charniak et al., Multileval Coarse-to-Fine PCFG Parsing, NAACL 2006,
  3. Klein and Manning, A* Parsing:  Fast Exact Viterbi Parse Selection, NAACL 2003,

Topic 7:  Computational morphology (discussion on March 5 led by Nathan)


  1. Yarowsky and Wicentowski, ACL 2001, Minimally supervised morphological analysis by multimodal alignment.
  2. N. A. Smith, D. A. Smith, and Tromble, HLT-EMNLP 2005, Context-based morphological disambiguation with random fields.
  3. Minkov, Toutanova, and Suzuki, ACL 2007, Generating complex morphology for machine translation.
  4. Dreyer, J. R. Smith, and Eisner, EMNLP 2008, Latent-variable modeling of string transductions with finite-state methods. (this has the correct version of Figure 1)
  5. One of:
  1. Kudo, Yamamoto, and Matsumoto, EMNLP 2004, Applying conditional random fields to Japanese morphological analysis. (similar to #2)
  2. Habash and Rambow, ACL 2005. Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop.
  3. Snyder and Barzilay, ACL 2008, Unsupervised multilingual learning for morphological segmentation. (somewhat similar to #3)


No meeting on March 12 (spring break)

Topic 8:  Semantic role labeling (discussion on March 19 led by Qin)

  1. Gildea and Jurafsky, CL 2002, Automatic labeling of semantic roles.
  2. Pradhan, Ward, and Martin, HLT-NAACL 2004, Shallow semantic parsing using support vector machines.
  3. Pradhan, Ward, and Martin, CL 2008, Towards robust semantic role labeling.
  4. One of:
  1. Thompon, Levy, and Manning, ECML 2003.  A generative model for semantic role labeling.
  2. Punyakanok, Roth, and Yih, IJCAI 2005.  The necessity of syntactic parsing for semantic role labeling.
  3. Xue and Palmer, EMNLP 2004.  Calibrating features for semantic role labeling.
  4. Punyakanok, Koomen, Roth, and Yih, CoNLL 2005.  Generalized inference with multiple semantic role labeling systems.
  5. Jiang, Ping, and Ng, EMNLP 2006.  Semantic role labeling of NomBank:  A maximum entropy approach.
  6. Liu and Ng, ACL 2005.  Learning predictive structures for semantic role labeling of NomBank.

Topic 9:  Semantic parsing into logical forms (discussion on March 26 led by Mike)

  1. Zettlemoyer and Collins. 2005. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars.  UAI.
  2. Wei Lu, Hwee Tou Ng, Wee Sun Lee, Luke S. Zettlemoyer.  2008.  A Generative Model for Parsing Natural Language to Meaning Representations.  EMNLP.
  3. Johan Bos, Stephen Clark, Mark Steedman, James R. Curran, Julia Hockenmaier. 2004. Wide-Coverage Semantic Representations from a CCG Parser.  COLING.
  4. Yuk Wah Wong and Raymond J. Mooney. 2007.  Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus.  ACL.
  5. One of:
  1. Ruifang Ge and Raymond J. Mooney. 2006. Discriminative Reranking for Semantic Parsing.  COLING/ACL.
  2. Yulan He, Steve Young.  2005. Semantic processing using the Hidden Vector State model.  Computer Speech and Language 19, pp. 85–106.
  3. Johan Bos.  2005.  Towards wide-coverage semantic interpretation.

Topic 10:  Textual entailment (discussion on April 2 led by Dipanjan)

  1. Bos and Markert. 2005. Recognising textual entailment with logical inference. HLT-EMNLP.
  2. Snow et al. 2006. Effectively Using Syntax for Recognizing False Entailment. HLT-NAACL.
  3. MacCartney et al. 2006. Learning to recognize features of valid textual entailments. HLT-NAACL.
  4. MacCartney and Manning, 2007. Natural Logic For Textual Inference. ACL Workshop on Textual Entailment and Paraprashing.
  5. One of:
  1. Hickl. 2008. Using Discourse Commitments to Recognize Textual Entailment. COLING.
  2. Tatu and Moldovan. COGEX at RTE 3. ACL Workshop on Textual Entailment and Paraprashing.

Topic 11:  Generation (discussion on April 9 led by Liu)

  1. Reiter and Dale.  1997.  Building applied natural language generation systems. JNLE.
  2. Barzilay and Lapata.  2005.  Modeling Local coherence: An Entity-based Approach. ACL.
  3. Bangalore and Rambow. 2000. Exploiting a Probabilistic Hierarchical Model for Generation. COLING.
  4. Lapata. 2003. Probabilistic Text Structuring: Experiments with Sentence Ordering. ACL.
  5. One of:
  1. Dale and Reither. 1995. Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions. Cognitive Science.
  2. Stone and Doran. 1997. Sentence Planning as Description Using Tree-Adjoining Grammar. ACL.

Topic 12:  Synchronous TAG (discussion on April 16 led by Jon)

  1. Only if you need it, a TAG tutorial:
  2. Chiang, Schuler, and Dras.  2000. Some remarks on an extension of syncrhonous tag.  TAG+5.
  3. Dras.  1999.   A Meta-Level Grammar: Redefining Synchronous TAG for Translation and Paraphrase.  ACL.
  4. Chiang and Rambow.  2006.  The Hidden TAG Model: Synchronous Grammars for Parsing Resource-Poor Languages.  TAG+8.
  5. Nesson, Shieber, and Rush.  2006.  Induction of Synchronous Tree-Insertion Grammars for Machine Translation.  AMTA.

Topic 13:  Markov logic (discussion on April 23 led by Ni)

  1. Jaimovich, A.; Meshi, O.; and Friedman, N. 2007. Template based inference in symmetric relational Markov random fields. UAI.
  2. Domingos and Richardson.  2004.  Markov Logic: A Unifying Framework for Statistical Relational Learning. ICML Workshop on Statistical Relational Learning and its Connections to Other Fields.
  3. Poon, H. & Domingos, P. 2006. Sound and efficient inference with probabilistic and deterministic dependencies.  AAAI.
  4. Kok and Domingos.  2009.  Learning Markov Logic Network Structure via Hypergraph Lifting.  ICML.


  1. Matthew Richardson, Pedro Domingos. 2006.  Markov Logic Networks. Machine Learning, 62, 107-136.
  2. Domingos, Hoifung Poon. 2008.  Joint Unsupervised Coreference Resolution with Markov Logic.  EMNLP.

Topic 14:  Bayesian NLP (discussion on April 30 led by Tae)

  1. Goldwater, Griffiths.  2007.  A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging.
  2. Johnson.  2007.  Why Doesn't EM Find Good HMM POS-Taggers?
  3. Through page 1004 of:  Blei, Ng, Jordan.  2003.  Latent Dirichlet Allocation.
  4. Griffiths, Steyvers, Blei, Tenenbaum.  Integrating topics and syntax
  5. Optional:
  1. If you like the last paper (Griffiths, Steyvers, Blei, Tenenbaum) but feel its details are too short, or a need more NLP-like perspective, read:  Wallach.  Topic Modeling: Beyond bag-of-words.
  2. If you wonder whether those HB models are ever competitive in realistic NLP tasks, read:  Toutanova, Johnson.  A Bayesian LDA-based model for semi-supervised part-of-speech tagging.