1 of 15

Natural Language Processing

By

S.V.V.D.Jagadeesh

Sr. Assistant Professor

Dept of Artificial Intelligence & Data Science

LAKIREDDY BALI REDDY COLLEGE OF ENGINEERING

2 of 15

  • Previously Discussed Topics
  • Session Outcomes
  • Tree Bank
  • Penn TreeBank
  • Penn TreeBank Examples
  • None Nodes
  • Using a TreeBank as a Grammar
  • TreeBank Grammar For VP
  • TreeBank Grammar for NP

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Previously Discussed Topics

LBRCE

NLP

3 of 15

At the end of this session, Student will be able to:

  • Understand the role of Treebanks in searching(Understand-L2)

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Session Outcomes

LBRCE

NLP

4 of 15

  • It is often important to search through a treebank to find examples of particular grammatical phenomena, either for linguistic research or for answering analytic questions about a computational application.
  • But neither the regular expressions used for text search nor the boolean expressions over words used for web search are a sufficient search tool.
  • What is needed is a language that can specify constraints about nodes and links in a parse tree, so as to search for specific patterns.

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Searching Using TreeBank

LBRCE

NLP

5 of 15

  • Tgrep (Pito, 1993) and TGrep2 (Rohde, 2005) are publicly-available tools for searching treebanks that use a similar language for expressing tree constraints.
  • A pattern in tgrep or TGrep2 consists of a specification of a node, possibly followed by links to other nodes.
  • A node specification can then be used to return the subtree rooted at that node.
  • For example, the pattern NP returns all subtrees in a corpus whose root is NP.

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Searching Using Tgrep and Tgrep2

LBRCE

NLP

6 of 15

  • Nodes can be specified by a name, a regular expression inside slashes, or a disjunction of these.
  • For example, we can specify a singular or plural noun (NN or NNS) using Penn Treebank notation as either of the following:
  • /NNS?/ NN|NNS
  • A node which either is the word bush or else ends in the string tree can be expressed as:
  • /tree$/|bush

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Searching Using Tgrep and Tgrep2

LBRCE

NLP

7 of 15

  • NP < PP
  • The relation << is used to specify dominance; this pattern matches an NP dominating a PP:
  • NP << PP
  • This previous pattern would thus match either of the following trees:
  • (NP (NP (NN reinvestment)) (PP (IN of) (NP (NNS dividends))))
  • (NP (NP (DT the) (JJ austere) (NN company) (NN dormitory))
  • (VP (VBN run) (PP (IN by) (NP (DT a) (JJ prying) (NN caretaker)))))

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Searching Using Tgrep and Tgrep2

LBRCE

NLP

8 of 15

  • The relation . is used to mark linear precedence.
  • The following pattern matches an NP that immediately dominates a JJ and is immediately followed by a PP, for example matching the NP dominating the austere company dormitory in
  • NP < JJ . VP
  • Each of the relations in a tgrep/TGrep2 expression is interpreted as referring to the first or root node.
  • Thus for example the following expression means an NP which both precedes a PP and dominates an S:
  • NP . PP < S

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Searching Using Tgrep and Tgrep2

LBRCE

NLP

9 of 15

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Links in tgrep2

LBRCE

NLP

10 of 15

  • Syntactic constituents could be associated with a lexical head; N is the head of an NP, V is the head of a VP.
  • This idea of a head for
  • each constituent dates back to Bloomfield (1914).
  • It is central to such linguistic formalisms such as Head-Driven Phrase Structure Grammar (Pollard and Sag, 1994), and has become extremely popular in computational linguistics with the rise of lexicalized grammars

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Heads and Head Findings

LBRCE

NLP

11 of 15

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Heads and Head Findings

LBRCE

NLP

12 of 15

  • An alternative approach to head-finding is used in most practical computational systems.
  • Instead of specifying head rules in the grammar itself, heads are identified dynamically in the context of trees for specific sentences.
  • In other words, once a sentence is parsed, the resulting tree is walked to decorate each node with the appropriate head.

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Heads and Head Findings

LBRCE

NLP

13 of 15

  • If the last word is tagged POS, return last-word.
  • Else search from right to left for the first child which is an NN, NNP, NNPS, NX, POS, or JJR.
  • Else search from left to right for the first child which is an NP.
  • Else search from right to left for the first child which is a $, ADJP, or PRN.
  • Else search from right to left for the first child which is a CD.
  • Else search from right to left for the first child which is a JJ, JJS, RB or QP.
  • Else return the last word

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Heads and Head Findings

LBRCE

NLP

14 of 15

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Heads and Head Findings

LBRCE

NLP

15 of 15

  • Previously Discussed Topics
  • Session Outcomes
  • Searching using Tree Banks
  • Searching Using Tgrep and Tgrep2
  • Links in Tgrep2
  • Heads and Head Findings

S.V.V.D.Jagadeesh

Monday, February 23, 2026

Summary

LBRCE

NLP