Bangda Sun

Practice makes perfect

Stanford NLP (Coursera) Notes (12) - Parsing Introduction

Parsing Introduction.

1. Two Views of Syntactic Structure

Using statistical model, there are two views of syntactic structure:

  • Constituency

Phrases structures organizes words into nested constituents, more intuitively - segment sentences by brackets.

  • Dependency

Structure shows which words depend on (modify or are arguments of) which other words using dependency arc.

Before these structures / parsing models are raised, the classical parsing models are based on symbolic grammar (Context-Free-Grammar, CFG) and lexicon. A big issue is they scaled very badly and didn’t give coverage.

The solutions include:

  • categorical constraints can be added to limit unlikely / weird parses
  • using statistical parsing to help find the most likely parses for sentences

Annotated data including Treebank are built, with benefits:

  • re-usability of labor (many parsers, POS taggers)
  • broad coverage
  • frequencies and distributional information
  • evaluation systems

2. Exponential Problem in Parsing

A key parsing decision is how to “attach” various constituents.