Stanford NLP (Coursera) Notes (12) - Parsing Introduction | Bangda Sun

Parsing Introduction.

1. Two Views of Syntactic Structure

Using statistical model, there are two views of syntactic structure:

Constituency

Phrases structures organizes words into nested constituents, more intuitively - segment sentences by brackets.

Dependency

Structure shows which words depend on (modify or are arguments of) which other words using dependency arc.

Before these structures / parsing models are raised, the classical parsing models are based on symbolic grammar (Context-Free-Grammar, CFG) and lexicon. A big issue is they scaled very badly and didn’t give coverage.

The solutions include:

categorical constraints can be added to limit unlikely / weird parses
using statistical parsing to help find the most likely parses for sentences

Annotated data including Treebank are built, with benefits:

re-usability of labor (many parsers, POS taggers)
broad coverage
frequencies and distributional information
evaluation systems

2. Exponential Problem in Parsing

A key parsing decision is how to “attach” various constituents.