Stanford NLP (coursera) Notes (1) - Introduction | Bangda Sun

NLP introduction.

Recently I watched videos and slides about Natural Language Processing from Stanford Coursera course by Dan Jurafsky and Christopher Manning. Although CS224d is popular, I still want to start with more basic materials, neural nets are not everything. Therefore I will spend several posts to go through this course. The materials of this course could be found here, and these materials will be my main references.

Generally speaking, NLP is using computers / machines to process / understand natural language used by human, including the language we speak and write. And with the high speed development of AI, it becomes more and more popular. NLP examples and applications are everywhere: question answering (siri); information extraction (email analysis); sentiment analysis (reviews of products on Amazon); machine translation (google translator), etc.

One of the difficulties in NLP is language ambiguity, and it’s everywhere. Also, take English as an example, factors like non-standard english; segmentation issues; idioms; neologism; tricky entities and so on all increase the difficulties for understanding natural language.

Current language tasks could be divided into three parts:

mostly solved, like spam detection, POS tagging, name entity recognition;
making good progress, like sentiment analysis, word sense disambiguation, parsing, machine translation and information extraction;
still really hard, like question answering, paraphrase, text summarization.

This course mainly focus on the first two parts, contents include

Basic Text Processing

Minimum Edit Distance

Language Modeling

Spelling Correction

Text Classification

Sentiment Analysis

Maximum Entropy Model

Information Extraction and Named Entity Recognition

Relation Extraction

POS Tagging

Parsing (Probabilistic Parsing, Lexicalized Parsing, Dependency Parsing)

Information Retrieval

Semantics

Question Answering

Summarization

Next time we will discuss Basic Text Processing and Minimum Edit Distance!