Stanford NLP (coursera) Notes (10) - Relation Extraction | Bangda Sun

Relation Extraction.

1. Relation Extraction

Last post we briefly introduced Information Extraction and one of the tasks: Named Entity Recognized (NER). This time we will continue - not only extract entities, but also extract the relationships among entities: IS - A relation, instance - of relation, etc (more from WordNet Thesaurus). For example, after we extract entities from company report, we get Company/Location/Date, to get more advanced knowledge structure, we focus on relation triples: Company - Founding, like Founding - year (IBM, 1911), Founding - location (IBM, New York).

Why Relation Extraction?

create new structured knowledge bases, useful for any app;
augment current knowledge bases;
support Question - Answering system.

Two resources:

Automated Content Extraction (ACE) gives 17 relations from 2008 “Relation Extraction Task”, these are specific rules for extraction;
Unified Medical Language System (UMLS) specifies 54 relations among 134 entity types.

Besides those hand-written patterns, we can also use supervised learning, semi-supervised learning and unsupervised learning.

2. Hand-Written Patterns

First let’s see the simplest one - IS-A relation. The early intuition from Hearst (1992):

Y such as X ((, X) * (, and | or) X);
such Y as X;
X or other Y;
X and other Y;
Y including X;
Y, especially X.

There are more relations like Located-in, founded, cures, etc. Named Entities are also helpful when extract relations.

Advantages:

human made rules tend to be high-precision;
can be tailored to specific domains.

Disadvantages:

human made rules are often low-recall;
time consuming work.

3. Supervised Relation Extraction

The basic task for the classifier is decide any 2 entities are related. Specific steps are as follows:

choose a set of relations we’d like to extract;
choose a set of relevant named entities;
find and label data, choose corpus - label the named entities in the corpus - hand label the relations between these entities - split into training and test set;
train a classifier on training set.

For features, we could extract word-based features (words before/after the target entities, words between target entities), entity-based features (POS tags of entities) and syntactic features (constituent path, base syntactic chunk path, typed-dependency path), etc.

Advantages:

can get high accuracy with enough training data and test data is similar with training.

Disadvantages:

labeling large training data is expensive;
classifier may not generalize well to different genres.

4. Semi - Supervised and Unsupervised Relation Extraction

When we don’t have label or even no training data, we could have a few seed tuples or high-precision patterns.

we can use bootstrapping: use the seeds to directly learn to populate a relation. First gather a set of seed pairs that have relation \(R\), then iterate:

find sentences with these pairs;
look at the context between or around the pair and generalize the context to create patterns;
use the patterns for grep for more pairs.

Also there are more advanced methods available, like Distant Supervision, etc.