Using Conditional Random Fields for Named Entity Recognition | ansaurus

tags:

views:

227

answers:

1

+1 Q:

Using Conditional Random Fields for Named Entity Recognition

What is Conditional Random Field? How does exactly Conditional Random Field identify proper names as person, organization, or place in a structured or unstructured text?

For example: This product is ordered by StackOverFlow Inc.

What does Conditional Random Field do to identify StackOverFlow Inc. as an organization?

A:

A CRF is a discriminative, batch, tagging model, in the same general family as a Maximum Entropy Markov model.

A full explanation is book-length.

A short explanation is as follows:

Humans annotate 200-500K words of text, marking the entities.
Humans select a set of features that they hope indicate entities. Things like capitalization, or whether the word was seen in the training set with a tag.
A training procedure counts all the occurrences of the features.
The meat of the CRF algorithm search the space of all possible models that fit the counts to find a pretty good one.
At runtime, a decoder (probably a Viterbi decoder) looks at a sentence and decides what tag to assign to each word.

The hard parts of this are feature selection and the search algorithm in step 4.

bmargulies 2009-12-27 12:49:28

related questions

XSD schema: Meta data driven generation with XSLT

Framework goto definition shows comments, are these generated from xml comments?

How do I programatically disable Etags in iis 6

Free or inexpensive metadata repository solution?

How to read database and table metadata from MS SQL server with minimum permissions

Does .NET (Mono) support cross-platform file operations and cross-platform audio metadata handling (through libraries)?

What metadata repository tools are used in industry?

What data mapping tool is good for mid size projects

Write dpi metadata to a jpeg image in Java

How do I check if a column exists in SQL Server?

How do you configure WCF to support FaultContracts where both the host and client are in the same process using a net.pipe?

Where are seed_value and increment_value for IDENTITY columns?

What SQL query or view will show "dynamic columns"

Getting image metadata in .NET without regards to metadata format

How do I determine if a column is an identity column in MSSQL 2000?

Are Meta Keywords Obsolete?

What Oracle privileges do I need to use DBMS_METADATA.GET_DDL?

Accessing Greasemonkey metadata from within your script?

Any Metadata driven UI sample code?

Is it possible to save metadata in an image?

How do you determine what SQL Tables have an identity column programatically

Customizing Search Results Display in Sharepoint Services 3.0 Wiki

Is it possible to select from "show tables" in MySQL?

How can I get the definition (body) of a trigger in SQL Server?

A python web application framework for tight DB/GUI coupling?