Named Entity Recognition (NER) is the task of seeking occurrences of named entities in text, according to a pre-defined set of categories. Named entities may be proper nouns (subcategorized in proper nouns of persons, cities, named artifacts, etc.) as well as time expressions, quantities, etc.

A related task is Entity Linking (EL), i.e., the task of recognizing in text linguistic expressions denoting relevant concepts. In other words, Entity Linking classifies named entities identified in text, e.g., by associating them to individuals belonging to relevant ontological classes (see also Ontology Learning).

In legal informatics, cross-references of legal documents within other legal documents are very important named entities to identify. It is often the case that the actual content of provisions is spread out among multiple cross-referred legal documents. In order to perform proper compliance checking, the additional information coming from the cited documents must be retrieved.

The University of Luxembourg has developed a rule-based system able to recognize several kinds of named entities in the legal domain, such as cross-references, organizations or public bodies, dates, etc. or to sub-classify basic named entities (e.g., classify when a proper noun of person refers to a judge, a lawyer, a party, etc.). The rule-based system can be easily integrated with statistical methods towards an hybrid approach featuring a good compromise between recall and precision. The rule-based system evolve the prototype used in the FP7 project EUCases (see (Boella et al, 2015)).

Main contributor(s): Livio Robaldo, Luigi Di Caro, Guido Boella