
Thursday, October 20, 2016

Onomastic resource for European Commission’s Joint Research Centre

The European Commission’s Joint Research Centre (JRC) has recently released a multilingual onomastic resource for the names of persons and organizations. The JRC-Names database provides lists of these name types and their many spelling variants (up to several hundred for a single personal name) and includes multiple scripts (e.g. Latin, Greek, Cyrillic, Japanese, Chinese, etc). The resource is the by-product of the Europe Media Monitor (EMM) family of applications, which has been analyzing up to 300,000 news reports per day, since 2004.

The JRC-Names resource and accompanying software are available for download as text.
The new Linked Data resource, accessible through the European Union’s Open Data Portal, also offers supplementary information (e.g. frequency counts, historical onomastic background information, etc.).

The new Linked Data edition is available through a SPARQL endpoint and via a RDF dump. It is registered on the portal as JRC-Names.
Additional information is available on this page of EU Open Data Portal.

No comments:

Post a Comment