The Europeana Geoparser is a service that uses information extraction techniques to automatically identify names of places and time periods that are mentioned in unstructured text. It works based on a Gazetteer to be able to assign coordinates and dates with the mentioned names of places and periods. The target users are Europeana, aggregators and data providers, which need to enrich the object descriptions by analyzing geographic or temporal references in existing metadata records.
The EuropeanaConnect Gazetteer is a rich dataset including over A rich data resource including over 9 million geographic names, co-ordinates, and boundaries. By enriching Europeana's metadata with these geographic references, it is possible to identify features such as continents, countries, cities, monuments and rivers contained in the objects on Europeana. The service also has a multilingual aspect. Users can search for a single term such as ‘London' and retrieve results for objects marked with ‘Londres' or ‘Londyn'. Information in the EuropeanaConnect Gazetteer has been collected from free data sources, which means there are no legal constraints to its use and re-use. Information from additional data sources is also continuously integrated, ensuring the service is kept up to date.
The Geoparser was developed during several years in series of projects at INESC-ID, and adapted for specialization in cultural heritage metadata records, during the Europeana Connect project, a best practices network that ran from May 2009 until October 2011. The Geoparser web service was maintained by Europeana and used by many of its aggregators in the following years, until changes in the technological environment (namely, in the Geonames API, and in the change in metadata format of Europeana – to the Europeana Data Model or EDM) made the service inoperational in 2014, and without financial support for its adaptation.
For more information, or to give feedback, please contact: Nuno Freire