Tuesday, November 7, 2017

Numishare supports OpenRefine reconciliation APIs for OCRE, PELLA, and CRRO

After building a reconciliation service for Nomisma concepts, I began working on applying the same methodologies to creating an OpenRefine reconciliation API for coin type corpora projects published in the Numishare platform. The API has been extended to support suggestions for properties. These properties are facet/string (exact match) or text (keyword anywhere) fields for mints, rulers, denominations, etc. that have been indexed into Apache Solr. It may be possible to extend this property list to dates, legends, or other indexed fields.

Property suggestion API is derived from available Solr facet fields

Test Case: University of Graz Roman imperial coins

I received a spreadsheet of about 2,000 Roman imperial coins with RIC numbers and emperors from Elisabeth Steiner at the University of Graz. I performed some cleanup of the RIC numbers and normalized the emperor list to English preferred labels via the Nomisma OpenRefine reconciliation service (more details below). About half of the coins normalized to OCRE IDs on the first pass (which took 45 seconds), but the majority of non-matches fell into two categories: RIC numbers that had been split by OCRE into separate URIs due to differences in denomination and RIC 6-8 volumes, where the numbering restarted based on mint rather than ruler. To ameliorate these issues, I got an updated spreadsheet that contained columns for mint and denomination.

Filter for uncertain attributions, 'od.'

My workflow was as follows: