Wednesday, July 24, 2013

Coin Hoards of the Roman Republic – a new tool for Roman numismatics

Today the American Numismatic Society and the Institute of Archaeology of University College London, UK, launch an important new tool for the analysis of Roman Republican coin hoards.

Coin Hoards of the Roman Republic Online (CHRR Online) is a collaboration between Rick Witschonke and Ethan Gruber at the ANS and Dr. Kris Lockyear of UCL.

The new web-based tool makes available in searchable form the contents of a database created by Dr. Lockyear of 694 Roman Republican Coin hoards and the 115,000 coins that they contain. The new tool, which is based on the Numishare technology, makes it possible to browse, search, map and analyze the evidence of Roman coin finds in new and exciting ways.

“The database was initially created on a PC for my PhD, but it has continued to be expanded since then and forms the basis of my book Patterns and Process in the Late Roman Republican Coin Hoards, where it is used, amongst other things, to investigate the size of late Republican coin issues, the date of the import of Republican denarii to Dacia and the patterning created by the events of the civil wars. I am very grateful to Professor Michael Crawford for allowing me access to his unpublished archive held in the British Museum”, noted Dr Lockyear. “It was obvious that the database, the result of over twenty years work, was a valuable resource that could help others in their research if I could make it widely available. As the database continues to grow, updates will be posted to the online version, which I hope will encourage others to share information.”

The potential for the ANS to help in the process of online publication was spotted by curatorial associate Rick Witschonke. “It was clear that Kris’ database dovetailed very neatly with work being carried out at the ANS to create stable identities for numismatic concepts on the web”, explains Witschonke. “We were very fortunate also to be in touch with the curators at the British Museum, Ian Leins and Eleanor Ghey, who generously made available to us the work they had recently undertaken to catalogue the BM collection. By bringing together their data and Kris’ hoard database with the work that ANS has been undertaking at Nomisma.org, we were able to create a new tool based on Linked Open Data principles.”

The creation of the new web tool was the work of ANS database developer Ethan Gruber. The integration of Roman Republican Coinage coin types defined by Nomisma.org into CHRR Online enables maps and timelines showing the geographic and temporal extent of hoards. Furthermore, users of the quantitative analysis interface may compare the distribution of selected typological attributes across numerous hoards, visualizing results in the form of graphs or downloading data in CSV for more sophisticated analyses. For example, one may compare the distribution of mints or issuers across dozens of hoards: a common numismatic query, delivered nearly instantaneously.

“The CHRR project is wonderful example of the way that ANS is working with multiple partners to create new resources for our members and the whole community of collectors and scholars” notes ANS Executive Director Ute Wartenberg Kagan. “By sharing our data in standard, open formats, we increase its power hugely. The ANS is currently at the forefront of the development of digital tools for numismatics at an international level. It is tremendously exciting to see another tool launched today.”

Thursday, July 18, 2013

Nomisma: Using XForms to Manage and Publish Linked Open Data

One of the main improvements in the newly-redesigned Nomisma web architecture is in the administrative backend, not visible to the public.  The previous iteration of Nomisma was built on top of open source wiki software.   Each id was an XHTML+RDFa fragment in the filesystem, created and edited through the wiki.  There was no validation, and the hand-coding of XHTML fragments occasionally led to human error: invalid XML documents which occasionally broke page loads or RDF distillation.  We needed to move to a more stable and scalable infrastructure.

The XHTML+RDFa fragments remain a part of the new architecture of Nomisma, now maintained in a GitHub repository.  The fragments are now edited in an XForms interface with the Orbeon processor, which enables not only editing of XML, but a variety of REST interactions to get and post data into the Apache Fuseki RDF triplestore and SPARQL endpoint, and post data into the Solr search index, which powers the Atom feed.

While the XForms web forms handle the simplest of XHTML templates, such as those for authorities, mints, regions, etc., it does not yet handle editing of more the more complex data models, such as those for IGCH hoards (like http://nomisma.org/id/igch0200) or coin types (for example, http://nomisma.org/id/rrc-174.1).  However, hoards and coin types are least likely to be manually edited, so the editing interface is most useful for those numismatic concepts which are most likely to be enhanced with additional labels and references to other linked open data identifiers (like VIAF or Pleiades ids).

Validation


One of the main features of XForms is advanced validation.  The @typeof attribute in the XHTML root div is tied to a drop down menu.  The values in this drop down menu are generated dynamically before the form has finished loading (xforms-model-construct-done) directly from a SPARQL query to acquire all of the nm:numismatic_term ids in Nomisma:

PREFIX rdf:      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX nm:       <http://nomisma.org/id/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?uri ?label WHERE {
?uri  rdf:type <http://nomisma.org/id/numismatic_term>.
?uri skos:prefLabel ?label .
FILTER (lang(?label) = "en")}
ORDER BY ASC(?label)

A similar query is passed from Orbeon to the endpoint to generate an XForms instance for nm:field_of_numismatics (e.g., Greek Numismatics, Roman Numismatics, etc.).  Languages (xml:lang in the div) are also tied to an instance which contains every ISO language code and label.


XForms bindings and XPath also ensure that other requirements of the XHTML document are met: there must be an English preferred label, the labels cannot be blank, there can be no repetitive languages for preferred labels, latitude and longitude must be decimal values between -180 and 180, and related links must be valid URIs.

One of the new features of this interface is the interaction between XForms and dbpedia.  It is possible to import labels in languages not already in the Nomisma id from dbpedia RDF.  The XForms submission is fairly straightforward:

<xforms:submission id="get-dbpedia-rdf" action="http://dbpedia.org/data/{instance('control-instance')/dbpedia}.rdf" ref="instance('dbpedia')"
                replace="instance" method="get">
                <xforms:message ev:event="xforms-submit-error" level="modal">Failed to get Dbpedia RDF.</xforms:message>
                <xforms:action ev:event="xforms-submit-done" xxforms:iterate="instance('dbpedia')//rdfs:label">
                    <xxforms:variable name="lang" select="@xml:lang"/>
                    <xforms:action if="not(instance('doc')/xhtml:div[@property='skos:prefLabel'][@xml:lang=$lang])">
                        <xforms:insert context="instance('doc')" nodeset="./xhtml:div[@property='skos:prefLabel'][last()]" origin="instance('prefLabel-template')"/>
                        <xforms:setvalue ref="instance('doc')/xhtml:div[@property='skos:prefLabel'][last()]" value="context()"/>
                        <xforms:setvalue ref="instance('doc')/xhtml:div[@property='skos:prefLabel'][last()]/@xml:lang" value="$lang"/>
                    </xforms:action>
                </xforms:action>
</xforms:submission>
Thus it is easily to rapidly and easily incorporate new labels into Nomisma to facilitate multilingual interfaces in other projects which depend on it for data (like OCRE and the UVA collection).

Workflow


Since the ids need to be maintained in GitHub in the long-term, the editing workflow requires the loading and saving of XHTML+RDFa fragments in the filesystem rather than through a REST interface like eXist.

The workflow is as follows:
  • Load existing id from filesystem or create new one
  • Edit the id
  • Save id. When the document is valid, the save button becomes enabled, and clicking the save button initiates several processes:
  1. Serialize the ids to XML and save back to the filesystem
  2. Serialize the XHTML+RDFa into RDF
  3. Using SPARQL/Update, POST the RDF back into the endpoint.  Since using POST adds new triples into the subject (e.g., http://nomisma.org/id/rome) (creating duplicate triples), the subject must first be flushed from the endpoint before the RDF is sent to Fuseki.  Therefore the following SPARQL query must be sent to the endpoint before the newly-edited RDF is inserted (wonky and unintuitive, but necessary with SPARQL/Update):
DELETE {?s ?p ?o} WHERE { <http://nomisma.org/id/rome> ?p ?o . ?s ?p ?o . FILTER (?s = <http://nomisma.org/id/rome>) } 
  • After the RDF is updated in the endpoint, the XHTML+RDFa is serialized into a Solr XML document and posted into the search index (for the Atom feed, although we may implement faceted search/browse eventually). After the doc is sent, a commit is sent to Solr.
  • Finally, a nightly cron job adds new files into the GitHub repo, and then changes are committed and pushed into GitHub.  Another job then runs to generate RDF dumps of the Nomisma data, which are available on the nomisma.org home page.
This is the gist of the editing workflow in the new version of Nomisma.  I plan to improve the XHTML+RDFa editing templates to support a greater degree of complexity in the data model.  Additionally, I aim to create an administrative interface to better manage datasets provided by other institutions.  The endpoint includes not only Nomisma ids, but RDF provided by OCRE, UVA, CHRR, the ANS, and a portion of the Berlin coinage for Augustus.  I want to be able to get VoID RDF files from new data contributors and do consistency checks on RDF dumps before ingesting them into Fuseki.  I also want to be able to delete or update all triples from a single institution.  This functionality will come eventually.  It will become a higher priority once there are more contributors of numismatic data to Nomisma.

All the code discussed above is, of course, open source: https://github.com/ewg118/nomisma/tree/master/xforms

Tuesday, July 16, 2013

Nomisma: A More Detailed Look at Public Features

On Friday, the new Nomisma.org was launched, and the previous blog post included a general overview of the new APIs and SPARQL endpoint.  In this post, I'll discuss in more detail some specific new features that have been implemented and how these features function under the hood.

Maps


While hoard and mint pages included Google Maps that displayed KML in the previous version of Nomisma, the process by which this KML is generated has changed in the new version (in addition to the move to OpenLayers).  The id for Rome (http://nomisma.org/id/rome), for example, includes a map with KML generated from a SPARQL query.  One point is created for the mint and multiple points generated for the findspots of coins minted in Rome.  This query gathers all of the findspots, defined as an http://nomisma.org/id/findspot [nm:findspot], regardless of whether the mint of Rome is defined within the object RDF itself explicitly or implicitly through a reference to a coin type URI.  See the following query:

PREFIX rdf:      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX nm:       <http://nomisma.org/id/>
PREFIX skos:      <http://www.w3.org/2004/02/skos/core#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT DISTINCT ?object ?findspot ?lat ?long ?title ?prefLabel WHERE {
{?type nm:mint <http://nomisma.org/id/rome> .
?object nm:type_series_item ?type.
?object nm:findspot ?findspot .
?findspot geo:lat ?lat .
?findspot geo:long ?long
}
UNION {
?object nm:mint <http://nomisma.org/id/rome> .
?object nm:findspot ?findspot .
?findspot geo:lat ?lat .
?findspot geo:long ?long
}
OPTIONAL {?object skos:prefLabel ?prefLabel}
OPTIONAL {?object dcterms:title ?title}
}
The SPARQL results are piped through an XSLT stylesheet that converts them into KML placemarks (see the templates with mode="kml").

Maps are generated in similar fashion for coin types (see http://nomisma.org/id/rrc-299.1b).  A point is created for the mint from which the coin was issued, and multiple points are created for findspots associated with that type.  KML is also created for hoards defined by Nomisma, such as IGCH 664, though in this case the KML is serialized not from SPARQL results, but from aggregating relating mint XHTML records together (which is more efficient than initiating multiple SPARQL queries).  While the maps have only been implemented on hoard, mint, and coin type record pages, they can be implemented for other types of ids in the future: for example to show the mints active under the authorities Alexander the Great or Diocletian and the circulation of their coinage, or to show the mints and circulation of denarii or tetradrachms.

Examples of Coin Types


Coin type ids will show examples of the type, similar to OCRE.  The record for RRC 299/1b shows, beneath the map, an example of 2012.92, located in the Virginia Museum of Fine Arts (although the data continues to be published by the University of Virginia Library).  The query for this page looks like:

PREFIX rdf:      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX nm:       <http://nomisma.org/id/>
           
SELECT ?object ?title ?publisher ?identifier ?collection ?weight ?axis ?diameter ?obvThumb ?revThumb ?obvRef ?revRef  WHERE {
?object nm:type_series_item <http://nomisma.org/id/rrc-299.1b>.
?object rdf:type <http://nomisma.org/id/coin>.
?object dcterms:title ?title .
?object dcterms:publisher ?publisher .
OPTIONAL { ?object dcterms:identifier ?identifier } .
OPTIONAL { ?object nm:collection ?collection } .
OPTIONAL { ?object nm:weight ?weight }
OPTIONAL { ?object nm:axis ?axis }
OPTIONAL { ?object nm:diameter ?diameter }
OPTIONAL { ?object nm:obverseThumbnail ?obvThumb }
OPTIONAL { ?object nm:reverseThumbnail ?revThumb }
OPTIONAL { ?object nm:obverseReference ?obvRef }
OPTIONAL { ?object nm:reverseReference ?revRef }}
ORDER BY ASC(?publisher)

Flickr Images


For time being (until/if abuse of the system ever occurs), Nomisma will display Flickr images which have been tagged with Nomisma-compliant machine tags.  See for example the record for Augustus, which displays one picture uploaded by the Portable Antiquities Scheme tagged with nomisma:authority=augustus.  Nomisma leverages the Flickr APIs to query for associated machine tags and display up to twelve images.

<xsl:variable name="predicate" select="if ($typeof='roman_emperor') then 'authority' else $typeof"/>
  <xsl:variable name="photos" as="element()*">
    <xsl:copy-of select="document(concat($service, '&amp;method=flickr.photos.search&amp;per_page=12&amp;machine_tags=nomisma:', $predicate, '=', $id))/*"/>
  </xsl:variable>
  <xsl:if test="count($photos//photo) &gt; 0">
    <div class="center">
      <h3>Flickr Images of this Typology (<a href="http://www.flickr.com/photos/tags/nomisma:{$predicate}={$id}">See all photos.</a>)</h3>
      <xsl:for-each select="$photos//photo">
        <div class="flickr_thumbnail">
          <a href="http://www.flickr.com/photos/{@owner}/{@id}" title="{@title}"><img src="{document(concat($service, '&amp;method=flickr.photos.getSizes&amp;photo_id=', @id))//size[@label='Thumbnail']/@source}" alt="{@title}"/></a>
        </div>
     </xsl:for-each>
  </div>
</xsl:if>

Friday, July 12, 2013

New and Improved Nomisma.org Released

Today we have released the new and improved Nomisma.org.  There have been some updates to fix consistency problems in the data.  One of the major improvements is the importation of multilingual labels pulled from dbpedia through the XForms-based editing interface (I'll write up a blog post to discuss the Nomisma back-end eventually, and probably write up something more detailed for publication in code4lib or CAA).  In addition to data improvements, we have introduced some major new functionalities:

 SPARQL endpoint (http://nomisma.org/sparql)


 The new Nomisma server employs an RDF triplestore and SPARQL endpoint based on Apache Fuseki.  The endpoint is detailed in an older post, "How to Participate in OCRE."  Instead of launching a triplestore specifically for OCRE, we have launched it more generally for Nomisma.  Therefore, we can insert into it RDF describing non-Roman imperial coins.  OCRE, then, queries the Nomisma triplestore to power its mapping, quantitative analysis, and thumbnail-displaying capabilities.  When we release the new linked data-aware OCRE (which will contain all RIC types through Commodus) within the next few weeks, I'll provide a more detailed blog post about how OCRE functions.

The Nomisma SPARQL page, linked above, contains a number of query examples to get you started.  For example, you can get the average weight of RIC Augustus 1a with the following query:

PREFIX rdf:      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX nm:    <http://nomisma.org/id/>
PREFIX xs:    <http://www.w3.org/2001/XMLSchema#>
SELECT (AVG(xs:decimal(?weight)) AS ?average)
WHERE {
?g nm:type_series_item <http://numismatics.org/ocre/id/ric.1(2).aug.1a>.
?g nm:weight ?weight
}

Or get all findspots for the type coin type:

PREFIX rdf:      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX nm:       <http://nomisma.org/id/>
PREFIX skos:      <http://www.w3.org/2004/02/skos/core#>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?object ?findspot ?lat ?long ?title ?prefLabel WHERE {
?object nm:type_series_item <http://numismatics.org/ocre/id/ric.1(2).aug.1a> .
?object nm:findspot ?findspot .
?findspot geo:lat ?lat .
?findspot geo:long ?long .
OPTIONAL {?object skos:prefLabel ?prefLabel}
OPTIONAL {?object dcterms:title ?title}
}


APIs (http://nomisma.org/apis)


We have introduced, so far, a small handful of APIs to expedite certain common SPARQL queries, for example, getting the average weight, diameter, or axis of a particular typology or get the closing date of a hoard given a selection of coin types.  One may also get a list of RDF representations of Nomisma IDs or NUDS documents representing coin types defined by Nomisma in one aggregated serialization.  These APIs expedite the data-loading processes in CHRR and other Numishare-based projects.

Flickr Machine Tags (http://nomisma.org/flickr)


Although not exactly a function of Nomisma, we encourage the use of machine tags in flickr to associate photographs with numismatic concepts defined by Nomisma.  We have documented this methodology in the link above.

Updates to Nomisma Atom Feed

With the launch of the new Nomisma.org, the Atom feed pagination format will change slightly and be more consistent with the format introduced by Numishare applications.

In the older Nomisma, pagination was treated in the URL as follows:

http://nomisma.org/feed/2/?q=*:*

In the new version, the query more closely resembles Lucene queries:

http://nomisma.org/feed/?q=*:*&start=100

Additionally, burial_start and burial_end have been removed, in addition to the geographic search.  I recommend using the SPARQL interface for these sorts of sophisticated queries.

The new feed incorporates opensearch and includes alternate links to the RDF serialization for each entry.  See the original blog post for further documentation.