Rene Van den Heuvel, Global Product Expert, Axiell:
What are Persistent Identifiers (PID’s) and how does it relate to Linked Open Data (LOD)?
More and more organizations these days share their data with other organizations and external systems. In fact, sharing data has become so important, that referring to data that is managed elsewhere can make it unnecessary, undesirable even, to create redundant copies of that data.
To refer to data that is open and managed elsewhere, an unambiguous and permanent identifier is needed. This is called a Persistent Identifier (PID) or permanent identifier. This PID should never change once assigned as other organizations will use these PID’s to refer to objects, metadata, concepts, person details, etc. that is managed in an organizations’ or 3rd party’s system.
Making data so that it is ‘Linked’ and ‘Open’ is not only relevant for objects in a collection, but also for the names, descriptors and standardized terminology that are used to describe these objects.
In fact, any data may be shared as open data. For instance, when open vocabulary resources such as the Getty AAT are used to describe an object, the PID’s of the used terms/concepts can be stored in the Axiell Collections database. When publishing object records as Linked Open Data, these records will contain references to these Getty AAT concepts.
This not only means that it is clear what the meaning and context of each term/concept is, but it also allows for the creation of links with data from other datasets that are using the same identifiable concepts/terms as descriptors. E.g. if a keyword that is used to indicate the creation location is “London”, that would be an ambiguous descriptor as there are multiple places in the world that are called London. But if besides (or even instead of) the term “London” its’ sourceID from Getty’s TGN would be listed (http://vocab.getty.edu/tgn/7011781), then it is clear which London is meant here.
Therefore, it is relevant to be able to manage PID’s in all entities in the system. In some cases, the PID’s are the ID’s of the organization, in other cases PID’s from external parties are stored in the system.
PID’s are the very basis for making data Linked Open.
Cooperation between NDE, Packed and several museums
To discuss how PID’s and Linked Open Data technology could be made available and accessible to the large number of clients of Axiell, a unique cooperation was started with NDE (Dutch Digital Heritage Network in The Netherlands), Packed (Digital Heritage Flanders, Belgium, now called Meemoo) and a number of museums and heritage organizations that are using Axiell’s collections management systems.
In various workshops, the workgroup discussed how an implementation of PID’s and LOD would be most beneficial to the heritage sector and how this could be made available to the sector.
This cooperation resulted in an implementation of PID management in Axiell Collections that is flexible and easy to use and that is now available to any organization that wishes to use it.
Furthermore, the workgroup created templates for the publication of data in LOD formats (e.g. EDM-Europeana Data Model and DC-Dublin Core) that Axiell have adopted and made part of the standardized solutions in Axiell Collections to ensure they are maintained and kept up to date in future releases of the software.
PID Implementation in Axiell Collections
The latest version of the Axiell Collections application (V5.0) includes PID management functionality. This implementation contains PID fields for all entities in the Axiell Collections database.
Default/initial URI’s and URL’s can be configured that will be assigned automatically by the system upon creation of records optionally using unique GUID (Globally Unique Identifier) also called UUID (Universally Unique Identifier).
URI’s may be referring to your own internet domain but may also be registered with external service providers using the Handle System. This ensures that if your organization’s name changes in the future, and as a result your internet domain name will be changed as well, that the PID’s/URI’s that link to your data remain unchanged.
URI’s from the Handle System will use the handle.net domain and together with a unique ID from the museum/archive, called prefix, could e.g. look like this: https://hdl.handle.net/11259/collection.20356, where 11259 is the prefix of the Amsterdam Museum and the section after the prefix is the internal ID of an object in the collection of the museum.
When using external service providers for PID registration, these external URI’s need to be forwarded to the actual URL that links to the object or other data. Therefore, the external URI needs to be associated with the actual URL. This association is done by registering the URI/URL pair with an external service provider such as SurfSARA.
When the URI (with the handle domain) is referenced, the service provider makes sure it refers to the actual URL. So, when in the future the URL changes (e.g. because of a name change resulting in a change in the internet domain name), it is simply a matter of updating the registration of the new URL to the permanent URI in the service provider’s system.
The current PID registration implementation in Axiell Collections uses Handle.net and SurfSARA as the service provider, but other systems may be implemented as well.
The URI/PID registration process in Axiell Collections allows the authorized user to select any number of records that is ready for publication as Linked Open Data or that needs updating.
Publication/presentation of Linked Open Data
Making your data available as Linked Open Data requires that your data is expressed in a format that other systems can interpret. While an html web page of an object is fine for a human to read, a system would want to receive the data in a technical and unambiguous way.
Therefore, various standards have been developed that express data in a form that other systems can understand, usually these are Resource Description Framework (RDF) formats. RDF is a standard model for data interchange on the web. The current formats that are available in the Axiell Collections LOD implementation include Dublin Core, Europeana Data Model, Linked-Art (based on CIDOC-CRM) and Schema.org
The XSLT stylesheets for these formats can be linked to the Axiell WebAPI to transform your data in these formats. Other formats can be added when needed.