OBI ID Policy
Authors: Melanie Courtot, Alan Ruttenberg, Bill Bug and the OBI Consortium
Executive Summary
Background
As part of the release process, we want to produce files that have homogeneous identifiers (IDs) and stable URIs.
Like the rest of the OBO ontologies we want to use purl [1] based URIs, because of the ability to redirect to a different URL should we want to change hosts, etc. (the current OBI URIs are based on Sourceforge).
Regarding the form of the URI itself, those that have expressed an opinion have the opinion that we should give all entities that we define - classes, relations, and instances, with IDs and use labels for the human readable version.
Three options proposed
Current OBO practice.
Example URI: http://purl.obofoundry.org/obo/owl/OBI#OBI_0010000 Pro: Current OBO practice.
Example URI: http://purl.obofoundry.org/obo/OBI/OBI_0010000 Pro: "/" instead of "#" gets rid of problem named above. No "owl" in the name. Cons: More verbose than necessary - we see "OBI" twice in the URI for classes. Removing the extra OBI Example URI: http://purl.obofoundry.org/obo/OBI_0010000 Pro: As short as is sensible.
Cons: Issue with terms that don't have OBI ids, such as relations, instances, and classes that we might want to keep named CURRENT STATUS - approved by OBI All agree on
- using purl based URIs.
- using IDs for everything from now on.
- option 3 (http://purl.obofoundry.org/obo/OBI_0010000) agreed upon
Basing our URIs on a domain we control/ use of DNS
Use http://purl.obofoundry.org/obo/OBI_0100102 instead of http://obi.sourceforge.net/ontology/OBI.owl#OBI_0100102
How to browse the whole ontology file?
Go to the URL http://purl.obofoundry.org/obo/obi.owl and you will get the display of the whole OBI.owl file
What about our xml:base?
What is our default namespace?
And the ontology URI?
This is the address on the web, and the name of the ontology. It will be http://purl.obofoundry.org/obo/obi.owl in our case.
Can I still use something in the form of http://purl.obofoundry.org/obo/obi.owl#OBI_0100102?
No, you would not get the correct fragment in the file, and it wouldn't be correct to use that to annotate files.
Where to get the latest version of OBI?
When querying for a specific term, would the users also get something else than just the class?
They could. We could also include, e.g. the told superclasses. Whatever we think useful as long as
a) the import brings in the rest of the semantics that is needed (but most browsers don't actually need the full semantics)
b) We don't say anything that conflicts with something in the ontology.
But we could add extra informative information, like a property that holds the OBO format text.
What happens to elements that don't currently have an ID (annotation property, relations...)?
<owl:AnnotationProperty rdf:ID="alternative_term_citation">
<definition>
formal citation of the source of the alternative_definition, e.g. identifier in external
database to indicate / attribute source(s) for the definition. Free text indicate / attribute source(s)
for the definition. EXAMPLE: Author Name, URI, MeSH Term C04, PUBMED ID, Wiki uri on 31.01.2007
</ definition>
</owl:AnnotationProperty>
would become
<owl:AnnotationProperty rdf:about="OBI_6786789">
<rdfs:label>alternative_term_citation</rdfs:label>
<definition>formal citation of the source of the alternative_definition, e.g. identifier in external
database to indicate / attribute source(s) for the definition. Free text indicate / attribute source(s) for the
definition. EXAMPLE: Author Name, URI, MeSH Term C04, PUBMED ID, Wiki uri on 31.01.2007
</ definition>
</owl:AnnotationProperty>
Note that we added the rdfs:label to keep the name of the property and that we use rdf:about instead of ID.
Does assigning IDs to everything have any consequence?
It shouldn't. Modulo tool bugs. Chris Mungall seemed to say this was OK too, so that reinforces.
What about the classes we are importing from ontologies who still use the traditional OBO format?
Nothing changes for classes we are importing, they keep their ID and URIs.
e.g. <owl:Class rdf:about="http://purl.org/obo/owl/CL#CL_0000236">
<rdfs:label xml:lang="en">B cell</rdfs:label>
...
will stay as it is.
Later we will try to get CL to use a similar scheme: http://purl.obofoundry.org/obo/CL_0000236
Sample file
Alan generated a zip file including curation status and new IDs http://groups.google.com/group/obi-developer/web/newids_curation_status.tgz for all to test.
Note that this file uses the thing.obofoundry.org, which we decided to change to purl.obofoundry.org on the suggestion of Chris Mungall.
The file also prototypes using a curation_status widget in Protege:
Future Work
Related discussions
changing our URIs to be purl based:
http://groups.google.com/group/obi-developer/browse_thread/thread/178768c384f4e9c6?hl=en
curation status widget:
http://groups.google.com/group/obi-developer/browse_thread/thread/e226710bcc7cd50e?hl=en
new purl(sort of) based OBI identifiers:
http://groups.google.com/group/obi-developer/browse_thread/thread/d83deb3313911cd6?hl=en
should all our string annotation property value be “untyped literals” i.e. localizable
Re: [Obi-devel] [Obi-svn-commit] SF.net SVN: obi: [346] branchDevelopment/trunk ( regarding OWL-Full with curation status)
References
[1] http://purl.org/, A PURL is a Persistent Uniform Resource Locator. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. The PURL resolution service associates the PURL with the actual URL and returns that URL to the client. The client can then complete the URL transaction in the normal fashion. In Web parlance, this is a standard HTTP redirect.
[2] http://en.wikipedia.org/wiki/Domain_name_system, The Domain Name System (DNS) associates various sorts of information with so-called domain names; most importantly, it serves as the "phone book" for the Internet by translating human-readable computer hostnames, e.g. www.example.com, into the IP addresses, e.g. 208.77.188.166, that networking equipment needs to deliver information.
Acknowledgments
Thanks to Jonathan Rees and Chris Mungall for their help.