Version 7 (modified by tore.eriksson, 15 years ago) |
---|
Which URI to use in your RDF?
Guideline 1. Use the data provider's URL as your identifier, unless it uses get arguments; in this case, it is a more stable case to use something like LSRN to do the redirect for you. This is because arguments expose an interface that is prone to change - if it changes, all triplestores on earth have to be updated; however with a redirect, there is only one update required.
Guideline 2. Often a provider offers multiple URLs for the same resource. e.g. the entrez query http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=22177139 is the same as http://www.ncbi.nlm.nih.gov/pubmed/22177139 (the community recommend to use the later) or http://www.ebi.uniprot.org/entry/P05067 is the same as http://purl.uniprot.org/uniprot/P05067 (where UniProt? asks you use the later, which will give RDF if your HTTP header requests it).
Document the URI pattern you use on freebase, to encourage uniformity. http://www.freebase.com/view/user/biohackathon/default_domain/views/namespace_1 (is this the correct Freebase page? it doesn't seem to be providing URI patterns.) This is a first-come-first-served approach to pick preferred URL pattern; the first data provider to cross-reference to a third party, gets to decide what URL pattern is in Freebase as "approved" (ideally with the 'blessing' of the third-party data provider)
Guideline 3. Sometimes there is more than provider and thus more than one official URI for a record about a common concept, which can happen with consortia. e.g. PDB is available at PDBj, PDBe, RCSB PDB. Then choose the one that you prefer. Consider adding owl:seeAlso references between them, in strong preference over owl:sameAs due that giving misleading semantics.