Version 4 (modified by markw, 15 years ago) |
---|
Which URI to use in your RDF?
Guideline 1. Use the data provider's URL as your identifier, unless it uses get arguments; in this case, it is a more stable case to use something like LSRN to do the redirect for you. This is because arguments expose an interface that is prone to change - if it changes, all triplestores on earth have to be updated; however with a redirect, there is only one update required.
Guideline 2. Often a provider offers multiple URLs for the same resource. e.g. the entrez query http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=22177139 is the same as http://www.ncbi.nlm.nih.gov/pubmed/22177139 (the community recommend to use the later) or http://www.ebi.uniprot.org/entry/P05067 is the same as http://purl.uniprot.org/uniprot/P05067 (where UniProt? asks you use the later, which will give RDF if your HTTP header accepts it).
Document the URI pattern you use on freebase, to encourage uniformity. http://www.freebase.com/view/user/biohackathon/default_domain/views/namespace_1 This is a first-come-first-served approach to pick preferred URL pattern; the first data provider to cross-reference to a third party, gets to decide what URL pattern is in Freebase as "approved" (ideally with the 'blessing' of the third-party data provider)
Guideline 3. When there is more than one official URI for a concept e.g. PDB is available at PDBj, PDBe, RCSB PDB. Then choose the one that you prefer. And add owl:sameAs statements for each URL.