Version 12 (modified by admin, 15 years ago) |
---|
The first ever official Data Provider session who will offer RAW DATA
After a 1:30 hour brain storming discussion, the group take a break and get ready for the Data Provider session.
The representative from the following official data provider were presents :
- DBCLS http://dbcls.rois.ac.jp/
- DDBJ http://www.ddbj.nig.ac.jp/
- Korean HapMap? http://www.khapmap.org/
- KEGG http://www.genome.jp/
- PDBj http://www.pdbj.org/
- UniProt? http://www.uniprot.org/
- TreeBASE v.1, v.2
The representative from the following data integration project offering a SPARQL endpoint:
- http://bio2rdf.org/ SPARQL endpoint list
- http://www.semantic-systems-biology.org/ SPARQL endpoint
- http://hcls.deri.ie/ SPARQL endpoint
These are the objectives that were discussed :
- RAW DATA available as N-TRIPLES dump should be produced by the data provider and made available from their FTP or HTTP server
- SPARQL endpoint for each dataset should be made available
- Standard URIs should be used in triple
- Standard predicate should be used in triple
RAW Data dump
UniProt? already offer raw data in RDF from their own FTP server, and on the fly for query results by appending '&format=rdf' to the URL. TreeBASE does the latter also, and can make an RDF dump available easily. DDBJ, PDBJ and KEEG will do the same.
SPARQL endpoint
There is already a certain number of SPARQL endpoints available but none are owned by the official data provider.
There is a need for an hosting service offering SPARQL endpoint in which it will be possible to load the official raw data provided by the data provider.
Standard URIs
Rule #1 When a data provider has given a derefencable URI to a topic, this is the only accepted URI that should be published on the [http://dig.csail.mit.edu/breadcrumbs/node/215 GGG]. Other data provider must make reference to it in their own dataset.
Actually the following data provider have created their own derefencable URI :
- UniProt?
- Gene Ontology
those providers will apply Rule #1 when they start to publish RDF.
- DBCLS
- DDBJ
- KEGG
- PDBJ
Now that data provider will share a common naming for URI, it is necessary to adopt a simple design rule for URIs.
Rule #2 The syntax of a derefencable URI is as follow : http://providerDomaineName/publicNamesapce/privateId
For example the folowing URI are valid :
- http://purl.uniprot.org/uniprot/P17710
- http://pdbj.org/pdbid/2yhx
- http://dbcls.jp/insc/AAB57760
- http://genome.jp/ec/2.7.1.1
but not these :
* http://www.genome.jp/dbget-bin/www_bget?ec:2.7.1.1
Standard predicates
Tokyo Manifesto
The rules number 1, 2 and 3 constitute the Tokyo Manifesto that only data provider may endorse.