== The first ever official Data Provider session who will offer RAW DATA == After a 1:30 hour brain storming discussion, the group take a break and get ready for the Data Provider session. The representative from the following official data provider were presents : * DBCLS [http://dbcls.rois.ac.jp/] * DDBJ [http://www.ddbj.nig.ac.jp/] * Korean HapMap [http://www.khapmap.org/] * KEGG [http://www.genome.jp/] * PDBj [http://www.pdbj.org/] * UniProt [http://www.uniprot.org/] * TreeBASE [http://treebase.org v.1],[http://treebase-dev.nescent.org:6666/treebase-web v.2] The representative from the following data integration project offering a SPARQL endpoint: * http://bio2rdf.org/ [http://delicious.com/fbelleau/bio2rdf:sparql SPARQL endpoint list] * http://www.semantic-systems-biology.org/ [http://www.semantic-systems-biology.org/biogateway/endpoint SPARQL endpoint] * http://hcls.deri.ie/ [http://hcls.deri.ie/sparql SPARQL endpoint] These are the objectives that were discussed : * RAW DATA available as N-TRIPLES dump should be produced by the data provider and made available from their FTP or HTTP server * SPARQL endpoint for each dataset should be made available * Standard URIs should be used in triple * Standard predicate should be used in triple == RAW Data dump == UniProt already offer raw data in RDF from their own [ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/ FTP server], and on the fly for query results by appending '&format=rdf' to the URL. TreeBASE does the latter also, and can make an RDF dump available easily. DDBJ, PDBJ and KEEG will do the same. == SPARQL endpoint == There is already a certain number of SPARQL endpoints available but none are owned by the official data provider. There is a need for an hosting service offering SPARQL endpoint in which it will be possible to load the official raw data provided by the data provider. == Standard URIs == {{{ Rule #1 When a data provider has given a derefencable URI to a topic, this is the only accepted URI that should be published on the [http://dig.csail.mit.edu/breadcrumbs/node/215 GGG]. Other data provider must make reference to it in their own dataset. }}} Actually the following data provider have created their own derefencable URI : * UniProt * Gene Ontology those providers will apply Rule #1 when they start to publish RDF. * DBCLS * DDBJ * KEGG * PDBJ Now that data provider will share a common naming for URI, it is necessary to adopt a simple design rule for URIs. {{{ Rule #2 The syntax of a derefencable URI is as follow : http://providerDomaineName/publicNamesapce/privateId }}} For example the folowing URI are valid : * http://purl.uniprot.org/uniprot/P17710 * http://pdbj.org/pdbid/2yhx * http://dbcls.jp/insc/AAB57760 * http://genome.jp/ec/2.7.1.1 but not these : * http://www.genome.jp/dbget-bin/www_bget?ec:2.7.1.1 == Standard predicates == == Tokyo Manifesto == The rules number 1, 2 and 3 constitute the Tokyo Manifesto that only data provider may endorse.