= Open Bio* Semantic Web tools = == Participants == (Please add your name!) * Toshiaki Katayama * Erick Antezana * Shuichi Kawashima * Mitsuteru Nakao * Brad Chapman * Peter Cock * Jan Aearts * Kyung-Hoon Kwon * Raoul Bonnal * Christian Zmasek === RDF libraries in each language === (Please add other languages as well!) Name of the available libraries: * Ruby * ActiveRDF: may depends on Rails * We have a lot more. see --> http://raa.ruby-lang.org/search.rhtml?search=rdf * Python * RDFLib http://www.rdflib.net/ * SPARQLWrapper http://sparql-wrapper.sourceforge.net/ * Perl * RDF: generic RDF library * [http://search.cpan.org/dist/ONTO-PERL/ ONTO-PERL]: it might be used (it is, however, OBO-centric http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn042) Functionality * RDF reader (in) * RDF generator (out) || language || in || out|| || Ruby || ? || ? || || Python || o || ? || || Perl || o || ? || === Tools to manipulate RDF data === * RDF grep * RDF diff * RDF sort * RDF uniq * RDF wc (triple count) * RDF cat (combine) * (most of) these commands will output RDF graph :) * "sort" before "diff" is better. * it would be great if "diff" can have an option to ignore "empty node" which can shift internal IDs even if the graphs are almost same. === RDF converters === * general: * RDF --> JSON (for [http://couchdb.apache.org/ CoachDB], [http://www.mongodb.org/ mongoDB] etc.) * biological: * Bio DB entries -> RDF (in [http://bioruby.org BioRuby] for [http://togows.dbcls.jp TogoWS], for example) * OBO <-> RDF (used in [http://cellcycleontology.org Cell Cycle Ontology] because ".obo" file format is easy to read for human) === Interface for SPARQL endpoint === DBI like interface to Query and obtain Result by SPARQL. * automatically map to the relevant language object * we can use SPARQL endpoint as a data source for [http://usegalaxy.org Galaxy], for example. * sometimes, results can be huge Language availability || language || generic || [http://4store.org/trac/wiki/ClientLibraries 4store] || || Ruby || o || o || || Python || o || o || || Perl || ? || x || || Java || o || o || || R || ? || ? || * 4store * database server http://4store.org/ * install doc in Japanese http://lifesciencedb.g.hatena.ne.jp/nakao_mitsuteru/20100105/1262713627 * binary download for mac http://dl.dropbox.com/u/152468/4store-1.0.2.dmg * [wiki:4storeQuickPrimer] * client library for Ruby/Python/Java http://4store.org/trac/wiki/ClientLibraries * Virtuoso * database server http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/ * -> http://sourceforge.net/projects/virtuoso/files/ Try "short read archive (metadata) SPARQL endpoint" which will be developed by the NextGenSeq open space discussion group. == Galaxy integration == * Plugin development * Documentation == Day2 == RDF manipulation tools and library to produce RDF from biological objects are important for data providers, but most users are consumers of the data, so we will focus on how to access public biological SPARQL endpoint easily in each Open Bio* library. Suggested endpoint to play with: * [http://www.semantic-systems-biology.org/biogateway/endpoint BioGateway endpoint] * More info at: [http://www.semantic-systems-biology.org/biogateway/querying Semantic Systems Biology] Implementations: * Initial python client for BioGateway: http://chapmanb.posterous.com/biohackathon-2010-day-2-python-sparql-query-b * Initial python query client for InterMine: http://chapmanb.posterous.com/biohackathon-2010-day-3-fish-interoperating-a Todo: * Survey existing SPARQL library in each language Todo: * Can we have common interface for major biological SPARQL endpoints? * Can we have nice SPARQL query builder? * Convert retrieved results into language's object? * What about getting not only a tabular result but also a result displaying a graph (nodes + edges)? (See [http://www.semantic-systems-biology.org/biogateway/sparql-viewer/ BioGateway browser]) Note: * Half of the !BioRuby group will also tackle with to develop the DB (e.g. KEGG) -> RDF generator (will be used in Bio2RDF and TogoWS)