Open Bio* Semantic Web tools
Participants
(Please add your name!)
- Toshiaki Katayama
- Erick Antezana
- Shuichi Kawashima
- Mitsuteru Nakao
- Brad Chapman
- Peter Cock
- Jan Aearts
- Kyung-Hoon Kwon
- Raoul Bonnal
- Christian Zmasek
- Keun-Joon Park
- Thomas Kappler
RDF libraries in each language
(Please add other languages as well!)
Name of the available libraries:
- Ruby
- ActiveRDF: may depends on Rails
- We have a lot more. see --> http://raa.ruby-lang.org/search.rhtml?search=rdf
- Python
- RDFLib http://www.rdflib.net/
- SPARQLWrapper http://sparql-wrapper.sourceforge.net/
- Perl
- RDF:Trine: generic, actively developed RDF library
- RDF::Redland: Perl wrapper for the C library Redland (librdf.org)
- List of modules and community at http://www.perlrdf.org
- ONTO-PERL: it might be used (it is, however, OBO-centric http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn042)
- StateOfPerlAndRdf
Functionality
- RDF reader (in)
- RDF generator (out)
- SPARQL query helpers
language | in | out | query |
Ruby | ? | ? | ? |
Python | o | o | ? |
Perl | o | o | o |
Tools to manipulate RDF data
- RDF grep
- RDF diff
- RDF sort
- RDF uniq
- RDF wc (triple count) (tkappler implemented a tiny one playing with RDF in Perl, it's at http://github.com/thomas11/perl-rdf-experiments, works only with small files)
- RDF cat (combine)
- (most of) these commands will output RDF graph :)
- "sort" before "diff" is better.
- it would be great if "diff" can have an option to ignore "empty node" which can shift internal IDs even if the graphs are almost same.
RDF converters
- general:
- biological:
- Bio DB entries -> RDF (in BioRuby for TogoWS, for example)
- OBO <-> RDF (used in Cell Cycle Ontology because ".obo" file format is easy to read for human): now in ONTO-PERL.
- Bio2RDF converters:
Interface for SPARQL endpoint
DBI like interface to Query and obtain Result by SPARQL.
- automatically map to the relevant language object
- we can use SPARQL endpoint as a data source for Galaxy, for example.
- sometimes, results can be huge
Language availability
language | generic | 4store |
Ruby | o | o |
Python | o | o |
Perl | o | x |
Java | o | o |
R | ? | ? |
- 4store
- database server http://4store.org/
- install doc in Japanese http://lifesciencedb.g.hatena.ne.jp/nakao_mitsuteru/20100105/1262713627
- binary download for mac http://dl.dropbox.com/u/152468/4store-1.0.2.dmg
- 4storeQuickPrimer
- client library for Ruby/Python/Java http://4store.org/trac/wiki/ClientLibraries
- Supports HTTP-based Sparql Protocol, so can be used from any language that can do GET requests. Wrapper implemented by RDF::Query for Perl.
- database server http://4store.org/
- Virtuoso
Try "short read archive (metadata) SPARQL endpoint" which will be developed by the NextGenSeq? open space discussion group.
Galaxy integration
- Plugin development
- Documentation
- See ONTO-Toolkit
Day2
RDF manipulation tools and library to produce RDF from biological objects are important for data providers, but most users are consumers of the data, so we will focus on how to access public biological SPARQL endpoint easily in each Open Bio* library.
Suggested endpoint to play with:
Implementations:
- Python SPARQL client for BioGateway?:
- Initial python query client for InterMine?: http://chapmanb.posterous.com/biohackathon-2010-day-3-fish-interoperating-a
- Ruby SPARQL client:
- ActiveRDF this is the original package, but it seems to be bugged with our endpoints. Two possible solutions:
- we did tests using Bio2RDF endpoints, we plan to support BioGateway? as well.
- we can explore the graph dinamically.
- AGILE NOW!!!!!!!!!!!!! you can see an example of agile sparql on Raoul's blog.
Todo:
- Survey existing SPARQL library in each language
Todo:
- Can we have common interface for major biological SPARQL endpoints?
- Can we have nice SPARQL query builder? (see below for OWL notes)
- Convert retrieved results into language's object?
- What about getting not only a tabular result but also a result displaying a graph (nodes + edges)? (See BioGateway browser)
- Map directly on Bio* interal objects
Note:
- Half of the BioRuby group will also tackle with to develop the DB (e.g. KEGG) -> RDF generator (will be used in Bio2RDF and TogoWS)
Using OWL to support dynamic queries and exploration of RDF data
This a short summary of a discussion on Friday, 2010-02-12.
The plan is to work on a nice SPARQL query builder, as a joint effort between the Bio* communities. It could make use of an OWL ontology as a description of the data: it lists what things there are, and how they are related. These relations could be offered to the user, via a simple API to the programmer or even on a web page to the biologist.
Erick pointed out the Manchester OWL Syntax as a possibly related effort.
Attachments
-
sparql.patch
(18 bytes) - added by bonnal
15 years ago.
Apply this patch to fix a connection problem for sparql endpoints