Changes between Version 47 and Version 48 of ImplementationBootcamp

Show
Ignore:
Timestamp:
2010/02/11 22:32:33 (13 years ago)
Author:
RutgerVos
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ImplementationBootcamp

    v47 v48  
    3030Then use [http://protege.stanford.edu/ Protege] to actually build the ontology. 
    3131 
    32 > MDW:  I highly recommend that you "make friends" with someone who has a deep understanding of OWL, and the consequences of various OWL constructs, as you go through your learning experience.  While the existing tutorials are good for telling you what is possible, they aren't always entirely clear about the consequences of choosing one encoding method versus another... and this dramatically affects your ability to "reason over" your data!! Unfortunately, there are few shortcuts - OWL is hard!   
     32It is highly recommended that you "make friends" with someone who has a deep understanding of OWL, and the consequences of various OWL constructs, as you go through your learning experience.  While the existing tutorials are good for telling you what is possible, they aren't always entirely clear about the consequences of choosing one encoding method versus another, and this dramatically affects your ability to "reason over" your data. Unfortunately, there are few shortcuts - OWL is hard!   
    3333 
    3434=== Which version of Protege should I use? === 
    3535 
    36 Why not the latest one? You get the current OWL 2. 
     36Protege 3 and Protege 4 are "philosophically" different, and represent a split in the global ontology community that runs roughly along the lines of the "OBO-fans" and the "OWL-DL-fans" (that's over-simplifying the situation, but I think it is by-and-large correct).  The two development communities had different target-audiences in mind when developing the software, and those audiences are reflected in the decisions made.  Protege 4 uses the Manchester OWL API "under the hood", and is somewhat more capable of manipulating OWL than Protege 3 is.  On the other hand, if you are planning to use Protege to generate RDF data ("individuals") manually, then Protege 3 might be more useful for you. 
    3737 
    38 > MDW:  Protege 3 and Protege 4 are "philosophically" different, and represent a split in the global ontology community that runs roughly along the lines of the "OBO-fans" and the "OWL-DL-fans" (that's over-simplifying the situation, but I think it is by-and-large correct).  The two development communities had different target-audiences in mind when developing the software, and those audiences are reflected in the decisions made.  Protege 4 uses the Manchester OWL API "under the hood", and is somewhat more capable of manipulating OWL than Protege 3 is (IMO).  On the other hand, if you are planning to use Protege to generate RDF data ("individuals") manually, then Protege 3 might be more useful for you.  This is all entirely my opinion, so please don't flame me if you are a fan of one or the other :-) 
     38=== In my generated RDF, what namespace URI do I use to prefix my terms? === 
    3939 
    40 === How do I namespace my terms? === 
    41  
    42 It is best to do this such that they can actually be resolved (unlike XML), preferably to an OWL file, e.g. "http://example.org/terms.owl#" 
    43  
    44 > MDW:  Can we re-phrase the question to be clear what we are asking?  :-) 
     40It is best to do this using a real URL such that the terms can actually be resolved (unlike XML), preferably to an OWL file, e.g. "xmlns:foo=http://example.org/terms.owl#" such that construct "foo:bar" can be resolved. 
    4541 
    4642===  What are the similarities and differences between the various shared names proposals?  === 
     
    5450===  I have an analysis tool, how do I expose it as a semantic web resource? === 
    5551 
    56 > MDW:  SADI please :-)   Luke gave the Java tutorial today, and I gave the Perl tutorial.  Edward Kawas from my lab has produced movies detailing how to create services in Perl for SADI, and I will be doing the voice-over for these movies and putting them up on YouTube in the next week.  I will add a link here.  We will do the same for the Java side once we have the extra-cool Java functionalities coded and ~stable.  In particular, Luke McCarthy and Paul Gordon have been working together at the Hackathon finding simple ways to put SADI Java services into the Google Cloud... so you might not even have to consume your own compute resources to achieve this! 
     52SADI provides one solution.  Luke McCarthy gave a Java tutorial Thursday 11 February 2010, and Mark Wilkinson gave the Perl tutorial on the same day.  Edward Kawas from the Wilkinson lab has produced movies detailing how to create services in Perl for SADI, and Mark will be doing the voice-over for these movies and putting them up on YouTube in the second week of February 2010.  Mark will add a link here.  The same will be done for the Java side once we have the extra-cool Java functionalities coded and ~stable.  In particular, Luke McCarthy and Paul Gordon have been working together at the Hackathon finding simple ways to put SADI Java services into the Google Cloud... so you might not even have to consume your own compute resources to achieve this! 
    5753 
    5854===  When someone calls GET on my URLs, what should I return in order to be semantic webby? === 
     
    8682 
    8783=== What XSL processor to use, should you want to convert legacy xml to rdf? === 
    88 @yokofakun uses xsltproc which work fine. For example see: 
     84Pierre Lindenbaum uses xsltproc which works fine. For example see: 
    8985  * http://plindenbaum.blogspot.com/2010/02/linkedinxslt-foaf-people-from.html 
    9086  * http://plindenbaum.blogspot.com/2010/02/searching-for-genotypes-with-sparql.html 
    91 @rvosa is using a stylesheet that uses XSL2.0 features, which libxslt doesn't like. He therefore uses [http://saxon.sourceforge.net/ saxon] to transform NeXML into RDF. 
     87Rutger Vos is using a stylesheet that uses XSL2.0 features, which libxslt (on which xsltproc is based) doesn't like. He therefore uses [http://saxon.sourceforge.net/ saxon] to transform NeXML into RDF. 
    9288 
    9389=== What to do with RDFa metadata? === 
     
    9692 
    9793=== I have database in RMDB, how can I convert them directly to RDF? ===  
    98 Use protege plug-in? or porvide web service? 
    9994 
    100 There is [http://www4.wiwiss.fu-berlin.de/bizer/d2rq/ D2RQ] which works okey but lacks a bit performance-wise. 
     95Possibly using a protege plug-in (which on?) or by providing a web service. There is [http://www4.wiwiss.fu-berlin.de/bizer/d2rq/ D2RQ] which works okey but lacks a bit performance-wise. However, this really depends on whether or not you intend to publish your database as a SPARQL endpoint.  The poll that Pierre Lindenbaum and Mark Wilkinson took over the past couple of days suggests that only 5 data providers (within Tweet-shot of us) currently provide SQL access to their data resources.  This does not seem to bode well for having data providers set-up SPARQL endpoints:  why would they open themselves to a new, unfamiliar technology when they don't open themselves to a well-known, tested, secure, and highly powerful technology?   
    10196 
    102 > MDW:  This really depends on whether or not you intend to publish your database as a SPARQL endpoint.  The poll that Pierre and I took over the past couple of days suggests that only 5 data providers (within Tweet-shot of us) currently provide SQL access to their data resources.  IMO this does not bode well for having data providers set-up SPARQL endpoints!!  (why would they open themselves to a new, unfamiliar technology when they don't open themselves to a well-known, tested, secure, and highly powerful technology???)   We have tried to make a compelling argument that exposing resources via SADI Web Services gives you the best of both worlds - a highly-granular control over what data you expose, how you expose it, and over the distribution of large numbers of requests over your compute-resources; yet our SHARE client helps make it *appear* that the entire world is one big SPARQL endpoint (on steroids, since you can SPARQL data that doesn't even exist until you ask the question!)  My opinion (biased!) is that SADI Web Services are a better way to expose RDF data compared to SPARQL endpoints.  Moreover, it doesn't require you to change your existing data infrastructure in any way - you don't need to have a triple-store to expose your data as triples via SADI.  With a Web Service-based exposure, you can migrate your data gradually/modularly, a few properties at a time, rather than attempting to move your entire database to the Semantic Web in one shot... and gain experience as you go!  Given that it is currently not (natively) possible to SPARQL query over multiple endpoints, you aren't losing anything by going the SADI route either.  Finally, '''all''' of your resources (both database and analytical tools) are exposed in exactly the same way, meaning that they are all accessed by clients in exactly the same way, simplifying client design :-) 
     97Mark Wilkinson's team have tried to make a compelling argument that exposing resources via SADI Web Services gives you the best of both worlds - a highly-granular control over what data you expose, how you expose it, and over the distribution of large numbers of requests over your compute-resources; yet the SHARE client helps make it *appear* that the entire world is one big SPARQL endpoint (on steroids, since you can SPARQL data that doesn't even exist until you ask the question!)   
     98 
     99Mark's opinion (biased!) is that SADI Web Services are a better way to expose RDF data compared to SPARQL endpoints.  Moreover, it doesn't require you to change your existing data infrastructure in any way - you don't need to have a triple-store to expose your data as triples via SADI.  With a Web Service-based exposure, you can migrate your data gradually/modularly, a few properties at a time, rather than attempting to move your entire database to the Semantic Web in one shot... and gain experience as you go.  Given that it is currently not (natively) possible to SPARQL query over multiple endpoints, you aren't losing anything by going the SADI route either.  Finally, '''all''' of your resources (both database and analytical tools) are exposed in exactly the same way, meaning that they are all accessed by clients in exactly the same way, simplifying client design. 
    103100   
    104101=== How granular should my returned RDF be? ===