Changes between Version 58 and Version 59 of ImplementationBootcamp

Show
Ignore:
Timestamp:
2010/02/12 10:11:19 (14 years ago)
Author:
jerven
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ImplementationBootcamp

    v58 v59  
    5555 * URI documentation must be open so that it can be replicated and reused. 
    5656 
     57=== UniProt PURLS === 
     58 
     59UniProt uses its own purls not only for their own data but also for all the cross references that we link to. 
     60 * For example purl.uniprot.org/HGNC/37122 is redirected to http://www.genenames.org/data/hgnc_data.php?hgnc_id=37122 
     61 * We do this because we have to maintain and keep stable links into the future. Meaning that when one of our cross reference databases changes their urls. We change the redirection. One maintenance location. 
     62 * When merging datasets people have to collapse these different URL's into one. Using either a regexp or owl:sameAs statements. 
     63 * Benefit to using UniProt purls when available is that there is an ongoing maintenance effort. 
     64 * They are documented in [http://www.uniprot.org/docs/dbxref dbxref] for the external datasets. 
     65 ** And there is work being done to make this available in owl/rdf including the internal datasets. 
     66 
     67Remember don't get to hung up about this. Pick an URI in your data set and change it when required. Kaizen, build something and then keep on improving it ;) 
     68 
    5769== Web services == 
    5870 
     
    94106  * http://plindenbaum.blogspot.com/2010/02/linkedinxslt-foaf-people-from.html 
    95107  * http://plindenbaum.blogspot.com/2010/02/searching-for-genotypes-with-sparql.html 
    96   * http://plindenbaum.blogspot.com/2010/02/processing-large-xml-documents-with.html 
    97108 
    98109When the XML source is too large to fit in the memory of xsltproc, I (Pierre Lindenbaum ) use  a custom tool named '''xslstream''' that calls a new XSLT transformation for every chunks of data. For example say you want to convert the XML files of DBSNP ( [ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/XML/]  e.g. ds_ch1.xml.gz is 1099375 KB ) with dbsnp2rdf.xsl ( http://code.google.com/p/lindenb/source/browse/trunk/src/xsl/dbsnp2rdf.xsl ). Download '''xsltstream''' from http://code.google.com/p/lindenb/downloads/list