Changes between Version 2 and Version 3 of TextMiningDayFour

Show
Ignore:
Timestamp:
2010/02/12 11:31:18 (14 years ago)
Author:
matthiassamwald
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • TextMiningDayFour

    v2 v3  
     1== The part of the MEDIE result XML that we should focus on == 
     2 
     3{{{ 
     4<set> 
     5        <article> 
     6                <PMID>19694806</PMID> 
     7                 
     8                ... <sentence> 
     9                        Treatment with 5-aza-2'-deoxycytidine and/or the histone deacetylase inhibitor trichostatin A increased  
     10                        <entity_name gena_id="GMM040544" species="Mus musculus" db_site="EntrezGene:13138|MGI:101864|PIR:S59630|SWISS-PROT:Q62165|TrEMBL:Q8BPJ7" type="gene_prod" confidence="1" filter_confidence="0.284761" id="entity-10" gene_symbol="Dag1">Dag1</entity_name> 
     11                        mRNA expression levels in myoblasts, and methylation decreased promoter activity in vitro. 
     12                     </sentence> 
     13}}}                   
     14                      
     15                 
     16== How to generate RDF-friendly URIs for recognized entities == 
     17 
     18 
     19{{{ <entity_name gena_id="GMM040544" species="Mus musculus" db_site="EntrezGene:13138|MGI:101864|PIR:S59630|SWISS-PROT:Q62165|TrEMBL:Q8BPJ7" }}} 
     20 
     21--->  http://purl.uniprot.org/uniprot/Q62165 
     22 
     23Note: If given the choice, prefer Swissprot over TrEMBL 
     24 
     25{{{ 
     26<entity_name id="disease6" facta_id="UMLS:C0878544" type="disease">cardiomyopathy</entity_name> 
     27}}} 
     28 
     29---> Unfortunately we still don't have well accepted URIs for UMLS yet. We could do a lookup and provide URIs pointing to the OBO disease ontology, but this is a bit computationally expensive. 
     30 
     31{{{ 
     32<entity_name id="compound18615" facta_id="CAS:79-43-6" type="compound">Dichloroacetate</entity_name> 
     33<entity_name id="enzyme1375" facta_id="EC:3.5.3.22" type="enzyme"> 
     34<entity_name id="compound18620" facta_id="CAS:61-78-9" type="compound">PAH</entity_name> 
     35</entity_name> 
     36}}} 
     37 
     38---> CAS and EC numbers would  need to be mapped as well 
     39 
    140 
    241== RDF format discussion == 
     
    544<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:cites> <uniprot> 
    645<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:cites> <genbank> 
    7 <http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:contains> _:hasG1 
    8  
    9 _:hasG1 <http://www.w3.org/2000/01/rdf-schema#label> "p53" 
    10 _:hasG1 <rdf:type> <something:protein> 
    11 _:hasG1 <something:taxonomy> <taxid:9399> 
    1246 
    1347<http://www.w3.org/2000/01/rdf-schema#label>