Version 7 (modified by yy, 9 years ago)

--

The part of the MEDIE result XML that we should focus on

<set>
	<article>
		<PMID>19694806</PMID>
		
		... <sentence>
			Treatment with 5-aza-2'-deoxycytidine and/or the histone deacetylase inhibitor trichostatin A increased 
			<entity_name gena_id="GMM040544" species="Mus musculus" db_site="EntrezGene:13138|MGI:101864|PIR:S59630|SWISS-PROT:Q62165|TrEMBL:Q8BPJ7" type="gene_prod" confidence="1" filter_confidence="0.284761" id="entity-10" gene_symbol="Dag1">Dag1</entity_name>
 			mRNA expression levels in myoblasts, and methylation decreased promoter activity in vitro.
		     </sentence>

How to generate RDF-friendly URIs for recognized entities

<entity_name gena_id="GMM040544" species="Mus musculus" db_site="EntrezGene:13138|MGI:101864|PIR:S59630|SWISS-PROT:Q62165|TrEMBL:Q8BPJ7"

--->  http://purl.uniprot.org/uniprot/Q62165

Note: If given the choice, prefer Swissprot over TrEMBL

<entity_name id="disease6" facta_id="UMLS:C0878544" type="disease">cardiomyopathy</entity_name>

---> Unfortunately we still don't have well accepted URIs for UMLS yet. We could do a lookup and provide URIs pointing to the OBO disease ontology, but this is a bit computationally expensive.

<entity_name id="compound18615" facta_id="CAS:79-43-6" type="compound">Dichloroacetate</entity_name>
<entity_name id="enzyme1375" facta_id="EC:3.5.3.22" type="enzyme">
<entity_name id="compound18620" facta_id="CAS:61-78-9" type="compound">PAH</entity_name>
</entity_name>

---> CAS and EC numbers would need to be mapped as well

Using TogoDB, although not a primary database provider, you can use URI for UMLS (subset, those can be redistributed).

http://togows.dbcls.jp/entry/nlm-UMLS/C08384423

RDF format discussion

<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:cites> <uniprot>
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:cites> <genbank>
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:contains> _:hasG1

_:hasG1 <http://www.w3.org/2000/01/rdf-schema#label> "p53"
_:hasG1 <rdf:type> <something:protein>
_:hasG1 <something:taxonomy> <taxid:9399>
...

<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <*:hasSentences> _:hasS1
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <*:hasSentences> _:hasS2
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <*:hasSentences> _:hasS3
...

_:hasS1 <*:hasSubject> _:sbj1
_:hasS1 <*:hasVerb> _:verb1
_:hasS1 <*:hasObject> _:obj1
_:sbj1 <http://www.w3.org/2000/01/rdf-schema#label> "peripheral lymphocytes"
_:verb1 <http://www.w3.org/2000/01/rdf-schema#label> "secreting"
_:obj1 <http://www.w3.org/2000/01/rdf-schema#label> "a specific cytokine"



<http://www.w3.org/2000/01/rdf-schema#label>

<http://purl.org/dc/elements/1.1/identifier> <http://pubmed.org/22333111>
<http://purl.org/dc/elements/1.1/title> "pmid:22333111"
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111>