Version 3 (modified by matthiassamwald, 9 years ago)

--

The part of the MEDIE result XML that we should focus on

<set>
	<article>
		<PMID>19694806</PMID>
		
		... <sentence>
			Treatment with 5-aza-2'-deoxycytidine and/or the histone deacetylase inhibitor trichostatin A increased 
			<entity_name gena_id="GMM040544" species="Mus musculus" db_site="EntrezGene:13138|MGI:101864|PIR:S59630|SWISS-PROT:Q62165|TrEMBL:Q8BPJ7" type="gene_prod" confidence="1" filter_confidence="0.284761" id="entity-10" gene_symbol="Dag1">Dag1</entity_name>
 			mRNA expression levels in myoblasts, and methylation decreased promoter activity in vitro.
		     </sentence>

How to generate RDF-friendly URIs for recognized entities

<entity_name gena_id="GMM040544" species="Mus musculus" db_site="EntrezGene:13138|MGI:101864|PIR:S59630|SWISS-PROT:Q62165|TrEMBL:Q8BPJ7"

--->  http://purl.uniprot.org/uniprot/Q62165

Note: If given the choice, prefer Swissprot over TrEMBL

<entity_name id="disease6" facta_id="UMLS:C0878544" type="disease">cardiomyopathy</entity_name>

---> Unfortunately we still don't have well accepted URIs for UMLS yet. We could do a lookup and provide URIs pointing to the OBO disease ontology, but this is a bit computationally expensive.

<entity_name id="compound18615" facta_id="CAS:79-43-6" type="compound">Dichloroacetate</entity_name>
<entity_name id="enzyme1375" facta_id="EC:3.5.3.22" type="enzyme">
<entity_name id="compound18620" facta_id="CAS:61-78-9" type="compound">PAH</entity_name>
</entity_name>

---> CAS and EC numbers would need to be mapped as well

RDF format discussion

<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:cites> <uniprot>
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111> <ns:cites> <genbank>

<http://www.w3.org/2000/01/rdf-schema#label>

<http://purl.org/dc/elements/1.1/identifier> <http://pubmed.org/22333111>
<http://purl.org/dc/elements/1.1/title> "pmid:22333111"
<http://togows.dbcls.jp/entry/ncbi-pubmed/22333111>