Version 17 (modified by linyu, 15 years ago) |
---|
GFF3 is a tab-delimited, ontology-aware format of genomic features. It should be beneficial to represent such genomic features in RDF so that we can exchange data and make queries with SW technologies.
Participants
- Erick Antezana
- Alberto Labarga
- Yu Lin
- Hideya KAWAJI
- Venkata Satagopam
- Jerven Bolleman
- ...
Scope
Mapping GFF2RDF (proposal)
The following table is in a very inmature state...
GFF Element | RDF (XML) | Description |
Column 1: "seqid" | <gff:seqid rdf:about="#ctg123"> | ? |
Column 2: "source" | <gff:source>1000</gff:source> | ? |
Column 3: "type" | <gff:type rdf:about="#SO:0000704"> | ? |
Column 4: "start" | <gff:start>1000</gff:start> | ? |
Column 5: "end" | <gff:stop>9000</gff:stop> | ? |
Column 6: "score" | <gff:score>5.8e-42</gff:score> | ? |
Column 7: "strand" | <gff:strand>+</gff:strand> | ? |
Column 8: "phase" | <gff:phase>.</gff:phase> | ? |
Column 9: "attributes" | <gff:attributes><rdf:Description>...</rdf:Description></gff:attributes> | ? |
Attribute Mapping (proposal) NEEDED? A list of feature attributes in the format tag=value. Multiple tag=value pairs are separated by semicolons. URL escaping rules are used for tags or values containing the following characters: ",=;". Spaces are allowed in this field, but tabs must be replaced with the ? URL escape.
Attribute tags | RDF (XML) | Description |
Column 1: "ID" | ||
Column 2: "Name" | ||
Column 3: "Alias" | ||
Column 4: "Parent" | ||
Column 5: "Target" | ||
Column 6: "Gap" | ||
Column 7: "Derives_from" | ||
Column 8: "Note" | ||
Column 9: "Dbxref" | ||
Column 10: "Ontology_term" |
Tools
- GFF to OWL (Source code not available yet?)
- Chirs Mungall's code
Discussion
- application to general genomic features (BED, etc)?
- Genomic coordinate system (0-based / 1-based)
- Dasty?
- Attributes is more important for describing an object.
Milestones