The CCGG, or Collaborative Cross Graphical Genome, is a novel data structure aimed at simplifying and streamlining the storage of genomic data. The data structure supports the storage of multiple different genomes at one time and has many uses in comparative genomics. In its current version, the Graphical Genome exists as a prototype with plans to generalize it for open source usage in the near future.
This is a simple Python class that abstracts away the Graphical Genome as a basic data structure. Through a series of simple commands you have the ability to load, store, explore, and analyze genomic data through a directed graph.
Docstrings included for all methods in the Graphical Genome. Running help(<methodname>)
will print out documentation for the given method.
A basic command line interface that allows you to perform simple commands like specifying sequence information and a file location and dumping that sequence from the graphical genome to that specified file location for further examination.
This dataset will allow you to use the tools specified above and is meant to showcase the utility of the Graphical Genome.
Found on our project website.