Preliminary Framework for biology related stuff
DDG this is just some starters, I will add snittypes for GeneOntology, LocusLink-data as well. My suggestions is to use the following libraries:
- oomk as a RDBMS for Mk4tcl which is inbuild into tclkit, oomk provides joins, intersections, unions and so on for easy access to the data. A database system is neccessary to get data from large files. You need to index them via tell before retrieving the data via seek
- snit as an object oriented framework, because it is tclonly, it is tcllike and it is needed by oomk
- md5pure in order to webenable the packages it might be neccessary to convert some identifiers to url-harmless strings. md5-strings fullfill this criteria
If focusing on those packages everything can be easily bundled into starkits and run on a variety of platforms [L1 ].
Now let's start: Candidates for inclusion into a biotcl-package:
- FastaFile snittype for parsing fasta files
- BlastFile snittype for parsing Blast-output files
- LocusLinkFile snittype for getting informations for LocusLink-IDs
- GeneOntol snittype for quering Gene Ontology [L2 ]
- Progress snittype for a console progressbar, required to get feedback during parsing of gigabytes of biological data
Other relevant Software for biologists:
- GRS graphic tool for genome segment visualization
- MASIA bio sequence pattern searching
- tkDCSE dedicated comparative sequence editor
- bioTk widgets for computational biology and genomes
- Biowish sequence editing, translations, etc.