What is Computational Biology?
Computational biology is a new discipline whose domain is the quantitative analysis of biological data, the elucidation of biological principles, and the engineering of biological systems. The central themes of computational biology have been shaped by radical breakthroughs in biotechnology, which have enabled high throughput gathering of data about DNA, RNA, and protein in cells and systems. The tools of computational biology have been assembled from ideas in established scientific disciplines, ranging from mathematics and statistics, to computer science, physics and chemistry. However, research in computational biology has consisted of more than just the application of existing ideas in new settings. As an emerging discipline in its own right, computational biology has resulted in numerous dividends in other fields. Within mathematics, the algebraic view of the discrete statistical models used in biological sequence analysis has had a direct impact on the development of algebraic statistics.
Example: Hidden Markov models (HMMs) were first developed for speech recognition applications, however their utility in biological sequence analysis, most notably in gene prediction, has resulted in major extensions and developments and in theory as well as in practice. In the gene finding application, HMMs are used to model the exon/intron structure of genes, and DNA sequences are considered to be the "observed output" from the model. The genetic code and exon structure of known genes impose non-trivial constraints which, in order to effectively be modeled, require extensions of standard HMMs to allow for generalized states (such models are also called semi-hidden Markov models). HMMs have been applied to numerous other biological sequence analysis problems, ranging from alignment to protein classification, and their algebraic interpretation provides a unifying viewpoint for describing the diverse models, and for addressing novel algorithmic questions.

Return to top