Biological sequences are the fundamental data type through which scientists interpret biology. Despite the exponential increase in the amount of sequence data, we are limited in our ability to predict functions for the vast majority of sequences.
For instance, less than 1% of the sequenced genes have laboratory validated functions and less than half can be associated with a hypothesized function. We need a more intelligent and efficient solution to propagate functional information across biological sequences.
Tatta Bio is building a new data infrastructure and a search engine to map sequences to function. We first target protein functions, focusing on highly diverse sequences.
We are building a new features upon Gaia Search. Sign up here if you would like early access!