Introduction
Humans are characterised by their curiosity, which has led them to great discoveries. Human nature is one of the great mysteries that humans have not yet solved. The field of biology has indeed made great discoveries, but as we go on, it seems that we have only scratched the surface of understanding our complexity. Let us look at how AI can be implemented in bioinformatics to help scientists get closer to answering the question, ‘Where do we come from?’
What are we made of?
You probably know that we are made of cells that in their nucleus, contain DNA. However, have you ever wondered what DNA is made of? Simply put, DNA is the foundation of our existence. It exists in every living being, and it consists of 4 building blocks called nucleotides, namely: Adenine (A), Thymine (T), Guanine (G), and Cytosine (C). These building blocks are arranged in series to form DNA and are held together by chemical bonds. The order in which they are placed is responsible for the proper function of our body, which is achieved with the proper expression of our genes. But what if something goes wrong? What if there is an error in the code and we have a so-called ‘mutation’? Would that raise any risks, or does it not really matter? Mostly it doesn’t, thanks to the coping mechanisms of our fascinating body. In case that it actually affects our function, though, how can we know in advance? Keep reading to find out more.
Get in Line
DNA sequencing is the process of determining the exact order of the nucleotides that we mentioned earlier. In a nutshell, to perform sequencing, we need to isolate DNA fragments and place some labels. Then, using PCR, we amplify them and elongate the strands. Last but not least, we need to use a computer to measure the fluorescence of the labels to determine the sequence. You might think, ‘ok, it doesn’t sound that complicated!’ We will not be the ones to answer that, but would you change your statement if we told you that the human genome contains approximately 3 billion nucleotides that spell out the instructions for the creation of life and maintaining a human being?
I suppose you now understand how complex such a process is, not to mention how much time it can take. Almighty AI can help us with that. AI algorithms such as BLAST and HMMER can sequence the entire human genome in less than 5 hours (Fastest DNA sequencing technique helps undiagnosed patients find answers in mere hours - News Center - Stanford Medicine). This has cut the cost from 7000$ to a little less than 1000$, and all it takes is proper Edge Computing equipment. Yes, Edge Computing can be quite expensive sometimes due to the selection of hardware for the processing of big amounts of data locally, however, it is better if we look at it as a long-term investment. Research centres and medical units specialising in DNA sequencing would surely benefit from such an investment to optimise local performance. Imagine how many cancer cases could be addressed with precision using this method!
Taking things up a notch, why not incorporate Virtual Reality (VR) into the process? Reading a series of GATC is one thing, but visualising them with the help of technology would be a game changer! We would be able to identify the gene responsible for any mutation in no time, even before the mutation occurs, and give the appropriate treatment before the gene is expressed and the damage is done. Not only would it be beneficial for laboratory tasks, but also it would be a huge addition to the formation of our endless knowledge and interaction with biological systems.
Of course, such processes require a significant amount of machinery and laboratory equipment and, as you realise, plenty of personnel. Automation is the key to cutting costs and ensuring proper execution of all processes. As mentioned, Edge Computing can solve this problem with localised hardware that can handle demanding processes. If this is not an option, as it can be somehow costly, depending on the application, the ideal candidate for this task is cloud computing. Having all your machines communicating with each other will not only minimise errors and execution time but also maximise efficiency.
Shape-Shifting
Guess the Protein
What use is DNA really for? Why is it so important? In simple words, DNA is used for a process called protein synthesis. Proteins are what get things done in our body and consist of amino acids. They are in charge of everything! Our height, hair colour, face shape, and how we think; control our feelings and appetite! Although biology has made great strides in discovery, it is surprising that we have not discovered all the existing proteins. How many? So far, scientists have discovered 90% of the proteins in our bodies; however, it is implied that approximately 90% of the way proteins interact with each other, aka ‘the proteome’, could be hidden (Many of our proteins remain hidden in the dark proteome)! After all this time and effort, we only know a fraction of how proteins work. As if this weren’t enough, one does not simply find an amino acid sequence to identify the protein. Most likely, it has not occurred to you, but proteins have a 3D structure, based on which a protein has a different name and performs a completely different task. But we are no quitters! Technology is on our side!
The powerful tool named Computer Vision (CV) is our greatest ally in this battle. CV models are GPU-accelerated algorithms used for object detection and classification. They have a wide range of applications, including CCTV systems, motion tracking, autonomous vehicles, and face recognition. A properly trained CV model could not only identify the structure of a protein based on the amino acids that it consists of but also determine its orientation in space. This is almost as good as giving us an answer of what exactly we are looking at, if not enabling us to pinpoint exactly what it does and how it interacts with other proteins. Moreover, by extracting the key features of a protein, we could even design our own novel proteins for a wide range of laboratory applications!
Where did we come from?
We have all heard at least once the riddle, ‘Did the chicken come first or the egg?’. Science has answered this (the chicken that laid the egg came first), but that’s not the point. This expression has a deeper biological meaning than just a riddle for children. It pretty much describes the entire evolution of species on Earth in approximately 800 million years (give or take). Years after years, scientists have studied animals both in the lab and in their natural habitats, trying to find correlations between species according to physical characteristics and behaviours and drawing conclusions about their evolution. Apart from humans, DNA Sequencing can be applied to animals and could completely change how we study them. It only makes sense, hence, that evolutionary biology can benefit from AI. In addition to unlocking the secrets of different animals’ genes through AI-driven DNA sequencing, we could also implement CV. How? Training it to detect the different characteristics of animals. The great advantage of a CV model compared to human vision is that it does not get tired. A properly trained model could identify even the tiniest details that can testify where an animal comes from geographically, what its ancestors are, and what morphological changes it went through in the course of time, or even make hypotheses about future evolutions!
Say What?!
Perhaps we could find out even more about animals in another way. Now, this might sound like sci-fi, but what if we communicate in their language? That’s not a joke! Did you know that chimps have a vocabulary? It is not as complex as ours, but it has approximately 400 words that are formed by combining different screams and gestures. This makes sense, if you think about it, as the complexity of a language is mainly based on what we want to express. In the wild, survival comes first, and therefore it is logical that a much simpler vocabulary is used, as it is just enough to get by. Now what would happen if we could study not only the gestures and sounds that animals make but also the movement of their vocal cords? Natural Language Processing (NLP) models are highly capable of human-machine interaction, right? We have a hint that something like that could be achieved between humans and the animal kingdom. Imagine how much we could learn from hearing the animals speak for themselves rather than just studying them by observation. Also, it would be wonderful if every time your cat meows or your dog barks, you could understand them, don’t you think?
Summing Up
AI can be priceless in the fields of bioinformatics and biology. It can sequence the entire DNA in less than an 8-hour shift, analyse and even design proteins, and study the animal kingdom with detail far better than the human eye can perceive. All it takes is enough data for it to be trained. Then, it can provide us with knowledge worth its weight in gold.
What We Offer
At TechnoLynx, we like to innovate. We specialise in delivering custom-tailored tech solutions for your needs. We understand the benefits of integrating AI into bioinformatics applications, ensuring safety in human-machine interactions, managing and analysing large data sets, and addressing ethical considerations.
We offer precise software solutions designed to empower AI-driven algorithms in a plethora of fields and industries. We are committed to innovating, which drives us to adapt to the ever-changing AI landscape. Our solutions are designed to increase efficiency, accuracy, and productivity. Feel free to contact us. We will be more than happy to answer any questions!
List of references
- B.Pharm, Y.S. (2017) DNA Sequencing, News-Medical (Accessed: 13 March 2024).
- Fastest DNA sequencing technique helps undiagnosed patients find answers in mere hours | News Center | Stanford Medicine (no date) (Accessed: 26 March 2024).
- First comprehensive tree of life shows how related you are to millions of species (no date) (Accessed: 13 March 2024).
- From Function to Form | Harvard Medical School (2019) (Accessed: 13 March 2024).
- Google’s DeepMind AI Predicts 3D Structure of Nearly Every Protein Known to Science (no date) CNET (Accessed: 13 March 2024).
- Many of our proteins remain hidden in the dark proteome (no date) Chemical & Engineering News (Accessed: 13 March 2024).