Image credit: doodlia/shutterstock.com
In 2011, Marc Andreessen wrote a Wall Street Journal (WSJ) article and coined the famous phrase: “Software is eating the world.” He argued that several critical technologies were mature enough to deliver software solutions at a global scale and transform industries.
We are in the middle of a dramatic and broad technological and economic shift in which software companies are poised to take over large swathes of the economy.
A decade later, we realize Andreessen put his money in the right place.
In 2019, three partners at Andreessen Horowitz had a similar idea. But not about software. They published an article, and borrowing from Andreessen’s wordplay coined a new phrase: “Biology is eating the world.” Thanks to scientific breakthroughs in recent decades, we have accumulated enough knowledge and developed technologies that continue to disrupt various industries, including healthcare, food, agriculture, textiles, and manufacturing.
Bio today is where information technology was 50 years ago: on the precipice of touching all of our lives. Just like software — and because of it — biology will one day become part of every industry.
So what exactly happened? What scientific and technological breakthroughs fuel this optimism? Let me give you a layman’s summary of the main events that got us here.
Early Beginnings: From Fermentation to Biotech
Biology lab — Westlea, 50’s
Biotechnology (or biotech) is any technology that uses living organisms to develop or make products. The first recorded use of biotechnology dates back 13,000 years ago when Natufians fermented grains with wild yeast to make beer. But today, biotech is tightly associated with genetic engineering (methods and processes that alter an organism’s genes).
For many years, the methods we used for genetic engineering were also primitive, for example, selective breeding. We did not learn to manipulate genes directly until the 1970s. And to understand what made this learning possible, we must travel back through many Nobel Prizes.
Discovering Nucleic Acid from Discarded Bandages
DNA was first discovered in 1869 by the Swiss physician Friedrich Miescher. As a scholar at Felix Hoppe-Seyler’s laboratory, Miescher studied the composition of white blood cells in the pus from discarded bandages. He isolated a new molecule from cell nuclei with unique chemical properties — much higher phosphorus content and resistance to protein digestion. He called it nuclein.
Miescher knew he had discovered something important. But because his colleagues were skeptical, he failed to publish his findings until 1871. It would take decades for scientists to clarify the structure of nuclein (later renamed nucleic acid) and uncover the arrangement of its components in space. Also, no one knew anything about nuclein function until the mid-twentieth century. For a long time, scientists believed proteins carried hereditary information and DNA was just a backbone that provided structure and support for chromosomes.
Building Blocks: DNA Established as a Long-Chain Molecule
Even after scientists clarified the structure of nuclein, no one knew how the parts of nucleic acid were joined together. It was a mystery. Albrecht Kossel first presented the idea that nucleic acids are joined to protein and carbohydrate molecules in specific building blocks. Between 1885 and 1901, Kossel also discovered that these acids were composed of five nitrogen bases: adenine, cytosine, guanine, thymine, and uracil. However, it was Russian-born chemist Phoebus Levene who elaborated on a more specific combination.
Between 1905 and 1939, Levene worked at the Rockefeller Institute for Medical Research in New York City. Although his passion was studying various organic compounds, he became popular for his work on isolating nucleotides, the basic building blocks of the nucleic acid molecule, and showing the world that the components link up in chains.
At first, he successfully isolated the five-carbon sugar D-ribose from RNA and erroneously announced that all nucleic acids contained D-ribose. Levene later realized his error. Not all nucleic acids have D-ribose. About 20 years later, Levine and his associates identified another sugar derived from D-ribose, known as 2-deoxyribose. This sugar is now known as an integral part of DNA. Levene and other scientists then discovered that DNA is essentially a long-chain molecule made up of four different nucleotides, ribose sugar, and phosphate, all linked up in series or chains. But even after Levene’s discoveries, scientists still believed proteins, not DNA, carried hereditary information.
More Than What It Seems: DNA Finally Identified as the Transforming Molecule
In the late 20s, British bacteriologist Frederick Griffith showed that dead bacteria could transfer traits to live bacteria via an unidentified transforming principle. But what was this transforming principle?
In 1944, another researcher, Oswald Avery, initially skeptical of Griffith’s findings, conducted some experiments. And something interesting happened. Avery not only replicated Griffith’s results but also determined that DNA, not cell proteins, transferred traits from one bacteria to another. Avery and his colleagues published a paper in the Journal of Experimental Medicine, in which they referred to DNA as the transforming principle.
Finally, in 1952, Alfred Hershey and Martha Chase, building on top of Avery’s work, confirmed DNA was the genetic material in a series of Hershey–Chase experiments. They demonstrated that when bacteriophages infect bacteria, they inject their DNA into bacterial cells. Hershey would later receive a Nobel Prize for this work, and Martha Chase, for some reason, would not be credited.
The First Picture of DNA Reveals Its Helical Structure
In May 1952, Rosalind Franklin, an expert in x-ray crystallography, and her team took a famous Photo 51 — the first DNA picture ever — revealing its helical structure.
Photo 51. Image credit: King’s College London Archives/CC BY-NC 4.0
James Watson and Francis Crick saw this photo, along with Franklin’s crystallographic calculations. It played a crucial role in helping them complete the first correct model of DNA — a beautiful double helix that we are familiar with today.
Image credit: ShadeDesign/shutterstock.com
Watson and Crick’s model showed that the DNA molecule is a spiral of two strands held together by complementary base pairing with adenine (A) always binding to thymine (T), and cytosine (C) always binding to guanine (G).
They also determined that DNA replicates itself by separating into two strands with one strand as a template, creating another DNA molecule using the base pairing principle mentioned above.
Publishing this model marked a significant milestone in the history of science. “We discovered the secret of life,” announced Crick at The Eagle — one of the oldest pubs in Cambridge. Watson and Crick became known as the fathers of DNA. They shared a Nobel Prize in 1962, just four years after Rosalind Franklin’s death, whose contribution did not receive credit at the time.
DNA makes RNA, and RNA makes protein
The new DNA model opened the door for new opportunities for scientific inquiry. Other studies followed and revealed more details about how DNA works. Studies of various organisms — bacteria, viruses, and animals — showed they share the same DNA structure. Yet, scientists still had no idea about the type of information found in the genetic code or how it transfers from one organism to another. Also, scientists did not fully understand the role of the RNA. They thought proteins were amorphous and lacked clarity about their relationship to DNA.
Then a notable pioneer and future two-time Nobel Prize winner, Frederick Sanger, determined the sequence of all the amino acids in insulin — a protein required for glucose absorption by cells. Contrary to popular belief, Sanger concluded that proteins have clearly defined structures and unique amino acid sequences.
Inspired by Sanger’s discovery, Watson and Crick sat through a series of his lectures and came up with a sequence hypothesis. The hypothesis stated that genetic information is encoded in the sequence of nucleotide bases of DNA and determines the sequence of amino acids, which, in turn, determines the protein structure and function.
“I shall…argue that the main function of the genetic material is to control (not necessarily directly) the synthesis of proteins. There is a little direct evidence to support this, but to my mind the psychological drive behind this hypothesis is at the moment independent of such evidence.” [source]
Faced with a lot of uncertainty and lack of experimental evidence, Crick continued developing his theory and eventually came up with a model that had great explanatory power. Two main ideas were the sequence hypothesis and the central dogma of molecular biology:
DNA makes RNA, and RNA makes protein.
It means that genetic information can move from DNA to RNA and RNA to protein, but not from protein to another protein or protein to RNA or DNA.
More discoveries followed. In 1955, George E. Palade discovered the ribosome — an enzyme that “reads” the messenger RNA (mRNA) sequence and translates this sequence into linked amino acids. He was awarded the Nobel Prize in 1974 for his work. In 1960, Charles Loe, Audrey Stevens, and Jerard Hurwitz discovered RNA polymerase, an enzyme that catalyzes the RNA synthesis from a DNA strand. In 1961, Crick and South African biologist Sydney Brenner showed that a triplet of nucleotide bases (A, C, G, and T) is the smallest unit of a genetic code — a codon. And each codon maps to a specific amino acid.
By the 60s, the details of the cellular machinery involved in reading DNA and creating proteins had become clear. Every cell makes proteins by copying DNA into RNA (transcription) and assembling a sequence of amino acids using RNA as a template (translation). Next, the cell folds the sequence of amino acids into protein mainly by hydrophobic interactions and then stabilizes it with thousands of noncovalent bonds between amino acids.
Genome editing
Image credit: elenabsl/shutterstock.com
Cutting DNA
The whole premise of genetic engineering is editing DNA. We need to somehow cut the existing DNA chain and insert another piece into it. And just as seen in history, the solution is in nature. It turns out some bacteria have a defensive mechanism against viruses — enzymes that cut foreign DNA at specific sites. And this is what Salvador Luria, Weigle, and Giuseppe Bertani found in their labs in the early 50s. They discovered that certain strains of E. coli could reduce the activity of phage λ — a virus that infects bacteria.
Later in the 60s, Werner Arber and Matthew Meselso cut (or cleaved) virus DNA using a specific type of enzyme they called restriction enzyme. The problem was that it cut DNA in random places. And this inconsistency was not very useful for genetic engineering.
In 1970, Hamilton O. Smith, Thomas Kelly, and Kent Wilcox discovered another restriction enzyme, HindII, which could cut DNA at a specific recognition site. In 1978, the Nobel Prize for Physiology or Medicine was awarded to Werner Arber, Daniel Nathans, and Hamilton O. Smith.
Inserting code into DNA
After cutting DNA, we need to insert our fragment into a cut. But how? The answer came in 1967 from Gellert, Lehman, Richardson, and Hurwitz. They discovered, purified, and characterized an enzyme called DNA ligase, a natural repair agent found in all organisms that joins two broken DNA links together.
Natural cellular processes sometimes generate breaks in the DNA which can quickly accumulate and compromise the integrity of the genome, leading to a loss of genetic information. DNA ligase repairs single-strand breaks by using complementary strands of the double helix as a template. Some types of DNA ligase can repair double-strand breaks.
Today, we can cut a DNA molecule at a specific site, mix it with many DNA fragments, add DNA ligase into the soup, and if we meet certain conditions, the DNA ligase will repair the DNA with a new fragment in the place of a cut.
Delivering altered DNA into a living cell
Another discovery that would become crucial for the future of biotech is plasmids — small naked DNA molecules naturally found in bacteria. Plasmids contain just two to three genes that give bacteria superpowers like resistance to antibiotics.
Bacteria can transfer plasmids to one another through a process called conjugation. Bacteria can also pick up plasmids from the environment and incorporate their DNA into their cells. And because plasmids have very short DNA sequences (which are easier to cut and modify), they can serve as the perfect delivery vehicle for modified DNA into bacteria. The credit for the work that helped us understand plasmids goes to Joshua Lederberg, who won a Nobel Prize for discovering that bacteria can mate and pass genes.
Genentech and the Birth of Genetic Engineering
Image credit: VectorMine/shutterstock.com
In November 1972, Herbert Boyer, a biochemist from the University of California, attended the U.S.-Japan joint meeting on plasmids. Boyer was an expert on restriction enzymes and presented his methods to cut DNA molecules such that the ends were not blunt but sticky, making it easier to join them with the other DNA fragments. He met Stanley Cohen, a Stanford medical professor who worked with plasmids, and presented a method that allowed the bacteria to take up plasmids that would replicate together with bacteria.
Both scientists became interested in each other’s work and realized they could collaborate on a project that would later give birth to genetic engineering: they isolated plasmids, cut them using the restriction enzyme EcoRI, and inserted a new gene into a cut, forming a new loop. The plasmid with a modified DNA was then transmitted to bacteria and gave them resistance to the antibiotic tetracycline. The bacteria were then mixed with the culture containing the tetracycline, and only bacteria containing the new gene survived.
This “new” DNA containing added genes was called recombinant DNA, and the set of methods to assemble it was called molecular cloning. These findings were a big deal because they allowed scientists to insert a gene of interest into bacteria and use it as a factory to produce proteins.
Sensing a great opportunity, in 1976, Herbert Boyer, left academia and founded Genentech with venture capitalist Robert A. Swanson. Genentech became the biotech company that used molecular cloning to produce the first synthetic human insulin and, later, the first synthetic human growth hormone.
Interest in biotech skyrocketed, and the first wave of hype hit the markets. Genentech was the first biotech company to go public and had tremendous success on its first day. Within the hour, its stock price went from $35 to $88, closing at $71.25. A new era began.
Image credit: iQoncept/shutterstock.com
Other major biotech companies were founded in the 80s, such as Amgen, Gilead Sciences, Celgene, Vertex Pharmaceuticals, and Regeneron Pharmaceuticals. More scientific breakthroughs were made, including:
- A genetically engineered vaccine (against Hepatitis B) by scientists at Merck & Co. (1986)
- A protein sequencer by Leroy Hood and Mike Hunkapiller (1980–1981)
- The shotgun method for sequencing DNA by Fred Sanger and colleagues (1977)
- PCR by Kary Mullis (1983)
- DNA fingerprinting by Professor Sir Alec Jeffreys (1984)
- The first genetically engineered crop — genetically altered tobacco (approved in 1986)
- The first genetically modified bacteria (1987)
- Genetically modified potatoes (first introduced by Monsanto in 1995)
- The Human Genome Project for mapping the entire human genome (1988)
By the 90s, investors were already pouring money into biotech stocks and buying into the promise of bringing new drugs to the market.
The number of biotechnology companies worldwide expanded from a few hundred in the 1980s to more than 4,000 by the end of the 1990s. 24 Publicly-held US biotech companies numbered 300–400 in the 1990s and included such major firms as Amgen, Genentech, Genzyme, Immunex, and Biogen Idec. Biotechnology revenues increased nearly threefold in the decade, from $8.3 billion in 1991 to $22.3 billion in 1999, while product sales increased sixfold, from $2.7 billion to $16.1 billion. Most large pharmaceutical companies became involved in biotechnology in the 1990s. Alliances between biotech and pharmaceutical companies became the norm for product development and marketing. [source]
Scientists made much progress in understanding various human diseases, which helped develop new medicines. New genomics companies emerged that did not produce drugs but generated DNA data and sold it to drug development companies. There was already hope that this data would help discover better drugs and significantly reduce the time to bring a new drug to the market.
The crash
The year 2000 was a spectacular and record-breaking year for biotech due to the total IPOs, funding, and market capitalization. The industry was full of optimism and excitement. The Human Genome Project was expected to be completed soon — a huge achievement for humanity that promised to deliver a lot of value.
However, in March 2000, US President Bill Clinton and British Prime Minister Tony Blair announced that genomic data would be free and available to everyone who wanted to research the sequence. And investors went into panic mode. The stocks of leading biotech companies crashed. The industry lost about $50 billion in market capitalization within two days after the statement. In a few weeks, the biotech index decreased almost by half. And that was just the beginning. The dot-com bubble burst happened and began to drag the biotech sector down.
Markets suffered, but it did not stop the advancement in science and technology. The Human Genome Project was declared complete on April 14, 2003. Sequencing continued getting cheaper. Another major project, the Human Microbiome Project, began in 2007 to study the microbial flora of humans.
The cost of genome sequencing has been dropping significantly, making it possible for scientists to access repositories of DNA codes from millions of organisms. It took more than ten years and almost $3 billion to sequence the first human genome, and now it costs $300 and takes just several days.
CRISPR and modern times
Image credit: MicroOne/shutterstock.com
In 2012, what’s widely considered one of the most significant discoveries in the history of biology happened. Jennifer Doudna and Emmanuelle Charpentier discovered a defense system called CRISPR-Cas9 — genetic scissors used by bacteria that allows it to cut viral DNA. They developed a method that allows CRISPR-Cas9 to cut and modify DNA at any location in a living cell.
But doesn’t this discovery sound like what scientists could already do when Genentech emerged? You guessed right. There’s only one difference from the old method of using restriction enzymes, and that’s precision. With restriction enzymes, scientists could only make cuts at specific restriction sites or near them. For example, the EcoRI restriction enzyme used by Genentech could recognize the sequence GAATTC and cut between G and A on both strands. But what if you want to discover a different sequence or replace a single base pair? With CRISPR-Cas9, this is possible.
The CRISPR-Cas9 method opened up enormous opportunities for genetic research, medicine, and agriculture. Because it is now possible to switch a single gene or adjust its activity, various genetic diseases like sickle cell disease, hemophilia, cystic fibrosis, and Huntington’s disease can potentially be cured. CRISPR-Cas9 can also help treat different types of cancer and infectious diseases. It also has applications for tissue engineering and regenerative medicine. For this discovery, Jennifer Doudna and Emmanuelle Charpentier received the Nobel Prize in Chemistry in 2020.
Applications in Computational Biology and Bioinformatics
An interdisciplinary field that is rapidly developing is computational biology and bioinformatics. It has pioneered efforts to solve problems like sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression, and protein-protein interactions.
The amount of genomics data available to scientists has grown exponentially since the 80s. At the same time, the tech industry has developed hardware and technologies that deal with big data. Storage, memory, and computing power have gotten cheaper and more accessible.
Software and services that allow us to process terabytes of data have become available to everyone. It is now simple to build and run complex distributed computational systems in the cloud. With the advancements of Machine and Deep learning methods, researchers have started applying them to problems in biology.
Here are some case studies. Google’s DeepMind developed AlphaFold and AlphaFold2, a model that can predict a protein structure from a sequence of amino acids with much better accuracy than any older methods. Companies like Insitro, Atomwise, and Insilico Medicine have developed artificial intelligence-powered drug discovery platforms that identify potential disease targets and accelerate the development of drug candidates. This area is especially promising because getting a new drug to the market is still a very costly and risky endeavor — on average, it costs about $2.5 billion and ten years to develop a new drug. The high costs of drug discovery and development have chased many investors away, negatively affecting Pharma innovation.
Applications in Synthetic Biology
Many advancements have also occurred in the field of synthetic biology. These include different applications ranging from studying the function of the genes, protein design, and drug discovery to creating biomaterials, biocomputers, and synthetic life.
With the decreased costs of sequencing, DNA synthesis, and PCR, it has become possible to synthesize progressively more complex DNA sequences. In 2010, scientists at the J. Craig Venter Institute engineered a genome that they inserted into the bacterium Mycoplasma capricolum, making it the first living self-replicating organism with fully artificial DNA. In 2016, the same lab designed a bacterium with the smallest genome — just 473 genes instead of thousands. In 2019, scientists at ETH Zurich created the Caulobacter ethensis-2.0 — a bacterial genome designed entirely by a computer.
Applications in Immune Cell Therapy
Another application that became very promising in the later 2010s is immune cell therapy. With this therapy, scientists engineer immune cells from patients to produce artificial receptors that allow them to detect and fight cancer cells. The US Food and Drug Administration (FDA) approved the first immune cell therapy in 2017, and more therapies are waiting in the pipeline.
Science and technology have made a lot of progress, but the ecosystem has done the same for biotech companies. Jared Friedman put it well in his YC blog post How Biotech Startup Funding Will Change in the Next 10 Years:
Today, founders can make real progress proving a concept for a biotech company for much less, often as little as $100K. There are low cost CROs that will do scientific work for a fee. Companies like Science Exchange make access to CROs and scientific supplies instantaneous and cost effective to small companies. It’s easy to rent fully equipped lab space by the bench, and companies are willing to help you stock it. Affordable lab robots from companies like OpenTrons make it possible to automate batch experiments, and computational drug discovery from companies like Atomwise allows some experiments to be done completely in silico. Companies like Cognition IP are bringing down the cost of filing patents, and companies like Enzyme are streamlining FDA submission.
Conclusion
Although this is not a comprehensive overview, I hope this article has helped you understand how bio is taking over the world.
We live in an exciting time for biotech. Many experts in science, technology, and finance have high hopes for the industry’s future. Science keeps on advancing, technology is evolving, and we will likely see an explosion in the number of biotech companies soon.
But there are still a lot of challenges. The price of developing new drugs is still extremely high, and there are many problems to solve in drug design and drug development. Advances in personalized medicine have created new challenges for biomanufacturing. Tons of data are being moved around and analyzed manually. Biotech is way behind in terms of software, and many gaps can be closed with the convergence of two industries — biotech and tech. I believe this convergence will be a noticeable trend in the next decade. This blog will focus on investigating this space, and discovering problems that can be solved using software and data.