Microbes are the silent assassins of the human world. They’re everywhere: Scientists have identified microbial DNA of all sorts in our homes and on the subway, from innocuous bugs to scary ones like Legionnaires and plague. But bacteria have to beware their own predators, too: a special class of viruses called phages.

Though they’re the most abundant and diverse organisms on Earth, scientists have had a hard time studying phages, which attack by inserting their genetic material into bacteria. Now, researchers have unearthed 12,500 new viruses in one go—the largest-ever addition to the viral family tree—by mining the genetic sequences of those unsuspecting microbial hosts. And they’re getting ready to add a whole lot more.

Scientists have known about viruses that attack bacteria since the early twentieth century. In 1917, Felix d’Herelle, a French-Canadian microbiologist, and his colleagues successfully isolated phages that kill bacteria like E. coli, salmonella, and dysentery, and doctors used his so-called phage therapies to treat disease. But phage research lost steam in the 1950s in the face of penicillin and other powerful new antibiotics.

With the surge of microbiome research, scientists have begun to acknowledge phages’ enormous impact on microbial communities and by extension, the environment. Microbes critically cycle nutrients like carbon, sulfur, and nitrogen through ecosystems. But phages can bring all that activity to a screeching halt. A particularly ambitious phalanx of phages in a bacterial ocean community can kill up to half of its microbes in one day. “In any environment we look, viruses are playing an important role in killing cells, moving genes around, or changing the metabolism of the cell,” says Matthew Sullivan, one of the microbiologists who discovered the new phages.

Despite their abundance, phages are really hard to study. Like microbes, they need to be nurtured in Petri dishes before microbiologists can study them. But lots of bacteria can’t be grown in a lab—the conditions aren’t similar enough to their home environments—and the same is true with phages.

Electron micrographs of aquatic viruses. (These viruses do not have names or associated hosts. They are images of viruses occurring in natural aquatic environments.)Electron micrographs of aquatic viruses. (These viruses do not have names or associated hosts. They are images of viruses occurring in natural aquatic environments.) J. R. Brum and M. B. Sullivan, Tucson Marine Phage Lab, University of Arizona

That’s why scientists have turned to metagenomics. Instead of isolating specific organisms in a handful of soil or bucketful of water, scientists can extract all the DNA in the sample and sequence it. That’s the kind of DNA jumble a team of scientists based at Ohio State University mined—along with some curated, complete microbial genomes—to collect 12,500 new, individual viral genome sequences.

It wasn’t easy. Sequencing all the DNA in a sample is pretty straightforward—except then you have to sort out the DNA. “It’s almost like you took hundreds of different puzzles and threw all the pieces together,” says Tanja Woyke, a microbiologist at the DOE Joint Genome Institute and project contributor. “Now you have to put those puzzles together and figure out which pieces come from which puzzle.”

To make it even more difficult, phages profligately exchange DNA with their bacterial hosts. They’ll hijack genes and incorporate them into their own genome or leave genes behind that the host then absorbs. One way scientists get around this mess is to pick out sequences by comparing them to ones that already exist in public databases. But though more than 30,000 microbes have had their genome sequenced, only about 1,300 viruses have. The trick was to find a way to identify and isolate new virus sequences in the mixed bag of microbial DNA.

Fortunately, phage genomes have a couple specific tells. First, they carry genes that encode for the capsule that protects their genetic material—no other organisms have that naturally. And second, lots of new genes. “Sixty to 80 percent of the genes look like nothing we have ever seen before,” says Simon Roux, a virologist on the project. Combining those two signals—capsule genes surrounded by a sea of novel genes—Roux coded a program called VirSorter that sifted through close to 15,000 public microbe genomes, searching for viral sequences.

Other stat programs can mine phage sequences from complete bacterial genomes, which aren’t mixed up with DNA from other organisms, but VirSorter is the only one that can accurately pull out phage sequences from the jungle of metagenomics. Using the program and other sorting techniques, Sullivan expects the number of known phages to grow 100-fold in just the next couple of years. That’s huge. Remember those 1,300 phage genomes? “They were sequenced over many, many years. And now in one study, you way more than double the sequence information,” says Woyke.

As the phages start rolling in, scientists will be busy organizing them into a taxonomic hierarchy—basically building a new tree of life. If two viruses share a good number of genes, they are probably related. The fewer genes they share, the more divergent their evolutionary paths. “From what we have right now, it seems to mirror the tree of life,” says Roux.

As this big picture forms, microbiologists and virologists can start to track the co-evolution of phages and their hosts—and understand what all the novel phage genes actually do. For example, you might start to see the same new genes popping up in phages that prey on ocean microbes that live in only the top layer of the ocean. That layer has light, so perhaps those genes have something to do with photosynthesis. “I’m sure there are patterns,” says Sullivan, “but at this point, we have very little of the big picture.”

That’s about as far as the scientists can go at this point—the genes are pretty alien—but as the virosphere grows, so does the knowledge of those functions. “We knew there were integrated phages in the bacterial genomes, but we didn’t have any good methods to identify the borders of the phage sequences,” says Nikos Kyrpides, a Metagenomics Program head at the DOE Joint Genome Institute. “This is something we needed, but didn’t have until now.”

Any microbiologist with a genome dataset can use VirSorter to search for new phages, and Sullivan says they made the program easy to use for that reason—you don’t need to be in a virus lab to use it. “These days most microbial ecologists, whether you’re studying soils, the gut, the oceans, are doing large scale metagenomic sequencing. And now they can use this tool to see virus stuff,” says Sullivan. The phages have been lurking in the dark, but you should prepare to hear a whole lot more about them in the next few years. And give bacteria a break—they have a lot more to deal with than you thought.

Go Back to Top. Skip To: Start of Article.


A Menagerie of Viruses Lurks in Microbial DNA