News & Highlights

Topics: Informatics, Research Resources

Panning for Gold in Research Data

Collaborative offers free bioinformatics training, research support.

Hidden knowledge lurks in the massive data sets generated by high-throughput technologies like the microarray, which can measure the expression of tens of thousands of genes simultaneously. A bioinformatics services program at Countway Library of Medicine, in collaboration with Harvard Catalyst, the Harvard Clinical and Translational Science Center, trains researchers and students to use powerful software, algorithms and databases. These tools can find nuggets of gold in rivers of research data—a mutated protein sequence that raises the risk of colon cancer, for example, or a new target for an existing drug.

Bioinformatics is the application of computer science and information technology to the generation of knowledge about medicine and the life sciences. Common activities include comparing gene expression under different conditions, such as before and after the application of a drug; mapping, aligning and analyzing DNA and protein sequences; or mining the genomes of a patient population, thousands of medical records or the global research literature for fresh insights.

Although bioinformatics was primarily used in genetics in the past, “Bioinformatics has become a core tool for every field of biology, including clinical research,” says David Osterbur, public and access services librarian and head of the bioinformatics services team based at Countway. The group is known as C3 Bioinformatics (C3 stands for Countway, the Center for Biomedical Informatics, and Harvard Catalyst). The C3 collaboration is a dream team of bioinformatics experts in genetics, molecular and cellular biology, clinical medicine, software engineering and library science.

Workshops and consults

In the true spirit of a library, Osterbur’s team offers nearly 30 different training workshops, as well as private consultations, all free of charge. Classes include “BLAST Tips and Tricks,” “Illumina Microarray Data Analysis Using R/Bioconductor” and “Next Generation Sequencing Analysis Using JMP Genomics.”

In the five years since Osterbur founded this program, thousands of Harvard researchers and students have taken hands-on courses and found them indispensable. Reddy Gali, a Harvard Catalyst bioinformatics educator, does much of the teaching and often works one-on-one with researchers.

After consulting with Gali on how to do microarray and “nano string” data analyses, Oleg Butovsky, a research fellow in neurology at Brigham and Women’s Hospital, discovered therapeutic targets in the peripheral immune system that he said can potentially slow development of amyotrophic lateral sclerosis, or ALS. For his critical assistance, Gali will be acknowledged as a co-author of a paper on Butovsky’s findings.

Amanda Nottke, an HMS postdoctoral student in pathology, consulted with Osterbur on how to perform a search in a sequence alignment algorithm called ClustalW. She had tried using other tools on her own with no luck, but after meeting with Osterbur for an hour, she left with a solution.

“What may have taken a researcher a week to do, with frustrating results, we can usually help them do in a couple of hours,” Gali said.

Highly responsive

C3 Bioinformatics is a critical resource that increases researchers’ efficiency, points out Douglas MacFadden, director of informatics technology for the Center for Biomedical Informatics, the research arm of the library’s bioinformatics enterprise. Making up the team are he, Gali, Osterbur and Paul Bain—a reference and education librarian whose expertise includes data-access tools for eukaryotic and other species, including the Ensemble browser, the University of California Santa Cruz genome browser, and BioMart.

“We can be highly responsive to the changing needs of Harvard’s community and in a matter of months develop a new course to teach precisely what our researchers need,” said MacFadden. For example, after people started clamoring to learn next-generation sequencing, the C3 team designed a course and offered it for the first time on October 27, 2011.

Most researchers don’t have the time to sit down and teach themselves how to use these complex tools. They can hire a consultant to perform data analyses, or they can turn to C3 Bioinformatics and learn how to do it.

Bioinformatics tools are constantly changing, becoming ever more powerful. Microarray was new in the late 1990s, says Osterbur. Today, it is about to be eclipsed by next generation sequencing technology.

Starting with this year’s incoming class, as part of the new Scholars in Medicine Program, all medical students are required to conduct a research project. The bioinformatics staff and reference librarians will introduce them to the Countway’s rich resources and explain how to manage their research.

“Medical students typically learn about nucleic acids and sequence alignment, and why those analyses are done,” said Osterbur, “but no class teaches them how to actually do these things.”

That’s where C3 comes in. “This is stuff you can only learn hands-on, at the computer,” Osterbur said. “That’s where it comes alive.” Best of all, participants leave class knowing what to do next with all that raw data.

Sign up to receive our newsletter: courses, funding, events, and resources.