NEWS

Proteomics Factories

High-throughput techniques will likely change the field of structural biology

By Eugene Russo

Figure: Gaetano Montelione and Yuanpeng Huang of Rutgers University


X-ray crystal structure of human basic fibroblast growth factor.
With a bit of luck and sometimes decades of dedication, scientists have in recent years revealed fascinating vistas of biological structures at the atomic level using X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. In 1997, Timothy Richmond, a professor of X-ray crystallography at the Swiss Federal Institute of Technology in Zurich, completed an 18-year undertaking that produced one of the largest structures yet, that of the nucleosome.1 In 1998, after several years of painstaking work, Rockefeller University investigator Roderick MacKinnon, a 1999 Lasker Award winner, pulled off the incredibly tricky--some might say unlikely--feat of using X-ray crystallography to obtain a "snapshot" of a potassium channel, one of those purported "Holy Grails" of neuroscience.2 But in the wake of the Human Genome Project (HGP), structural genomics has become the focus, and more and more crystallographers feel the need for speed. A number of pilot projects are currently looking to use so-called high-throughput methods, aided by robotics, to find boatloads of protein structures using NMR spectroscopy and X-ray crystallography, a practice that could forever change structural biology.

To get optimal use of the sequenced human proteins the HGP is cranking out, researchers, especially those interested in drug design, need to determine protein function. To get valuable clues, they often look for protein structure. Thus the stage has been set for what could become a Human Proteomics Project, a collaboration involving several labs churning out not sequences, but atomic structures using NMR spectroscopy and X-ray crystallography. NMR and X-ray crystallography work in concert: NMR is limited to small proteins or protein domains but can help determine structures of proteins that can't be crystallized; it can also be useful as a screening tool to detect a protein's degree of foldedness.

High Throughput

All of the pilot projects have similar aims: to systematically determine a large number of protein structures using computational techniques and, based on protein similarity, organize them into families. "The idea [is] to do the structure of one member of a family so you can get a representative structure for all the members to a certain level of accuracy," explains John Norvell, assistant director for research training programs at the National Institute of General Medical Sciences (NIGMS) and the head of the application process for a budding government-funded structural genomics program. And all of the projects emphasize high-throughput techniques, using robotics to speed up the usually laborious tasks of protein purification, protein crystallization, and data collection. Researchers are intent on minimizing the time necessary to go from protein expression to structure determination.

Figure: Gary Kleiger of UCLA


X-ray crystal structure of the E1-beta subunit of the branched-chain 2-oxo acid dehydrogenase from the archaea bacterium Pyrobaculum aerophilum
But what of the Herculean efforts sometimes necessary to solve complex structures such as Richmond's nucleosome? Will the increasingly popular practice of high-throughput structural genomics encourage researchers to only pick the "low-hanging" fruit, the structures that are easy and solvable? Norvell's quick to point out that the NIH request for applications includes a request for some projects focused on hard-to-determine proteins such as membrane proteins, and that structural genomics efforts will not absorb funds from other established structural biology programs. Also, improvements in technology made through these initiatives should aid researchers involved in more complex projects. And knowing the structures of simpler proteins could ease the process of piecing together more complex structures. "If you have the structures of the individual domains, for example, or proteins that have structures similar to those domains, you can use these to accelerate the structure determination of the more complex system," explains Gaetano Montelione, a professor of molecular biology and biochemistry at Rutgers University.

Montelione is the director of one of the first pilot proteomics projects in the United States, a five-year effort at Rutgers that began in 1998 and is funded by the New Jersey Commission on Science and Technology Initiative (NJCST). Several other projects are under way. The NIGMS is accepting applications for five-year grants for pilot proteomics research centers through Feb. 11, 2000, as part of a large structural genomics initiative. Another separate, small NIH-funded effort at the University of Maryland's Center for Advanced Research in Biotechnology is also under way. More than a year ago, the Department of Energy (DOE) started funding planned three-year pilot proteomics projects at several locations including the University of California, Los Angeles, and Rockefeller University. David Eisenberg, DOE grant recipient and director of the UCLA DOE laboratory of structural biology and molecular medicine, goes so far as to call his project a "pilot to a pilot."

If all goes well, and a human proteomics project becomes a reality, these pilot centers and others like them will become part of a larger research network. "With the huge amount of information in the genome, it's really the only way to tackle the problem," says Thomas E. Ellenberger, an associate professor of biological chemistry and molecular pharmacology at Harvard University. In fact, NJCST-funded researchers at Rutgers and several other universities including Columbia, Cornell, Toronto, and Yale are already actively trying to form a Northeast Structural Genomics Consortium to integrate those universities' bioinformatics, protein production, robotic crystallization, X-ray crystallography, and protein NMR efforts via the Web, videoconferencing, and a common database.

Scientists might, for example, use gene targets identified by the bioinformatics groups and posted on the consortium Web site to generate expression vectors at the University of Toronto or Rutgers University. Researchers at the Hauptman Woodward Institute in Buffalo would then screen the corresponding protein samples to identify conditions for crystallization and post those conditions in the project database. In a final step, crystallographers at Columbia would use those conditions to generate samples suitable for data collection. "This combination of expertise is not available in any single institute and can only be consolidated by forming a multi-institute, multidisciplinary consortium," claims Montelione.

Genes vs. Proteins

According to Ellenberger, a potential Human Proteomics Project has greater support in the scientific community than did the HGP in its early days. "Many people initially looked at structural genomics as not the best investment," says Ellenberger, one of several applicants for the upcoming NIGMS-sponsored structural genomics initiative. But scientists now seem more willing to take the leap of faith necessary for a huge, costly initiative akin to the HGP. "There's a similarity [between projects] in that there's an attempt to get a complete picture," comments Eisenberg.


X-ray crystal structure of theenzyme Pyrobaculum aerophilum adenylosuccinate lyase. The structure was solved by Eric A. Toth in the lab of Todd O. Yeates at UCLA
Of course, there are major differences as well. The HGP's medical payoff has to do with diagnosis, finding susceptibilities in an individual's genome; the medical payoff of a proteomics project relates to treatment, finding therapeutic proteins, drugs, or vaccines. Another big difference: Automating protein analysis is a much taller order than automating DNA analysis. The chemical diversity of proteins relative to nucleic acids is going to make broad-based studies much more challenging. "We're going to have to figure out creative ways to deal with the diversity of folds and surface chemistries of these proteins," says Ellenberger. He adds that membrane proteins like those studied by MacKinnon will be overlooked in the first "pass" of structural genomics studies, since no one knows how to crystallize them in a high-throughput mode.

For More Information
Structural Genomics Initiatives, National Institute of General Medical Sciences
www.nih.gov/nigms/funding/psi.html

Initiative in Structural Genomics and Bioinformatics, New Jersey Commission on Science and Technology
www-nmr.cabm.rutgers.edu:80

UCLA-DOE Lab of Structural Biology and Molecular Medicine
www.doe-mbi.ucla.edu/Overview.html

Structure to Function Pilot Project, Center for Advanced Research in Biotechnology (CARB) and The Institute For Genomic Research
s2f.carb.nist.gov

Despite the emphasis on high throughput, in the future, many of the top structural biology labs may still gravitate toward the "big" biological problems in terms of both targeted entities' molecular weight and degree of complexity. Richmond, for one, isn't interested in cranking out tons of protein structures. "My hope would be to continue ... to pick a biological question--for example, how does transcription depend on chromatin--and then try to provide structural data," he explains.

From Cutting Edge to Common Tool

But the nature of structural biology and X-ray crystallography projects has changed significantly in recent years. As crystallography and NMR evolve from cutting-edge techniques to everyday laboratory tools, more and more biochemists and biologists can, with a little help, crystallize their favorite proteins. Ellenberger likens the field's evolution to that of molecular biology. "When it started out, only a few labs could purify the required enzymes and/or knew how to work with DNA," he explains. "Now we buy kits." Although "crystallography kits" may never materialize, many of the tools and technologies developed through structural genomics projects should simplify structural studies and enable more interested scientists to get into the crystallography game.

Richmond points out, though, that the process is still by no means automatic: "You can get lucky," he says. Not-so-well-versed scientists can have the synchrotron station masters collect their data for them, run it through a program called SOLVE, and come up with a structure. But it's rarely that routine. One likely scenario: Full-time structural biologists will focus on the larger, more difficult problems; other investigators will do their own structural work for the routine, well-defined problems. Says Ellenberger, "I think the structural biology community welcomes that."

Eugene Russo can be contacted at erusso@the-scientist.com.

References

1. E. Russo with comments by Timothy J. Richmond, Hot Papers, The Scientist, 13[24]:15, Nov. 22, 1999.

2. J. Wilson with comments by R. MacKinnon, Hot Papers, The Scientist, 14[1]:15, Jan. 10, 2000.

Finding the Right Model

Though the idea of creating a huge bank of human protein structures has put a twinkle in the eye of many a structural biologist, some disagree as to which animal model would facilitate the first step. Indeed, some even see no problem with going directly to human proteins. The general objective: get better at predicting protein structure from Human Genome Project sequence information so that scientists can infer gene function. Then, eventually, get higher-resolution structural information to inspect the differences between proteins. There's a lot of similarity among kingdoms of organisms--some folds are even identical. But the significance of the discrepancies is unclear. Where's the best place to start?

The animals of choice for Department of Energy (DOE) projects currently under way are from a kingdom of microorganisms called Archaea, one of three kingdoms of cells along with bacteria and eukaryotes. First found in thermal vents off the coast of Italy, thermophilic archaeans' extreme living conditions enable researchers to maximize the efficiency of protein isolation and purification, one of the most difficult steps in structure determination. Researchers express each of the organism's 2,000 proteins in Escherichia coli separately. They then grow the altered E. coli, break open the bacteria, spin down the cell debris, and separate the archaean protein from the unwanted E. coli proteins by heating the resulting solution until the E. coli proteins denature. "This is much, much simpler than dealing with a human," explains David Eisenberg, DOE grant recipient and director of the University of California, Los Angeles DOE laboratory of structural biology and molecular medicine. "It's much simpler even than dealing with another sort of bacterium because of this high-temperature feature." The DOE efforts, says Eisenberg, are intended primarily to investigate human proteome project feasibility.

A five-year pilot proteomics project at Rutgers University is geared toward metazoan organisms like Caenorhabditis elegans and Drosophila. "In choosing our genes [our targets], we try to be guided by the other kinds of functional genomics that are going on," says project leader Gaetano Montelione, a professor of molecular biology and biochemistry at Rutgers. His group wants to tie in the project's structural genomics data with functional genomics data from the mountain of cell and developmental biology, genetic screening, and knockout studies done in yeast, Drosophila, and C. elegans. "Although [Archaea] are more straightforward, you don't have this worldwide community of people doing the functional genomics characterization," he comments. And, says Montelione, metazoans present technical obstacles that will no doubt have to be met if researchers ever hope to focus on human proteins. Rutgers researchers, whose current efforts emphasize technology development over cranking out tons of structures, would like to tackle these problems sooner rather than later.

Other investigators see no good reason not to go directly to human proteins. As part of a grant proposal for the nascent structural genomics project at the National Institute of General Medical Sciences (NIGMS), a group of researchers from Harvard Medical School and the Dana Farber Cancer Institute hope to target human proteins associated with various human cancers. They expect to find folds and combinations of folds in the human genome that aren't represented in lower organisms. But human proteins are tricky: They don't always fold properly when expressed in bacteria like E. coli. However, Thomas E. Ellenberger, an associate professor of biological chemistry and molecular pharmacology at Harvard, notes that the pharmaceutical industry has succeeded in producing many human proteins in lower organisms by denaturing and refolding them. And he refutes the notion, held by many, that proteins from thermophiles like Archaea are somehow less conformationally flexible and therefore more suited to crystallization. Some studies have even suggested that Archaea aren't quite the perfect model organism for high-throughput structural studies. Some archaeal proteins have proven difficult to express in E. coli; others can't tolerate rapid purification by heat denaturation of the host proteins. "These are not unlike the problems facing high-throughput expression of human proteins," says Ellenberger.

Of course, the collection of structural genomics projects ongoing will likely make room for all model organisms. Montelione is quick to emphasize that while his project prioritizes human homologues, it's not merely focusing on the metazoans with an eye toward humans. "We're doing the metazoans because the metazoans are interesting in and of themselves," he explains. And the ultimate goal is not just screening or drug design. "The goal is more profound than that," says Montelione. "I think it's to understand, at an atomic level, the way that biology works."

--Eugene Russo


The Scientist 14[3]:1, Feb. 7, 2000

© Copyright 2000, The Scientist, Inc. All rights reserved.
We welcome your opinion. If you would like to comment on this article, please write us at editorial@the-scientist.com

News | Opinions & Letters | Research | Hot Papers | LabConsumer | Profession
About The Scientist | Jobs | Classified | Web Registration | Print Subscriptions | Advertiser Information