Skip to content

F1000 contributions

Date Details


A draft human pangenome reference.

Nature. 2023 May; 617(7960):312-324

The Human Pangenome Reference Consortium describes a draft of the human pangenome reference, providing highly accurate and diverse genetic information and capturing more variations between individuals to improve variant analysis and structural variant detection compared to the existing linear genomic references.


100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care - Preliminary Report.

The New England Journal of Medicine. 2021 11 11; 385(20):1868-1880

The paper demonstrates the opportunity of using genome sequencing to diagnose rare diseases. Some key take-home messages include:

  • a surprising number of diagnoses are immediately actionable.
  • many patients seek care regularly and at great expense, resources that can be better spent with the underlying cause known.
  • linking variants to diseases increases our understanding and can guide future research.


Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis.

Lancet 2022 Feb 12; 399(10325):629-655

This article estimates the global burden of anti-microbial resistance (AMR) across regions and pathogen-drug combinations. The findings show that the burden of AMR is already alarmingly high, and, as previously known, the negative impact of AMR is growing rapidly. The authors discuss urgently needed actions: prevent infections through better sanitation and other disease-preventing measures, prevent infections through vaccines, reduce antibiotics in agriculture, only use antibiotics when needed, and increase our efforts in developing new antibiotics.


Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning

Gainza and colleagues present MaSIF, a geometric deep learning tool to capture both structural and geometric features of protein surfaces to create so-called surface finger-prints. MaSIF is tested by implementing three separate applications - prediction of protein-protein interactions, prediction of protein-protein interaction interfaces, and classification of ligand-binding sites. MaSIF performs well compared to specialized tools in each area and is at times remarkably much faster.


Highly accurate protein structure prediction for the human proteome

Nature . 2021 Aug;596(7873):590-596. doi: 10.1038/s41586-021-03828-1

Tunyasuvunakool and colleagues present a high-accuracy data set of Alphafold models of the majority of the human proteome, mainly excluding proteins longer than 2700 amino acids. The models are made freely available to the community, and accuracy predictions accompany each model. The contribution is significant from several aspects, the most apparent being models predicted to be highly accurate but for which experimental structures are not yet available.

Further, many of the low-accuracy regions are intrinsically disordered or only ordered in the presence of interaction partners. However, low-accuracy regions that do not correspond to disordered parts can both be genuine failures of Alphafold or, importantly, indicate features of proteins that might not yet be fully understood or otherwise result from inaccurate or incomplete information in the databases used as input to the prediction.


A proximity-dependent biotinylation map of a human cell.

Nature. 2021 Jul; 595(7865):120-124

Go and colleagues used a proximity-dependent biotinylation technique to discover over 4,100 proteins residing close to one of the 192 bait proteins. In addition, the authors demonstrate that their data is of high quality and high resolution. Finally, they make this data easily available via both an online tool and downloadable files.


Quadrivalent influenza nanoparticle vaccines induce broad protection.

Nature. 2021 Apr;592(7855):623-628. doi: 10.1038/s41586-021-03365-x

Boyoglu-Barnum and colleagues published a report on the computational design of immunogenic nanoparticles. The nanoparticles were generated using a two-component strategy: a pentameric scaffold was mixed with an equimolar mixture of fused Influenza HA ectodomains, resulting in mosaic nano-particles that both elicit a humoral response and contain several different antigens simultaneously. The authors showed that the nanoparticles performed similarly to commercial Influenza vaccines on recent influenza strains. The nanoparticles performed better on historical strains. Notably, the nanoparticles were compared to a similar design, but where the pentameric scaffold could not assemble, thereby demonstrating the importance of the larger particles. The two-component strategy is generic and is making it simpler to generate nanoparticles with many different antigens.


RefMet: a reference nomenclature for metabolomics.

Nat Methods. 2020 12; 17(12):1173-1174

RefMet is a reference nomenclature for metabolomics aiming at solving the problem of the same compound being able to have multiple different names, making cross-referencing challenging. RefMet has four annotation groups: 1. complete structural annotations; 2. RegioChemistry (chiral centers or double bonds not fully characterized); 3. some structural features; 4. metabolic class and sum-composition information. Community efforts to provide RefMet nomenclature in major data resources are underway. As of September 2020, RefMet contained 138k metabolic species.


Array programming with NumPy.

Nature 2020 09; 585(7825):357-362

NumPy is a Python software package used to carry out calculations on arrays of arbitrary dimensionality. These types of calculations are common in scientific computing and NumPy has become a cornerstone of a wide array of scientific software. This review touches on the basic capabilities of NumPy but also describes its foundational role in the scientific Python ecosystem.


Antibody-mediated disruption of the SARS-CoV-2 spike glycoprotein.

Nat Commun 2020 10 21; 11(1):5337

A detailed understanding of how antibodies impact the bound antigen can provide essential insights, but data can be challenging to obtain. Wrobel and colleagues present a study where they use cryo-electron microscopy (cryo-EM) to study conformational changes of the SARS-CoV-2 spike membrane glycoprotein upon binding of CR3022. CR3022 cannot bind the open or closed conformation of the spike, and the authors demonstrate that the trimer dissociates upon antibody binding. Despite this dissociation, CR3022 does not neutralize SARS-CoV-2.


Structural basis of a shared antibody response to SARS-CoV-2.

Science. 2020 08 28; 369(6507):1119-1123

This paper by Yuan and colleagues provides an in-depth view of the antibodies that bind to the receptor-binding domain (RDB) of the SARS-CoV-2 spike protein. The authors used previous studies to identify the most common IGHV gene, IGHV3-53. Two members from the group were co-crystallized with RDB; the crystal structures were compared to the binding interface of ACE2, and the authors demonstrated that the antibodies are targeting the same interface. This study provides insight into why some antibodies can be neutralizing, and this information is useful to guide vaccine development.


A lipidome atlas in MS-DIAL 4

Tsugawa et al. Nat Biotechnol. 2020 Jun 15. doi: 10.1038/s41587-020-0531-2

Tsugawa and colleagues present mass spectrometry-data independent analysis software version 4 (MS-DIAL 4) with extensive lipidomics support. The authors used data collected from over 1000 samples on 10 different mass spectrometry platforms to construct a lipidome atlas. The atlas includes tandem mass spectra, collision cross-sections, and retention times, and covers 117 lipid subclasses.


Structural basis of Cullin 2 RING E3 ligase regulation by the COP9 signalosome.

Faull et al. Nat Commun. 2019 08 23; 10(1):3814

The paper by Faull and colleagues uses three complementary methods to gain structural and functional insights into protein complexes that are traditionally difficult to study. Cryo-electron microscopy (cryo-EM) is used to measure the quaternary structure and to determine several conformational states. Cross-linking mass spectrometry (XL-MS) is used to confirm the relative positioning of the individual protein chains and, together with the cryo-EM images, serves as input to model the quaternary structures. Finally, hydrogen-deuterium exchange (HDX)-MS is used to study the conformational changes the protein complex undergoes. The three techniques are used to study the COP9-mediated deactivation of Cullin-Ring E3 Ligases.


1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses.

Zou et al. Nat Biotechnol. 2019 Feb;37(2):179-185.

High-quality microbial reference genomes play essential roles in several research fields. This study by Zou and colleagues describes a project intending to increase the number of high-quality reference genomes found in the human gut. The study collected samples from over 150 subjects and used eleven different cultivating conditions, achieving over 6000 isolates, each analyzed by DNA sequencing. The effort resulted in over 1500 high-quality reference genomes. The large number of previously uncharacterized bacterial species is an indication that we yet have work to do to increase the reference genome coverage that exists in the human population. It also indicates that the human gut microbiota is both complex and diverse.


Commonality despite exceptional diversity in the baseline human antibody repertoire.

Brinay et al. Nature. 2019 Feb;566(7744):393-397.

This paper by Briney and colleagues describes an experiment where circulating B-cells were isolated in 10 biological replicas from 10 human subjects. The antibody heavy-chains were sequenced, roughly at a 1:4 IgG to IgM ratio. The paper characterizes the data in detail, convincing the reader that the quality of the dataset is high. They then use the biological replicas to estimate the diversity of circulating B-cells. Finally, the authors make their data available, thereby creating a high-value dataset that can be used to understand better the complexity that underpins our immune system.


Moving beyond P values: data analysis with estimation graphics.

Ho e al. Nat Methods. 2019 Jul;16(7):565-566

In this correspondence, Ho and colleagues suggest using estimation graphics to supplement p-value calculations to improve the communication of scientific results. The authors provide multiple ways to produce the estimation graphs easily.


Single-molecule kinetics of pore assembly by the membrane attack complex.

Parsons et al Nat Commun. 2019 May 6;10(1):2066

This paper by Parsons and colleagues describes an atomic force microscopy experiment to determine the kinetics of the formation of single membrane attack complexes (MACs), a multi-component protein complex that forms inside cell membranes to create pores. The rate-limiting step is the incorporation of the first C9 component that happens after the sequential addition of components C5b, C6, C7, and C8 that are permanently inserted into the target membrane. The first C9 component is rapidly followed by 17 additional C9 components being incorporated at rates that are two orders of magnitude higher. The comparatively slow incorporation of the first C9 component is potentially important to allow time for the CD59 MAC inhibitor to bind to the C5b-C8 to prevent the MAC from making pores in host cells.


Universal protection against influenza infection by a multidomain antibody to influenza hemagglutinin.

Laursen et al. Science. 2018 Nov 2;362(6414):598-602

Laursen and colleagues present compelling evidence that passive immunization is more efficient using multidomain antibodies compared to single-domain antibodies. This might play a significant role in combating diseases common in elderly patients as passive immunization is relatively more effective compared to active immunization in aging populations. Multidomain antibodies are constructed by combining several broadly neutralizing antibodies. This paper demonstrates both that these have a broad application area and that the combination is more powerful than the constituent parts.


Within-host competition can delay evolution of drug resistance in malaria.

Bushman et al PLoS Biol. 2018 Aug 21;16(8):e2005712

This paper by Mary Bushman and colleagues describes a mathematical model that sheds light on the complicated relationship between in-host competition and between-host transmission of malaria parasites. Detailed understanding of how the cost of sustaining resistance affect its ability to compete with non-resistant strains is of high importance in addressing increased resistance burdens by parasites, bacteria, and viruses. The model provides insight into why resistance emerges in areas with low disease burden but spreads more quickly in areas with high disease burden.


A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life.

Parks et al Nat Biotechnol. 2018 Nov;36(10):996-1004

This paper by Parks and colleagues proposes yet another data-driven way to better deal with genome phylogeny, a problem exacerbated by the explosion of genomes from bacteria that cannot be cultured which, in turn, is making them difficult to classify using standard methods. It is worth reading this paper and using their taxonomy as, while data-driven, they have put a lot of emphasis on trying to preserve as much of the existing taxonomies as possible, thereby minimizing the break from the literature. Further, the authors make their software available under an open-source license, and this will enable anybody to benefit from automated classification without having to make their data public before publication.


Academic research in the 21st century: maintaining scientific integrity in a climate of perverse incentives and hypercompetition.

Edwards and Roy Environ Eng Sci. 2017 Jan 1;34(1):51-61

Incentives intended to increase scientific performance can promote negative behavior that ultimately can damage the reputation of the scientific endeavor. The problem is growing, and it is fundamentally important to increase awareness among scientists and others so that the problem can be adequately addressed. I hence urge everybody to read this article (or similar as there is a growing body of literature that is addressing or quantifying the problem) and spread the word.


A protein activity assay to measure global transcription factor activity reveals determinants of chromatin accessibility.

Wei el al Nat Biotechnol. 2018 Jul;36(6):521-529

This article by Wei and colleagues describes a technology (active transcription factor identification [ATI]) that is capable of measuring transcription factor activity in cells or tissues. They apply ATI to several experimental systems, such as different mouse tissues and various species and conclude that transcriptional regulatory networks might be simpler than previously thought. The described technique has significantly higher applicability compared to previous techniques and therefore holds the promise to shed new light on the activity of transcription factors in various tissues and under different perturbations.


Molecular mechanism of extreme mechanostability in a pathogen adhesin.

Milles et al Science. 2018 Mar 30;359(6383):1527-1533

Pathogenic adhesins such as SdgG bind their target host proteins through a mechanism called dock, lock, and latch (DLL). In short, the adhesin and target molecule dock in a sequence-dependent manner, the adhesin then goes through a conformational change where a strand from one domain is folded over the target molecule and then forms a beta sheet with an adjacent domain. Milles and colleagues used atomic force microscopy and molecular modeling to determine the binding strength to host proteins and demonstrated that the DLL adhesins as as strong as some covalent bonds; remarkably, this strength is sequence-independent as it is dependent on backbone interactions.


Dietary trehalose enhances virulence of epidemic Clostridium difficile.

Collins et al Nature. 2018 Jan 18;553(7688):291-294

This paper shows that cheaper production and the subsequence approval of trehalose as a food additive pre-dated the spread and epidemic outbreaks of Clostridium difficile strains capable of metabolizing trehalose at low concentrations. Further, the authors show that the capability of metabolizing trehalose existed at least 15 years before the broad introduction of this chemical in human diets; normal modern diets contain sufficient amounts of trehalose to be detected by C. difficile in the human GI tract. A link between the hypervirulence of C. difficile ribotype RT027 and its capability of metabolizing trehalose was demonstrated using mice. The authors conclude that the addition of trehalose to human diets contributed to the emergence of epidemic and hypervirulent strains.


Singularity: Scientific containers for mobility of compute.

Kurtzer et al PLoS One. 2017 May 11;12(5):e0177459

This paper by Kurtzer and colleagues describes singularity, a software container solution optimal for computational resources of the kind often found at academic institutions. Users can install one or many software tools and all the software libraries that these tools depend on are inside the container. The containers can then be shared with colleagues or used on central computer infrastructure to address larger computational problems. As the containers are files, they can be preserved and thereby allow users to reproduce results produced in the past or elsewhere. If containers get widely adopted in the scientific community, it will contribute to solving the reproducibility problem and save time and effort.


Potent peptidic fusion inhibitors of influenza virus.

Kadam et al. Science. 2017 Oct 27;358(6362):496- 502.

Influenza is a deadly virus, challenging to combat due to rapid evolution; broadly neutralizing antibodies (bnAbs) often bind to epitopes that play essential roles in the function of the protein and therefore are more conserved. The influenza hemagglutinin (HA) protein undergoes a conformational change as the virus particle fuses with the endosome membrane. The hinge region that enables the conformational change is one of these conserved epitopes that is the target of several characterized bnAbs.

Kadam and colleagues report on the development and characterization of small cyclized peptides that were designed using experimental crystal structures of bnAbs as guides; as a result, the peptide also binds to the conserved hinge region, preventing the conformational change needed. The peptides described here were co-crystallized with HA, and the neutralizing effects were evaluated. The results of the paper indicate that it is possible to mimic the binding of a bnAb with small, stable peptides. This is an important step forward, as peptides have many properties that make them more suitable as drugs compared to bnAbs.


Antibiotic tolerance facilitates the evolution of resistance.

Levin-Reisman et al. Science 2017 02 24; 355(6327):826-830

Bacterial antibiotics tolerance is the ability to survive exposure to antibiotics through slow growth, by extending the lag phase for example; tolerance is independent from resistance as the minimum inhibitory concentration for tolerant strains under growth can be the same as for intolerant strains. Bacterial antibiotic resistance, on the other hand, is the ability to not only survive in the presence of antibiotics but to grow and divide. It is of vital importance to understand the relationship between tolerance and resistance. The paper by Levin-Reisman et al. sheds light on this relationship and show that, in this particular case, several different mutations can lead to tolerance. These mutations allow the tolerant strains more time to develop resistance; the authors show that this additional time is crucial as the mutations needed to increase resistance are fewer and therefore require more time, on average. The authors conclude that the most likely evolutionary path to resistance goes via tolerance. This finding begs the question of whether clinical labs should also measure and report tolerance in addition to the resistance profiles commonly reported today.


Analysis of protein-coding genetic variation in 60,706 humans.

Lek et al Nature 2016 08 18; 536(7616):285-91

Lek and colleagues report on a meta-analysis of a large cohort of human genomes to build a massive resource of protein-coding genetic variations. The technical challenges in processing data at this scale are non-trivial; as the authors provide detailed descriptions of their computational analysis, they enable others to carry out larger experiments in the future. Importantly, the authors make their results readily accessible.


Enhancing reproducibility for computational methods.

Stodden et al. Science 2016 12 09; 354(6317):1240-1241

Reproducibility of scientific results is a cornerstone of the scientific endeavor; scientific and other news sources have reported that reproducibility of many studies published in high-impact journals and elsewhere is lower than expected. This article by Stodden and colleagues lists a number of recommendations that, if implemented by everybody, would improve the reproducibility of computational methods. Some of the recommendations are challenging, but nevertheless, this paper is a must-read for anybody that is using computational methods.


The cytotoxic Staphylococcus aureus PSMα3 reveals a cross-α amyloid-like fibril.

Tayeb-Fligelman Science 2017 02 24; 355(6327):831-833

Pathogens interact with their hosts in a wide variety of ways; understanding how and why can play a vital role in solving the antimicrobial resistance problem. This paper by Tayeb-Fligelman and colleagues reports on the 1.45A crystal structure of the PSMalpha3 amyloid fibril. The fibril is cytotoxic to host cells in contrast to the monomer. Interestingly, this fibril is formed by two 'sheets' made up by alpha helices and thereby differs from previously solved fibril structures whose cores are made mostly from beta sheets. The authors speculate that the cytotoxicity results from the fibrils deforming the host cell membrane.


SPLASH, a hashed identifier for mass spectra.

Wohlgemuth et al. Nat Biotechnol 2016 Nov 08; 34(11):1099-1101

Wohlgemuth and colleagues have written SPLASH, a small application implemented in several programming languages, that generates database-independent hashes for MS spectra. It is already supported by several online data repositories and hence facilitates cross-database searches. This technology, if widely adapted, will make referring to a specific spectrum simpler, and this can play an important role in tracing data sources used to, for example, create spectral libraries. Spectral libraries are increasingly used to increase the sensitivity of both DDA and DIA experiments. I believe that SPLASH will provide an opportunity to more easily trace the flow of information through the increasingly complex MS analysis workflows.


An interbacterial NAD(P)(+) glycohydrolase toxin requires elongation factor Tu for delivery to target cells.

Whitney et al. Cell 2015 Oct 22; 163(3):607-19

This paper by Whitney and colleagues describes the function of Tse6, a Pseudomonas aeruginosa toxin that is delivered to neighboring recipient cells via type VI secretion. Tse6 forms a complex with elongation factor Tu of the recipient cell to enter through the inner membrane; once inside, it depletes NAD(P)+ through catalysis. This article provides an example of how complex the interactions between bacterial cells can be; P. aeruginosa is altering the energy state of recipient cells competing for resources using a system that a) needs the donor and recipient cells to be in physical contact; b) requires that type VI secretion is able to deliver the toxin to the periplasm; and c) that Tse6 is able to form a complex with an essential and conserved host protein to gain access to the cytosol.


Targeted proteomics identifies liquid-biopsy signatures for extracapsular prostate cancer.

Kim et al. Nat Commun 2016; 7:11906

This article by Kim and colleagues describes a rigorous study developing prostate cancer biomarkers. They applied quantitative, targeted mass spectrometry on a 74-patient cohort to identify a panel of 34 candidate peptides. Importantly, the performance of the biomarkers was evaluated in a larger, independent cohort consisting of 207 patients. This paper is a worthwhile read for anybody developing biomarkers using mass spectrometry.


Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis.

Bradley et al. Nat Commun 2015 Dec 21; 6:10063

This article by Bradley and colleagues describe a software application, Mykrobe predictor, that uses DNA sequencing data for fast and accurate antibiotic resistance profile predictions. While the ideas are not new, Mykrobe predictor is both faster and more accurate that previous software, and, importantly, can detect minor resistant populations. Both demonstrated models (S. aureus and M. tuberculosis) show significant overall speed increases compared to the technologies in use today. As vastly more sequencing data will become available over time, resource-efficient tools like Mykrobe predictor will become important to survey these datasets and track the spread of resistance genes; this will be of importance in our efforts to avoid entering the looming post-antibiotic era.


Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae.

Conlan et al. Sci Transl Med 2014 Sep 17; 6(254):254ra126

The looming threat of the post-antibiotic era urgently calls for studying the spread of both virulent strains and plasmids; plasmids can play a critical role as they can carry antibiotic resistance genes. This study by Conlan and colleagues describes a thorough study where long-term in-patients and their environment were regularly surveyed using a sequencing technique optimal for sequencing complete plasmids. Their findings reveal a complex story described at length. Key findings include a surprising diversity among the isolated plasmids and a lower-than-expected rate of horizontal transfer. In general, the complexity that this article is hinting at might mean that additional effort needs to be invested to identify important reservoirs in society at large, not just in the health-care institutions that were studied here.


Unusual biology across a group comprising more than 15% of domain Bacteria.

Brown et al. Nature 2015 Jul 09; 523(7559):208-11

Our inability to culture a large fraction of all bacterial strains is biasing our information and understanding of the bacterial communities that colonize our environment. Technological advances such as growing bacteria in their natural environment and/or our ability to directly sequence environmental samples is partly mitigating this. In this particular study, the authors isolated tiny cells from an aquifer and sequenced them. Analysis of the resulting genomes revealed that these bacteria seem to belong to uncharacterized phyla and that they lack important metabolic pathways. They also had unusual introns in genes such as 16S rRNA and the authors crucially made the observation that many of these genomes would not be detected using standard 16S rRNA gene amplicon surveys.


Origins of major archaeal clades correspond to gene acquisitions from bacteria.

Nelson-Sathi et al. Nature 2015 Jan 01; 517(7532):77-80

Understanding bacterial evolution and speciation is challenging because rapid adaption through point mutations or other DNA alterations is accompanied by the influx of foreign DNA via lateral gene transfers. Sequenced genomes are snapshots in time and offer little direct information on the evolutionary processes involved in shaping them. Studying large cohorts of genomes can reveal some of the history shaping individual genomes. Nelson-Sathi and colleagues use this approach in this study and were able to work out the origin of a subset of genes transferred laterally between archaeal bacteria and eubacteria. The authors made the observation that the transfer was more commonly originating from the eubacteria. Finally, the authors found that these transfers seem to coincide with speciation. This paper represents a step forward in understanding bacterial evolution and speciation and is a worthwhile read.


A new antibiotic kills pathogens without detectable resistance.

Ling et al Nature. 2015 Jan 22;517(7535):455-9

This exceptional paper by Ling and colleagues describes the results of a relatively simple, yet remarkably powerful strategy to grow bacteria in their natural environment and screen for desirable properties of the resulting cultures. In this case, the authors detect and characterize a new antimicrobial compound produced by a bacteria of the Aquabacteria genus. The compound, Teixobactin, might allow us to avoid the catastrophic consequences of pan-resistant bacteria becoming more prevalent; perhaps more importantly, the strategy used is generally applicable and will provide an opportunity to find more anti-microbial compounds.


Genome-wide identification of genes required for fitness of group A Streptococcus in human blood.

Le Breton et al Infect Immun. 2013 Mar;81(3):862-75

The paper by Le Breton et al. describes a high-throughput technology to find genes that are important in a clinically relevant selection experiment. The authors disrupt random genes in the genome using transposons, then grow the pool of mutants in rich growth media to reduce or remove any transposon insertions that affect growth. They then grow the pool in three successive four-hour incubations in blood donated from three individuals. They then quantify the relative amount of mutants in the input pool with the output pool. The authors then show that the genes found to reduce viability in blood are highly relevant for transporting and metabolizing nutrients in blood (high in peptides and fat, low in carbohydrates and nucleotides), or are involved in protecting the bacteria from anti-microbial factors. One of the key concepts highlighted in this paper is the strategy to remove mutants with growth defects in rich media, as these otherwise show up as positives in the screen despite having little to do with fitness in blood, a highly relevant parameter in clinical settings.


Molecular anatomy of a trafficking organelle.

Takamori et al Cell. 2006 Nov 17;127(4):831-46

We are inching closer to atomic resolution models of entire molecular systems, and hybrid structural approaches combing two or more orthogonal technologies are gaining popularity. The paper by Takamori and colleagues is a landmark paper that combined pseudo-atomic models of proteins and membranes together with quantitative data to seed a starting model. They then used molecular dynamics to minimize the energy of the system and arrived at a model of a synaptic vesicle. While the model is not predictive, it provides a graphical model of the entire systems and this, of course, can result in new insights.


Epistasis and allele specificity in the emergence of a stable polymorphism in Escherichia coli.

Plucain et al Science. 2014 Mar 21;343(6177):1366-9

Epistasis' role in evolution is becoming increasingly well supported; to understand the role for a given single-nucleotide polymorphism (SNP), one needs to consider all the epistatic interactions, many which remain unknown. In this paper, Plucain and co-authors provide further proof of epistasis as they explore the mutations underlying the stable coexistence of two strains of Escherichia coli. By comparing the fitness (through an invasion assay) of several mutations, they could quantitatively show that the effect of the mutation had a strong dependency on the genetic background.


Bacterial cell wall. MurJ is the flippase of lipid-linked precursors for peptidoglycan biogenesis.

Sham et al Science 2014 Jul 11;345(6193):220-2

Close to all bacteria synthesize a cell wall protecting them from the environment. Punching holes or inhibiting cell wall biosynthesis kills bacteria, and it is therefore no surprise that several bactericidal antibiotics target just the cell wall and its biosynthesis. Detailed understanding of this machinery gives us more opportunities to disrupt it. Here, Sham and colleagues report that MurJ is an essential flippase in Escherichia coli. The flippase transports lipid II from the cytosol into the periplasmic space. While MurJ has been a suspect for a while, the data presented in this paper make a compelling case that this is indeed true.


Functional discovery via a compendium of expression profiles.

Hughes et al Cell. 2000 Jul 7;102(1):109- 26.

This paper by Hughes and colleagues is important for numerous reasons, in addition to the many new insights into yeast biology that it reported. The paper was cited over 2400 times between the years 2000 and 2013, and it is still frequently cited with over 80 citations in 2013. The basic idea of the study was to create a compendium of transcription profiles measured on yeast strains with specific genes knocked-out. The compendium was then used to hypothesize on the functions of unknown gene knock-outs or chemical perturbations. What set this study apart was the use of 67 control experiments, which were used to create error models accounting for gene-specific fluctuations. These error models allowed the authors to detect relatively small changes in genes that are usually transcribed stably and to down-weigh changes in highly volatile genes. Despite the success of this study, the use of gene-specific error models remains relatively little utilized, and the reason for this is likely the cost that it involved. I believe that gene-specific error models bring a lot of power to studies and hope that their use will increase as the cost of many high-throughput experiments are decreasing.


Evolutionary pathway to increased virulence and epidemic group A Streptococcus disease derived from 3,615 genome sequences.

Nasser et al Proc Natl Acad Sci U S A. 2014 Apr 29;111(17):E1768-76

Nasser and colleagues present the largest bacterial genome sequencing study to date - 3615 clinical isolates of a human pathogen. The earliest isolates were collected in the 1920s and the latest in 2013. The data allowed the authors to pinpoint key evolutionary events in time and could thus shed light on the factors that led to a sharp increase of infections in the 1980s. Analysis of single nucleotide polymorphisms and mobile genetic elements provides insight into which genes are under high evolutionary pressure, and this information can be used to select for proteins that are likely playing key roles in the infection process.


The innate growth bistability and fitness landscapes of antibiotic-resistant bacteria.

Deris et al Science. 2013 Nov 29;342(6162):1237435

Detailed understanding of the mechanisms of antibiotic drug resistance is crucial when developing new strategies to address this global and growing problem. Deris and colleagues present a quantitative model, backed by experimental data, showing that resistance is coupled to the growth state of the bacterium. Importantly, the model shows and explains that growth rate at certain drug concentrations is bistable: growth of individual bacteria is either completely absent, i.e. the bacterium is in a persistent state, or not affected much.


Naturally occurring single amino acid replacements in a regulatory protein alter streptococcal gene expression and virulence in mice.

Carrol et al J Clin Invest. 2011 May;121(5):1956-68

Large genome sequencing projects of many strains of the same bacterial species show that some proteins are significantly more mutated than is expected by chance. In this paper, Musser and colleagues follow up on a large genome sequencing study by investigating the consequences of non-synonymous mutations of one of the most mutated genes, ropB. Interestingly, RopB regulates, among many targets, various virulence factors. The study is interesting in that it found that RopB regulates different target genes depending on the isoform and genetic background and that this has a significant effect on the virulence. This hints at a complicated relationship between specific isoforms of the regulatory protein and the promoter regions.


Activated ClpP kills persisters and eradicates a chronic biofilm infection.

Conlon et al Nature. 2013 Nov 21;503(7476):365-70

Antibiotics are not always equally efficient in killing all bacteria underlying an infection despite the bacteria being close to genetically identical. One of the most studied causes of this is persisters, bacteria in a lower activity or dormant phenotypic state, where the low energy levels of the cell render the antibiotics less efficient. Conlon et al. show that activating a protease using acyldepsipeptide antibiotic (ADEP4) in combination with other antibiotics effectively cleared infections, whereas individual antibiotics by themselves were unable to clear the infection completely, and this is explained by them targeting different sub-populations. Importantly, Conlon et al. showed that they were able to cure deep-rooted infections in a mouse model with the drug combination. Targeting all sub-populations is important, as described here, and should be systematically investigated as more facets of infections are described and more possibilities to treat infection become available.


High-resolution mapping of the spatial organization of a bacterial chromosome.

Le et al Science, 2013, Nov 8,342(6159):731-4

Chromosome conformation capture, together with deep sequencing, is a technology that can measure DNA-DNA interactions in vivo. Le et al. report on the structure of a bacterial chromosome under the cell cycle progression as well as comparisons between wild-type and various perturbations. These perturbations include treatment of rifampicin and knockouts of proteins known to have an impact on the organization of chromosomes, such as the histone-like protein HU and the structural maintenance of chromosomes (SMC) protein. The authors observe several important features: the chromosome seems to have regions with many local interactions called chromosomal interaction domains, or CID, separated by regions with few interactions. Regions with few interactions correlate with highly expressed genes and this is supported by the observation that these low-interaction regions seems to disappear when treating with rifampicin. These findings give an insight into the organization of bacterial chromosomes but, perhaps more importantly, they set the stage to test which perturbations have an impact on chromosome organization in a more systematic way.


Genomically recoded organisms expand biological functions.

Lajoie et al Science. 2013 Oct 18;342(6156):357-60

Genomically recoded organisms (GROs) are likely to 1) be resistant to viral infections and 2) be resistant to horizontal gene transfers, both in terms of incorporating genes from the environment but also making engineered traits available to naturally occurring organisms. Both features are useful when engineering organisms for use outside of labs. Lajoie and colleagues recoded a nonsense codon in Escherichia coli to a sense codon, which allowed for the deletion of a release factor and showed that one of two phages had attenuated virulence in the GRO compared to wildtype. The recoding was accomplished on a relatively small budget and it is likely that more heavily modified organisms will be created in the near future and this work indicates that they will be even more resistant to infections and, by extension, to horizontal gene transfers.


Use of collateral sensitivity networks to design drug cycling protocols that avoid resistance development.

Imamovic and Sommer Sci Transl Med. 2013 Sep 25;5(204):204ra132

This paper demonstrates that anti-microbial resistance against one antibiotic can lead to so-called collateral sensitivity to antibiotics of one or more different antibiotic class. The authors systematically mapped out the collateral sensitivity relationships for 23 antibiotics by first eliciting resistance in wild-type Escherichia coli by increasing drug concentration over time until the resistant bacteria were growing faster than wild-type bacteria. The growth rate of the resistant bacteria was then compared to wild-type with a different drug present. Slower-growing resistant bacteria meant that bacteria resistant to the first drug were collaterally sensitive to the second drug. The authors defined optimal cycles of drugs that can be used to increase the overall drug efficacy. Importantly, two clinical isolates were shown to have largely the same pattern of collateral sensitivity compared to the laboratory strains. Antibiotics resistance is a growing clinical problem and strategies like the one described in this paper can alleviate some of these problems without the need for developing and approving new drugs.


Structural systems biology evaluation of metabolic thermotolerance in Escherichia coli

Chang et al Science. 2013 Jun 7; 340(6137):1220-3

Chang et al. combined a network model of the Escherichia coli metabolism iJO1366 with experimental and predicted protein structures. They used the model to predict temperature-sensitive enzymes catalyzing key reactions limiting growth rates at super-optimal temperatures. The predictions were experimentally verified, showing that the predictions reasonably well reflected the experimental data. This paper is an important contribution to the field of structural systems biology and it is likely that the strategies presented here and variations thereof will play a bigger role as structural coverage of model organisms increases.


A "dock, lock, and latch" structural model for a staphylococcal adhesin binding to fibrinogen.

Ponnuraj et al Cell. 2003 Oct 17;115(2):217-28

This important paper describes a novel structural mechanism for how a surface-associated bacterial protein, the receptor, can achieve a high binding affinity to a human molecule, the ligand, even in cases where the bound ligands amino acid sequence segment does not optimally fit into the binding pocket. The authors call this the dock-lock-and-latch mechanism and it depends on the ligand binding in a groove between two domains. A beta-sheet (the latch) is then donated from one of the domains, N3, and becomes strand-paired on the second domain, N2. The authors further show that the particular receptor studied SdrG has a higher affinity to different peptides other than the fibrinogen peptide co-crystalized. This can be the result of SdrG binding to molecules other than fibrinogen. It is tempting to speculate that this could also be a mechanism for the receptor to bind tightly to several fibrinogen isoforms likely to be present in the human population. Finally, the authors described the conserved sequence motif for the latch and used it to show that the dock-lock-and-latch mechanism is present in several other bacteria.


Stabilization of cooperative virulence by the expression of an avirulent phenotype

Diard et al Nature. 2013 Feb 21;494(7437):353-6

Pathogens that produce and secrete expensive virulence factors (producers) often grow slower than genetically identical ‘non-producers’ or similar individuals (a ‘defector’ unable to produce the virulence factors, for example) who do not produce virulence factors. Evolutionary theory states that the producers eventually will be outcompeted given that the producers and others within the same population benefit equally from the secreted factors. Diard and colleagues demonstrate that carefully controlled bistable expression of these virulence factors provides a means to keep the defectors at bay long enough to allow the transmittal of the more virulent wild-type, despite its slower growth.


An integrated map of genetic variation from 1,092 human genomes.

Abecasis et al Nature. 2012 Nov 1;491(7422):56-65

The 1000 Genomes Project aims to map genomic variation in humans by low-coverage whole-genome and deep-coverage exome sequencing of over 1000 genomes selected from 14 distinct populations across the world. The scale of the project allows for the mapping out of single nucleotide polymorphisms (SNPs) present in < 1% of the population confidently, and this in turn allows the team to estimate how local these mutations are and, to some extent, human migrations between the various geographical locations. This paper describes how the data were processed to ensure high quality and gives some examples of how the data can be used. Among the highlights are 38 million SNPs, 14 million small insertions and deletions and 14,000 larger deletions. Additionally, they describe some global trends that can be read out of the data set. All of the project data are available on the web both for download and through various other means, such as custom Ensembl browser. The data in the project are a treasure trove for anybody interested in the evolution of human genes and genomes and provide a valuable resource to which one's own data can be compared.


Global landscape of HIV-human protein complexes.

Jaeger et al Nature. 2011 Dec 21;481(7381):365- 70

Viruses rely on hijacking various cellular functions of the host cell to ultimately produce and release new copies of the virus. Mechanistically, this hijacking is presumably heavily reliant on protein-protein interaction where virus proteins physically interact with human proteins to alter their function. In this paper, the complete interactome between all HIV proteins and the proteins from two cell lines is presented. These interactions can be used in many ways, such as gaining understanding in which interactions likely play a key role, and also which protein-protein interfaces are important. This paper is the first complete interaction map between a pathogen and its host, making it an important contribution to the field of infectious deceases.


Epistasis as the primary factor in molecular evolution.

Breen et al Nature. 2012 Oct 25;490(7421):535- 8

Molecular evolution is the study of the evolution of RNA, DNA and proteins. Several studies have reported that the evolutionary rate of amino acid substitutions cannot be explained without including the genetic background of the organism. In other words, the evolutionary rate of individual amino acids is dependent on long-range contacts within the genome, a phenomenon often referred to as epistasis. In this study, the authors use deep sequence alignments to compare the expected evolutionary rate to the observed rate and conclude that the actual rate is indeed much smaller than that predicted, indicating that the evolution of these molecules is constrained by the genetic background. However, it is worth noting that the proteins in this study are all highly conserved and therefore might not be representative of proteins in general.


The dynamics of cooperative bacterial virulence in the field.

Raymond et al Science. 2012 Jul 6;337(6090):85-8

Bacterial populations undergo rapid adaption when under selective pressure. These adaptations can be genetic (mutations for example) but they can also be changes to the relative prevalence of a particular genetic trait compared to another, as sub-populations might have different selective pressure depending on population attributes such as density. Secreted virulence factors, exotoxins for example, can be expensive to produce but benefit all bacteria occupying the same clinical site. It is therefore conceivable that bacteria of a population can benefit from not producing the toxin and instead rely on their neighbors. This paper demonstrated that there is an optimal relation between bacteria producing the toxin and bacteria that are not, and that this fraction is reached from different starting points (under constant density). Selective pressure on an individual is hence dependent on population density and composition where bacteria not producing toxins would be at an advantage at high density but at a disadvantage at lower densities assuming the relative fraction of producers and non-producers are the same.


Population genomics of early events in the ecological differentiation of bacteria.

Shapiro et al Science. 2012 Apr 6;336(6077):48-51

This article demonstrates how genomic traits are spread through bacterial populations in oceans and the authors find that certain genetic elements sweep through the populations similarly to how genetic traits are swept through sexual eukaryotes. The authors conclude that ecological differentiation is driven by gene-specific sweeps and not genomic sweeps. These findings are important as they demonstrate that evolution in bacteria is a complex interplay between mutations, selection, and specific and unspecific exchange of genetic material.


High-throughput decoding of antitrypanosomal drug efficacy and resistance.

Alsford et al Nature. 2012 Jan 25;482(7384):232-6

The authors of this paper utilized high-throughput RNA interference (RNAi) in combination with five known drugs used to treat human African trypanosomiasis to reveal the proteins involved in the drug mechanism. The infection caused by African trypanosomes, the causative agent of sleeping sickness, is treated using five different drugs. The underlying mechanism of the drug efficacy is partly unknown. Deep insight into the protein networks involved is provided by this paper where high-throughput RNAi was used to knock down genes in a systematic fashion and monitor pathogen viability. Proteins involved in the drug action could be identified when knocked down by increased viability of the protozoa. The paper nicely demonstrates how to use high-throughput technologies to arrive at relatively detailed information on how a particular drug is operating.


Evolution of increased complexity in a molecular machine.

Finnigan GC et al Nature.2012 Jan 19; 481(7381):360-4

This paper provides credible evidence on how complexity in protein complexes can increase based on high-probability mutations and gene duplications. Plausible ancient proteins were re-created and shown to functionally replace the more complex modern version of the yeast vacuolar-type H(+)-ATPase (V-ATPase) proton pump.

Protein complexes carry out fundamental processes in all living systems. Experimentally verified ways on how these complexes evolve are rare since the ancestral proteins are lost. This paper used maximum likelihood phylogeny to identify key mutations and then reconstruct plausible ancestral proteins. The authors show, through a simple functional assay, how an ancient gene duplication and subsequent evolution altered the structure of the yeast V-ATPase proton pump from a two-paralog hexamer to a three-paralog hexamer in which all three paralogs are essential.


Chromosome organization by a nucleoid-associated protein in live bacteria.

Wang W et al Science.2011 Sep 9; 333(6048):1445-)

This paper demonstrates the application of super-resolution microscopy to elucidate the spatial distribution of chromosome-organizing proteins in live bacterial cells that are too small to study with more conventional microscopy methods.

Cells display a high degree of organization and yet re-organize themselves in response to various stimuli. Elucidating the mechanisms underlying this organization and dynamics is likely required to achieve a deeper understanding of how cells function. The super-resolution microscopy technology used in this paper allows the authors to study how the bacteria organize their chromosome and demonstrate that this organization have an impact on gene expression. This paper is important as it directly links the spatial distribution of a protein to its function. We believe that super-resolution microscopy will contribute with valuable knowledge as the technology becomes more readily available.


Phenotypic landscape of a bacterial cell.

Nichols RJ et al Cell.2011 Jan 7; 144(1):143-56

This paper systematically screens 3979 knock-out strains of Escherichia coli over 324 conditions-for-growth phenotypes. This large dataset, which is available for download, is an invaluable resource for anybody interested in environmental perturbation/protein interactions.

This work describes a systematic study in which 3979 gene knockouts are tested in 324 conditions-for-growth phenotypes. The authors attempt to be as inclusive as possible in both relevant conditions and gene knockouts. This makes the resulting data set a unique resource for hypothesis generation. The simple readout of growth phenotype is surprisingly informative and is currently one of the few technologies accessible to studies of this magnitude where almost 1.3 million environmental perturbation/gene interactions were tested. Focused follow-up experiments where selected conditions are tested by more involved technologies such as transcript or proteomic analyses, will be easier to design and hence benefit from this study.


Distinct signatures of diversifying selection revealed by genome analysis of respiratory tract and invasive bacterial populations.

Shea PR et al Proc Natl Acad Sci U S A.2011 Mar 22; 108(12):5039-4)

Large-scale full-genome sequencing of the pathogen Streptococcus pyogenes reveals that strains causing pharyngitis can also become invasive.

S. pyogenes is a common pathogen that causes pharyngitis but can also become invasive, in which case the mortality rate increases significantly. Musser and colleagues genome-sequenced 86 strains isolated in pharyngitis patients and over 200 invasive strains isolated in the same geographical area at the same time. The comparison revealed that all four primary genetic lineages identified in the study caused pharyngitis and all four could also become invasive. Furthermore, the study revealed that certain genes are more strongly diverged between the pharyngitis isolates compared to the invasive isolates than expected by chance. This can be explained by different selective pressures exerted on the bacteria in the different anatomical sites. This study gives insight into disease progression and also provides information on genes that play key roles for survival in the distinct anatomical sites, all using a technology that is rapidly becoming accessible to all.


Second-order selection for evolvability in a large Escherichia coli population.

Woods RJ et al Science.2011 Mar 18; 331(6023):1433-6

This paper shows that genomic epistasis plays a key role in evolution where initially favorable mutations alter the effect of subsequent synergistic mutations.

Fitness in bacteria can be evaluated by the competitiveness of one sub-strain among many under a certain condition. Descendants of the fittest bacteria will constitute the majority in a population after enough time has passed. In this paper, initially less-fit bacteria (eventual winner [EW]) would eventually achieve higher fitness than bacteria that were initially more fit (eventual loser [EL]). The authors prove that this is due to the fact that EL bacteria cannot benefit to the same degree as EW from a mutation in spoT. This paper convincingly shows that this is due to epistasis, where the genetic background of EL (specifically, in gene topA) does not benefit to the same extent as EW.

In other words, EW has a higher degree of evolvability or evolutionary potential compared to EL under the tested condition. This is an important finding with implications for, among other fields, directed evolution experiments, as early beneficial mutations might exclude parts of the evolutionary space where even more-fit bacteria could be possible.


The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA.

Garneau JE et al Nature.2010 Nov 4; 468(7320):67-71

This important paper shows that the adaptive immune system of some bacteria can cut double-stranded DNA (dsDNA) sequences, and that the cut site is specified by a short nucleic acid sequence.

The 'clustered regularly interspaced short palindromic repeats' (CRISPR)/ CRISPR-associated (Cas) immune system of Streptococcus thermophilus can acquire specific CRISPR spacers from various sources, such as self-replicating plasmids. Plasmids that match these spacers are cut at a specific site determined by the spacer and produce blunt ends. The now linear plasmid is rapidly lost. This is important as this can be exploited by manipulating the CRISPR of bacteria so they cannot acquire plasmids that, for example, carry anti-biotic resistance genes. This paper also shows that the CRISPR/Cas system specifically cleaves dsDNA at specific sites. It is plausible that this system can also be exploited in the future to develop technology in which specifying where to cut DNA is as simple as synthesizing a matching CRISPR spacer.


Systematic analysis of human protein complexes identifies chromosome segregation proteins.

Hutchins JR et al Science.2010 Apr 30; 328(5978):593-9

In this exciting work, high-throughput technologies were used to characterize protein complexes involved in chromosome segregation, and new members of these complexes were identified and verified.

Proteins organize into complexes to perform many cellular tasks. Knowledge about the composition and dynamics of these complexes will further the understanding of fundamental processes in biology. In this study, protein complexes involved in chromosome segregation were characterized using a combination of high-throughput imaging and immunoprecipitation (IP)-based mass spectrometry. Some of the more interesting finds were followed up and shown to be correct. The authors were able to use the genome-wide RNA interference (RNAi) screen (where testable hypotheses are difficult to generate) to identify potential targets, process all of them with high-throughput technologies and generate hypotheses that could be tested using traditional biochemical approaches. They tackled a well-studied system and came up with novel information.


Evolution of MRSA during hospital transmission and intercontinental spread.

Harris SR et al Science.2010 Jan 22; 327(5964):469-74

Genome sequencing of methicillin-resistant Staphylococcus aureus (MRSA) from multiple temporal and spatial isolates has revealed information about infection paths and mutation rates.

Antibiotic-resistant bacteria are a growing global health problem. An obvious strategy to prevent the spread of such strains is to reduce transmission between individuals. This ultimately relies on our understanding of the transmission paths. In this paper, multiple strains were sequenced and single-nucleotide polymorphisms (SNPs) were analyzed in great detail generating a high-resolution dendrogram. The study demonstrates how genome sequencing of bacterial isolates (isolated across time and space) of a particular strain of MRSA, TW20, provides a high-resolution map on how bacteria spread between patients at the same hospital and provides information about intercontinental spread. The applied technology lays the foundation for investigating how the bacteria are spread long after the outbreak has occurred provided that a sufficient number of isolates were collected and catalogued.


Structure and mechanisms of a protein-based organelle in Escherichia coli.

Tanaka S et al Science.2010 Jan 1; 327(5961):81-4

In this study, X-ray structures of four microcompartment proteins that are the components of the protein-based organelle capsule explain how bent conformations are achieved and reveal an active gating function.

There is growing evidence that bacteria are, in fact, not devoid of internal compartmentalization but are rather highly organized. For example, bacterial proteins are to a large extent organized in protein complexes {1} and are sometimes localized to specific regions of the cytosol {2}; in addition, there is also evidence for protein-encapsulated organelles (see ref {3}, on which Ruedi Aebersold is an author). In this paper, Tanaka and colleagues reveal how one of the shell proteins has a bent conformation which allows these microcompartments to close. They also show that there is an active gate that presumably regulates the in- and out-flow of various molecular species. This paper adds to the evidence that the proteomes of bacteria are highly structured and compartmentalized.

References: {1} Kuehner et al. Science 2009, 326:1235-40. {2} Shapiro et al. Science 2009, 326:1225-8. {3} Kerfeld et al. Science 2005, 309:936-8.