The Archaeal Proteome Project (ArcPP) is a community effort that works towards a comprehensive analysis of archaeal proteomes.
Modern proteomics approaches can explore whole proteomes within a single mass spectrometry (MS) run. However, the enormous amount of MS data generated often remains incompletely analyzed due to a lack of sophisticated bioinformatic tools and expertise needed from a diverse array of fields. In particular, in the field of microbiology, efforts to combine large-scale proteomic datasets have so far largely been missing. Thus, despite their relatively small genomes, the proteomes of most archaea remain incompletely characterized. This in turn undermines our ability to gain a greater understanding of archaeal cell biology.
Therefore, we have initiated the ArcPP, which collects a diverse set MS data sets, uses state-of-the-art bioinformatic tools for their comprehensive analysis and expert knowledge from a broad range of fields for the interpretation of results. Starting with the model archaeon Haloferax volcanii, we have:
- reanalyzed more than 26 Mio. spectra
- optimized the analysis using parameter sweeps, multiple search engines implemented in Ursgal, and the combination of results through the combined PEP approach
- thoroughly controlled false discovery rates for high confidence protein identifications using the picked protein FDR approach and limiting FDR to 0.5%
- identified more than 45k peptides, corresponding to 3,069 proteins (>75% of the proteome) with a median sequence coverage of 55%.
- identified the largest archaeal glycoproteome described so far
You can explore these results through an interactive web database or various other ways:
- All result files can be downloaded for further processing here
- All scripts used for the analysis are available on GitHub. Here, also new information about additional datasets, meta data and further analyses can be found.
- This work has been published in: Schulze et al. (2020) Nature Communications 11
- The identification of the largest archaeal glycoproteome is described in: Schulze et al. (2021) PLOS Biology
With this established bioinformatic infrastructure, we have set the stage for further analyses, including proteogenomics as well as the characterization of various post-translational modifications. Furthermore, ArcPP will integrate quantitative results obtained from the individual datasets in order to identify common regulatory mechanisms. While our work has so far focused on the H. volcanii proteome, we plan to integrate results from a diverse range of archaea. If you want to contribute to this community effort, please contact us and check out our GitHub page.
Please contact us with any questions, contributions or issues. Feel free to open issues and pull request on our GitHub page or contact us at email@example.com