Seminar: Modelling error in next-generation sequencing data - applications in metagenomics

2012-11-28: by Inge Jonassen, University of Bergen, Norway.

Next-generation sequencing instruments generate data in high volumes and have opened new avenues for molecular life science research. Each technology and platform has its own error characteristics that should be taken into account in order to use the data to its full potential. Systematic analysis of pyrosequencing data from the Roche 454 platform has allowed us to construct empirical error models. These have been used to construct a sequence simulator FlowSim generating more realistic pyrosequencing data. An important application of next generation sequencing and pyrosequencing in particular is in metagenomics. We will discuss how error models for pyrosequencing data allows improved interpretation of metagenomic data sets. Finally we will describe experiments where we have performed systematic comparisons of alternative approaches to characterizing molecular diversity in real biological samples.

Balzer, Malde, Lanzen, Sharma, Jonassen. 2010. Characteristics of 454 pyrosequencing data — enabling realistic simulation with flowsim. Bioinformatics 26,i420-i425.
Balzer, Malde, Jonassen. 2011. Systematic exploration of error sources in pyrosequencing flowgram data. Bioinformatics 27, i304-i309.