Diagenode

MGcount: a total RNA-seq quantification tool to address multi-mappingand multi-overlapping alignments ambiguity in non-coding transcripts


Hita Andrea, Brocart Gilles, Fernandez Ana, Rehmsmeier Marc, Alemany Anna, Schvartzman Sol

Background Total-RNA sequencing (total-RNA-seq) allows the simultaneous study of both the coding and the non-coding transcriptome. Yet, computational pipelines have traditionally focused on particular biotypes, making assumptions that are not fullfilled by total-RNA-seq datasets. Transcripts from distinct RNA biotypes vary in length, biogenesis, and function, can overlap in a genomic region, and may be present in the genome with a high copy number. Consequently, reads from total-RNA-seq libraries may cause ambiguous genomic alignments, demanding for flexible quantification approaches. Results Here we present Multi-Graph count (MGcount), a total-RNA-seq quantification tool combining two strategies for handling ambiguous alignments. First, MGcount assigns reads hierarchically to small-RNA and long-RNA features to account for length disparity when transcripts overlap in the same genomic position. Next, MGcount aggregates RNA products with similar sequences where reads systematically multi-map using a graph-based approach. MGcount outputs a transcriptomic count matrix compatible with RNA-sequencing downstream analysis pipelines, with both bulk and single-cell resolution, and the graphs that model repeated transcript structures for different biotypes. The software can be used as a python module or as a single-file executable program. Conclusions MGcount is a flexible total-RNA-seq quantification tool that successfully integrates reads that align to multiple genomic locations or that overlap with multiple gene features. Its approach is suitable for the simultaneous estimation of protein-coding, long non-coding and small non-coding transcript concentration, in both precursor and processed forms. Both source code and compiled software are available at https://github.com/hitaandrea/MGcount. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04544-3.

Tags
D-Plex small RNA-seq

Share this article

Published
January, 2022

Source

Products used in this publication

  • Small RNA library preparation with UMI for Illumina
    C05030001
    D-Plex Small RNA-seq Kit for Illumina
  • default alt
    G02030005
    RNA Data Analysis

Events

  • APHL 2024
    Milwaukee, Wisconsin, USA
    May 6-May 9, 2024
  • London Calling 2024
    London, UK
    May 21-May 24, 2024
 See all events

News

 See all news


The European Regional Development Fund and Wallonia are investing in your future.

Extension of industrial buildings and new laboratories.


       Site map   |   Contact us   |   Conditions of sales   |   Conditions of purchase   |   Privacy policy   |   Diagenode Diagnostics