Improvement of detection performance of fusion genes from RNA-seq data by clustering short reads

Yoshiaki Sota*, Shigeto Seno, Hironori Shigeta, Naoki Osato, Masafumi Shimoda, Shinzaburo Noguchi, Hideo Matsuda


研究成果: Article査読

1 被引用数 (Scopus)


Fusion genes are involved in cancer, and their detection using RNA-Seq is insufficient given the relatively short reading length. Therefore, we proposed a shifted short-read clustering (SSC) method, which focuses on overlapping reads from the same loci and extends them as a representative sequence. To verify their usefulness, we applied the SSC method to RNA-Seq data from four types of cell lines (BT-474, MCF-7, SKBR-3, and T-47D). As the slide width of the SSC method increased to one, two, five, or ten bases, the read length was extended from 201 bases to 217 (108%), 234 (116%), 282 (140%), or 317 (158%) bases, respectively. Furthermore, fusion genes were investigated using STAR-Fusion, a fusion gene detection tool, with and without the SSC method. When one base was shifted by the SSC method, the reads mapped to multiple loci decreased from 9.7% to 4.6%, and the sensitivity of the fusion gene was improved from 47% to 54% on average (BT-474: from 48% to 57%, MCF-7: 49% to 53%, SKBR-3: 50% to 57%, and T-47D: 43% to 50%) compared with original data. When the reads are shifted more, the positive predictive value was also improved. The SSC method could be an effective method for fusion gene detection.

ジャーナルJournal of Bioinformatics and Computational Biology
出版ステータスPublished - 2019 6月 1

ASJC Scopus subject areas

  • 生化学
  • 分子生物学
  • コンピュータ サイエンスの応用


「Improvement of detection performance of fusion genes from RNA-seq data by clustering short reads」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。