Detection of fusion genes from human breast cancer cell-line RNA-seq data using shifted short read clustering

Yoshiaki Sota, Shigeto Seno, Hironori Shigeta, Naoki Osato, Masafumi Shimoda, Shinzaburo Noguchi, Hideo Matsuda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Fusion genes make for one of the mechanisms of tumorigenesis. The identification of fusion genes by RNA-Seq has attracted attention. Various methods for detecting fusion genes have been proposed, but their accuracy is not sufficient. One of the causes of this problem is the relatively short reading length in RNA-Seq data. Therefore, before mapping RNA-Seq data, we proposed a method, which is based on shifted short-read clustering (SSC), to identify shifted reads of the same origin and extend them as representative sequences. As a result, we assumed that the percentage of uniquely mapped reads would be increased, and the detection rates of the fusion genes could be improved. To verify these hypotheses, we applied the SSC method to RNA-Seq data from three celllines (BT-474, MCF-7, and SKBR-3). When only one base was shifted, the average read lengths of BT-474, MCF-7, and SKBR-3 were extended from 201 to 223 bases (111%), 201 to 214 bases (106%), and 201 to 213 bases (106%), respectively. Furthermore, the effectiveness of the SSC method is demonstrated by comparing the performances of a fusion gene detection tool's results, STAR-Fusion, with and without the SSC method of the reads. The percentage of uniquely mapped reads of BT-474, MCF-7, and SKBR-3 were improved from 88% to 93%, 88% to 94%, and 92% to 95%, respectively. Finally, the fusion gene detection rates of BT-474, MCF-7, and SKBR-3 were increased from 48% to 57%, 49% to 53%, and 50% to 53% respectively. The SSC method is considered to be an effective method not only for improving the percentage of uniquely mapped reads but also for fusion gene detection.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering, BIBE 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages159-162
Number of pages4
ISBN (Electronic)9781538662168
DOIs
Publication statusPublished - 2018 Dec 6
Externally publishedYes
Event18th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2018 - Taichung, Taiwan, Province of China
Duration: 2018 Oct 292018 Oct 31

Publication series

NameProceedings - 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering, BIBE 2018

Conference

Conference18th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2018
Country/TerritoryTaiwan, Province of China
CityTaichung
Period18/10/2918/10/31

Keywords

  • Cancer
  • Fusion gene
  • RNA-seq
  • SlideSort

ASJC Scopus subject areas

  • Genetics(clinical)
  • Health Informatics
  • Oncology
  • Biomedical Engineering
  • Cardiology and Cardiovascular Medicine

Fingerprint

Dive into the research topics of 'Detection of fusion genes from human breast cancer cell-line RNA-seq data using shifted short read clustering'. Together they form a unique fingerprint.

Cite this