TY - JOUR
T1 - Generative aptamer discovery using RaptGen
AU - Iwano, Natsuki
AU - Adachi, Tatsuo
AU - Aoki, Kazuteru
AU - Nakamura, Yoshikazu
AU - Hamada, Michiaki
N1 - Funding Information:
Computation for this study was performed in part on the NIG supercomputer at ROIS National Institute of Genetics. N.I. and M.H. thank members of Hamada Laboratory for their valuable comments. This work was supported by JST CREST (grant nos. JPMJCR1881 and JPMJCR21F1) Japan.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/6
Y1 - 2022/6
N2 - Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). Various candidates are limited by actual sequencing data from an experiment. Here we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimensional latent space on the basis of motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery.
AB - Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). Various candidates are limited by actual sequencing data from an experiment. Here we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimensional latent space on the basis of motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery.
UR - http://www.scopus.com/inward/record.url?scp=85131324974&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131324974&partnerID=8YFLogxK
U2 - 10.1038/s43588-022-00249-6
DO - 10.1038/s43588-022-00249-6
M3 - Article
AN - SCOPUS:85131324974
SN - 2662-8457
VL - 2
SP - 378
EP - 386
JO - Nature Computational Science
JF - Nature Computational Science
IS - 6
ER -