PDB-scale analysis of known and putative ligand-binding sites with structural sketches

Jun Ichi Ito, Yasuo Tabei, Kana Shimizu, Kentaro Tomii*, Koji Tsuda

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

21 Citations (Scopus)


Computational investigation of protein functions is one of the most urgent and demanding tasks in the field of structural bioinformatics. Exhaustive pairwise comparison of known and putative ligand-binding sites, across protein families and folds, is essential in elucidating the biological functions and evolutionary relationships of proteins. Given the vast amounts of data available now, existing 3D structural comparison methods are not adequate due to their computation time complexity. In this article, we propose a new bit string representation of binding sites called structural sketches, which is obtained by random projections of triplet descriptors. It allows us to use ultra-fast all-pair similarity search methods for strings with strictly controlled error rates. Exhaustive comparison of 1.2 million known and putative binding sites finished in ∼30 h on a single core to yield 88 million similar binding site pairs. Careful investigation of 3.5 million pairs verified by TM-align revealed several notable analogous sites across distinct protein families or folds. In particular, we succeeded in finding highly plausible functions of several pockets via strong structural analogies. These results indicate that our method is a promising tool for functional annotation of binding sites derived from structural genomics projects.

Original languageEnglish
Pages (from-to)747-763
Number of pages17
JournalProteins: Structure, Function and Bioinformatics
Issue number3
Publication statusPublished - 2012 Mar
Externally publishedYes


  • Ligand-binding site
  • Neighbor search algorithm
  • Pocketome
  • Structure and function

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology


Dive into the research topics of 'PDB-scale analysis of known and putative ligand-binding sites with structural sketches'. Together they form a unique fingerprint.

Cite this