Application of VBGMM for pitch type classification: analysis of TrackMan's pitch tracking data

Kazuhiro Umemura*, Toshimasa Yanai, Yasushi Nagata

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


In the game of baseball, each pitcher throws various types of pitches, such as cutter, curve ball, slider, and splitter. Although the type of a given pitch may be inferred by audience and/or obtained from the TrackMan data, the actual pitch type (i.e., the pitch type declared by the pitcher) may not be known. Classification of pitch types is a challenging task, as pitched baseballs may have different kinematic characteristics across pitchers even if the self-declared pitch types are the same. In addition, there is a possibility that the kinematic characteristics of pitched baseballs are identical even if the self-declared pitch types are different. In this study, we aimed to classify TrackMan data of pitched baseballs into pitch types by applying the Variational Bayesian Gaussian Mixture Models technique. We also aimed to analyze the kinematic characteristics of the classified pitch types and indices related to batting performance while pitching each pitch type. The results showed that the pitch types could not be accurately classified solely by kinematic characteristics, but with consideration of the characteristics of the fastball the accuracy improves substantially. This study could provide a basis for the development of a more accurate automatic pitch type classification system.

Original languageEnglish
Pages (from-to)41-71
Number of pages31
JournalJapanese Journal of Statistics and Data Science
Issue number1
Publication statusPublished - 2021 Jul


  • Baseball
  • Clustering
  • Machine learning
  • Sports statistics
  • TrackMan

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Statistics and Probability


Dive into the research topics of 'Application of VBGMM for pitch type classification: analysis of TrackMan's pitch tracking data'. Together they form a unique fingerprint.

Cite this