TY - JOUR
T1 - Stock portfolio selection balancing variance and tail risk via stock vector representation acquired from price data and texts
AU - Du, Xin
AU - Tanaka-Ishii, Kumiko
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/8/5
Y1 - 2022/8/5
N2 - Recent works on portfolio selection report ways to incorporate textual data in addition to price movements. Price, texts, and events as what lies underneath take heterogeneous data form and therefore have been processed without any consistent mathematical formulation. In this article, we propose to generalize portfolio selection by representing all related objects (stocks, news, events) in an embedding vector space, that we call a NEws-STock space with Event Distribution (NESTED). A NESTED forms an inner product vector space (Hilbert space), in which texts and stocks are represented as vectors (embeddings), acquired through a distribution of events. In this article, we first theoretically reformulate Markowitz's portfolio optimization problem on NESTED. We show how our new formulation has the potential to better incorporate the tail risk, which is represented better in textual data. One typical method to acquire such embeddings is via neural computing. Our experimental results, obtained by using it on 24 news-price datasets across three markets, showed that the Pareto's exponent in the negative tail of the generated portfolios increased in all markets, which is evidence that the stock embeddings captured the tail risks. Our method showed a large improvement balancing between the tail risk and non-tail risk, up to 45.5% larger gain and 59.4% larger Information ratio.
AB - Recent works on portfolio selection report ways to incorporate textual data in addition to price movements. Price, texts, and events as what lies underneath take heterogeneous data form and therefore have been processed without any consistent mathematical formulation. In this article, we propose to generalize portfolio selection by representing all related objects (stocks, news, events) in an embedding vector space, that we call a NEws-STock space with Event Distribution (NESTED). A NESTED forms an inner product vector space (Hilbert space), in which texts and stocks are represented as vectors (embeddings), acquired through a distribution of events. In this article, we first theoretically reformulate Markowitz's portfolio optimization problem on NESTED. We show how our new formulation has the potential to better incorporate the tail risk, which is represented better in textual data. One typical method to acquire such embeddings is via neural computing. Our experimental results, obtained by using it on 24 news-price datasets across three markets, showed that the Pareto's exponent in the negative tail of the generated portfolios increased in all markets, which is evidence that the stock embeddings captured the tail risks. Our method showed a large improvement balancing between the tail risk and non-tail risk, up to 45.5% larger gain and 59.4% larger Information ratio.
KW - Mean–variance minimization
KW - Neural network
KW - News text
KW - Portfolio optimization
KW - Stock embedding
KW - Tail risk
UR - http://www.scopus.com/inward/record.url?scp=85130352195&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130352195&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2022.108917
DO - 10.1016/j.knosys.2022.108917
M3 - Article
AN - SCOPUS:85130352195
SN - 0950-7051
VL - 249
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 108917
ER -