Publications


Targeted Financial-Oriented Social Media Sentiment Measurement: Natural Language Processing Approach

Gilles Caporossi, Pan Liu, Feng Zhan, and Xiaozhou Zhou

Abstract This study develops a natural language processing model that measures financial-oriented sentiment targeted toward specific firms in social media texts. First, we create a human-annotated social media targeted financial sentiment dataset. Then, we propose a prompt-based model architecture that achieves state-of-the-art performance on multiple benchmark datasets for general targeted sentiment analysis. Subsequently, we finetune this model using our annotated dataset, which allows it to measure targeted financial sentiment with high accuracy. We apply it to 23 million financial-oriented social media posts from different platforms to measure financial sentiment toward 24 meme stocks (stocks that gain frenetic attention from retail investors on social media which is often accompanied by dramatic price movement) and 30 Dow Jones constituent stocks. Our results show that the sentiment measured by our model is positively correlated with price return and negatively correlated with price volatility, and that this correlation is stronger for meme stocks than for Dow Jones stocks. We further demonstrate that our model’s sentiment measurement economically outperforms other representative financial sentiment measurements by comparing the returns of the same trading strategy built upon them separately.


Order-Driven Markets: Spoofing the Spoofers

Michèle Breton and Amin Nejat

Abstract Spoofing is one of the forms of market manipulation, which consists of placing orders with the intent to cancel them before execution, in order to artificially influence the market price and mislead investors. This paper uses cryptocurrency market data to propose price-impact models, which are the foundation of market manipulation strategies. In addition, we show how order-book data could be used to detect and/or prevent manipulation in order-driven markets.


Detection and Prevention of Key-Compromise Related Fraudulence in Crypto-assets Through AI-Empowered Smart Contract: A Novel Framework for Asset Protection and Key-leakage Prevention

Zonglun Li, Hanqing Zhao, and Xue Liu

Abstract This report introduces a new framework aimed at preventing key-leakage hacking events in blockchain systems. These events have emerged as a significant concern in recent years, leading to substantial financial losses and undermining trust in blockchain technologies. The proposed framework is distinguished by its provision to allow users to freely allocate assets, choose their security level by adjusting a trust threshold, and significantly economize on gas expenses compared to other trading frameworks for safety-critical environments. By addressing these aspects, the framework ameliorates both the security and efficiency of transactions on the blockchain, thereby fostering enhanced stability and reliability of blockchain systems. This report provides a comprehensive analysis of the proposed framework, demonstrating its effectiveness and efficiency through empirical evaluation.


Deep Unsupervised Anomaly Detection in High-Frequency Markets

Cédric Poutré, Didier Chételat, and Manuel Morales

Abstract Inspired by recent advances in the deep learning literature, this article introduces a novel hybrid anomaly detection framework specifically designed for limit order book (LOB) data. A modified Transformer autoencoder architecture is proposed to learn rich temporal LOB subsequence representations, which eases the separability of normal and fraudulent time series. A dissimilarity function is then learned in the representation space to characterize normal LOB behavior, enabling the detection of any anomalous subsequences out-of-sample. We also develop a complete trade–based manipulation simulation methodology able to generate a variety of scenarios derived from actual trade–based fraud cases. The complete framework is tested on LOB data of five NASDAQ stocks in which we randomly insert synthetic quote stuffing, layering, and pump-and-dump manipulations. We show that the proposed asset-independent approach achieves new state-of-the-art fraud detection performance, without requiring any prior knowledge of manipulation patterns.

Keywords: Limit order book; Time series anomaly detection; Deep learning; Trade–based manipulation; Dissimilarity model; Unsupervised learning.


Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures Market

Timothy DeLise

Abstract Modern financial electronic exchanges are an exciting and fast-paced marketplace where billions of dollars change hands every day. They are also rife with manipulation and fraud. Detecting such activity is a major undertaking, which has historically been a job reserved exclusively for humans. Recently, more research and resources have been focused on automating these processes via machine learning and artificial intelligence. Fraud detection is overwhelmingly associated with the greater field of anomaly detection, which is usually performed via unsupervised learning techniques because of the lack of labeled data needed for supervised learning. However, a small quantity of labeled data does often exist. This research article aims to evaluate the efficacy of a deep semi-supervised anomaly detection technique, called Deep SAD, for detecting fraud in high-frequency financial data. We use exclusive proprietary limit order book data from the TMX exchange in Montréal, with a small set of true labeled instances of fraud, to evaluate Deep SAD against its unsupervised predecessor. We show that incorporating a small amount of labeled data into an unsupervised anomaly detection framework can greatly improve its accuracy.


The Role of Twitter in Cryptocurrency Pump-and-Dumps

David Ardia and Keven Bluteau

Abstract We examine the influence of Twitter promotion on cryptocurrency pump-and-dump events. By analyzing abnormal returns, trading volume, and tweet activity, we uncover that Twitter effectively garners attention for pump-and-dump schemes, leading to notable effects on abnormal returns before the event. Our results indicate that investors relying on Twitter information exhibit delayed selling behavior during the post-dump phase, resulting in significant losses compared to other participants. These findings shed light on the pivotal role of Twitter promotion in cryptocurrency manipulation, offering valuable insights into participant behavior and market dynamics.

Keywords: Cryptocurrencies, Event-study, Pump-and-dump, Twitter