Market making is a problem of the optimal placement of limit orders on both sides of the limit order book with the goal of maximizing the trader’s terminal wealth while minimizing the related risks. Such risks particularly include inventory, execution, latency, adverse selection, and model uncertainty risks. Especially salient is the inventory risk, arising from the fluctuations in the value of the asset held in the market maker’s inventory, which is typically non-zero, since it depends on when and whether the placed orders get executed. Consequently, effective market making requires dynamic adaptation to changes in the current inventory level and other relevant market and market maker-related variables. The underlying problem of stochastic optimal control can be naturally cast as a discrete Markov Decision Process (MDP). Existing analytical approaches to market making tend to be predicated upon a set of naïve assumptions and are ill-suited to market making on order-driven markets as they fail to consider the discreteness of the limit order book in general. Moreover, they do not factor in the market microstructure dynamics, especially the time variability of order arrival intensities. Promisingly, methods based on (deep) reinforcement learning are known to lend themselves well to solving problems formulated as MDPs and hence offer a potential alternative to tackling market making. Moreover, considering that the model of the market maker’s environment is typically unknown, model-free deep reinforcement learning methods, capable of learning directly from data without any explicit modeling of the underlying dynamics or prior knowledge, are of pivotal importance. Bearing this in mind, as well as the shortcomings of the current approaches, in this thesis novel model-free deep reinforcement learning methods for market-making on order-driven markets with time-varying order arrival intensities are proposed. The first method is based on two standalone supervised learning-based signal generating units and a deep reinforcement learning unit for market making that exploits the generated signals. Special attention is paid to demands on the sufficient granularity of the resulting market making policies and to the methods’ robustness to variations in the market microstructure dynamics. To this end, a procedure for training market making agents robust to such variations, based on adversarial reinforcement learning, is also proposed. Moreover, an evaluation framework for testing the proposed method with respect to the interpretability and the risk-adjusted return metrics is proposed. The second method is concerned with market making under a weakly consistent, multivariate Hawkes process-based LOB model. The experimental results are discussed, analyzed, and juxtaposed against the results of several market making benchmarks. It is found that the proposed methods outperform the benchmarks with respect to multiple risk-adjusted reward performance metrics.
|Sažetak (hrvatski)|| |
Održavanje tržišta problem je optimalnog postavljanja ograničenih naloga na obje strane knjige ograničenih naloga s ciljem maksimizacije konačne vrijednosti uz minimizaciju povezanih rizika, a posebice rizika inventara, koji proizlazi iz fluktuacija u vrijednosti imovine u inventaru održavatelja tržišta. Postojeći analitički pristupi održavanju tržišta najčešće se temelje na naivnim pretpostavkama te nisu prikladni za održavanje tržišta na tržištima zasnovanim na nalozima. Nadalje, ne uzimaju u obzir dinamiku mikrostrukture tržišta, pa ni vremensku promjenjivost intenziteta dolazaka naloga. U ovoj disertaciji predlažu se novi pristupi za održavanje tržišta na tržištima zasnovanim na nalozima u uvjetima vremenski promjenjivih intenziteta dolazaka naloga, temeljeni na dubokom podržanom učenju. Eksperimentalni rezultati detaljno su diskutirani, analizirani, te uspoređeni s rezultatima više referentnih strategija za održavanje tržišta. Otkriveno je da predloženi pristupi nadmašuju referentne strategije s obzirom na više mjera povrata prilagođenih riziku.