# Statistical Models for Large Tick Assets

In this section we present briefly the statistical models recently introduced by Curato and Lillo (2013) describing the high frequency dynamics of price changes for a large tick size asset in trade time. We want to show that these models are able to reproduce the phenomenon of clustering for log-returns and the scaling of hypercumulants Aq (n) in trade time.

The building blocks of these models are simple: the distribution of price changes caused by 1 transaction, i. e. Ap(i, и = 1), and the statistical properties of the dynamics of the bid-ask spread s (i). In our model we impose a coupling between the process of the price changes and of the spread in order to reproduce the price – change clustering.

We consider first a benchmark model, hereafter called i. i.d. model, in which this coupling is absent and where we use only the information contained in the distribution of Ap (i, n = 1).[5] Our empirical analysis indicates that for a large tick asset the distribution of Ap is mainly concentrated on Ap = 0. This observation allows us to limit the discrete set on which we define the distribution at the scale of 1 transaction, i. e. Ap є {—2, —1,0,1,2} in units of half tick size. In the i. i.d.

model the Ap(i) process is simply an i. i.d. process in which each observation has the distribution estimated from data. Numerical simulations and analytical considerations show that this model is unable to reproduce price change clustering at any scale, i. e. when we aggregate n values we recover a bell shaped distribution for Ap (i, n) = J2l=1AP (i).

P (s (i) = ks (i – 1) =j, s(i – 2) = l, •••) =P(s (i) = ks (i – 1) = j) = pjk,

(6)

where j, k,l є {1,2} are spread values and i є N is the trade time. The spread process is described by the transition matrix S є M (2,2):

S = (P11 P12)

P21 P22

where the normalization is given by Y? k=i Pjk = 1. For example for CSCO we estimate Pu = 0.97 and p21 = 0.58, i. e. the transitions in which the spread changes are not frequent. In this model the spread could assume 2 values so we could have 4 possible transitions t (i) between two subsequent transactions, that we could identify with an integer number from 1 to 4. For example, the transition s (i) = 1 ! s (i + 1) = 1 is described by the state t (i) = 1, etc. In this way we can derive a new Markov(1) process that describe the process t (i). At this point the mechanistic constraint imposed by a price grid, defined by the value of the tick size, allows us to couple the price changes Ap (i) with the process of transitions t (i). In this way we are able to define a Markov-switching model (Hamilton 2008) for price changes Ap (i) conditioned to the Markov process t (i) by the conditional probabilities:
where m 2 {1,2,3,4}. The estimation of such conditional probabilities enable us to simulate the process for price changes. In order to compute log-returns we follow a simple procedure. We generate the simulated series of price changes from the Markov-switching model calibrated from data, then we integrate it choosing as starting point the first mid-price recorded on the measured series. At this point we have a synthetic discrete series of mid-price on which we can compute log-returns r (i, 1) correspondent to 1 transaction. Then we aggregate individual transaction returns on non-overlapping windows of width n to recover the process at a generic time scale n. This model is able to reproduce clustering for price changes and for log-returns. In fact as we can observe in Fig. 6 this model reproduces the returns clustering at different time scales. The clustering starts to disappear beyond the time scale of aggregation n = 512.

The Markov-switching model is not able to explain the empirically observed correlation of squared price changes, that is related to the presence of volatility clustering. Usually in financial econometrics an autoregressive conditional het – eroskedasticity model (ARCH) (Bera and Higgins 1993; Engle et al. 2008) can account for volatility clustering and non-Gaussianity of returns. We do not make use of this class of models because they are defined by continuous stochastic variables. Instead, we have seen in Sect. 3.2 that the high frequency return distribution is characterized by the presence of discretization and clustering. For this reason we have chosen to define our model directly on discrete variables as price changes, but this choice prevents us from using models like ARCH. Therefore in Curato and Lillo (2013) we develop a second model based on an autoregressive switching model for price changes, which preserves the ideas of ARCH type models that past squared returns affect current return distribution. This means that the conditional

probabilities of Eq. (7) can depend not only on the last spread transition, but also on the recent past values of price changes. In the case in which regressors are defined only by past squared price changes, our model can be viewed as a higher-order double chain Markov model of order p (Berchtold 1999). For our purposes we remember only that we have fitted this model for an order p = 50 on our data. Here we use it to generate a series in order to study the scaling of hypercumulants Aq (n). We refer to our original article (Curato and Lillo 2013) for the details and definitions for this autoregressive model.

In order to fit our three models we split our daily time series in two series, the first displays low volatility instead the second displays high volatility. Here we focus on the low volatility series that starts at 10:30 and ends at 15:45, but our findings are the same for the series with high volatility. We compute the log-returns from the Montecarlo simulations of our models and then compute the normalized log-return g(i, n) in trade time. The Montecarlo simulations generate 5961600 data points that correspond to 1 year of transactions for a mean duration time, i. e. the interval of time between two transaction, of 1 s and 6 h of trading activity each day. The sample for the stock CSCO instead covers 2 months of trading for a total of 275879 transactions. We can observe from the Figs. 7 and 8 that the proposed models converge to a Gaussian behavior. The Markov-switching model and the double chain

q q

 10

 100

 1000

 1

Fig. 8 Log-linear plot of the scaling of hypercumulants Aq=і;2.5,з, з.5 (n) for the stock CSCO and the related simulated returns processes

Markov model, i. e. DCMM in the figures, reproduce slightly better the scaling of the hypercumulants respect to the simple i. i.d. model, although all models are very close to the empirical moments. We could observe a little difference in Fig. 8 for high values of n between simulation and real data. We think that this distortion rises from the reduced number of the data sample used to compute Aq for the stock CSCO with respect to the number of data points available for simulated data. The Markov-switching model results to be the simplest model able to reproduce at the same time the clustering of price changes and log-returns together with the correct scaling of hypercumulants toward a Gaussian behavior.

(continued)

 discrete nature of prices affects also the distribution of log-returns. The effect of discretization for log-returns disappears when the time scale of observation of the mid-price process is high enough, both for large effective tick than for small effective tick. The difference between the two kinds of stock is that for a large tick asset we have also the effect of returns clustering. For example, if we observe the price process in continuous time for a time scale around 15 min, the effect of discretization and clustering starts to disappear for a large effective tick, instead for small effective tick sizes the discretization disappears at shorter time scales, i. e. 30 s. The analysis of the hypercumulants of distributions of log-returns indicates a converge toward a Gaussian behavior for stocks with large and small effective tick sizes. The presence of a large effective tick seems to slow down the convergence toward a Gaussian behavior, coherently with a progressive disappearance of clustering of returns. The analysis of convergence by means of the Hill estimator shows a crossover time scale after which large and small tick size assets behave in the same way. However our analysis characterize how the blind use of estimators could lead to errors in the determination of the tail exponents. We develop statistical models in trade time for large tick size asset that are able to reproduce the presence of clustering for price changes and log-returns. We find that in order to reproduce the clustering effect we need a model in which the dynamics of mid-price changes is coupled with the dynamics of the bid-ask spread. A Markov-switching model, where the switching process is defined by the possible transitions between subsequent values of the spread, is able to reproduce the effect of clustering and the scaling of hypercumulants computed from empirical data. This simple high frequency statistical microstructural model, defined by quantities like bid and ask prices on a discrete price grid determined by the value of the tick size, is able to recover a Gaussian behavior for returns at the macroscopic time scale of the hours.

 Acknowledgements Authors acknowledge partial support by the grant SNS11LILLB Price formation, agents heterogeneity, and market efficiency.

 Ahn, H. J., Cai, J., Chan, K., & Hamao, Y. (2007). Tick size change and liquidity provision on the Tokyo Stock Exchange. Journal of Japanese and International Economies, 21, 173-194. Ane, T., & Geman, H. (2000). Order flow, transaction clock and normality of asset returns. The Journal of Finance, 55, 2259-2284. Ascioglu, A., Comerton-Forde, C., & Mclnish, T. H. (2010). An examination of minimum tick sizes on the Tokyo Stock Exchange. Japan and the World Economy, 22, 40-48.

Bera, A. K., & Higgins, M. L. (1993). Arch models: properties, estimation and testing. Journal of Economic Surveys, 7(4), 305-366.

Berchtold, A. (1999). The double chain Markov model. Communications in Statistics – Theory and Methods, 25(11), 2569-2589.

Bessembinder, H. (2000). Tick size, spreads, and liquidity: An analysis of Nasdaq Securities Trading near Ten Dollars. Journal of Financial Intermediation, 9, 213-239.

Bessembinder, H. (1997). The degree of price resolution and equity trading costs. Journal of Financial Economics, 45, 9-34.

Bessembinder, H. (1999). Trade execution costs on NASDAQ and the NYSE: A post-reform comparison. The Journal of Financial and Quantitative Analysis, 34(3), 387-407.

Bessembinder, H. (2003). Trade execution costs and market quality after decimalization. The Journal of Financial and Quantitative Analysis, 35(4), 747-777.

Bollen, N. P. B., & Busse, J. A. (2006). Tick size and institutional trading costs: evidence from mutual funds. The Journal of Financial and Quantitative Analysis, 41(4), 915-937.

Bouchaud, J. P., & Potters, M. (2009). Theory of financial risk and derivative pricing: from statistical physics to risk management, (2nd ed.). Cambridge, UK: Cambridge University Press.

Christie, W. G., & Schultz, P. H. (1994). Why do NASDAQ market makers avoid odd-eighth quotes? The Journal of Finance, 49(5), 1813-1840.

Christie, W. G., Harris, J. H., & Schultz, P. H. (1994). Why did NASDAQ market makers stop avoiding odd-eighth quotes? The Journal of Finance, 49(5), 1841-1860.

Chung, K. H., Chuwonganant, C., & McCormick, D. T. (2004). Order preferencing and market quality on NASDAQ before and after decimalization. Journal of Financial Economics, 71, 581-612.

Clark, P. K. (1973a). A subordinated stochastic process model with finite variance for speculative prices. Econometrica, 41(1), 135-155.

Clauset, A., Shalizi, C. R., & Newman, M. E. J.(2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661-703.

Cont, R. (2001). Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance, 1, 223-236.

Dacorogna, M. M., Genfay, R., Muller, U. A., Olsen, R. B., & Pictet, O. V. (2001). An introduction to high-frequency finance. San Diego, California, USA: Academic Press.

Dayri, K. A., Bacry, E., & Muzy, J. F. (2011). The nature of price returns during periods of high market activity. In F. Abergel, B. K. Chakrabarti, A. Chakraborti & M. Mitra (Eds.), Econophysics of order-driven markets (pp. 155-172). Milano, Italy: Springer-Verlag Italia.

Eisler, Z., Bouchaud, J. P., & Kockelkoren, J. (2012). The price impact of order book events: market orders, limit orders and cancellations. Quantitative Finance, 12(9), 1395-1419.

Engle, R. F., Focardi, S. M., & Fabozzi, F. J. (2008). ARCH/GARCH models in applied financial econometrics. In F. J. Fabozzi (Ed.), Handbook of Finance. New York, NY: John Wiley & Sons.

Gibson, S., Singh, R., & Yerramilli, V. (2003). The effect of decimalization on the components of the bid-ask spread. Journal ofFinancial Intermediation, 12, 121-148.

Gillemot, L., Farmer, J. D., & Lillo, F. (2006). There’s more to volatility than volume. Quantitative Finance, 6, 371-384.

Goldstein, M. A., & Kavajecz, K. A. (2000). Eights, sixteenths, and market depth: changes in tick size and liquidity provision on the NYSE. Journal of Financial Economics, 56, 125-149.

Hamilton, J. D. (2008). Regime-Switching models. In S. N. Durlauf, & L. E. Blume (Eds.), The new palgrave dictionary of economics, 2nd edn. Basingstoke, UK: Palgrave Macmillan.

Harris, L. (1991). Stock price clustering and discreteness. The Review of Financial Studies, 4(3), 389-415.

Hautsch, N. (2012). Econometrics of financial high-frequency data. Berlin: Springer-Verlag.

He, Y., & Wu, C. (2004). Price rounding and bid-ask spreads before and after decimalization. International Review of Economics and Finance, 13, 19-41.

Huang, R. D., & Stoll, H. R. (2001). Tick size, bid-ask spreads, and market structure. The Journal of Financial and Quantitative Analysis, 36(4), 503-522.

La Spada, G., Farmer, J. D., & Lillo, F. (2011). Tick size and price diffusion. In F. Abergel, B. K. Chakrabarti, A. Chakraborti & M. Mitra (Eds.), Econophysics of order-driven markets (pp. 173-187). Milano, Italy: Springer-Verlag Italia.

Loistl, O., Schossmann, B., & Veverka, A. (2004). Tick size and spreads: The case of Nasdaq’s decimalization. European Journal of Operational Research, 155, 317-334.

MacKinnon, G., & Nemiroff, H. (2004). Tick size and the returns to providing liquidity. International Review of Economics and Finance, 13, 57-73.

Mandelbrot, B., & Taylor, H. M. (1967). On the distributions of stock price differences. Operations Research, 15(6), 1057-1062.

Onnela J. P., Toyli, J., Kaski, K. (2009). Tick size and stock returns. Physica A, 388, 441-454.

Osborne, M. F. M. (1962). Periodic structure in the Brownian Motion of stock prices. Operations Research, 10(3), 345-379.

Plerou, V., Gopikrishnan, P., & Stanley, H. E. (2005). Quantifying fluctuations in market liquidity: Analysis of the bid-ask spread. Physical Review E, 71, 046131-1/046131-8.

Plerou, V., & Stanley, H. E. (2007). Test of scaling and universality of the distributions of trade size and share volume: Evidence from three distinct markets. Physical Review E, 76, 046109- 1/046109-10.

Ponzi, A., Lillo, F., & Mantegna, R. N. (2009). Market reaction to a bid-ask spread change: A power-law relaxation dynamics. Physical Review E, 80, 016112-1/016112-12.

U. S. Securities and Exchange Commission. (2012). Report to Congress on Decimalization. Retrieved from http://www. sec. gov/news/studies/2012/decimalization-072012.pdf.

Robert, C. Y., & Rosenbaum, M. (2011). A new approach for the dynamics of ultra-high-frequency data: the model with uncertainty zones. Journal ofFinancial Econometrics, 9(2), 344-366.