# Empirical Analysis

In this section we study the role of the effective tick size on the distributional properties of price changes and log-returns at different time scales. We make use of two simple definitions of the effective tick size, the first one is the tick-to-price ratio and the second one is the unconditional frequency to have the bid-ask spread equal to one tick. They are usually used in equivalent way in order to classify assets in large and small tick assets (Eisler et al. 2012; Dayri et al. 2011). We use the first definition mainly in Sect. 3, instead the second one is used to define a statistical model in Sect. 4.

In this paper we study high frequency data of highly liquid stocks traded at NAS­DAQ market in the period from 01/07/2009 to 31/08/2009 (42 trading days). We analyze the stocks: Apple Inc. (AAPL), Amazon (AMZN), Microsoft Corporation (MSFT), Cisco Systems (CSCO). Our data contain time stamps corresponding to order executions, trade prices, bid-ask quotes, size of trading volume and direction of trading. The time resolution is millisecond. The trading activity at NASDAQ starts at 9 : 30 and ends at 16 : 00. We decide to discard all transaction data corresponding to first and last 6 min of the day. During these minutes we observe bursts of trading activity and an abnormal high price fluctuations that could affect the statistical analysis of returns distributions.

It is important to point out that, since when a market order hits several limit orders, it results in several trades being reported, we choose to aggregate together all such transactions and consider them as one trade if the millisecond time stamps of trades in our database are the same. We are going to use these transactions as our “events”, meaning that all relevant values are calculated at the time just before each transaction. Hereafter we define the transaction or trade time as an integer counter of events defined by the execution of a market order.

For each asset, we define the following time series in trade time:

• ti is the time of i – th trade, i є N is the transaction or trade time.

• b (i) = b (ti) and a (i) = a (ti) are respectively the best bid and ask prices just before the i – th trade.

• P (i) = P (ti) = (b (ti) + a(ti))/2 is the midpoint price just before the i-th trade, p є N is measured in units of half tick.

• s (i) = s (ti) = a (ti) — b (ti) is the spread just before the i-th trade, s є N is measured in units of tick.

• Ap (i) = Ap (t^ = p (ti+i) — p (u) is the price change caused by the i-th trade, Ap є Z is measured in units of half tick.

• r (i) = r (ti) = log (p (ti+i)) — log (p (ti)) is the log-return, r є R.

• Ap (i, n) = p (ti+n) — p (ti) is the price change caused by n consecutive trades, n is the trade time scale at which we observe the price change process.

• r (i, n) = log (p (ti +n)) — log (p (U)) is the log-return caused by n consecutive trades.

We want to study the price process also in continuous time. To this end we define the midprice process pc (t) assuming that the price between two transactions is given by the midprice just before the second transaction. This defines a piecewise constant function like that shown in Fig. 1.

Time in secs.

Fig. 1 Midprice process for the stock AAPL is the black piecewise constant curve. The price grid is measured in units of one tick. The time is the number of seconds from the beginning of the trading day. Circles indicate executions of market orders. We can observe trades that do not cause price changes

Table 1 Sample statistic for log-returns

 Stock Time scale Mean Std. deviation Skewness Ex. kurtosis AAPL 1 trade 6.602e—08 7.553e—05 0.01434 5.268 1 s 6.641e—08 8.043e—05 —0.00105 22.16 AMZN 1 trade 6.497e—08 1.484e—04 0.0442 8.177 1 s 4.590e—08 1.097e—04 0.3164 41.25 MSFT 1 trade 1.038e—07 1.190e—04 0.0112 7.773 1 s 5.949e—08 8.406e—05 —0.0146 50.50 CSCO 1 trade 1.173e—07 1.427e—04 0.00505 7.008 1 s 5.587e—08 9.791e—05 0.1285 50.34

We make use of two definitions of the effective tick size:

• Tr = 1/{pt) is the tick-to-price ratio, where the mean trade price {pt) is measured in 0.01\$, i. e. the value of one tick.

• Ts = # [s (i) = 1] /Nt is the fraction of times the spread is equal to 1 tick, Nt is the total number of trades in 42 days.

The symbol {• ••) denotes a temporal average over the entire length of the time series. We can observe in Table 2 that these two measures divide the stocks in the same manner in two groups: AAPL and AMZN are small tick size stocks, instead MSFT and CSCO are large tick size stocks. It is important to observe that the two measures lead to the same classification between large and small tick assets. As we

Table 2 Effective tick size for NASDAQ stocks

 Stock Tick size # Trades Nt a Duration b c T r T, Class AAPL 0.01\$ 918294 1.037 0.64 0.256 SMALL AMZN 0.01\$ 530076 1.797 1.2 0.243 SMALL MSFT 0.01\$ 532795 1.788 4.1 0.932 LARGE CSCO 0.01\$ 420963 2.263 4.9 0.932 LARGE

a 42 days of transactions b mean time value in sec between 2 trades c measured in basis points

will see in the following these two classes are different from the point of view of the statistical properties of their price changes and returns distributions from small to large time scales of observation.

We start with the study of the shape of distributions of price changes Ap(i, n) and log-returns r (i, n) as a function of the value of the tick-to-price ratio. In this qualitative discussion we refer to price changes and returns computed in trade time because we want to describe only the differences between discrete distributions and distributions defined on a continuous support. Our findings are the same for the continuous time case. We want to show that the effect of a discrete tick size is more substantial for a high tick-to-price ratio. We choose to show the results for AAPL and MSFT stocks because they exemplify the two types of qualitatively different behavior.

The first important observation is the presence of price changes clustering when we observe the process in trade time or in continuous time. The price change clustering is the phenomenon for which we have an uneven use of price fractions of the price grid. In Fig. 2 we show the histogram of price changes at an aggregation scale n = 128 for a large and a small tick asset. A large tick asset has a distribution of price changes in which odd values are less populated then even values. Our empirical observations indicate that the process Ap(i, n) shows clustering for each value of n in the case of large tick assets. For example if we observe the process Ap (i, n = 8192/ the clustering is still present and 8192 transactions are a significant part of the total transactions that we could have in one day of trade, e. g. in the case of MSFT they correspond to an average execution time of four hours. So this effect is not only visible at high frequency time scales, and we want to stress that its origin comes from the price dynamics that we observe at the scale of single transactions.

The effect of price changes clustering is not present at all in the case of a small tick-to-price ratio. From the smallest time scale to the largest, i. e. from 1 to 8192 transactions, we observe a usual occupation of even and odd levels of price changes.

Fig. 2 Histograms of price changes Ap{i, n = 128) and log-returns r (i, n = 128) for AAPL and MSFT stocks. Price changes for MSFT are clustered on even values. The effect of clustering for log-returns is clearly visible

When we refer to a usual occupation we mean an absence of systematic differences in populations of price changes and a presence of a smooth discrete distribution like a binomial or a Poisson distribution. The differences in the shape of price changes distributions between small and large tick size assets are clear in Fig. 2 on the right column of the panel. The observation of clustering at a daily time scale is already known in literature (Onnela et al. 2009; Munnix and Schafer 2010; Harris 1991)but it is not clearly connected to a measure of the effective tick size. Our observations, instead, connect this property of prices directly to the effective tick.

In this way we conclude that we do not have a universal shape of distributions of price changes, but we have a dependence from an effective tick size, measured by Tr or Ts. There is a growing consensus that distributional properties of returns are quite universal, i. e. the shape of the distribution is the same for all the assets, especially for relatively large time scales (Cont 2001). How can we reconcile the empirical observation of price change clustering with a universal shape of returns
distributions? The presence of a discrete tick size has an effect on the distribution of returns. Our empirical analysis shows that returns distributions are affected by an effect of discretization and by returns clustering.[3] For small tick size asset only the discretization effect is present, while for large tick asset there is the additional effect of returns clustering. The effect of discretization is less present as the time scale n increases, instead returns clustering is less present as the tick-to-price ratio decreases. We want to stress the idea that the presence of discretization, coupled with returns clustering, for an high tick-to-price ratio disappears at time scales n higher than that relative to a low tick-to-price ratio.

In a small tick size asset like AAPL discretization effects on returns are visible at time scale of one transaction but when n & 128 this effect disappears. Instead for an asset like MSFT at the time scale of n & 128 the effect of discretization and clustering are present as we can see in Fig. 2. Our hypothesis is that we should find some time scale in trade or continuous time at which the discretization and clustering of returns disappear and we could find a universal shape for distributions of price returns. This means that we should study the properties of scaling of the distributions of returns. We made this analysis by means of the empirical hypercumulants Aq of distributions and the tail exponent a describing the asymptotic power-law behavior of distributions.

In order to compare the behavior of distributions for different time scales, i. e. n for trade time or At in continuous time, we define a normalized return g:

r(i, n) ~{r(i, n)) g(i, n) = .

{r[4] (i, n)) – {r (i, n))2

The definition for the continuous case is similar. We analyze the scaling by the moments defined by a fractional index q, i. e. the hypercumulants (Bouchaud and Potters 2009; Gopikrishnan 1999; Plerou and Stanley 2007), of the distributions of normalized returns g (i, n):

Aq (n) = {|g(i, n) |q),

in this way this quantity is defined as a function of the time scale n.

We estimate also the tail exponent a for the normalized returns g. This exponent describes the asymptotic power-law behavior of probability density functions in the following way:

where a > 0. This estimate gives information on the existence of the moments of a given distribution. A necessary condition for the qth moments to exist is that the probability density P (x) should decay faster than 1/|x|q+1 for |x| going towards infinity, then all the moments such that q > a are infinite. The asymptotic behavior of the density P (x) is also connected to properties of random variables under summation. Consider the sum Sm = ^™=1 x,- of independent identically distributed (i. i.d) random variables x,. If the x, ’s have finite second moments, the central limit theorems holds and Sm is distributed as a Gaussian in the limit m! 1. If the random variables x, are characterized by a distribution having asymptotic power – law behavior like that in Eq. (3) where 0 < a < 2, then Sm converges to a Levy stable stochastic process of exponent 0 < a < 2 in the limit m! i.

A common problem when studying a distribution that decays as a power-law is how to obtain an accurate estimate of the exponent characterizing its asymptotic behavior. We use a method developed by Clauset et al. (2009) in order to estimate the exponent a and the value of x, i.e. xmin, beyond which we have the power – law behavior. Their approach combines maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov statistics and likelihood ratios. They studied the following probability density function defined on x > xmin:

 X> x5-

where xmin is for us the lower bound of power-law behavior and a > 1. Here we should make attention about the range of values of a because the power-law exponent of Eq. (4) correspond to a + 1 of Eq. (3). They use the well-known Hill maximum likelihood estimator:

 Fig. 3 Linear-log plot of the scaling of hypercumulants Aq of normalized returns g for stocks AAPL and MSFT, i. e. respectively a small and a large tick size asset. On the left we have the case of normalized returns g (i, n) in trade time and on the right the case of normalized returns g(t, At) in continuous time. The different style of lines indicates the different time scales At. The vertical lines indicate the values of q, i. e. q = 1, 3, for which we illustrate the time scale dependence Aq (n) or Aq (At) in Fig. 4

time scales but that they seem to converge at higher time scales. In these figures we compute the corresponding values of Aq for the standard Gaussian distribution in order to control the convergence of distributions of price returns as a function of the time scale of aggregation n or At. We show the dependence of the hypercumulants from the power index q in Fig. 3 for some fixed values of time scales. When we have a large effective tick asset, the convergence of normalized returns g to a Gaussian behavior is slower with respect to returns computed in presence of a small effective tick asset. A possible motivation could be the presence of clustering of returns for large tick size assets, i. e. we have a distortion of distribution that is almost absent for the case of small tick size assets. We make this hypothesis on the basis of the computed histograms of distributions of returns. Here we can observe that the discretization effects starts to disappear after n = 32 for a small tick size stock, instead this threshold is higher for large tick size stocks (where we have also the presence of clustering), i. e. we find a value of n = 512 – f 1024. If we observe the returns process in continuous time the discretization effect disappears after a time scale around At ‘ 30 s for small tick stocks, instead for a large tick stock it starts to disappear after around 16 min.

Fig. 4 Log-linear plot of the scaling of hypercumulants Aq=1;3 (n) and Aq=1;3 (At) for stocks: AAPL, AMZN, MSFT and CSCO. We observe a different speed of convergence to a Gaussian behavior between small and large tick size stocks. There is also a different behavior if we observe the price process in trade time (left panels) or continuous time (right panels)

The behavior of Aq as a function of trade time or of continuous time is showed in Fig. 4 for two values of q. We observe that, independently of the effective tick size, in continuous time the convergence toward a Gaussian behavior is slower with respect to trade time. This observation may be explained by the subordination hypothesis. The original idea dates back to a paper by Mandelbrot and Taylor (1967) that was later developed by Clark (1973a,b). Mandelbrot and Taylor proposed that prices could be modeled as a subordinated random process Y (t) = X (r (t)), where Y is the random process generating returns, X is a Brownian motion and r (t) is a stochastic time clock whose increments are i. i.d. and uncorrelated with the process X. Clark hypothesized that the time clock r (t) is the cumulative trading volume in time t, but more recent works indicated that the number of transactions, i. e. trade time, is more important than their size (Ane and Geman 2000). Gillemot et al. (2006) and La Spada et al. (2011) showed that the role of the subordination hypothesis in fat tails of returns is strongly dependent on the tick size. From our point of view it is important to stress that the stochastic clock r (t) could modify the moments of distributions for the increments AX and AY. For example Clark (1973a) showed that if X is a Gaussian stochastic process with stationary independent increments, and r (t) has stationary independent positive increments with finite second moment which are independent from X, then the kurtosis of the increments of X (r (t)) is an

increasing function of the variance of the increments of r (t). We can observe such effect in Table 1 of Sect. 3.1. Our hypothesis is that a similar effect could be the motivation of the distortion of the value of the hypercumulants for the same value of time scale aggregation, i. e. the distribution of the sum of n = 100 identically distributed values of 1 transaction returns is not the same distribution that we obtain summing 100 identically distributed values of 1 s returns.

Let discuss now the results of the estimation of the tail exponent a of the distribution of returns at different time scales of aggregation. Figure 5 shows the estimated values of a with the error bars for aggregation in transaction time (left panel) and in real time (right panel). For small values of aggregation (in real or transaction time) a clear difference appears between large and small tick size assets. The former type of assets displays a large estimated value of a, while for the latter class the exponent is already quite small. When the aggregation scale increases, the estimated tail exponent for large tick size assets rapidly decays and around n ‘ 30 or At ‘ 30 s their behavior becomes indistinguishable from that of small tick assets. This suggests that the same underlying and latent price process characterizes large and small tick size assets, but for the former class the large tick size hides the process, at least until the crossover time scale.

After this crossing, the estimated exponent monotonically decreases (at least until the maximal investigated time scale). It is worth noticing that at the largest time

Fig. 5 Log-linear plot of the scaling of asymptotic tail exponent a, defined in Eq. (3), for distributions of normalized returns g as a function of trade time n and continuous time At for stocks: AAPL, AMZN, MSFT and CSCO. The straight dashed line a = 2 is the upper bound of Levy behavior, i. e. 0 < a < 2. The horizontal dotted line is the upper bound of the index for which we computed Aq, q 2 [1,4]
scales of aggregation the estimated exponent, both in real and in transaction time, is smaller than 2, indicating a Levy-like regime. Before commenting this result it is important to stress that the values shown in Fig. 5 are estimated exponents, i. e. there is no guarantee that the distributions have a true power law tail. Clauset et al. (2009) algorithm gives the best estimate of a and xmin assuming that the tail is power law. Moreover as noticed in Clauset et al. (2009) when the sample is small, the method can give incorrect estimations. This is the case of the last two points of Fig. 5, where the sample size is around 100 data points. These two considerations highlight the problems that could arise when estimating the tail exponent of a distribution which has a discrete (and finite) support. In this case the tail is obviously not power law, but the method gives in any case an estimated value. In fact numerical simulations of iid models with finite support and thus finite variance (the model i. i.d. discussed in the next section) display a similar behavior of the estimated tail exponent, including a value smaller than 2 for large aggregation. This is clearly a misestimation result.

In conclusion, the analysis of the tail exponent shows two regimes, one in which large and small tick size assets show a markedly different behavior and one where tick-to-price ratio does not play any role. Moreover numerical simulations and empirical analyses suggest to be very cautious when estimating tail exponent of a distribution that is either defined on a discrete support (as for price changes) or has an hidden discretization (as for log-returns). Arbitrarily small values of the exponent could be (mis)estimated as a result of an improper use of statistical methods.