A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
When dealing with the currency market, its movements can be separated into high-frequency variations [3] and slower
movements responsible for the trends [2]. This paper deals with the later, and explores its complexity relying on the
concept of entropy as defined in Information Theory and Kolmogorov complexity.
In the field of Information Theory, entropy is a magnitude that quantifies the uncertainty of a measure. On the
other hand, this paper follows the approach of Chaitin [4] and Kolmogorov [5] by defining complexity as in algorithmic
information theory, which takes into account the order of the points in a sequence. In this view, a chain is random if its
Kolmogorov complexity is at least equal to the length of the chain.
The benefit of the connection between information content and randomness is that it provides a way to quantify the
complexity of a dataset without relying on models or hypothesis about the process generating the data. By comparing the
entropy of our system with the maximum entropy rate possible, we can determine the degree of randomness of a series;
a complex (total random) process is defined as that process lacking pattern repetition.
The use of the entropy rate to study the complexity of a time series is not limited to stochastic processes. Sinai [6]
introduced the concept of entropy to describe the structural similarity between different dynamic systems that preserve
the measurements, giving a generalization of Shannon entropy for dynamic systems, known as Kolmogorov–Sinai entropy
(KS). Unfortunately, KS entropy is sometimes undefined for limited and noisy measurements of a signal represented in a
data series.
To overcome that limitation, Grassberger and Procaccia [7] used the Renyi entropy to define the correlation integral,
which in turn was used by Eckmann and Ruelle [8] to define the φ functions as a conditional probability. This ER entropy
is an exact estimation of the entropy of the system. Building upon those φ functions, Pincus [9] described the methodology
of ApEn, useful for limited and noisy data, providing a hierarchy of randomness based on the different patterns and their
repetitions.
ApEn measures the logarithmic probability that nearby pattern runs remain close in the next incremental comparison:
low ApEn values reflect that the system is very persistent, repetitive and predictive, with apparent patterns that repeat
themselves throughout of the series, while high values means complexity in the sense of independence between the data
and a low number of repeated patterns. The readers are encouraged to read a recent comprehensive tutorial on these
algorithms [10].
To use Approximate Entropy, it is necessary to specify two parameters, the embedding dimension (m) and the tolerance
of the measure (r), determined as a percentage of the standard deviation. Once the calculations have been performed, the
result of the algorithm is a positive real number, with higher values indicating more randomness. However, those values
are dependent on the characteristics of the dataset such as the influence of the past in the future prices or the volatility
of the prices.
In order to obtain a measure of randomness suitable for comparisons between evolving datasets, the Pincus Index (PI)
was introduced as a measure of the distance between a dataset and the maximum possible randomness of that system [11].
A value of PI equal to zero implies a totally ordered and completely predictable system, whereas a value equal to or greater
than one implies total randomness and unpredictability. The added benefit of the Pincus Index is that, unlike ApEn, it is
suitable for comparisons between different markets. This paper completes the development of that index by introducing
different kinds of multidimensionality in the measure. Thus, knowledge of the PI would be useful to fully understand the
concepts here presented and how the several levels of complexity of this measure are captured.
The Pincus Index was designed [11] to be independent on the parameter r by choosing the maximum value of
Approximate Entropy (MaxApEn), but the index is still dependent on the selection of the embedding dimension (m). This
parameter is related to the memory of the system and accounts for the length of the patterns compared in the sequence.
Techniques to determine the optimum value of the embedding dimension include the use the mutual information and false
nearest neighbor method [12–14], but since different markets may have different embedding dimensions, the comparisons
with a fixed m could be biased. To account for that possibility, we follow Bolea and coauthors [15] in the definition of
a Multidimensional index. Since such an index was based on MaxApEn, its extrapolation to a Multidimensional Pincus
Index is straightforward and provides a parameter-free index which allows for comparisons between evolving systems.
Besides multidimensionality in embedding dimension, dynamic systems may be composed of processes at different
frequencies with correlations at multiple time scales. Therefore, in the characterization of complexity, the comparison
of different frequencies may lead to incorrect conclusions. Costa and coauthors [16] proposed a multiscale procedure
to capture those correlations, showing its efficiency in distinguishing complexities in different dynamical regimes. To
describe the complexity of a time series at different levels, Costa and coauthors [17] generalized the multiscale procedure
to consider the complexity of higher statistical moments of time series. Here, we extend that methodology to create a new
Multiscale Pincus Index, showing how it is useful to correctly quantify the complexity of trading in different timeframes
and different statistical moments.
2. Methods and results
2.1. On the calculation of the Pincus Index
The Pincus Index (PI) captures the distance from a situation of total randomness for a given dataset, measured against
shuffled versions of the same data. To better quantify complexity and provide an index that is independent of the tolerance
2