The time series analysis is mainly carried out from these three aspects.
Time series analysis is a distinctive research field, which began in the financial industry, such as stock market trend prediction and investment risk assessment. Later, it penetrated into other fields, and it also has a place in future market forecasting, dynamic pricing, electricity consumption forecasting, biomedicine and so on.
Mathematical definition is generally a relatively short, rigorous and abstract language that describes a concept. A set of random variables sorted by time series.
Represents a time series of random events, abbreviated as
In time series prediction, every data, that is, the numerical value we see, is actually the observation value of a random variable, and the random variable obeys a certain distribution. In fact, the value we see can also be called observation value, which is actually an implementation of time random series, or an example. All the historical data we see are a set of random time series samples.
In fact, we grasp the essence of this random time series through analysis.
Because we know that every point obeys the overall distribution. As long as the properties of these random time series are obtained through data, the appearance of random variables can be mastered. In fact, it is a process of mathematical statistics, which is somewhat similar to the generation model in machine learning.
In fact, the overall scheme of time series task has been briefly described above.
With the overall plan, we follow these steps step by step, then fill in the requirements and complete the time series prediction.
The key step of the task is time series analysis, so what is time series analysis? In a word, time series analysis is a statistical analysis of time series.
So what are the specific analysis methods? There are mainly two kinds, namely descriptive time series analysis and statistical time series analysis.
There are two definitions of stationarity in time series analysis theory.
The so-called strictness means that all statistical properties of strictness and stability do not change with time. This is the property of strict stationarity and the definition of strict stationarity.
You can try to describe some concepts in mathematical language in the future.
Also known as covariance stationary, second-order stationary or wide-sense stationary, the first and second moments of weakly stationary time series do not change with time.
Judging the stationarity of time series is helpful to the choice of later models, so stationarity is an important property of time series and can be used to classify time series.
We will talk about the relationship between strict stationarity and weak stationarity. The sequence satisfying strict stationarity has weak stationarity, but strict stationarity cannot cover all weak stationarity. Why can't strict stationarity cover all weak stationarity? This is because Cauchy distribution is strictly stationary time series, but there is no second moment or first moment, so Cauchy distribution is strictly stationary and does not satisfy weak stationarity.
When the time series is a normal distribution series, all statistical properties of the normal distribution are described by the second moment, and the weakly stationary normal series is also strictly stationary.
Because in fact most time series are weakly stationary, today we will also focus on weakly stationary.
If the second moment of time series is finite
We see that with the change of time, the average value of time series is a constant.
The variance, like the mean, is a constant, and the variance is the second moment.
Covariance is also a second-order moment, regardless of whether the points at different times are regular or not, because weak stationary covariance or autocorrelation is a function of time interval. When the covariance of time interval is equal, the covariance is different when the interval is different, and it will change when S changes.
In fact, we are looking for the relationship between and. Here, we use s to represent different time intervals, for example.
Then that is to say, the autocorrelation of weakly stationary time series is only related to the time delay s, and has nothing to do with the starting position t of time.
Autocorrelation is abbreviated as a univariate function only related to time delay s, which is equivalent to variance.
The autocorrelation coefficient of stationary time series can also be simply written as a univariate function related to time delay S.
If the time series produced by a model is stationary, then the model is stationary, otherwise it is non-stationary.
Here is a passage that you can understand. AR, MA and ARMA models are commonly used stationary sequence fitting models, but not all AR, MA and ARMA models are stationary.
Ok, let's go back to the linear difference equation. Let's focus on two expressions of difference equation. Among them, let's talk about what a lag operator is.
Assume that the sum of known time series has the following relationship.
In fact, we don't use y to indicate yes, but b to indicate the lag in the program, and so on.
So polynomials are represented by hysteresis operators.
The typical P-order linear difference equation is
Today, I will mainly talk about some deduction formulas of time series. I have read some data before, and the deduction behind the commonly used AR model and MA model in time series is deep and difficult to understand. Recently, I read some materials and summarized them properly.
Although the time series is simple, it still needs some efforts to really understand it and break it down into the following forms.
These items are represented as time series by additive model, among which we can fit the trend items and season items by model, because they have laws to follow, and we need to learn by model.
GPD is a trend model, which grows exponentially with time.
The flow of people in supermarkets is cyclical, and the weekly flow of people on weekends is more than that from Monday to Friday. There are more people in the afternoon than in the morning.
That means we are right. As we have discussed before, time series is a random process, namely joint distribution. Usually, the joint distribution we study is a relatively repetitive problem.
This is one of the earliest NPL analysis. When we are in the statistical model, it uses chain rules to express joint probability.
Almost everyone who has learned knows conditional probability, and the random variable at each moment of time series is related to the probability at the random time point before him. This is the joint probability, and calculating this joint probability requires a considerable amount of calculation.
When a is less than 1, the model is stable, otherwise it is unstable. Why is there such a conclusion? We can solve this problem by combining the principle of ball landing.
In fact, our nonhomogeneous difference equation
The follow is that general solution of the difference equation.
Among them, b is also the lag operator l, which is denoted by b here. Show it again here.
Next, the characteristic solution is calculated and the left side is extracted.
The sum form that can represent infinite variables should be familiar to everyone and similar, so we use the sum of geometric series instead.
Key correlation research and computability. The correlation of AR sequence is negative exponential decay, and the MA(q) model is limited correlation.
Finite time series correlation
According to the principle of minimum mean square error, the prediction is made.
That is, the AR model we discussed, then the AR model can be used for time series analysis.
In this way, the distribution of time series conforms to the synchronization distance, so that time series is a stable time series. Linear filter
This is another model for studying time series, which is studied through frequency domain.