Professional Documents
Culture Documents
4 September 2017
Trend line.
Dispersion changing.
Assume that our data has some trend, and spikes around it is due to a lot of
random factors, that affects our data. For example, amount of served requests is
described using this approach very well. Garbage collection, cache misses,
paging by OS, a lot of things affects particular time of served response.
Lets take half an hour slice from our data, from 2017-08-27 12:00 till
12:30. We can see that this data has a trend, and some oscillations
Let’s build regression line for defining slope of this trend line.
const 916.269951
dy/dx 11.599507
Results means, that const is level for this trend line, and dy/dx is slope
line, which defines how fast level grows according time.
So actually we reduce dimension of data from 31 parameters, to 2
parameters.
If we subtract from our initial data our regression function values, we will
see process, that looks like stationary stochastic process.
Here, we can see that Dickey-Fuller Test value is really small, and do not
reject null hypothesis about non stationarity of our time series slice. Also
autocorrelation functions looks well.
Thus we have made some transformation of our data, and we can rotate
our data according our slope of our trend line.
Actually our slope is a discrete derivative of our non stationary time series,
due to the constant interval of our metric points, we can not to take in account
dx. Hence we can approximate our data as piecewise function which computed
using discrete derivatives of time series regression trends.
It looks like there is linear autocorrelation for every slice, and if we find a
regression line for every slice, we can build model of our time slice, using
assumptions that we made.