Professional Documents
Culture Documents
Topics
Dynamic linear models (state space models) Sequential context, Bayesian framework Standard classes of models, model decompositions Models and methods in physical science applications Time series decompositions, latent structure Neurophysiology - climatology - speech processing Multivariate time series: Financial applications Latent structure, volatility models
= Gt t1 + t
signal xt , state vector t = (t1 , . . . , td ) regression vector Ft and state matrix Gt zero mean measurement errors t and state innovations t often zero-mean and normally distributed
Examples
Slowly varying level observed with noise: yt = x t + t Dynamic linear regression: yt = x t + t xt = F t t
t
xt = xt1 + t
= t1 + t
Error models for t , t normal distributions mixtures of normals: outliers and abrupt structural changes
yt = x t + t
x t = a t + b t Xt
190 180 170 160 150 140 130 120 110 100 80 75 70 65 60 55 50 45 1/75 1/77 1/79
MARKET
SALES
1/81
1/83
1/85
QTR YEAR
1 75
1 77
1 79
1 81
1 83
1 85
Relative to static model, dynamic regression delivers: improved estimation via adaptation for local regression parameters and increased (honest) uncertainty about regression parameters adaptability to (small) changes improved point forecasts partitions variation: parameter vs observation error increased precision of stated forecasts i.e., improved prediction: point forecasts AND precision
yt = x t + t yt1 xt1
xt = F t t yt xt
= Gt t1 + t
yt+1 xt+1
t1 t t+1 Sequential model denition : Markov evolution structure CI structure : t sucient for future at time t
Bayesian Forecasting
Key concepts: Bayesian: modelling & learning is probabilistic Time-varying parameter models: often non-stationary Sequential view, sequential model denitions encourages interaction, intervention Statistical framework: Forecasting: What might happen? and What if?
Data processing and statistical learning from observations Updating of models and probabilistic summaries of belief Time series analysis ... Retrospection: What happened?
Bayesian Machinery
Inferences based on information Dt = {(y1 , . . . , yt ), It } Find and summarise p(t |Dt ) and p(1 , . . . , t |Dt ) and update as t t + 1 Forecasts: p(yt+1 , . . . , yt+k |Dt ) and update as t t + 1 Implementation & computations: Linear/normal models: neat theory, Kalman ltering Extend to infer variance components, non-normal errors .. need approximations, simulation methods, MCMC
Commercial Applications
Short term forecasting of consumer sales and demand Monitoring: stocks and inventories of consumer products Many items or sectors: Aggregation and multi-level models Standard models for commercial applications: Data = Trend + Seasonal + Regression + Error yt = x1t + x2t + x3t + t
t = t1 + t
where (aj,t , bj,t ) wander through time Key concept: Model (De)Composition Modelling, prior specication, interventions: component-wise Posterior inference: detrending, deseasonalisation, etc
= Gt1 + t
Includes all standard point-forecasting methods (exponential smoothing, variants, ... ) Polynomial trend and seasonal components in commercial models Includes all practically useful ARIMA models Multiple representations:
t
= Et G EGE1
General class: Time-varying Ft , Gt Includes non-stationary models, time-varying ARIMA models, etc., ...
Simulation-Based Computation
yt = x t + t
xt = F t t
= Gt t1 + t
Normal error models p(t ), p(t ) Fixed time window t = 1, . . . , n State vector set: = {1 , . . . , n } Require: full posterior sample from p(|Dn ) Available via Forward-ltering: Backward-sampling algorithm
Carter and Kohn (1994) Biometrika Fr hwirth-Schnatter (1994) J Time Series Anal u West and Harrison 1997
FFBS Algorithm
Forward-ltering: Standard normal/linear analysis: Kalman lter delivers normal p(t |Dt ) at each t = 1, . . . , n Backward-sampling: at t = n : sample from p(n |Dn ) n for t = n 1, n 2, . . . , 1 : sample from normal distribution t p(t |Dt , ) t+1 Builds up as , , . . . n n1 Exploits Markovian/CI model structure
MCMC in DLMs
DLM parameters: e.g., constant model yt = x t + t xt = F t Parameters: Variances (variance matrices) of t , t Elements of F, G Indicators in normal mixture models for errors Add parameters to analysis: MCMC utilising FFBS
= Gt1 + t
MCMC in DLMs
Parameters (may depend on sample size) Gibbs sampling: p(, |Dn ) iteratively resampled via Apply FFBS algorithm to draw from p(| , Dn ) Draw new value of from p(| , Dn ) Iterate Standard Gibbs sampling: MCMC May need creativity in sampling : Metropolis-Hastings, etc Often easy: as in Autoregressive DLM
y t = x t + t xt =
d j=1
j xtj +
xt = (1, 0, . . . , 0)xt , 2 0 1 0 d1 0 0 .. . 0 1 3 0 0
xt = Gxt1 + t d 0 0 , . . . 0
t
t
0 0 . . . 0
Autoregressive DLM
Latent AR(d) process: Time series: DLM for for xt : 1 1 G= 0 . . . 0
xt =
d j=1
j xtj +
0 0 . . . 0
xt =
j=1
zt,j +
j=1
at,j
z terms: one for each pair of complex conjugate eigenvalues a terms: one for each real eigenvalue underlying latent processes zt,j and at,j follow simple models at,j is AR(1) process - short-term correlations zt,j is quasi-periodic ARMA(2,1) - noisy sine wave with randomly time-varying amplitude & phase, xed frequency
Time-varying Autoregression
TV-AR(d) model: xt =
d j=1
t,j xtj +
= t1 + t
Flexible representations: non-stationary process, time-varying spectral properties latent component structure other evolutions for t (e.g., Godsill et al on speech processing)
with
t,1 1 G(t ) = 0 . . . 0
t} t}
Component models: yt = x0t + x1t + t infer latent TV-AR processes too: {x0t , xt :
TVAR Decomposition
dz da
xt =
j=1
zt,j +
j=1
at,j
z terms: one for each pair of complex conjugate eigenvalues a terms: one for each real eigenvalue underlying latent processes: at,j is TV-AR(1) process - short-term correlations + time-varying correlation zt,j is TV-ARMA(2,1) - noisy sine wave with randomly time-varying amplitude & phase + time-varying frequency (2/wavelength) Number of complex/real eigenvalues can vary over time too!
Time:Frequency Analysis
Fit exible (high order?) AR or TV-AR models Estimate latent components and their frequencies, amplitudes over time Time domain representation of spectral structure Often, some zt,j physically meaningful, some (high frequency) represent noise, model approximation at,j noise, model approximation and and (possibly) low frequency trend
Paleoclimatology Example
deep ocean cores: relative abundance of 18 O 18 O as global temperatures (smaller ice mass) reverse sign: higher recent global temperatures well known periodicities: earth orbital dynamics impact on solar insolation Milankovitch; Shackleton et al since 1976 eccentricity: obliquity: precession: 95-120 kyear 40-42 kyear 19-25 kyear (1 or 2?)
oxygen level
trend
comp 1
comp 2
comp 3
comp 4
500
1000
1500
2000
2500
Oxygen
Wavelengths vary only modestly Estimated periods/wavelengths consistent with geological determinations .... 108120, peak 110 .... 40.841.6, peak 41.5 .... 22.223, peak 22.8 Switch due to order of estimated amplitude Geological interpretation? Structural climate change 1.1m yrs?
-100
-200
-300
-300
0
-200
-100
200
50
100 time
150
200
100
200
500
1000 time
1500
2000
beta
alpha
alpha:theta
theta
delta
data
100
200 time
300
400
500
beta
alpha
theta2
theta1
delta2
delta1
data
500
1000
1500 time
2000
2500
3000
S26.Low Patient/Treatment
Two EEG series on one individual: S26.Low cf S26.Mod Repeat seizures with varying ECT treatment
2 TVAR(20) with time-varying t
Evident high frequency structure and spiky traces higher order models Several frequency bands inuenced by seizure several components Identication issues
EEG Comparisons
10 S26.Low S26.Mod
Particle Filtering
Key Goal: sequentially update posteriors p(t |Dt ) p(t+1 |Dt+1 ) Numerical Approximations (points/weights): {t , t : j = 1, . . . , Nt } Theoretical update: p(t+1 |Dt+1 ) p(yt+1 |t+1 )p(t+1 |Dt ) MC approximation to prior:
Nt (j) (j)
p(t+1 |Dt )
k=1
t p(t+1 |t )
(k)
(k)
(j)
yt = (y1t , . . . , ypt ) e.g., yit is pvector of returns on investment i (exchange rate futures, etc.)
JPY
Returns % 0
-10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-10 7602
Returns % 0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
GBP
10 10
AUD
Returns % 0
-10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-10 7602
Returns % 0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
CAD
10 10
CHF
Returns % 0
-10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-10 7602
Returns % 0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
FRF
10 10
NLG
Returns % 0
-10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-10 7602
Returns % 0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
DEM
0.10 0.10
JPY
IS 0.0
-0.10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-0.10
7602
IS 0.0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
GBP
0.10 0.10
AUD
IS 0.0
-0.10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-0.10
7602
IS 0.0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
CAD
0.10 0.10
CHF
IS 0.0
-0.10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-0.10
7602
IS 0.0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
FRF
0.10 0.10
NLG
IS 0.0
-0.10
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-0.10
7602
IS 0.0
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
residualt = Xt ft + et ft et ft ... kvector of latent factors N (ft |0, Ft ) N (et |0, Et ) et ... qvector of idiosyncracies
Ft = diag(exp(1,t ), . . . , exp(k,t )) Et = diag(exp(k+1,t ), . . . , exp(k+q,t )) t = Xt Ft Xt + Et Factor and idiosyncratic latent log volatilities: t = (1,t , . . . , k+q,t )
Constant factor loadings matrix Xt = X or slowly varying over time Identication constraints on X :
xk,1 x k+1,1 . . .
1 x2,1 . . .
X=
xq,1
0 1 xk+1,k . . .
0 0
xq,k
Multivariate SV models vector AR(1) model for latent log volatilities t volatility persistence: AR(1) coecients (diagonal) marginal volatility levels: time-varying t dependent innovations: shocks to volatilities related across series global/sector eects represented through correlations t = t + (t1 t1 ) + t t N (0, U) t random walk
Inference based on a xed sample over t = 1, . . . , T Bayesian analysis via posterior simulations Monte Carlo samples from joint posterior for model parameters, dynamic regression parameters, AND latent processes: factors & volatilities { ft , t : t = 1, . . . , T } examples of highly structured, hierarchical models with many latent variables and parameters custom MCMC built of standard modules simple MC evaluation of forecast/predictive distributions
Sequential particle ltering over t = T + 1, T + 2, , . . . particulate approximation to posterior distributions prior: cloud of particles, associated importance weights posterior: sampling/importance resampling to revise posterior samples p(|Dt1 ) p(|Dt ) sample regeneration by local smoothing of particulate prior time-varying latent variables and model parameters
JPY
stdev 0.10
0.05
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
0.05
stdev 0.10
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
GBP
0.15 0.15
AUD
stdev 0.10
0.05
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
0.05
stdev 0.10
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
CAD
0.15 0.15
CHF
stdev 0.10
0.05
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
0.05
stdev 0.10
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
FRF
0.15 0.15
NLG
stdev 0.10
0.05
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
0.05
stdev 0.10
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
-0.10
0.0
0.04
7602
Factor 1 0.08
8102
8602
9102
9602
JPY
0.10 Factor 2 0.0 0.05 0.12
7602 8102 8602 9102 9602
-0.10
0.0
0.04
7602
Factor 2 0.08
8102
8602
9102
9602
0.05
Factor 1 0.0
-0.05
-0.10
0.0
7602
0.02
Factor 1 0.04
0.06
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
JPY
0.10 0.08
7602 7710 7906 8102 8210 8406 8602 8710 8906 9102 9210 9406 9602 9710
0.05
Factor 2 0.0
-0.05
-0.10
0.0
7602
0.02
Factor 2 0.04
0.06
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
GBP
0.10 0.08
7602 7710 7906 8102 8210 8406 8602 8710 8906 9102 9210 9406 9602 9710
0.05
Factor 3 0.0
-0.05
-0.10
0.0
7602
0.02
Factor 3 0.04
0.06
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
Factor 1 DEM 1.00 (0.00) JPY GBP AUD CAD CHF FRF ESP 0.55 (0.06) 0.68 (0.05) 0.23 (0.03) 0.04 (0.02) 1.04 (0.03)
Factor 2 0.00 (0.00) 1.00 (0.00) 0.48 (0.14) 0.35 (0.07) 0.03 (0.08) 0.23 (0.07)
Factor 1 DEM FRF ESP CHF GBP JPY AUD CAD 1 1 1 1 0.7 0.5 0.2
JPY
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
GBP
Idiosyncratic 0.05 0.10 0.15 Idiosyncratic 0.05 0.10 0.15
AUD
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
CAD
Idiosyncratic 0.05 0.10 0.15 Idiosyncratic 0.05 0.10 0.15
CHF
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
FRF
Idiosyncratic 0.05 0.10 0.15 Idiosyncratic 0.05 0.10 0.15
NLG
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
0.0
7602
7710
7906
8102
8210
8406
8602
8710
8906
9102
9210
9406
9602
9710
Models/forecasts feed into portfolio decision management Quintana et al invited talk, Valencia VII Live/Operational assessments of dynamic factor models Improved sequential particle ltering: parameters Time-varying loadings matrices Xt , parameters Uncertainty about number of factors