Stockholm University, VT2022
In package Ecdat there is the data set
Macrodat, used in exercise 2 above, where variable
lhur is the unemployment rate and punew is a
quarterly inflation index.
Define a data set for the unemployment rate and the yearly inflation rate.
?Ecdat::Macrodat, punew is the
CPI
library(TS)
library(dynlm)
library(vars)
library(forecast)
data(Macrodat, package = "Ecdat")
cpi <- Macrodat[, "punew"]
plot(cpi)
infl_qoq <- diff(log(cpi))
Why diff(log())?
It’s a computationally efficient method to approximate percentage changes.
\[ \Delta \log(x_t) = \log(x_t) - \log(x_{t - 1}) = \log \left(\frac{x_t}{x_{t - 1}} \right) = \log \left( 1 + \frac{x_t - x_{t - 1}}{x_{t - 1}} \right) \approx \frac{x_t - x_{t - 1}}{x_{t - 1}} \]
The approximation is better, the smaller the percentage change.
infl_qoq is in decimal scale of percentage
(\(1\% \widehat{=} 0.01\))\[ P_t = P_{t - 1} + \pi_t P_{t - 1} = (1 + \pi_t) P_{t - 1} \]
infl_qoq\[ P_t = (1 + \pi_t) P_{t - 1} = (1 + \pi_t) (1 + \pi_{t - 1}) P_{t - 2} = (1 + \pi_t) (1 + \pi_{t - 1}) (1 + \pi_{t - 2}) P_{t - 3} = (1 + \pi_t) (1 + \pi_{t - 1}) (1 + \pi_{t - 2}) (1 + \pi_{t - 3}) P_{t - 4} \]
\[ P_t =(1 + \pi_t) (1 + \pi_{t - 1}) (1 + \pi_{t - 2}) (1 + \pi_{t - 3}) P_{t - 4} = (1 + \pi_t)^4 P_{t - 4} \]
infl_ann <- (1 + infl_qoq)^4 - 1
infl_ann <- infl_ann * 100
infl_qoq <- infl_qoq * 100
infl_y <- diff(log(cpi), lag = 4)
infl_y <- infl_y * 100
ts.plot(
infl_qoq, infl_ann, infl_y,
lty = 1:3,
main = "Inflation rate comparison"
)
legend(
"topright",
legend = c("QoQ", "QoQ annualized", "Over last year"),
lty = 1:3
)
ts.intersect() takes care of that, creating a
multivariate time series, restricting the longer seriesdat <- ts.intersect(infl = infl_ann, unemp = Macrodat[, "lhur"])
ts.plot(dat, lty = 1:2)
legend(
"topright",
legend = c("Inflation", "Unemployment"),
lty = 1:2
)
Determine the order of integration for the two variables using an ADF-test.
TS libraryADF.test(dat[, "infl"])
## # # # # # # # # # # # # # # # # # # # #
## # # # # # # # # # # # # # # # # # # # #
## Augmented Dickey--Fuller test
## # # # # # # # # # # # # # # # # # # # #
## type: trend drift none
## AR1: 0.88 0.89 0.97
## lags: 12 12 12
## # # # # # # # # # # # # # # # # # # # #
## statistic 1pct 5pct 10pct
## tau3 -2.72 -3.99 -3.43 -3.13
## phi3 3.88 8.43 6.49 5.47
## tau2 -2.62 -3.46 -2.88 -2.57
## phi1 3.44 6.52 4.63 3.81
## tau1 -1.31 -2.58 -1.95 -1.62
tau2 at 10% significanceADF.test(diff(dat[, "infl"]))
## # # # # # # # # # # # # # # # # # # # #
## # # # # # # # # # # # # # # # # # # # #
## Augmented Dickey--Fuller test
## # # # # # # # # # # # # # # # # # # # #
## type: trend drift none
## AR1: -1.2 -0.62 -0.61
## lags: 12 12 12
## # # # # # # # # # # # # # # # # # # # #
## statistic 1pct 5pct 10pct
## tau3 -5.45 -3.99 -3.43 -3.13
## phi3 14.84 8.43 6.49 5.47
## tau2 -5.16 -3.46 -2.88 -2.57
## phi1 13.32 6.52 4.63 3.81
## tau1 -5.18 -2.58 -1.95 -1.62
ADF.test(dat[, "unemp"])
## # # # # # # # # # # # # # # # # # # # #
## # # # # # # # # # # # # # # # # # # # #
## Augmented Dickey--Fuller test
## # # # # # # # # # # # # # # # # # # # #
## type: trend drift none
## AR1: 0.97 0.97 1
## lags: 12 12 12
## # # # # # # # # # # # # # # # # # # # #
## statistic 1pct 5pct 10pct
## tau3 -2.51 -3.99 -3.43 -3.13
## phi3 3.31 8.43 6.49 5.47
## tau2 -2.58 -3.46 -2.88 -2.57
## phi1 3.34 6.52 4.63 3.81
## tau1 -0.65 -2.58 -1.95 -1.62
tau2 rejects
at the 10% levelADF.test(diff(dat[, "unemp"]))
## # # # # # # # # # # # # # # # # # # # #
## # # # # # # # # # # # # # # # # # # # #
## Augmented Dickey--Fuller test
## # # # # # # # # # # # # # # # # # # # #
## type: trend drift none
## AR1: 0.42 0.43 0.43
## lags: 12 12 12
## # # # # # # # # # # # # # # # # # # # #
## statistic 1pct 5pct 10pct
## tau3 -5.16 -3.99 -3.43 -3.13
## phi3 13.33 8.43 6.49 5.47
## tau2 -5.12 -3.46 -2.88 -2.57
## phi1 13.12 6.52 4.63 3.81
## tau1 -5.13 -2.58 -1.95 -1.62
Determine whether the variables are cointegrated or not using the Engle-Granger ADF-test. Hint: Estimate S&W 16.24.
Cointegration
\[ \hat{e}_t = y_t - \hat{\beta}_0 - \hat{\beta}_1 x_t \]
dynlm() to preserve the
ts class for the residualslm() works correctly (remember lab 3)model_eg <- dynlm(infl ~ unemp, data = dat)
summary(model_eg)
##
## Time series regression with "ts" data:
## Start = 1959(2), End = 2000(4)
##
## Call:
## dynlm(formula = infl ~ unemp, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.6776 -1.9511 -0.9432 1.4511 11.9018
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.7489 1.0150 2.708 0.00748 **
## unemp 0.2778 0.1655 1.679 0.09509 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.195 on 165 degrees of freedom
## Multiple R-squared: 0.01679, Adjusted R-squared: 0.01083
## F-statistic: 2.818 on 1 and 165 DF, p-value: 0.09509
res_eg <- resid(model_eg)
plot(res_eg)
ADF.test(res_eg)
## # # # # # # # # # # # # # # # # # # # #
## # # # # # # # # # # # # # # # # # # # #
## Augmented Dickey--Fuller test
## # # # # # # # # # # # # # # # # # # # #
## type: trend drift none
## AR1: 0.87 0.87 0.87
## lags: 12 12 12
## # # # # # # # # # # # # # # # # # # # #
## statistic 1pct 5pct 10pct
## tau3 -3.01 -3.99 -3.43 -3.13
## phi3 4.64 8.43 6.49 5.47
## tau2 -2.90 -3.46 -2.88 -2.57
## phi1 4.20 6.52 4.63 3.81
## tau1 -2.89 -2.58 -1.95 -1.62
tau2 here, the ADF specification with intercept
(drift), but no trendTable 17.1 from Stock and Watson (4. ed, p. 665)
| 10% | 5% | 1% | |
|---|---|---|---|
| One regressor | −3.12 | −3.41 | −3.96 |
| Two regressors | −3.52 | −3.80 | −4.36 |
| Three regressors | −3.84 | −4.16 | −4.73 |
| Four regressors | −4.20 | −4.49 | −5.07 |
You commented in the lecture that the EG-ADF test is kind of tedious to do. I have not found a good implementation of it to run as a single function. This test does however have some undesirable theoretical properties as well, which is why the Johansen procedure is typically used to test for cointegration. We cover this in the last exercise in a later lab.
Estimate a VAR model where the lag order is determined by AIC.
vars packageVAR() function is \(p = 1\)VARselect() to determine optimal lag orderddat <- diff(dat)
VARselect(ddat, type = "const")
## $selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 4 4 2 4
##
## $criteria
## 1 2 3 4 5 6
## AIC(n) -1.883240 -2.0364174 -2.0474062 -2.1208223 -2.106279 -2.0618174
## HQ(n) -1.835597 -1.9570123 -1.9362390 -1.9778931 -1.931587 -1.8553642
## SC(n) -1.765938 -1.8409138 -1.7737011 -1.7689158 -1.676171 -1.5535081
## FPE(n) 0.152098 0.1305011 0.1290848 0.1199637 0.121747 0.1273212
## 7 8 9 10
## AIC(n) -2.0736933 -2.0713295 -2.0391405 -2.0009913
## HQ(n) -1.8354780 -1.8013522 -1.7374011 -1.6674899
## SC(n) -1.4871825 -1.4066173 -1.2962268 -1.1798762
## FPE(n) 0.1258705 0.1262373 0.1304572 0.1356465
model_var <- VAR(ddat, p = 4, type = "const")
summary(model_var)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: infl, unemp
## Deterministic variables: const
## Sample size: 162
## Log Likelihood: -275.682
## Roots of the characteristic polynomial:
## 0.7735 0.7735 0.7724 0.7724 0.6014 0.6014 0.5258 0.5258
## Call:
## VAR(y = ddat, p = 4, type = "const")
##
##
## Estimation results for equation infl:
## =====================================
## infl = infl.l1 + unemp.l1 + infl.l2 + unemp.l2 + infl.l3 + unemp.l3 + infl.l4 + unemp.l4 + const
##
## Estimate Std. Error t value Pr(>|t|)
## infl.l1 -0.35181 0.07839 -4.488 1.41e-05 ***
## unemp.l1 -2.78566 0.45227 -6.159 6.16e-09 ***
## infl.l2 -0.24230 0.08320 -2.912 0.00412 **
## unemp.l2 1.21188 0.57085 2.123 0.03537 *
## infl.l3 0.05618 0.08063 0.697 0.48696
## unemp.l3 0.53037 0.57256 0.926 0.35574
## infl.l4 -0.07154 0.07573 -0.945 0.34629
## unemp.l4 -1.58340 0.47966 -3.301 0.00120 **
## const -0.00260 0.10988 -0.024 0.98115
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 1.398 on 153 degrees of freedom
## Multiple R-Squared: 0.3953, Adjusted R-squared: 0.3637
## F-statistic: 12.5 on 8 and 153 DF, p-value: 1.025e-13
##
##
## Estimation results for equation unemp:
## ======================================
## unemp = infl.l1 + unemp.l1 + infl.l2 + unemp.l2 + infl.l3 + unemp.l3 + infl.l4 + unemp.l4 + const
##
## Estimate Std. Error t value Pr(>|t|)
## infl.l1 0.042168 0.013675 3.084 0.00243 **
## unemp.l1 0.728097 0.078896 9.229 < 2e-16 ***
## infl.l2 0.003257 0.014513 0.224 0.82274
## unemp.l2 0.001224 0.099582 0.012 0.99021
## infl.l3 0.017220 0.014065 1.224 0.22270
## unemp.l3 -0.080847 0.099879 -0.809 0.41952
## infl.l4 0.021240 0.013211 1.608 0.10994
## unemp.l4 -0.028794 0.083675 -0.344 0.73123
## const -0.004894 0.019168 -0.255 0.79883
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.2438 on 153 degrees of freedom
## Multiple R-Squared: 0.4778, Adjusted R-squared: 0.4505
## F-statistic: 17.5 on 8 and 153 DF, p-value: < 2.2e-16
##
##
##
## Covariance matrix of residuals:
## infl unemp
## infl 1.9539 -0.02480
## unemp -0.0248 0.05946
##
## Correlation matrix of residuals:
## infl unemp
## infl 1.00000 -0.07276
## unemp -0.07276 1.00000
ts.plot(resid(model_var), main = "Residuals of VAR(4)", lty = 1:2)
legend(
"bottomright",
legend = c("Inflation", "Unemployment"),
lty = 1:2
)
vars package has some functions for diagnostics,
e.g. the multivariate Ljung-Box test for autocorrelation in the
residualsserial.test(model_var)
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object model_var
## Chi-squared = 50.826, df = 48, p-value = 0.3629
normality.test(model_var)
## $JB
##
## JB-Test (multivariate)
##
## data: Residuals of VAR object model_var
## Chi-squared = 34.938, df = 4, p-value = 4.784e-07
##
##
## $Skewness
##
## Skewness only (multivariate)
##
## data: Residuals of VAR object model_var
## Chi-squared = 14.303, df = 2, p-value = 0.0007838
##
##
## $Kurtosis
##
## Kurtosis only (multivariate)
##
## data: Residuals of VAR object model_var
## Chi-squared = 20.635, df = 2, p-value = 3.305e-05
Determine whether any of the variables Granger-causes the other. Hint: One does, the other not.
vars package
# from inflation to unemployment
cause_i_u <- causality(
model_var,
cause = "infl",
vcov. = sandwich::vcovHC(model_var)
)
# from unemployment to inflation
cause_u_i <- causality(
model_var,
cause = "unemp",
vcov. = sandwich::vcovHC(model_var)
)
causality() runs two different tests, we only want the
Granger-causalitycause_i_u$Granger
##
## Granger causality H0: infl do not Granger-cause unemp
##
## data: VAR object model_var
## F-Test = 1.8569, df1 = 4, df2 = 306, p-value = 0.1179
cause_u_i$Granger
##
## Granger causality H0: unemp do not Granger-cause infl
##
## data: VAR object model_var
## F-Test = 9.7371, df1 = 4, df2 = 306, p-value = 1.999e-07
Plot all four possible impulse response functions using the orthogonal innovations.
irf() is to compute all IRFs (all
variables on all other, 2x2 in our case) with orthogonal shocksplot() takes objects of class
varirf and creates nice plotsvar_irf <- irf(model_var, ortho = TRUE)
plot(var_irf)
Produce a (one) plot, a so-called fan chart, of the forecasts for ten years ahead for the variable that is Granger-caused. The title of the plot should read “Forecasts for the XXX rate.”, where XXX is replaced with the variable’s name and the label on the vertical axis should be a “%”.
predict() produces a forecast for
n.ahead time stepsfanchart() from the vars package produces
exactly the plot we wanthmax <- 10 * 4
var_forecast <- predict(model_var, n.ahead = hmax)
fanchart(
var_forecast,
names = "infl",
main = "Forecast of the inflation rate",
ylab = "%"
)
forecast packagevar_forecast2 <- forecast(
model_var,
h = hmax,
fan = TRUE
)
plot(
var_forecast2$forecast$infl,
main = "Forecast of the inflation rate",
ylab = "%"
)