Stockholm University, VT2022
wbstats and WDI: World Bankeurostat: Eurostatfredr: Federal Reserve Economic Dataimfr: International Monetary Fundpxweb: list
of available statisticsvignette
vignette("pxweb"), or go to packages →
pxweb → “User guides, package vignettes and other
documentation.”R-friendly format
(often data.frames) and require minimal cleaning.csv, .tsv,
.txt, …)R data file:
saveRDS() for a single object (e.g. a single
data.frame or mts); format is
.rdssave() for multiple objects (e.g. multiple
data.frames, and some lists); format is
.rdatasaveRDS() or
save())(Pun intended)
Cleaning the time variable is a critical step in every data preparation. Since you will do this with every project that has data with a time dimension, let’s spend some time on this. Cleaning other variables is usually more of a case-by-case business.
2000Q12000qtr12000m012000 January2000-03-21data.frames, and turn them into ts or
mts objects only immediately before running timeseries
functionsdate column in our data.framesubstr() to extract parts of the combined
string by position, and then use as.numeric()d <- "2000Q1"
y <- substr(d, start = 0, stop = 4) |> as.numeric()
q <- substr(d, start = 6, stop = 6) |> as.numeric()
R does not know we want to turn the
character into numerics representing
dateszoo offers classes yearqtr and
yearmon to store quarterly and monthly datescharacter using pattern
matching
R where those are within the string, and it will
extract them into a standardized time variable%Y for 4-digit years, %m
for 2-digit month, %B for written out months (English)
?strptime for a listlibrary(zoo)
dq1 <- "2000Q2"
dq2 <- "2000qtr2"
dq3 <- "2000 Quarter 2"
dateq1 <- as.yearqtr(dq1, format = "%YQ%q")
dateq2 <- as.yearqtr(dq2, "%Yqtr%q")
dateq3 <- as.yearqtr(dq3, "%Y Quarter %q")
dateq1
## [1] "2000 Q2"
dateq2
## [1] "2000 Q2"
dateq3
## [1] "2000 Q2"
zoo stores the time index internally like
R: 2000 Q1 → 2000.00;
2000 Q2 → 2000.25; …as.numeric(dateq3)
## [1] 2000.25
ts object, but
stored as a variabledm1 <- "2000m03"
dm2 <- "2000 March"
datem1 <- as.yearmon(dm1, "%Ym%m")
datem2 <- as.yearmon(dm2, "%Y %B")
datem1
## [1] "Mar 2000"
datem2
## [1] "Mar 2000"
base
R ts index for monthly data:as.numeric(datem2)
## [1] 2000.167
y1 <- 2000
m1 <- 11
datec1 <- as.yearmon(
x = paste(y1, m1, sep = " "),
format = "%Y %m"
)
datec1
## [1] "Nov 2000"
base R, dates
are stored as Date class internally
POSIXct if
you are interesteddd1 <- "2000-24-12"
dated1 <- as.Date(dd1, format = "%Y-%d-%m")
dated1
## [1] "2000-12-24"
R prints dates as Year-Month-Day (or
"%Y-%m-%d"; the ISO 8601 format)format(datem1, "%Y.%m")
## [1] "2000.03"
format(datem2, "%b %y")
## [1] "Mar 00"
format(datem2, "%b %Y")
## [1] "Mar 2000"
format(datem2, "%B %Y")
## [1] "March 2000"
format(dated1, "%A, %d %B %Y")
## [1] "Sunday, 24 December 2000"