Authors
Steffen Moritz, Alexis Sardá, Thomas Bartz-Beielstein, Martin Zaefferer, Jörg Stork
Publication date
2015/10/13
Journal
arXiv preprint arXiv:1510.03924
Description
Missing values in datasets are a well-known problem and there are quite a lot of R packages offering imputation functions. But while imputation in general is well covered within R, it is hard to find functions for imputation of univariate time series. The problem is, most standard imputation techniques can not be applied directly. Most algorithms rely on inter-attribute correlations, while univariate time series imputation needs to employ time dependencies. This paper provides an overview of univariate time series imputation in general and an in-detail insight into the respective implementations within R packages. Furthermore, we experimentally compare the R functions on different time series using four different ratios of missing data. Our results show that either an interpolation with seasonal kalman filter from the zoo package or a linear interpolation on seasonal loess decomposed data from the forecast package were the most effective methods for dealing with missing data in most of the scenarios assessed in this paper.
Total citations
20162017201820192020202120222023202441119383935323111
Scholar articles
S Moritz, A Sardá, T Bartz-Beielstein, M Zaefferer… - arXiv preprint arXiv:1510.03924, 2015