TY - JOUR
T1 - Selego
T2 - robust variate selection for accurate time series forecasting
AU - Tiwaskar, Manoj
AU - Garg, Yash
AU - Li, Xinsheng
AU - Candan, K. Selçuk
AU - Sapino, Maria Luisa
N1 - Funding Information:
This work is partially supported by NSF#1827757 “Building Doctor’s Medicine Cabinet (BDMC): Data-Driven Services for High Performance and Sustainable Buildings”, NSF#1610282 “DataStorm: A Data Enabled System for End-to-End Disaster Planning and Response”, NSF#1633381 “BIGDATA: Discovering Context-Sensitive Impact in Complex Systems”, NSF#1909555 “pCAR: Discovering and Leveraging Plausibly Causal (p-causal) Relationships to Understand Complex Dynamic Systems”, and DOE grant “Securing Grid-interactive Efficient Buildings (GEB) through Cyber Defense and Resilient System (CYDRES)”. Part of the research was carried out using the Chameleon testbed supported by the NSF.
Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.
PY - 2021/9
Y1 - 2021/9
N2 - Naïve extensions of uni-variate prediction techniques lead to an unwelcome increase in the cost of multi-variate model learning and significant deteriorations in the model performance. In this paper, we first argue that (a) one can learn a more accurate forecasting model by leveraging temporal alignments among variates to quantify the importance of the recorded variates with respect to a target variate. We further argue that, (b) for this purpose we need to quantify temporal correlation, not in terms of series similarity, but in terms of temporal alignments of key “events” impacting these series. Finally, we argue that (c) while learning a temporal model using recurrence based techniques (such as RNN and LSTM—even when leveraging attention strategies) is difficult and costly, we can achieve better performance by coupling simpler CNNs with an adaptive variate selection strategy. Relying on these arguments, we propose a Selego framework (Selego is a word of latin origin meaning “selection”) for variate selection and experimentally evaluate the performance of the proposed approach on various forecasting models, such as LSTM, RNN, and CNN, for different top-X% variates and different forecasting time in the future (lead) on multiple real-world datasets. Experiments show that the proposed framework can offer significant (90 - 98 %) drops in the number of recorded variates that are needed to train predictive models, while simultaneously boosting accuracy.
AB - Naïve extensions of uni-variate prediction techniques lead to an unwelcome increase in the cost of multi-variate model learning and significant deteriorations in the model performance. In this paper, we first argue that (a) one can learn a more accurate forecasting model by leveraging temporal alignments among variates to quantify the importance of the recorded variates with respect to a target variate. We further argue that, (b) for this purpose we need to quantify temporal correlation, not in terms of series similarity, but in terms of temporal alignments of key “events” impacting these series. Finally, we argue that (c) while learning a temporal model using recurrence based techniques (such as RNN and LSTM—even when leveraging attention strategies) is difficult and costly, we can achieve better performance by coupling simpler CNNs with an adaptive variate selection strategy. Relying on these arguments, we propose a Selego framework (Selego is a word of latin origin meaning “selection”) for variate selection and experimentally evaluate the performance of the proposed approach on various forecasting models, such as LSTM, RNN, and CNN, for different top-X% variates and different forecasting time in the future (lead) on multiple real-world datasets. Experiments show that the proposed framework can offer significant (90 - 98 %) drops in the number of recorded variates that are needed to train predictive models, while simultaneously boosting accuracy.
KW - Forecasting
KW - Recurrent and convolutional networks
KW - Variate selection
UR - http://www.scopus.com/inward/record.url?scp=85111486448&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111486448&partnerID=8YFLogxK
U2 - 10.1007/s10618-021-00777-1
DO - 10.1007/s10618-021-00777-1
M3 - Article
AN - SCOPUS:85111486448
SN - 1384-5810
VL - 35
SP - 2141
EP - 2167
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 5
ER -