TY - JOUR
T1 - Analysis of Large Heterogeneous Repairable System Reliability Data With Static System Attributes and Dynamic Sensor Measurement in Big Data Environment
AU - Liu, Xiao
AU - Pan, Rong
N1 - Publisher Copyright:
© 2019, © 2019 American Statistical Association and the American Society for Quality.
PY - 2020/4/2
Y1 - 2020/4/2
N2 - In the age of Big Data, one pressing challenge facing engineers is to perform reliability analysis for a large fleet of heterogeneous repairable systems with covariates. In addition to static covariates, which include time-invariant system attributes such as nominal operating conditions, geo-locations, etc., the recent advances of sensing technologies have also made it possible to obtain dynamic sensor measurement of system operating and environmental conditions. As a common practice in the Big Data environment, the massive reliability data are typically stored in some distributed storage systems. Leveraging the power of modern statistical learning, this article investigates a statistical approach which integrates the random forests algorithm and the classical data analysis methodologies for repairable system reliability, such as the nonparametric estimator for the mean cumulative function and the parametric models based on the nonhomogeneous Poisson process. We show that the proposed approach effectively addresses some common challenges arising from practice, including system heterogeneity, covariate selection, model specification and data locality due to the distributed data storage. The large sample properties as well as the uniform consistency of the proposed estimator are established. Two numerical examples and a case study are presented to illustrate the application of the proposed approach. The strengths of the proposed approach are demonstrated by comparison studies. Datasets and computer code have been made available on GitHub.
AB - In the age of Big Data, one pressing challenge facing engineers is to perform reliability analysis for a large fleet of heterogeneous repairable systems with covariates. In addition to static covariates, which include time-invariant system attributes such as nominal operating conditions, geo-locations, etc., the recent advances of sensing technologies have also made it possible to obtain dynamic sensor measurement of system operating and environmental conditions. As a common practice in the Big Data environment, the massive reliability data are typically stored in some distributed storage systems. Leveraging the power of modern statistical learning, this article investigates a statistical approach which integrates the random forests algorithm and the classical data analysis methodologies for repairable system reliability, such as the nonparametric estimator for the mean cumulative function and the parametric models based on the nonhomogeneous Poisson process. We show that the proposed approach effectively addresses some common challenges arising from practice, including system heterogeneity, covariate selection, model specification and data locality due to the distributed data storage. The large sample properties as well as the uniform consistency of the proposed estimator are established. Two numerical examples and a case study are presented to illustrate the application of the proposed approach. The strengths of the proposed approach are demonstrated by comparison studies. Datasets and computer code have been made available on GitHub.
KW - Mean cumulative function
KW - Nonhomogeneous Poisson process
KW - Random forests
KW - Recurrence data
KW - Repairable reliability data analysis
UR - http://www.scopus.com/inward/record.url?scp=85084621360&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084621360&partnerID=8YFLogxK
U2 - 10.1080/00401706.2019.1609584
DO - 10.1080/00401706.2019.1609584
M3 - Article
AN - SCOPUS:85084621360
SN - 0040-1706
VL - 62
SP - 206
EP - 222
JO - Technometrics
JF - Technometrics
IS - 2
ER -