TY - GEN
T1 - Topological Knowledge Distillation for Wearable Sensor Data
AU - Jeon, Eun Som
AU - Choi, Hongjun
AU - Shukla, Ankita
AU - Wang, Yuan
AU - Buman, Matthew P.
AU - Turaga, Pavan
N1 - Funding Information:
This research was funded by NIH R01GM135927, as part of the Joint DMS/NIGMS Initiative to Support Research at the Interface of the Biological and Mathematical Sciences. Y. Wang is partially supported by the Pilot Project Program of the Big Data Health Science Center at the University of South Carolina.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Converting wearable sensor data to actionable health insights has witnessed large interest in recent years. Deep learning methods have been utilized in and have achieved a lot of successes in various applications involving wearables fields. However, wearable sensor data has unique issues related to sensitivity and variability between subjects, and dependency on sampling-rate for analysis. To mitigate these issues, a different type of analysis using topological data analysis has shown promise as well. Topological data analysis (TDA) captures robust features, such as persistence images (PI), in complex data through the persistent homology algorithm, which holds the promise of boosting machine learning performance. However, because of the computational load required by TDA methods for large-scale data, integration and implementation has lagged behind. Further, many applications involving wearables require models to be compact enough to allow deployment on edge-devices. In this context, knowledge distillation (KD) has been widely applied to generate a small model (student model), using a pre-trained high-capacity network (teacher model). In this paper, we propose a new KD strategy using two teacher models - one that uses the raw time-series and another that uses persistence images from the time-series. These two teachers then train a student using KD. In essence, the student learns from heterogeneous teachers providing different knowledge. To consider different properties in features from teachers, we apply an annealing strategy and adaptive temperature in KD. Finally, a robust student model is distilled, which utilizes the time series data only. We find that incorporation of persistence features via second teacher leads to significantly improved performance. This approach provides a unique way of fusing deep-learning with topological features to develop effective models.
AB - Converting wearable sensor data to actionable health insights has witnessed large interest in recent years. Deep learning methods have been utilized in and have achieved a lot of successes in various applications involving wearables fields. However, wearable sensor data has unique issues related to sensitivity and variability between subjects, and dependency on sampling-rate for analysis. To mitigate these issues, a different type of analysis using topological data analysis has shown promise as well. Topological data analysis (TDA) captures robust features, such as persistence images (PI), in complex data through the persistent homology algorithm, which holds the promise of boosting machine learning performance. However, because of the computational load required by TDA methods for large-scale data, integration and implementation has lagged behind. Further, many applications involving wearables require models to be compact enough to allow deployment on edge-devices. In this context, knowledge distillation (KD) has been widely applied to generate a small model (student model), using a pre-trained high-capacity network (teacher model). In this paper, we propose a new KD strategy using two teacher models - one that uses the raw time-series and another that uses persistence images from the time-series. These two teachers then train a student using KD. In essence, the student learns from heterogeneous teachers providing different knowledge. To consider different properties in features from teachers, we apply an annealing strategy and adaptive temperature in KD. Finally, a robust student model is distilled, which utilizes the time series data only. We find that incorporation of persistence features via second teacher leads to significantly improved performance. This approach provides a unique way of fusing deep-learning with topological features to develop effective models.
KW - knowledge distillation
KW - time series data analysis
KW - topological data analysis
KW - wearable sensor data
UR - http://www.scopus.com/inward/record.url?scp=85150222712&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85150222712&partnerID=8YFLogxK
U2 - 10.1109/IEEECONF56349.2022.10052019
DO - 10.1109/IEEECONF56349.2022.10052019
M3 - Conference contribution
AN - SCOPUS:85150222712
T3 - Conference Record - Asilomar Conference on Signals, Systems and Computers
SP - 837
EP - 842
BT - 56th Asilomar Conference on Signals, Systems and Computers, ACSSC 2022
A2 - Matthews, Michael B.
PB - IEEE Computer Society
T2 - 56th Asilomar Conference on Signals, Systems and Computers, ACSSC 2022
Y2 - 31 October 2022 through 2 November 2022
ER -