Topological persistence guided knowledge distillation for wearable sensor data

Eun Som Jeon; Hongjun Choi; Ankita Shukla; Yuan Wang; Hyunglae Lee; Matthew P. Buman; Pavan Turaga

doi:10.1016/j.engappai.2023.107719

Topological persistence guided knowledge distillation for wearable sensor data

Eun Som Jeon, Hongjun Choi, Ankita Shukla, Yuan Wang, Hyunglae Lee, Matthew P. Buman, Pavan Turaga

Research output: Contribution to journal › Article › peer-review

Abstract

Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks — one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.

Original language	English (US)
Article number	107719
Journal	Engineering Applications of Artificial Intelligence
Volume	130
DOIs	https://doi.org/10.1016/j.engappai.2023.107719
State	Published - Apr 2024

Keywords

Deep learning
Feature orthogonality
Knowledge distillation
Topological data analysis
Wearable sensor data

ASJC Scopus subject areas

Control and Systems Engineering
Artificial Intelligence
Electrical and Electronic Engineering

Access to Document

10.1016/j.engappai.2023.107719

Cite this

@article{5702bb111c9e41f289d3faf14ed6a012,

title = "Topological persistence guided knowledge distillation for wearable sensor data",

abstract = "Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks — one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.",

keywords = "Deep learning, Feature orthogonality, Knowledge distillation, Topological data analysis, Wearable sensor data",

author = "Jeon, {Eun Som} and Hongjun Choi and Ankita Shukla and Yuan Wang and Hyunglae Lee and Buman, {Matthew P.} and Pavan Turaga",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2024",

month = apr,

doi = "10.1016/j.engappai.2023.107719",

language = "English (US)",

volume = "130",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Topological persistence guided knowledge distillation for wearable sensor data

AU - Jeon, Eun Som

AU - Choi, Hongjun

AU - Shukla, Ankita

AU - Wang, Yuan

AU - Lee, Hyunglae

AU - Buman, Matthew P.

AU - Turaga, Pavan

PY - 2024/4

Y1 - 2024/4

N2 - Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks — one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.

AB - Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks — one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.

KW - Deep learning

KW - Feature orthogonality

KW - Knowledge distillation

KW - Topological data analysis

KW - Wearable sensor data

UR - http://www.scopus.com/inward/record.url?scp=85180407551&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85180407551&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2023.107719

DO - 10.1016/j.engappai.2023.107719

M3 - Article

AN - SCOPUS:85180407551

SN - 0952-1976

VL - 130

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 107719

ER -

Topological persistence guided knowledge distillation for wearable sensor data

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this