Application of data science tools to determine feature correlation and cluster metal hydrides for hydrogen storage

Alireza Rahnama, Seetharaman Sridhar

Research output: Contribution to journalArticlepeer-review

8 Scopus citations


In this paper, the openly available database provided by the US Department of Energy on hydride metals for hydrogen energy were studied through unsupervised machine learning to identify the similarities in samples which are originally classified in various material classes. We employed k-means algorithm to investigate the similar behaviour of different materials classes in relation to hydrogen weight percent and operating parameters. It was found that the optimal number of clusters within the dataset is 3 despite the fact that the data points are classified in eight different material classes. We employed discrete linear convolution method for anomaly detection and to identified irregularities and outliers in the dataset. In addition, kernel density estimations was employed and the results showed that most data points are located in the temperature range between 0 and 200 C, pressure between 0 to 5 atm and hydrogen weight percent between 0-2 wt.%. Our investigation showed that most of the outliers belong to complex material class. The analysis of clustering behaviour showed that A2B, complex hydrides and Mg-based alloys clustered together, which is supported by the fact that many samples with the same structures belong to these three classes simultaneously. It was also found that the removal of temperature or heat of formation significantly changes the clustering behaviour. The proposed method in this study can be used to find the closest material chemistry for a desired set of properties.

Original languageEnglish (US)
Article number100366
StatePublished - Sep 2019
Externally publishedYes


  • Artificial intelligence
  • Clustering
  • Hydrogen storage materials
  • Machine-learning
  • Metal hydrides

ASJC Scopus subject areas

  • General Materials Science


Dive into the research topics of 'Application of data science tools to determine feature correlation and cluster metal hydrides for hydrogen storage'. Together they form a unique fingerprint.

Cite this