TY - GEN
T1 - A new optimization criterion for generalized discriminant analysis on undersampled problems
AU - Ye, Jieping
AU - Janardan, Ravi
AU - Park, Cheong Hee
AU - Park, Haesun
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2003
Y1 - 2003
N2 - A new optimization criterion for discriminant analysis is presented. The new criterion extends the optimization criteria of the classical linear discriminant analysis (LDA) by introducing the pseudo-inverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of the classical LDA. Recently, a new algorithm called LDA/GSVD for structure-preserving dimension reduction has been introduced, which extends the classical LDA to very high-dimensional undersampled problems by using the generalized singular value decomposition (GSVD). The solution from the LDA/GSVD algorithm is a special case of the solution for our generalized criterion in this paper, which is also based on GSVD. We also present an approximate solution for our GSVDbased solution, which reduces computational complexity by finding sub-clusters of each cluster, and using their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices of which the GSVD can be applied efficiently. Experiments on text data, with up to 7000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.
AB - A new optimization criterion for discriminant analysis is presented. The new criterion extends the optimization criteria of the classical linear discriminant analysis (LDA) by introducing the pseudo-inverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of the classical LDA. Recently, a new algorithm called LDA/GSVD for structure-preserving dimension reduction has been introduced, which extends the classical LDA to very high-dimensional undersampled problems by using the generalized singular value decomposition (GSVD). The solution from the LDA/GSVD algorithm is a special case of the solution for our generalized criterion in this paper, which is also based on GSVD. We also present an approximate solution for our GSVDbased solution, which reduces computational complexity by finding sub-clusters of each cluster, and using their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices of which the GSVD can be applied efficiently. Experiments on text data, with up to 7000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.
UR - http://www.scopus.com/inward/record.url?scp=78149326379&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149326379&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:78149326379
SN - 0769519784
SN - 9780769519784
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 419
EP - 426
BT - Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003
T2 - 3rd IEEE International Conference on Data Mining, ICDM '03
Y2 - 19 November 2003 through 22 November 2003
ER -