A new optimization criterion for generalized discriminant analysis on undersampled problems

Jieping Ye; Ravi Janardan; Cheong Hee Park; Haesun Park

A new optimization criterion for generalized discriminant analysis on undersampled problems

Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

A new optimization criterion for discriminant analysis is presented. The new criterion extends the optimization criteria of the classical linear discriminant analysis (LDA) by introducing the pseudo-inverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of the classical LDA. Recently, a new algorithm called LDA/GSVD for structure-preserving dimension reduction has been introduced, which extends the classical LDA to very high-dimensional undersampled problems by using the generalized singular value decomposition (GSVD). The solution from the LDA/GSVD algorithm is a special case of the solution for our generalized criterion in this paper, which is also based on GSVD. We also present an approximate solution for our GSVDbased solution, which reduces computational complexity by finding sub-clusters of each cluster, and using their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices of which the GSVD can be applied efficiently. Experiments on text data, with up to 7000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

Original language	English (US)
Title of host publication	Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003
Pages	419-426
Number of pages	8
State	Published - 2003
Externally published	Yes
Event	3rd IEEE International Conference on Data Mining, ICDM '03 - Melbourne, FL, United States Duration: Nov 19 2003 → Nov 22 2003

Publication series

Name	Proceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)	1550-4786

Other

Other	3rd IEEE International Conference on Data Mining, ICDM '03
Country/Territory	United States
City	Melbourne, FL
Period	11/19/03 → 11/22/03

ASJC Scopus subject areas

General Engineering

Cite this

A new optimization criterion for generalized discriminant analysis on undersampled problems. / Ye, Jieping; Janardan, Ravi; Park, Cheong Hee et al.
Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003. 2003. p. 419-426 (Proceedings - IEEE International Conference on Data Mining, ICDM).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Ye, J, Janardan, R, Park, CH & Park, H 2003, A new optimization criterion for generalized discriminant analysis on undersampled problems. in Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003. Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 419-426, 3rd IEEE International Conference on Data Mining, ICDM '03, Melbourne, FL, United States, 11/19/03.

@inproceedings{66a4eae1d4a842929cd61b84440fb3bb,

title = "A new optimization criterion for generalized discriminant analysis on undersampled problems",

abstract = "A new optimization criterion for discriminant analysis is presented. The new criterion extends the optimization criteria of the classical linear discriminant analysis (LDA) by introducing the pseudo-inverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of the classical LDA. Recently, a new algorithm called LDA/GSVD for structure-preserving dimension reduction has been introduced, which extends the classical LDA to very high-dimensional undersampled problems by using the generalized singular value decomposition (GSVD). The solution from the LDA/GSVD algorithm is a special case of the solution for our generalized criterion in this paper, which is also based on GSVD. We also present an approximate solution for our GSVDbased solution, which reduces computational complexity by finding sub-clusters of each cluster, and using their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices of which the GSVD can be applied efficiently. Experiments on text data, with up to 7000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.",

author = "Jieping Ye and Ravi Janardan and Park, {Cheong Hee} and Haesun Park",

year = "2003",

language = "English (US)",

isbn = "0769519784",

series = "Proceedings - IEEE International Conference on Data Mining, ICDM",

pages = "419--426",

booktitle = "Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003",

}

TY - GEN

T1 - A new optimization criterion for generalized discriminant analysis on undersampled problems

AU - Ye, Jieping

AU - Janardan, Ravi

AU - Park, Cheong Hee

AU - Park, Haesun

PY - 2003

Y1 - 2003

N2 - A new optimization criterion for discriminant analysis is presented. The new criterion extends the optimization criteria of the classical linear discriminant analysis (LDA) by introducing the pseudo-inverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of the classical LDA. Recently, a new algorithm called LDA/GSVD for structure-preserving dimension reduction has been introduced, which extends the classical LDA to very high-dimensional undersampled problems by using the generalized singular value decomposition (GSVD). The solution from the LDA/GSVD algorithm is a special case of the solution for our generalized criterion in this paper, which is also based on GSVD. We also present an approximate solution for our GSVDbased solution, which reduces computational complexity by finding sub-clusters of each cluster, and using their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices of which the GSVD can be applied efficiently. Experiments on text data, with up to 7000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

AB - A new optimization criterion for discriminant analysis is presented. The new criterion extends the optimization criteria of the classical linear discriminant analysis (LDA) by introducing the pseudo-inverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of the classical LDA. Recently, a new algorithm called LDA/GSVD for structure-preserving dimension reduction has been introduced, which extends the classical LDA to very high-dimensional undersampled problems by using the generalized singular value decomposition (GSVD). The solution from the LDA/GSVD algorithm is a special case of the solution for our generalized criterion in this paper, which is also based on GSVD. We also present an approximate solution for our GSVDbased solution, which reduces computational complexity by finding sub-clusters of each cluster, and using their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices of which the GSVD can be applied efficiently. Experiments on text data, with up to 7000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

UR - http://www.scopus.com/inward/record.url?scp=78149326379&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78149326379&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:78149326379

SN - 0769519784

SN - 9780769519784

T3 - Proceedings - IEEE International Conference on Data Mining, ICDM

SP - 419

EP - 426

BT - Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003

T2 - 3rd IEEE International Conference on Data Mining, ICDM '03

Y2 - 19 November 2003 through 22 November 2003

ER -

A new optimization criterion for generalized discriminant analysis on undersampled problems

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this