TY - GEN
T1 - Evaluating distributional semantic and feature selection for extracting relationships from biological text
AU - Emadzadeh, Ehsan
AU - Jonnalagadda, Siddhartha
AU - Gonzalez, Graciela
PY - 2011
Y1 - 2011
N2 - The constant flow of biomolecular findings being published each day challenges our ability to develop methods to automatically extract the knowledge expressed in text to potentially influence new discoveries. Finding relations between the biological entities (e.g. proteins and genes) in text is a challenging task. To facilitate the extraction process, a relation can be decomposed into a trigger and the complementary arguments (e.g. theme, site). Several approaches have been proposed based on machine learning which generally use a common set of features for all trigger types. Here we evaluate the impact of applying a feature selection method for trigger classification. Our proposed method uses a greedy feature selection algorithm to find an optimal set of attributes for each trigger type. We show that using the customized set of features can improve classification results significantly (up to 53.96% in f-measure). In addition, we evaluated different settings for including semantic features in the classifiers. We found that using semantic features can improve classification results and found the best setting for each trigger type.
AB - The constant flow of biomolecular findings being published each day challenges our ability to develop methods to automatically extract the knowledge expressed in text to potentially influence new discoveries. Finding relations between the biological entities (e.g. proteins and genes) in text is a challenging task. To facilitate the extraction process, a relation can be decomposed into a trigger and the complementary arguments (e.g. theme, site). Several approaches have been proposed based on machine learning which generally use a common set of features for all trigger types. Here we evaluate the impact of applying a feature selection method for trigger classification. Our proposed method uses a greedy feature selection algorithm to find an optimal set of attributes for each trigger type. We show that using the customized set of features can improve classification results significantly (up to 53.96% in f-measure). In addition, we evaluated different settings for including semantic features in the classifiers. We found that using semantic features can improve classification results and found the best setting for each trigger type.
KW - Distributional Semantic
KW - Feature selection
KW - NLP
KW - Relation Extraction
UR - http://www.scopus.com/inward/record.url?scp=84857874139&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857874139&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2011.65
DO - 10.1109/ICMLA.2011.65
M3 - Conference contribution
AN - SCOPUS:84857874139
SN - 9780769546070
T3 - Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011
SP - 66
EP - 71
BT - Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011
T2 - 10th International Conference on Machine Learning and Applications, ICMLA 2011
Y2 - 18 December 2011 through 21 December 2011
ER -