TY - GEN
T1 - Story detection using generalized concepts and relations
AU - Ceran, Betul
AU - Kedia, Nitesh
AU - Corman, Steven
AU - Davulcu, Hasan
N1 - Funding Information:
Acknowledgment This research was supported by an Office of Naval Research grants N00014-09-1-0872 and N00014-14-1-0477 performed at Arizona State University. Some of the material presented here was sponsored by Department of Defense and is approved for public release, case number:15-467.
Publisher Copyright:
© 2015 ACM.
PY - 2015/8/25
Y1 - 2015/8/25
N2 - A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Shallow parsers reveal semantic roles of words leading to subject-verb-object triplets. We developed a novel algorithm to extract information from triplets by clustering them into generalized concepts by utilizing syntactic criteria based on common contexts and semantic corpus-based statistical criteria based on "contextual synonyms". We show that generalized concepts representation of text (1) overcomes surface level differences (which arise when different keywords are used for related concepts) without drift, (2) leads to a higher-level semantic network representation of related stories, and (3) when used as features, they yield a significant 36% boost in performance for the story detection task.
AB - A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Shallow parsers reveal semantic roles of words leading to subject-verb-object triplets. We developed a novel algorithm to extract information from triplets by clustering them into generalized concepts by utilizing syntactic criteria based on common contexts and semantic corpus-based statistical criteria based on "contextual synonyms". We show that generalized concepts representation of text (1) overcomes surface level differences (which arise when different keywords are used for related concepts) without drift, (2) leads to a higher-level semantic network representation of related stories, and (3) when used as features, they yield a significant 36% boost in performance for the story detection task.
UR - http://www.scopus.com/inward/record.url?scp=84962530669&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962530669&partnerID=8YFLogxK
U2 - 10.1145/2808797.2809312
DO - 10.1145/2808797.2809312
M3 - Conference contribution
AN - SCOPUS:84962530669
T3 - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
SP - 942
EP - 949
BT - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
A2 - Pei, Jian
A2 - Tang, Jie
A2 - Silvestri, Fabrizio
PB - Association for Computing Machinery, Inc
T2 - IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
Y2 - 25 August 2015 through 28 August 2015
ER -