TY - GEN
T1 - Incorporating emoji descriptions improves tweet classification
AU - Singh, Abhishek
AU - Blanco, Eduardo
AU - Jin, Wei
N1 - Funding Information:
This material is based upon work supported by the National Science Foundation under Grants Nos. 1734730, 1832267 and 1845757. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The Titan Xp used for this research was donated by the NVIDIA Corporation.
Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2019
Y1 - 2019
N2 - Tweets are short messages that often include specialized language such as hashtags and emojis. In this paper, we present a simple strategy to process emojis: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. We show that this strategy is more effective than using pretrained emoji embeddings for tweet classification. Specifically, we obtain new state-of-the-art results in irony detection and sentiment analysis despite our neural network is simpler than previous proposals.
AB - Tweets are short messages that often include specialized language such as hashtags and emojis. In this paper, we present a simple strategy to process emojis: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. We show that this strategy is more effective than using pretrained emoji embeddings for tweet classification. Specifically, we obtain new state-of-the-art results in irony detection and sentiment analysis despite our neural network is simpler than previous proposals.
UR - http://www.scopus.com/inward/record.url?scp=85085529066&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085529066&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85085529066
T3 - NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
SP - 2096
EP - 2101
BT - Long and Short Papers
PB - Association for Computational Linguistics (ACL)
T2 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019
Y2 - 2 June 2019 through 7 June 2019
ER -