TY - GEN
T1 - Generating Topic-Preserving Synthetic News
AU - Mosallanezhad, Ahmadreza
AU - Shu, Kai
AU - Liu, Huan
N1 - Funding Information:
VII. ACKNOWLEDGEMENTS This material is based upon the work supported, in part, by ONR N00014-21-1-4002 and AFRL FA8650-15-D-6583/FA8650-17-F-6820. Kai Shu is supported by the NSF award #2109316.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - The text generation methods have witnessed great success in text summarization, machine translation, and synthetic news generation. However, these techniques may be abused to generate disinformation and fake news. To better understand the potential threats of synthetic news, we develop a novel generation method RLTG to generate topic-preserving news content. The majority of existing text generation methods are either controlled by specific attributes or lack topic consistency between the input claims and output news, making synthetic news less coherent and realistic. In this paper, we study the problem of topic-preserving synthetic news generation by proposing a novel deep reinforcement learning-based method to control the output of large pre-trained language models. Experiment results on real-world datasets demonstrate that the news contents generated by RLTG are topic-consistent and realistic.
AB - The text generation methods have witnessed great success in text summarization, machine translation, and synthetic news generation. However, these techniques may be abused to generate disinformation and fake news. To better understand the potential threats of synthetic news, we develop a novel generation method RLTG to generate topic-preserving news content. The majority of existing text generation methods are either controlled by specific attributes or lack topic consistency between the input claims and output news, making synthetic news less coherent and realistic. In this paper, we study the problem of topic-preserving synthetic news generation by proposing a novel deep reinforcement learning-based method to control the output of large pre-trained language models. Experiment results on real-world datasets demonstrate that the news contents generated by RLTG are topic-consistent and realistic.
KW - Adversarial Training
KW - Reinforcement Learning
KW - Text Generation
UR - http://www.scopus.com/inward/record.url?scp=85125362758&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125362758&partnerID=8YFLogxK
U2 - 10.1109/BigData52589.2021.9671623
DO - 10.1109/BigData52589.2021.9671623
M3 - Conference contribution
AN - SCOPUS:85125362758
T3 - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
SP - 490
EP - 499
BT - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
A2 - Chen, Yixin
A2 - Ludwig, Heiko
A2 - Tu, Yicheng
A2 - Fayyad, Usama
A2 - Zhu, Xingquan
A2 - Hu, Xiaohua Tony
A2 - Byna, Suren
A2 - Liu, Xiong
A2 - Zhang, Jianping
A2 - Pan, Shirui
A2 - Papalexakis, Vagelis
A2 - Wang, Jianwu
A2 - Cuzzocrea, Alfredo
A2 - Ordonez, Carlos
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Big Data, Big Data 2021
Y2 - 15 December 2021 through 18 December 2021
ER -