Pharmacovigilance on twitter? Mining tweets for adverse drug reactions

Karen O'Connor; Pranoti Pimpalkhute; Azadeh Nikfarjam; Rachel Ginn; Karen L. Smith; Graciela Gonzalez

Pharmacovigilance on twitter? Mining tweets for adverse drug reactions

Karen O'Connor, Pranoti Pimpalkhute, Azadeh Nikfarjam, Rachel Ginn, Karen L. Smith, Graciela Gonzalez

Research output: Contribution to journal › Article › peer-review

Abstract

Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads.

Original language	English (US)
Pages (from-to)	924-933
Number of pages	10
Journal	AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
Volume	2014
State	Published - 2014

ASJC Scopus subject areas

General Medicine

Cite this

@article{f4e6d05245c54f248fd809bcb30f28a7,

title = "Pharmacovigilance on twitter? Mining tweets for adverse drug reactions",

abstract = "Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads. ",

author = "Karen O'Connor and Pranoti Pimpalkhute and Azadeh Nikfarjam and Rachel Ginn and Smith, {Karen L.} and Graciela Gonzalez",

year = "2014",

language = "English (US)",

volume = "2014",

pages = "924--933",

journal = "AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium",

issn = "1559-4076",

publisher = "American Medical Informatics Association",

}

TY - JOUR

T1 - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions

AU - O'Connor, Karen

AU - Pimpalkhute, Pranoti

AU - Nikfarjam, Azadeh

AU - Ginn, Rachel

AU - Smith, Karen L.

AU - Gonzalez, Graciela

PY - 2014

Y1 - 2014

N2 - Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads.

AB - Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads.

UR - http://www.scopus.com/inward/record.url?scp=84964312897&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964312897&partnerID=8YFLogxK

M3 - Article

C2 - 25954400

AN - SCOPUS:84964312897

SN - 1559-4076

VL - 2014

SP - 924

EP - 933

JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

ER -

Pharmacovigilance on twitter? Mining tweets for adverse drug reactions

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this