TY - JOUR
T1 - Pivoting approaches for bulk extraction of Entity-Attribute-Value data
AU - Dinu, Valentin
AU - Nadkarni, Prakash
AU - Brandt, Cynthia
N1 - Funding Information:
This work was supported in part by NIH Grants U01 CA78266, K23 RR16042 and institutional funds from Yale University School of Medicine. The authors would like to thank Perry Miller for comments that improved this manuscript.
PY - 2006/4
Y1 - 2006/4
N2 - Entity-Attribute-Value (EAV) data, as present in repositories of clinical patient data, must be transformed (pivoted) into one-column-per-parameter format before it can be used by a variety of analytical programs. Pivoting approaches have not been described in depth in the literature, and existing descriptions are dated. We describe and benchmark three alternative algorithms to perform pivoting of clinical data in the context of a clinical study data management system. We conclude that when the number of attributes to be returned is not too large, it is feasible to use static SQL as the basis for views on the data. An alternative but more complex approach that utilizes hash tables and the presence of abundant random-access-memory can achieve improved performance by reducing the load on the database server.
AB - Entity-Attribute-Value (EAV) data, as present in repositories of clinical patient data, must be transformed (pivoted) into one-column-per-parameter format before it can be used by a variety of analytical programs. Pivoting approaches have not been described in depth in the literature, and existing descriptions are dated. We describe and benchmark three alternative algorithms to perform pivoting of clinical data in the context of a clinical study data management system. We conclude that when the number of attributes to be returned is not too large, it is feasible to use static SQL as the basis for views on the data. An alternative but more complex approach that utilizes hash tables and the presence of abundant random-access-memory can achieve improved performance by reducing the load on the database server.
KW - Clinical patient record systems
KW - Clinical study data management systems
KW - Databases
KW - Entity-Attribute-Value
UR - http://www.scopus.com/inward/record.url?scp=33645274632&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33645274632&partnerID=8YFLogxK
U2 - 10.1016/j.cmpb.2006.02.001
DO - 10.1016/j.cmpb.2006.02.001
M3 - Article
C2 - 16556470
AN - SCOPUS:33645274632
SN - 0169-2607
VL - 82
SP - 38
EP - 43
JO - Computer Methods and Programs in Biomedicine
JF - Computer Methods and Programs in Biomedicine
IS - 1
ER -