TY - JOUR
T1 - Estimation with selected binomial information or do you really believe that dave winfield is batting .471?
AU - Casella, George
AU - Berger, Roger L.
N1 - Funding Information:
* George Casella is Professor, Biometrics Unit, Cornell University, Ithaca, NY 14853. Roger L. Berger is Professor, Department of Statistics, North Carolina State University, Raleigh, NC 27695. This research was supported by National Science Foundation Grant DMS9 100839 and National Security Agency Grant 90F-073. The authors thank Steve Hirdt of the Elias Sports Bureau for providing detailed data for the 1992 Major League Baseball season. They also thank Marty Wells for numerous conversations concerning the Gibbs/EM algorithm implementation and the editors and referees for many constructive comments on an earlier version of this article.
PY - 1994/9
Y1 - 1994/9
N2 - Often sports announcers, particularly in baseball, provide the listener with exaggerated information concerning a player’s performance. For example, we may be told that Dave Winfield, a popular baseball player, has hit safely in 8 of his last 17 chances (a batting average of .471). This is biased, or selected information, as the “17” was chosen to maximize the reported percentage. We model this as observing a maximum success rate of a Bernoulli process and show how to construct the likelihood function for a player’s true batting ability. The likelihood function is a high-degree polynomial, but it can be computed exactly. Alternatively, the problem yields to solutions based on either the EM algorithm or Gibbs sampling. Using these techniques, we compute maximum likelihood estimators, Bayes estimators, and associated measures of error. We also show how to approximate the likelihood using a Brownian motion calculation. We find that although constructing good estimators from selected information is difficult, we seem to be able to estimate better than expected, particularly when using prior information. The estimators are illustrated with data from the 1992 Major League Baseball season.
AB - Often sports announcers, particularly in baseball, provide the listener with exaggerated information concerning a player’s performance. For example, we may be told that Dave Winfield, a popular baseball player, has hit safely in 8 of his last 17 chances (a batting average of .471). This is biased, or selected information, as the “17” was chosen to maximize the reported percentage. We model this as observing a maximum success rate of a Bernoulli process and show how to construct the likelihood function for a player’s true batting ability. The likelihood function is a high-degree polynomial, but it can be computed exactly. Alternatively, the problem yields to solutions based on either the EM algorithm or Gibbs sampling. Using these techniques, we compute maximum likelihood estimators, Bayes estimators, and associated measures of error. We also show how to approximate the likelihood using a Brownian motion calculation. We find that although constructing good estimators from selected information is difficult, we seem to be able to estimate better than expected, particularly when using prior information. The estimators are illustrated with data from the 1992 Major League Baseball season.
KW - Brownian motion
KW - EM algorithm
KW - Gibbs sampling
KW - Selection bias
UR - http://www.scopus.com/inward/record.url?scp=21844513757&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=21844513757&partnerID=8YFLogxK
U2 - 10.1080/01621459.1994.10476846
DO - 10.1080/01621459.1994.10476846
M3 - Article
AN - SCOPUS:21844513757
SN - 0162-1459
VL - 89
SP - 1080
EP - 1090
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 427
ER -