|
Sign In to gain access to subscriptions and/or personal tools.
|
Multimodel Inference
Understanding AIC and BIC in Model Selection
Kenneth P. Burnham
David R. Anderson
Colorado Cooperative Fish and Wildlife Research Unit (USGS-BRD)
The model selection literature has been generally poor at reflecting the deep foundations of the Akaike information criterion (AIC) and at making appropriate comparisons to the Bayesian information criterion (BIC). There is a clear philosophy, a sound criterion based in information theory, and a rigorous statistical foundation for AIC. AIC can be justified as Bayesian using a "savvy" prior on models that is a function of sample size and the number of model parameters. Furthermore, BIC can be derived as a non-Bayesian result. Therefore, arguments about using AIC versus BIC for model selection cannot be from a Bayes versus frequentist perspective. The philosophical context of what is assumed about reality, approximating models, and the intent of model-based inference should determine whether AIC or BIC is used. Various facets of such multimodel inference are presented here, particularly methods of model averaging.
Key Words: AIC BIC model averaging model selection multimodel inference
References
- Akaike, Hirotugu. 1973. "Information Theory as an Extension of the Maximum Likelihood Principle." Pp. 267-281 in Second International Symposium on Information Theory, edited by B. N. Petrov and F. Csaki. Budapest: Akademiai Kiado.
- Akaike, Hirotugu. 1974. "A New Look at the Statistical Model Identification." IEEE Transactions on Automatic ControlAC-19:716-723.[CrossRef]
- Akaike, Hirotugu. 1981. "Likelihood of a Model and Information Criteria." Journal of Econometrics 16:3-14.[CrossRef]
- Akaike, Hirotugu. 1983. "Information Measures and Model Selection." International Statistical Institute 44:277-291.
- Akaike, Hirotugu. 1985. "Prediction and Entropy." Pp. 1-24 in A Celebration of Statistics, edited by Anthony C. Atkinson and Stephen E. Fienberg. New York: Springer-Verlag.
- Akaike, Hirotugu. 1992. "Information Theory and an Extension of the Maximum Likelihood Principle." Pp. 610-624 in Breakthroughs in Statistics, vol. 1, edited by Samuel Kotz and Norman L. Johnson. London: Springer-Verlag.
- Akaike, Hirotugu. 1994. "Implications of the Informational Point of View on the Development of Statistical Science." Pp. 27-38 in Engineering and Scientific Applications: Vol. 3. Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, edited by Hamparsum Bozdogan. Dordrecht, the Netherlands: Kluwer Academic.
- Andserson, David R. and Kenneth P. Burnham. 2002. "Avoiding Pitfalls When Using Information-Theoretic Methods." Journal of Wildlife Management66:910-916.
- Azzalini, Adelchi. 1996. Statistical Inference Based on the Likelihood. London: Chapman & Hall.
- Boltzmann, Ludwig. 1877. "Uber die Beziehung Zwischen dem Hauptsatze der Mechanischen Warmetheorie und der Wahrscheinlicjkeitsrechnung Respective den Satzen uber das Warmegleichgewicht." Wiener Berichte76:373-435.
- Breiman, Leo. 1992. "The Little Bootstrap and Other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error." Journal of the American Statistical Association87:738-754.[CrossRef]
- Breiman, Leo. 2001. "Statistical Modeling: The Two Cultures." Statistical Science26:199-231.
- Buckland, Steven T., Kenneth P. Burnham, and Nicole H. Augustin. 1997. "Model Selection: An Integral Part of Inference." Biometrics53:603-618.[CrossRef][Web of Science]
- Burnham, Kenneth P. and David R. Anderson. 1998. Model Selection and Inference: A Practical Information-Theoretical Approach. New York: Springer-Verlag.
- Burnham, Kenneth P. and David R. Anderson. 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretical Approach. 2d ed. New York: Springer-Verlag.
- Cavanaugh, Joseph E. and Andrew A. Neath. 1999. "Generalizing the Derivation of the Schwarz Information Criterion." Communication in Statistics Theory and Methods28:49-66.
- Chamberlin, Thomas. [1890] 1965. "The Method of Multiple Working Hypotheses." Science 148:754-759.
- deLeeuw, Jan. 1992. "Introduction to Akaike (1973) Information Theory and an Extension of the Maximum Likelihood Principle." Pp. 599-609 in Breakthroughs in Statistics, vol. 1, edited by Samuel Kotz and Norman L. Johnson. London: Springer-Verlag.
- Edwards, AnthonyW. F. 1992. Likelihood. Expanded ed. Baltimore: Johns Hopkins University Press.
- Forster, Malcolm R. 2000. "Key Concepts in Model Selection: Performance and Generalizability." Journal of Mathematical Psychology44:205-231.[CrossRef][Web of Science][Medline]
[Order article via Infotrieve]
- Forster, Malcolm R.. 2001. "The New Science of Simplicity." Pp. 83-119 in Simplicity, Inference and Modelling: Keeping It Sophisticatedly Simple, edited by Arnold Zellner, Hugo A. Keuzenkamp, and Michael McAleer. Cambridge, UK: Cambridge University Press.
- Forster, Malcolm R. and Elliott Sober. 1994. "How to Tell Simpler, More Unified, or Less Ad Hoc Theories Will Provide More Accurate Predictions." British Journal of the Philosophy of Science45:1-35.[Abstract]
- Gelfand, Alan and Dipak K. Dey. 1994. "Bayesian Model Choice: Asymptotics and Exact Calculations." Journal of the Royal Statistical Society, Series B56:501-514.
- Gelman, Andrew, John C. Carlin, Hal S. Stern, and Donald B. Rubin. 1995. Bayesian Data Analysis. New York: Chapman & Hall.
- Hand, David J. and Veronica Vinciotti. 2003. "Local Versus Global Models for Classification Problems: Fitting Models Where It Matters." The American Statistician57:124-131.[CrossRef]
- Hansen, Mark H. and Charles Kooperberg. 2002. "Spline Adaptation in Extended Linear Models." Statistical Science17:2-51.[CrossRef]
- Hoeting, Jennifer A., David Madigan, Adrian E. Raftery, and Chris T. Volinsky. 1999. "Bayesian Model Averaging: A Tutorial (With Discussion)." Statistical Science14:382-417.[CrossRef]
- Hurvich, Clifford M. and Chih-Ling Tsai. 1989. "Regression and Time Series Model Selection in Small Samples." Biometrika76:297-307.[Abstract/Free Full Text]
- Hurvich, Clifford M. and Chih-Ling Tsai. 1995. "Model Selection for Extended Quasi-Likelihood Models in Small Samples." Biometrics51:1077-1084.[CrossRef][Web of Science][Medline]
[Order article via Infotrieve]
- Johnson, Roger W. 1996. "Fitting Percentage of Body Fat to Simple Body Measurements." Journal of Statistics Education4(1). Retrieved from www.amstat.org/publications/jse/v4n1/datasets.johnson.html
- Kass, Robert E. and Adrian E. Raftery. 1995. "Bayes Factors." Journal of the American Statistical Association90:773-795.[CrossRef]
- Key, Jane T., Luis R. Pericchi, and Adrian F. M. Smith. 1999. "Bayesian Model Choice: What and Why?" Pp. 343-370 in Bayesian Statistics 6, edited by Jos¥e M. Bernardo, James O. Berger, A. Philip Dawid, and Adrian F. M. Smith. Oxford, UK: Oxford University Press.
- Kullback, Soloman and Richard A. Leibler. 1951. "On Information and Sufficiency." Annals of Mathematical Statistics22:79-86.
- Lahiri, Partha, ed. 2001. Model Selection. Beachwood, OH: Lecture Notes-Monograph Series, Institute of Mathematical Statistics.
- Lehman, Eric L. 1990. "Model Specification: The Views of Fisher and Neyman, and Later Observations." Statistical Science5:160-168.
- Linhart, H. and Walter Zucchini. 1986. Model Selection. New York: John Wiley.
- McQuarrie, Alan D. R. and Chih-Ling Tsai. 1998. Regression and Time Series Model Selection. Singapore: World Scientific Publishing Company.
- Meyer, Mary C. and PurushottamW. Laud. 2002. "Predictive Variable Selection in Generalized Linear Models." Journal of the American Statistical Association97:859-871.[CrossRef]
- Parzen, Emmanuel, Kunio Tanabe, and Genshiro Kitagawa, eds. 1998. Selected Papers of Hirotugu Akaike. New York: Springer-Verlag.
- Raftery, Adrian E. 1995. "Bayesian Model Selection in Social Research (With Discussion)." Sociological Methodology25:111-195.
- Raftery, Adrian E.. 1996. "Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Regression Models." Biometrika83:251-266.[Abstract/Free Full Text]
- Reschenhofer, Erhard. 1996. "Prediction With Vague Prior Knowledge." Communications in StatisticsTheory and Methods25:601-608.
- Royall, Richard M. 1997. Statistical Evidence: A Likelihood Paradigm. London: Chapman & Hall.
- Stone, Mervyn. 1974. "Cross-Validatory Choice and Assessment of Statistical Predictions (With Discussion)." Journal of the Royal Statistical Society, Series B39:111-147.
- Stone, Mervyn. 1977. "An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaikes Criterion." Journal of the Royal Statistical Society, Series B39:44-47.
- Schwarz, Gideon. 1978. "Estimating the Dimension of a Model." Annals of Statistics 6:461-464.[Web of Science]
- Spiegelhalter, David J., Nicola G. Best, Bradley P. Carlin, and Angelita van der Linde. 2002. "Bayesian Measures of Model Complexity and Fit." Journal of the Royal Statistical Society, Series B64:1-34.
- Sugiura, Nariaki. 1978. "Further Analysis of the Data by Akaikes Information Criterion and the Finite Corrections." Communications in Statistics, Theory and MethodsA7:13-26.
- Takeuchi, Kei. 1976. "Distribution of Informational Statistics and a Criterion of Model Fitting" (in Japanese). Suri-Kagaku (Mathematic Sciences)153:12-18.
- Wasserman, Larry. 2000. "Bayesian Model Selection and Model Averaging." Journal of Mathematical Psychology44:92-107.[CrossRef][Web of Science][Medline]
[Order article via Infotrieve]
- Weakliem, David L. 1999. "A Critique of the Bayesian Information Criterion for Model Selection." Sociological Methods & Research27:359-397.[Abstract/Free Full Text]
- Williams, David. 2001. Weighing the Odds: A Course in Probability and Statistics. Cambridge, UK: Cambridge University Press.
Sociological Methods & Research, Vol. 33, No. 2,
261-304 (2004)
DOI: 10.1177/0049124104268644

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati Twitter What's this?
This article has been cited by other articles:

|
 |

|
 |
 
J. F. Soechting, J. Z. Juveli, and H. M. Rao
Models for the Extrapolation of Target Motion for Manual Interception
J Neurophysiol,
September 1, 2009;
102(3):
1491 - 1502.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. H.G. Ezard, S. D. Cote, and F. Pelletier
Eco-evolutionary dynamics: disentangling phenotypic, environmental and population fluctuations
Phil Trans R Soc B,
June 12, 2009;
364(1523):
1491 - 1498.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. J. Balas and P. Sinha
The role of sequence order in determining view canonicality for novel wire-frame objects
Atten Percept Psychophys,
May 1, 2009;
71(4):
712 - 723.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S. J. Schwartz, C. A. Mason, H. Pantin, and J. Szapocznik
Longitudinal Relationships Between Family Functioning and Identity Development in Hispanic Adolescents: Continuity and Change
The Journal of Early Adolescence,
April 1, 2009;
29(2):
177 - 211.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Lavoue and P. O. Droz
Multimodel Inference and Multimodel Averaging in Empirical Modeling of Occupational Exposure Levels
Ann. Hyg.,
March 1, 2009;
53(2):
173 - 180.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Cui, N. de Klerk, M. Abramson, A. Del Monaco, G. Benke, M. Dennekamp, A. W. Musk, and M. Sim
Fractional Polynomials and Model Selection in Generalized Estimating Equations Analysis, With an Application to a Longitudinal Epidemiologic Study in Australia
Am. J. Epidemiol.,
January 1, 2009;
169(1):
113 - 121.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Meyer and M. Kirkpatrick
Perils of Parsimony: Properties of Reduced-Rank Estimates of Genetic Covariance Matrices
Genetics,
October 1, 2008;
180(2):
1153 - 1166.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Hossler and V. Bouchard
The Joint Estimation of Soil Trace Gas Fluxes
Soil Sci. Soc. Am. J.,
August 20, 2008;
72(5):
1382 - 1393.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. M. DeSantis, E. A. Houseman, B. A. Coull, A. Stemmer-Rachamimov, and R. A. Betensky
A penalized latent class model for ordinal data
Biostat.,
April 1, 2008;
9(2):
249 - 262.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. W. Hollister, P. V. August, J. F. Paul, and H. A. Walker
Predicting Estuarine Sediment Metal Concentrations and Inferred Ecological Conditions: An Information Theoretic Approach
J. Environ. Qual.,
January 4, 2008;
37(1):
234 - 244.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. Kusterer, H.-P. Piepho, H. F. Utz, C. C. Schon, J. Muminovic, R. C. Meyer, T. Altmann, and A. E. Melchinger
Heterosis for Biomass-Related Traits in Arabidopsis Investigated by Quantitative Trait Loci Analysis of the Triple Testcross Design With Recombinant Inbred Lines
Genetics,
November 1, 2007;
177(3):
1839 - 1850.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. M. Fitzpatrick and H. B. Shaffer
Hybrid vigor between native and introduced salamanders raises new challenges for conservation
PNAS,
October 2, 2007;
104(40):
15793 - 15798.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. A. DIUK-WASSER, M. B. TOURE, G. DOLO, M. BAGAYOKO, N. SOGOBA, I. SISSOKO, S. F. TRAORE, and C. E. TAYLOR
EFFECT OF RICE CULTIVATION PATTERNS ON MALARIA VECTOR ABUNDANCE IN RICE-GROWING VILLAGES IN MALI
Am J Trop Med Hyg,
May 1, 2007;
76(5):
869 - 874.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. M. Markle, R. A. Schincariol, J. H. Sass, and J. W. Molson
Characterizing the Two-Dimensional Thermal Conductivity Distribution in a Sand and Gravel Aquifer
Soil Sci. Soc. Am. J.,
June 21, 2006;
70(4):
1281 - 1294.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. L. Weakliem
Introduction to the Special Issue on Model Selection
Sociological Methods Research,
November 1, 2004;
33(2):
167 - 187.
[PDF]
|
 |
|
|
|