Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here to sign up for SAGE Journal Email Alerts today!

Sign In to gain access to subscriptions and/or personal tools.
Sociological Methods & Research
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (4)
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Stine, R. A.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Model Selection Using Information Theory and the MDL Principle

Robert A. Stine

University of Pennsylvania

Information theory offers a coherent, intuitive view of model selection. This perspective arises from thinking of a statistical model as a code, an algorithm for compressing data into a sequence of bits. The description length is the length of this code for the data plus the length of a description of the model itself. The length of the code for the data measures the fit of the model to the data, whereas the length of the code for the model measures its complexity. The minimum description length (MDL) principle picks the model with smallest description length, balancing fit versus complexity. Variations on MDL reproduce other well-known methods of model selection. Going further, information theory allows one to choose from among various types of models, permitting the comparison of tree-based models to regressions. A running example compares several models for the well-known Boston housing data.

Key Words: Akaike information criterion (AIC) • Bayes information criterion (BIC) • risk inflation criterion (RIC) • cross-validation • model selection, stepwise regression • regression tree

Sociological Methods & Research, Vol. 33, No. 2, 230-260 (2004)
DOI: 10.1177/0049124103262064


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
JAMAHome page
G. C. Fonarow, K. F. Adams Jr, W. T. Abraham, C. W. Yancy, and W. J. Boscardin
Risk Stratification for In-Hospital Mortality in Acutely Decompensated Heart Failure--Reply
JAMA, May 25, 2005; 293(20): 2468 - 2468.
[Full Text] [PDF]


Home page
Sociological Methods ResearchHome page
D. L. Weakliem
Introduction to the Special Issue on Model Selection
Sociological Methods Research, November 1, 2004; 33(2): 167 - 187.
[PDF]