The Effective Sample Size in Bayesian Information Criterion for Level-Specific Fixed and Random Effects Selection in a Two-Level Nested Model
Published 2022-06-23Version 1
Popular statistical software provides Bayesian information criterion (BIC) for multilevel models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties of the proper use of the BIC in selecting a multilevel model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multilevel models. In this study, we derive the BIC's penalty term for level-specific fixed and random effect selection in a two-level nested design. In this new version of BIC, called BIC_E, this penalty term is decomposed into two parts if the random effect variance-covariance matrix has full rank: (a) a term with the log of average sample size per cluster whose multiplier involves the overlapping number of dimensions between the column spaces of the random and fixed effect design matrices and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we study the behavior of BIC_E in the presence of redundant random effects. The use of BIC_E is illustrated with a textbook example data set and a numerical demonstration shows that the derived formulae adheres to empirical values.