Title: | Loglikelihood Adjustments for Econometric Models |
---|---|
Description: | Adjusts the loglikelihood of common econometric models for clustered data based on the estimation process suggested in Chandler and Bate (2007) <doi:10.1093/biomet/asm015>, using the 'chandwich' package <https://cran.r-project.org/package=chandwich>, and provides convenience functions for inference on the adjusted models. |
Authors: | Theo Bruckbauer [aut, cre] |
Maintainer: | Theo Bruckbauer <[email protected]> |
License: | EUPL (>=1.2) |
Version: | 1.0.0 |
Built: | 2025-02-28 03:33:01 UTC |
Source: | https://github.com/maebruck/chantrics |
This function adjusts the loglikelihood of fitted model objects based on Chandler and Bate (2007). It is a generic function for different types of models, which are listed in Supported models. This section also contains links to function-specific help pages.
adj_loglik(x, cluster = NULL, use_vcov = TRUE, use_mle = TRUE, ...)
adj_loglik(x, cluster = NULL, use_vcov = TRUE, use_mle = TRUE, ...)
x |
A supported fitted model object, see Supported models |
cluster |
A vector or factor indicating the cluster the corresponding
loglikelihood contribution belongs to. It is required to have the same
length as the vector returned by |
use_vcov |
A logical scalar. By default, the |
use_mle |
A logical scalar. By default, the MLE from |
... |
Further arguments to be passed to |
If use_vcov = TRUE
, the current default, the function will test
whether a vcov
S3 method exists for x
, and will take the
variance-covariance matrix from there. Otherwise, or if use_vcov = FALSE
the variance-covariance matrix of the MLE is estimated inside
chandwich::adjust_loglik()
using stats::optimHess()
.
An object of class "chantrics"
inheriting from class "chandwich"
.
See the documentation provided with chandwich::adjust_loglik()
.
"chantrics"
objects have the following methods available to them:
alrtest
- Adjusted Likelihood ratio tests
lmtest::coeftest
- \(z\) tests for all
coefficients
confint
and plot.confint
- confidence intervals for
all coefficients, and diagnostics plots for confint()
.
conf_intervals
- enhanced confidence
interval reports
conf_region
- two-dimensional confidence
regions
See the model-specific pages in the supported models section.
R. E. Chandler and S. Bate, Inference for clustered data using the independence loglikelihood, Biometrika, 94 (2007), pp. 167–183. doi:10.1093/biomet/asm015.
lax::alogLik()
supports adjustment for user-supplied objects.
alrtest
is a helper function to simulate the functions lmtest::waldtest()
and lmtest::lrtest()
for adjusted chantrics
objects. The method can be
employed to compare nested models (see details).
alrtest(object, ...)
alrtest(object, ...)
object |
a |
... |
further object specifications (see details), as well as named
parameters that will be passed to |
This function is a helper function that creates an interface to
anova.chantrics()
that is similar to lmtest::waldtest()
and
lmtest::lrtest()
.
The standard method is to compare the fitted model object object
with the
models in ...
. Instead of passing the fitted models into ...
, other
specifications are possible. Note that the types of specifications cannot be
mixed, except between numerics/characters. The type of the second object
supplied determines the algorithm used.
"chantrics"
objects: When
supplying two or more "chantrics"
objects, they will be sorted as in
anova.chantrics()
. Then, the ALRTS will be computed consecutively between
the two neighbouring models. Note that all models must be nested. For
details refer to anova.chantrics()
.
"numeric"
: If the second
object is "numeric"
or "character"
, then "numeric"
objects
corresponding element in attr(terms(object1), "term.labels")
will be
turned into their corresponding "character"
element and will be handled
as in "character"
below.
"character"
: If the second object is
"numeric"
or "character"
, then the "character"
objects are
consecutively included in an update formula like update(object1, . ~ . - object2)
"formula"
: If the second object is a "formula"
, then
the second model will be computed as update(object1, object2)
.
Then, the adjusted likelihood ratio test statistic (ALRTS), as described in
Section 3.5 of Chandler and Bate
(2007), is computed by
anova.chantrics()
.
If a single unnamed object is passed in ...
, sequential ANOVA is
performed on object
.
An object of class "anova"
inheriting from class "data.frame"
.
The columns are as follows:
Resid.df |
The residual number of degrees of freedom in the model. |
df |
The increase in residual degrees of freedom with respect to the model in the row above. |
ALRTS |
The adjusted likelihood ratio statistic. |
Pr(>ALRTS) |
The p-value of the test that the model above is a "significantly better" model as the one in the current row. |
R. E. Chandler and S. Bate, Inference for clustered data using the independence loglikelihood, Biometrika, 94 (2007), pp. 167–183. doi:10.1093/biomet/asm015.
anova.chantrics()
for the implementation of the computations of
the test statistics.
lmtest::waldtest()
and lmtest::lrtest()
for syntax.
anova
method for chantrics
objects
## S3 method for class 'chantrics' anova(object, ...)
## S3 method for class 'chantrics' anova(object, ...)
object |
Object of class |
... |
Further objects of class |
Create an analysis of adjusted deviance table for one object (sequential), or
two or more nested models that have been adjusted using the
adj_loglik()
method. It uses the adjusted likelihood ratio test
statistic (ALRTS), as described in Section 3.5 of
Chandler and Bate (2007).
Each line represents the model as given above the table, with each line (except for the first line) showing the residual degrees of freedom of that model, the change in degrees of freedom, the ALRTS and the associated p-value in comparison to the model in the line above.
When a single model is specified, the function returns a sequential analysis of deviance table, where, iteratively, one term is being removed from the right of the full formula. This process is continued until the "intercept only" model is left. The row names are the names of the dropped term in comparison to the model in the line above.
If more than one model is specified, the function sorts the models by their
number of variables as returned by adj_loglik()
in
attr(x, "p_current")
.
Details of the ALRT can be found in chandwich::compare_models()
and in
Chandler and Bate (2007).
An object of class "anova"
inheriting from class "data.frame"
.
The columns are as follows:
Resid.df |
The residual number of degrees of freedom in the model. |
df |
The increase in residual degrees of freedom with respect to the model in the row above. |
ALRTS |
The adjusted likelihood ratio statistic. |
Pr(>ALRTS) |
The p-value of the test that the model above is a "significantly better" model as the one in the current row. |
R. E. Chandler and S. Bate, Inference for clustered data using the independence loglikelihood, Biometrika, 94 (2007), pp. 167–183. doi:10.1093/biomet/asm015.
chandwich::compare_models: implementation of the comparison mechanism
# from Introducing Chandwich. set.seed(123) x <- rnorm(250) y <- rnbinom(250, mu = exp(1 + x), size = 1) fm_pois <- glm(y ~ x + I(x^2), family = poisson) fm_pois_adj <- adj_loglik(fm_pois) fm_pois_small_adj <- update(fm_pois_adj, formula = . ~ . - I(x^2)) fm_pois_smallest_adj <- update(fm_pois_adj, formula = . ~ 1) anova(fm_pois_adj, fm_pois_small_adj, fm_pois_smallest_adj) # use different types of adjustment with type, default is "vertical" anova(fm_pois_adj, fm_pois_small_adj, fm_pois_smallest_adj, type = "cholesky") # sequential anova anova(fm_pois_adj)
# from Introducing Chandwich. set.seed(123) x <- rnorm(250) y <- rnbinom(250, mu = exp(1 + x), size = 1) fm_pois <- glm(y ~ x + I(x^2), family = poisson) fm_pois_adj <- adj_loglik(fm_pois) fm_pois_small_adj <- update(fm_pois_adj, formula = . ~ . - I(x^2)) fm_pois_smallest_adj <- update(fm_pois_adj, formula = . ~ 1) anova(fm_pois_adj, fm_pois_small_adj, fm_pois_smallest_adj) # use different types of adjustment with type, default is "vertical" anova(fm_pois_adj, fm_pois_small_adj, fm_pois_smallest_adj, type = "cholesky") # sequential anova anova(fm_pois_adj)
chantrics
adjusts the loglikelihood of common econometric models for
clustered data based on the estimation process suggested in
Chandler and Bate (2007),
using the chandwich package,
and provides convenience functions for inference on the adjusted models.
adj_loglik()
adjusts the model's parameter covariance matrix to
incorporate clustered data, and can mitigate model misspecification by
wrapping chandwich::adjust_loglik
for the supported models.
The returned model of class chantrics
can be plugged into standard model
evaluation and model comparison methods, for example, summary()
, confint()
and anova()
, and a hypothesis test framework provided by alrtest()
.
See vignette("chantrics-vignette", package = "chantrics")
for an
overview of the package.
R. E. Chandler and S. Bate, Inference for clustered data using the independence loglikelihood, Biometrika, 94 (2007), pp. 167–183. doi:10.1093/biomet/asm015.
In a generalised linear model (glm), the user can choose between a range of
distributions of a response , and can allow for non-linear relations
between the mean outcome for a particular combination of covariates
,
, and the linear predictor,
, which is the link function
.
it is required to be monotonic. (For a quick introduction, see
Kleiber and Zeileis (2008, Ch. 5.1), for more complete coverage of the
topic, see, for example, Davison (2003, Ch. 10.3))
For more usage examples and more information on glm
models, see the
Introducing chantrics
vignette by running
vignette("chantrics-vignette", package = "chantrics")
gaussian
poisson
binomial
MASS::negative.binomial
Also works for MASS::glm.nb()
, note that the standard errors of the theta
are not adjusted.
Davison, A. C. 2003. Statistical Models. Cambridge Series on Statistical and Probabilistic Mathematics 11. Cambridge University Press, Cambridge.
Kleiber, Christian, and Achim Zeileis. 2008. Applied Econometrics with R. Edited by Robert Gentleman, Kurt Hornik, and Giovanni Parmigiani. Use r! New York: Springer-Verlag.
# binomial example from Applied Econometrics in R, Kleiber/Zeileis (2008) # == probit == data("SwissLabor", package = "AER") swiss_probit <- glm(participation ~ . + I(age^2), data = SwissLabor, family = binomial(link = "probit") ) summary(swiss_probit) swiss_probit_adj <- adj_loglik(swiss_probit) summary(swiss_probit_adj) # == logit == swiss_logit <- glm(participation ~ . + I(age^2), data = SwissLabor, family = binomial(link = "logit") ) summary(swiss_logit) swiss_logit_adj <- adj_loglik(swiss_logit) summary(swiss_logit_adj)
# binomial example from Applied Econometrics in R, Kleiber/Zeileis (2008) # == probit == data("SwissLabor", package = "AER") swiss_probit <- glm(participation ~ . + I(age^2), data = SwissLabor, family = binomial(link = "probit") ) summary(swiss_probit) swiss_probit_adj <- adj_loglik(swiss_probit) summary(swiss_probit_adj) # == logit == swiss_logit <- glm(participation ~ . + I(age^2), data = SwissLabor, family = binomial(link = "logit") ) summary(swiss_logit) swiss_logit_adj <- adj_loglik(swiss_logit) summary(swiss_logit_adj)
Generic function for calculating the loglikelihood contributions from individual observations for a fitted model.
logLik_vec(object, ...)
logLik_vec(object, ...)
object |
A fitted model object. |
... |
Further arguments. |
An object of class "logLik_vec"
, which is a numeric vector of length
nobs(object)
(i.e. the number of observations in object
) of the
loglikelihood of each observation. Additionally, it contains the
attributes df
(model degrees of freedom) and nobs
(number of
observations).
The methods stats::logLik()
, and stats::nobs()
are available.
Obtains predictions from chantrics objects. The function can currently only
supply predictions of the link
and the response
values of the data used
for the fit.
## S3 method for class 'chantrics' predict(object, newdata = NULL, type = c("response", "link"), ...)
## S3 method for class 'chantrics' predict(object, newdata = NULL, type = c("response", "link"), ...)
object |
Object of class |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. Supplying new data is currently not supported. |
type |
the type of prediction required. The default |
... |
unused. |
If newdata
is omitted, the predictions are based on the data used
for the fit. Any instances of NA
will return NA
.
A vector of predictions.
residuals()
returns the residuals specified in type
from a "chantrics"
object.
## S3 method for class 'chantrics' residuals(object, type = c("response", "working", "pearson"), ...)
## S3 method for class 'chantrics' residuals(object, type = c("response", "working", "pearson"), ...)
object |
an object of class |
type |
the type of residuals which should be returned. The alternatives
are: |
... |
further arguments passed to or from other methods |
The different types of residuals are as in stats::residuals.glm()
.
A vector of residuals.
A. C. Davison and E. J. Snell, Residuals and diagnostics. In: Statistical Theory and Modelling. In Honour of Sir David Cox, FRS, 1991. Eds. Hinkley, D. V., Reid, N. and Snell, E. J., Chapman & Hall.
M. Döring, Interpreting Generalised Linear Models. In: Data Science Blog, 2018. https://www.datascienceblog.net/post/machine-learning/interpreting_generalized_linear_models/
adj_loglik()
for model fitting, stats::residuals.glm()
, and
stats::residuals()
.
update.chantrics()
will update a model that has been adjusted by
adj_loglik()
. It passes all arguments to the standard stats::update()
function.
## S3 method for class 'chantrics' update(object, ...)
## S3 method for class 'chantrics' update(object, ...)
object |
A |
... |
Additional arguments to the call, passed to |
The function cannot change any arguments passed to the adj_loglik()
function. To change any of these arguments, re-run adj_loglik()
.
Passing evaluate = FALSE
is not supported, if this is required,
run stats::update()
on the unadjusted object.
The fitted, adjusted "chantrics"
object.
# from Introducing Chandwich. set.seed(123) x <- rnorm(250) y <- rnbinom(250, mu = exp(1 + x), size = 1) fm_pois <- glm(y ~ x + I(x^2), family = poisson) fm_pois_adj <- adj_loglik(fm_pois) fm_pois_small_adj <- update(fm_pois_adj, formula = . ~ . - I(x^2)) summary(fm_pois_small_adj) fm_pois_smallest_adj <- update(fm_pois_adj, formula = . ~ 1) summary(fm_pois_smallest_adj)
# from Introducing Chandwich. set.seed(123) x <- rnorm(250) y <- rnbinom(250, mu = exp(1 + x), size = 1) fm_pois <- glm(y ~ x + I(x^2), family = poisson) fm_pois_adj <- adj_loglik(fm_pois) fm_pois_small_adj <- update(fm_pois_adj, formula = . ~ . - I(x^2)) summary(fm_pois_small_adj) fm_pois_smallest_adj <- update(fm_pois_adj, formula = . ~ 1) summary(fm_pois_smallest_adj)