| tobit {VGAM} | R Documentation |
Fits a Tobit model.
tobit(Lower = 0, Upper = Inf, lmu = "identitylink", lsd = "loge",
nsimEIM = 250, imu = NULL, isd = NULL,
type.fitted = c("uncensored", "censored", "mean.obs"),
imethod = 1, zero = -2)
Lower |
Numeric. It is the value L described below. Any value of the linear model x_i^T beta that is less than this lowerbound is assigned this value. Hence this should be the smallest possible value in the response variable. May be a vector (see below for more information). |
Upper |
Numeric. It is the value U described below. Any value of the linear model x_i^T beta that is greater than this upperbound is assigned this value. Hence this should be the largest possible value in the response variable. May be a vector (see below for more information). |
lmu, lsd |
Parameter link functions for the mean and standard deviation parameters.
See |
imu, isd |
See |
type.fitted |
Type of fitted value returned.
The first choice is default and is the ordinary uncensored or
unbounded linear model.
If |
imethod |
Initialization method. Either 1 or 2, this specifies two methods for obtaining initial values for the parameters. |
nsimEIM |
Used for the nonstandard Tobit model.
See |
zero |
An integer vector, containing the value 1 or 2. If so,
the mean or standard deviation respectively are modelled
as an intercept-only.
Setting |
The Tobit model can be written
y_i^* = x_i^T beta + e_i
where the e_i ~ N(0,sigma^2) independently and i=1,...,n. However, we measure y_i = y_i^* only if y_i^* > L and y_i^* < U for some cutpoints L and U. Otherwise we let y_i=L or y_i=U, whatever is closer. The Tobit model is thus a multiple linear regression but with censored responses if it is below or above certain cutpoints.
The defaults for Lower and Upper and
lmu correspond to the standard Tobit model.
Then Fisher scoring is used, else simulated Fisher scoring.
By default, the mean x_i^T beta is
the first linear/additive predictor, and the log of
the standard deviation is the second linear/additive
predictor. The Fisher information matrix for uncensored
data is diagonal. The fitted values are the estimates
of x_i^T beta.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm,
and vgam.
Convergence is often slow. Setting crit = "coeff"
is recommended since premature convergence of the log-likelihood
is common.
Simulated Fisher scoring is implemented for the nonstandard
Tobit model. For this, the working weight matrices for
some observations are prone to not being positive-definite;
if so then some checking of the final model is recommended
and/or try inputting some initial values.
The response can be a matrix.
If so, then Lower and Upper
are recycled into a matrix with the number of columns equal
to the number of responses,
and the recycling is done row-wise (byrow = TRUE).
For example, these are returned in fit4@misc$Lower and
fit4@misc$Upper below.
If there is no censoring then
uninormal is recommended instead. Any value of the
response less than Lower or greater than Upper will
be assigned the value Lower and Upper respectively,
and a warning will be issued.
The fitted object has components censoredL and censoredU
in the extra slot which specifies whether observations
are censored in that direction.
The function cens.normal is an alternative
to tobit().
When obtaining initial values, if the algorithm would otherwise want to fit an underdetermined system of equations, then it uses the entire data set instead. This might result in rather poor quality initial values, and consequently, monitoring convergence is advised.
Thomas W. Yee
Tobin, J. (1958) Estimation of relationships for limited dependent variables. Econometrica 26, 24–36.
rtobit,
cens.normal,
uninormal,
double.cens.normal,
posnormal,
rnorm.
## Not run:
# Here, fit1 is a standard Tobit model and fit2 is a nonstandard Tobit model
tdata <- data.frame(x2 = seq(-1, 1, length = (nn <- 100)))
set.seed(1)
Lower <- 1; Upper <- 4 # For the nonstandard Tobit model
tdata <- transform(tdata,
Lower.vec = rnorm(nn, Lower, 0.5),
Upper.vec = rnorm(nn, Upper, 0.5))
meanfun1 <- function(x) 0 + 2*x
meanfun2 <- function(x) 2 + 2*x
meanfun3 <- function(x) 2 + 2*x
meanfun4 <- function(x) 3 + 2*x
tdata <- transform(tdata,
y1 = rtobit(nn, mean = meanfun1(x2)), # Standard Tobit model
y2 = rtobit(nn, mean = meanfun2(x2), Lower = Lower, Upper = Upper),
y3 = rtobit(nn, mean = meanfun3(x2), Lower = Lower.vec, Upper = Upper.vec),
y4 = rtobit(nn, mean = meanfun3(x2), Lower = Lower.vec, Upper = Upper.vec))
with(tdata, table(y1 == 0)) # How many censored values?
with(tdata, table(y2 == Lower | y2 == Upper)) # How many censored values?
with(tdata, table(attr(y2, "cenL")))
with(tdata, table(attr(y2, "cenU")))
fit1 <- vglm(y1 ~ x2, tobit, data = tdata, trace = TRUE,
crit = "coeff") # crit = "coeff" is recommended
coef(fit1, matrix = TRUE)
summary(fit1)
fit2 <- vglm(y2 ~ x2, tobit(Lower = Lower, Upper = Upper, type.f = "cens"),
data = tdata, crit = "coeff", trace = TRUE) # ditto
table(fit2@extra$censoredL)
table(fit2@extra$censoredU)
coef(fit2, matrix = TRUE)
fit3 <- vglm(y3 ~ x2,
tobit(Lower = with(tdata, Lower.vec),
Upper = with(tdata, Upper.vec), type.f = "cens"),
data = tdata, crit = "coeff", trace = TRUE) # ditto
table(fit3@extra$censoredL)
table(fit3@extra$censoredU)
coef(fit3, matrix = TRUE)
# fit4 is fit3 but with type.fitted = "uncen".
fit4 <- vglm(cbind(y3, y4) ~ x2,
tobit(Lower = rep(with(tdata, Lower.vec), each = 2),
Upper = rep(with(tdata, Upper.vec), each = 2)),
data = tdata, crit = "coeff", trace = TRUE) # ditto
head(fit4@extra$censoredL) # A matrix
head(fit4@extra$censoredU) # A matrix
head(fit4@misc$Lower) # A matrix
head(fit4@misc$Upper) # A matrix
coef(fit4, matrix = TRUE)
## End(Not run)
## Not run: # Plot the results
par(mfrow = c(2, 2))
# Plot fit1
plot(y1 ~ x2, tdata, las = 1, main = "Standard Tobit model",
col = as.numeric(attr(y1, "cenL")) + 3,
pch = as.numeric(attr(y1, "cenL")) + 1)
legend(x = "topleft", leg = c("censored", "uncensored"),
pch = c(2, 1), col = c("blue", "green"))
legend(-1.0, 2.5, c("Truth", "Estimate", "Naive"),
col = c("purple", "orange", "black"), lwd = 2, lty = c(1, 2, 2))
lines(meanfun1(x2) ~ x2, tdata, col = "purple", lwd = 2)
lines(fitted(fit1) ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y1 ~ x2, tdata)) ~ x2, tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
# Plot fit2
plot(y2 ~ x2, data = tdata, las = 1, main = "Tobit model",
col = as.numeric(attr(y2, "cenL")) + 3 +
as.numeric(attr(y2, "cenU")),
pch = as.numeric(attr(y2, "cenL")) + 1 +
as.numeric(attr(y2, "cenU")))
legend(x = "topleft", leg = c("censored", "uncensored"),
pch = c(2, 1), col = c("blue", "green"))
legend(-1.0, 3.5, c("Truth", "Estimate", "Naive"),
col = c("purple", "orange", "black"), lwd = 2, lty = c(1, 2, 2))
lines(meanfun2(x2) ~ x2, tdata, col = "purple", lwd = 2)
lines(fitted(fit2) ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y2 ~ x2, tdata)) ~ x2, tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
# Plot fit3
plot(y3 ~ x2, data = tdata, las = 1,
main = "Tobit model with nonconstant censor levels",
col = as.numeric(attr(y3, "cenL")) + 3 +
as.numeric(attr(y3, "cenU")),
pch = as.numeric(attr(y3, "cenL")) + 1 +
as.numeric(attr(y3, "cenU")))
legend(x = "topleft", leg = c("censored", "uncensored"),
pch = c(2, 1), col = c("blue", "green"))
legend(-1.0, 3.5, c("Truth", "Estimate", "Naive"),
col = c("purple", "orange", "black"), lwd = 2, lty = c(1, 2, 2))
lines(meanfun3(x2) ~ x2, tdata, col = "purple", lwd = 2)
lines(fitted(fit3) ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y3 ~ x2, tdata)) ~ x2, tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
# Plot fit4
plot(y3 ~ x2, data = tdata, las = 1,
main = "Tobit model with nonconstant censor levels",
col = as.numeric(attr(y3, "cenL")) + 3 +
as.numeric(attr(y3, "cenU")),
pch = as.numeric(attr(y3, "cenL")) + 1 +
as.numeric(attr(y3, "cenU")))
legend(x = "topleft", leg = c("censored", "uncensored"),
pch = c(2, 1), col = c("blue", "green"))
legend(-1.0, 3.5, c("Truth", "Estimate", "Naive"),
col = c("purple", "orange", "black"), lwd = 2, lty = c(1, 2, 2))
lines(meanfun3(x2) ~ x2, data = tdata, col = "purple", lwd = 2)
lines(fitted(fit4)[, 1] ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y3 ~ x2, tdata)) ~ x2, data = tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
## End(Not run)