![]() These three models all result in the same point estimate but imply different standard errors and different predictive distributions. As discussed further in Section 11.1, unequal variances are not typically a major issue for the goal of estimating regression coefficients, but they become more important when making predictions about individual cases. That is, measurements with higher variance get lower weight when fitting the model. From a completely different direction, weighted least squares is the maximum likelihood estimate for the regression model with independent normally distributed errors with unequal variances, where sd(ε_i) is proportional to 1/√w_i. ![]() Then weighted regression on the compressed dataset, (x, y, w), is equivalent to unweighted regression on the original data.ģ. ![]() More directly, suppose each data point can represent one or more actual observations, so that i represents a collection of w_i data points, all of which happen to have x_i as their vector of predictors, and where y_i is the average of the corresponding wi outcome variables. Including these weights in the regression is a way to approximately minimize the sum of squared errors with respect to the population rather than the sample.Ģ. In this example, men, younger people, and members of ethnic minorities would have higher weights. Then we would assign to survey respondent a weight that is proportional to the number of people of that type in the population represented by that person in the sample. For example, suppose our data come from a survey that oversamples older white women, and we are interested in estimating the population regression. A weighted regression is fit to sample data in order to estimate the (unweighted) linear model that would be obtained if it could be fit to the entire population. This is the most common way that regression weights are used in practice. Using observed data to represent a larger population. Weighted least squares can be derived from three different models:ġ. Three models leading to weighted regression Here’s what we wrote about weighting in Section 10.8 of Regression and Other Stories: So then we get questions like, “How do you do weighted regression in Stan,” and we have to reply, “What is it that you actually want to do?”Īnd then there’s this whole thing where people do poststratification weighting and think they’re doing inverse probability weighting see Section 3.3 of this article with John Carlin to see why these two sorts of weights are different. People just assume because something has one name (“weights”), it is one thing. Weighting causes no end of confusion both in applied and theoretical statistics. You can read the whole thing at the above link. I also want to talk about the settings where we know how to use these sorts of weights, and the ones where we don’t. Here, I want to distinguish the different uses and clarify when the differences are a problem. There are roughly three and half distinct uses of the term weights in statistical methodology, and it’s a problem for software documentation and software development.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |