Multiple regression models

While comparing exposure prevalence in sub-groups of a population may have benefits in elucidating exposures that have pronounced effects on disease, often, several factors may contribute to disease outcomes.  In these cases, multivariate regression analysis can be conducted to highlight exposure factors that differ between groups.  If the outcome is at the individual level, a multivariate logit or probit may be appropriate.  If the outcome is at a group level, a multivariate logistic regression may be preferred, although if most ratios or percentages range between 0.3 and 0.7, a linear regression can often give a good fit.  Standard statistical packages (SAS, R, etc.) permit fairly straightforward disease modelling for datasets that are complete, that is have all the needed exposure measures present for each "diseased" and "non-diseased" epidemiological unit. However, frequently, cross sectional studies have incomplete data.