The Department of Biostatistics at the University of Washington is a leading center of research in developing biostatistical methods, consistently ranked as one of the top biostatistics departments in research productivity.
The Department of Biostatistics at the University of Washington is a leading center of research in developing biostatistical methods, consistently ranked as one of the top biostatistics departments in research productivity.
The scientific ability to measure biological processes at the molecular level has exploded in recent decades, including measurements of genetic transcription and protein expression. These biomarker measurements have enormous potential to improve public health through disease prevention and early detection, and to improve the clinical treatment of disease. For example, biomarkers might enable disease to be detected at earlier, more treatable stages. In other settings biomarkers track disease progression, or provide early indication about whether a treatment is effective. For some conditions biomarkers can predict patients’ varied responses to therapy, which can guide treatment decisions and improve care through “personalized” treatment plans. Faculty in UW Biostatistics develop methodology for the development and evaluation of biomarkers for disease diagnosis, prognosis, and treatment selection.
Causal inference addresses scientific questions of interest using a formal language of causation. For example, observational studies face challenges of selection bias due to confounding, and causal inference studies the definition, identification, and estimation of the causal effect of an exposure on a disease endpoint, with interpretation as if the study were a randomized trial with no missing data. Causal inference also allows asking many questions in randomized trials: Does a treatment effect vary over patient subgroups defined by variables measured after randomization (such as adherence, becoming infected with a pathogen, or having a biomarker response to treatment)? Does a biomarker response mediate a treatment effect? Our faculty develop statistical methods tackling these types of problems emerging from many application areas including health policy research, epidemiology, mental health, cancer, and infectious diseases.
Randomized controlled clinical trials are widely recognized as providing the highest level of evidence in evaluating benefits and risks of interventions for treatment and prevention of diseases. Clinical trials should be carefully designed, conducted, analyzed, and interpreted to enhance their ability to provide timely and reliable insights obtained in an ethically proper manner. It takes a motivated and enlightened “village” to accomplish those goals, and UW Biostatistics is part of that village. Faculty, staff and students in the Department of Biostatistics develop and enhance clinical trials methodology; collaborate with government and industry sponsors who are conducting clinical trials; serve on data monitoring committees; provide scientific and regulatory oversight in evaluating the safety and efficacy of interventions; and engage the broader academic community through lectures and courses that address the most compelling issues in current clinical trials.
The spatial structure of health and environmental data presents special opportunities and challenges in public health research. In some settings, the spatial structure of data is an advantage because it allows one to leverage local dependence to improve prediction, for example in prevalence mapping with point-referenced disease data, small area estimation with area-referenced health data, and geostatistical kriging with point-referenced environmental data. Spatial methods are especially important in the developing world, where health data sources are limited. Similarly, in studies of associations between health outcomes and exposures in air, water and soil, specialized spatial methods address the sparse sampling of the exposure. An additional advantage of spatial data is that modern geographic information systems (GIS) provide extensive demographic, geographic, and land-use information at every location. On the other hand, spatial structure often presents challenges by introducing correlation or unmeasured confounding, often without the benefit of replicate independent samples. Statistical development in this area includes methods to incorporate complex sampling schemes, reconstruction of underlying surfaces, spatio-temporal kriging, and dimension reduction for multivariate spatial data.
Epidemiologic studies in human populations examine how infectious agents, environmental exposures, lifestyle choices, and genetic variants contribute to disease and injury. The goals of epidemiologic studies range from simple disease surveillance to increasing scientific knowledge of disease processes. Some epidemiologic study designs, such as case-cohort designs and two-phase designs, originated from research in UW Biostatistics. Department researchers have also made major contributions to statistical methods used to analyze data from these designs as well as epidemiologic case-control and cohort studies. Faculty and students collaborate with epidemiologists to advance science and public health in a broad range of disease area, including cancer, cardiovascular disease, diabetes, traumatic injury, and infectious disease.
In biomedical research, simple models governed by only a few parameters may not describe underlying systems well enough to be useful. Because of this, modern statistical inference often relies instead on much more flexible semiparametric and nonparametric models, in which infinite-dimensional parameters are present. However, inference in these models can be complicated. In some cases, a classical likelihood-based approach can be used; in many others, more sophisticated techniques must be employed to construct efficient estimators.
UW Biostatistics faculty have been at the cutting edge of work on these flexible models for many years. They have produced many important general results on efficient estimation in infinite-dimensional models, and have led the application of these methods in contexts such as survival analysis, causal inference, and missing data problems. Other recent innovations include connecting frequentist semiparametric work with Bayesian approaches, and investigating the theory and use of targeted maximum likelihood estimation. Faculty also work to make these techniques widely accessible by promoting their innovations through books, teaching material, and free software packages.
Functional magnetic resonance imaging, electronic health records, high-throughput molecular biology, and other new technologies have resulted in a deluge of complex data in recent years. Such rich data sources have the potential to inform important questions in public health, biology, and medicine. However, these new data sources share many statistical challenges. For instance, the number of features in modern biomedical data often exceeds the available sample size. As a result, classical statistical methods do not work reliably, and it can be very hard to answer even seemingly simple questions on the basis of these data.
Faculty in UW Biostatistics are developing new statistical learning methods for the analysis of large-scale data sets, often by exploiting the data’s inherent structure, such as sparsity and smoothness. These new methods can be used to perform prediction, estimation, and inference in complex big-data settings. This work has applications in many areas, including estimating biological networks and inferring changes in network structure in disease conditions; correcting for multiplicity in high throughput studies; predicting treatment response; and inferring feature importance in structured models. New approaches combine statistical, computational, and domain expertise in order to tackle problems across public health, biology, and medicine.
The data collected in many biomedical studies have a clustered structure. For example, in longitudinal studies, data on study participants are collected over time. In this example, each participant constitutes a ‘cluster’ of observations. The clustering structure can also have multiple levels. For example, in a multicenter clinical trial, study participants are recruited in hospitals which, in turn, could be organized within states or countries. These examples share the feature the data are likely to be correlated. For example, data from the same individual/hospital are likely to be more similar than data from different individuals/hospitals. These characteristics motivate research on the design of longitudinal or multilevel studies and on the development of statistical methodology for these types of data. Relevant methodology includes mixed effect models, joint modeling of survival and longitudinal data, modeling with missing data, multi-state modeling.
This is an exciting time for statistical genetics and genomics. Recent technological advancements have launched a new era of biomedical research in genomics. Genetics researchers now have access to sequence data on entire genomes as well as other high-dimensional genomic data on transcription, protein expression, etc. There is great potential for these data to provide unprecedented insight into the genetic underpinnings of human health and disease. UW Biostatistics conducts world-class research in statistical genetics and is a leader in the development of statistical and computational approaches for extracting useful knowledge from these data. The Department’s Genetic Analysis Center coordinates analyses of data sets with millions of genetic variants genotyped or sequenced on tens of thousands of individuals. Many genetics projects investigated by faculty and students in UW Biostatistics are computationally demanding due to the large size and complexities of the data, requiring the development of carefully optimized algorithms and efficient computing implementations.
Many biomedical studies examine time-to-event outcomes. For example, an epidemiological study might investigate disease etiology by studying time-to-disease-onset; or a clinical trial may assess treatment efficacy for extending survival. The collection of time-to-event data may be subject to sophisticated study designs. Most notably, such data are often only partially observed when subjects drop out of a study or do not experience the event of interest during the study period. Faculty and students in UW Biostatistics have made many important contributions to methods development in Survival Analysis, also known as Event History Analysis. These methods include efficient estimation of univariate and multivariate survival functions, powerful hypothesis testing procedures, and regression methods under various study designs. A number of faculty and students continue to dedicate effort to developing survival analysis methodology for many applications including biomarker development, genomic research, and vaccine efficacy assessment.