For the automated recognition of people based on intrinsic physical or behavioural traits, see Biometrics. For the academic journal, see Biostatistics (journal). For the journal related to Biometry, see Biometrics (journal).
Biostatistics (a portmanteau of biology and statistics; sometimes referred to as biometry or biometrics) is the application of statistics to a wide range of topics in biology. The science of biostatistics encompasses the design of biological experiments, especially in medicine, agriculture and fishery; the collection, summarization, and analysis of data from those experiments; and the interpretation of, and inference from, the results. A major branch of this is medical biostatistics,[1] which is exclusively concerned with medicine and health.
Contents
- 1 Biostatistics and the history of biological thought
- 2 Education and training programs
- 3 Applications of biostatistics
- 4 See also
- 5 References
- 6 External links
Biostatistics and the history of biological thought[edit source | edit]
Biostatistical reasoning and modeling were of critical importance to the foundation theories of modern biology. In the early 1900s, after the rediscovery of Mendel's work, the gaps in understanding between genetics and evolutionary Darwinism led to vigorous debate among biometricians, such as Walter Weldon and Karl Pearson, and Mendelians, such as Charles Davenport, William Bateson and Wilhelm Johannsen. By the 1930s, statisticians and models built on statistical reasoning had helped to resolve these differences and to produce the neo-Darwinian modern evolutionary synthesis.
The leading figures in the establishment of this synthesis all relied on statistics and developed its use in biology.
- Sir Ronald A. Fisher developed several basic statistical methods in support of his work The Genetical Theory of Natural Selection
- Sewall G. Wright used statistics in the development of modern population genetics
- J. B. S Haldane's book, The Causes of Evolution, reestablished natural selection as the premier mechanism of evolution by explaining it in terms of the mathematical consequences of Mendelian genetics.
These individuals and the work of other biostatisticians, mathematical biologists, and statistically inclined geneticists helped bring together evolutionary biology and genetics into a consistent, coherent whole that could begin to be quantitatively modeled.
In parallel to this overall development, the pioneering work of D'Arcy Thompson in On Growth and Form also helped to add quantitative discipline to biological study.
Despite the fundamental importance and frequent necessity of statistical reasoning, there may nonetheless have been a tendency among biologists to distrust or deprecate results which are not qualitatively apparent. One anecdote describes Thomas Hunt Morgan banning the Friden calculator from his department at Caltech, saying "Well, I am like a guy who is prospecting for gold along the banks of the Sacramento River in 1849. With a little intelligence, I can reach down and pick up big nuggets of gold. And as long as I can do that, I'm not going to let any people in my department waste scarce resources in placer mining."[2]
Education and training programs[edit source | edit]
Almost all educational programmes in biostatistics are at postgraduate level. They are most often found in schools of public health, affiliated with schools of medicine, forestry, or agriculture, or as a focus of application in departments of statistics.
In the United States, where several universities have dedicated biostatistics departments, many other top-tier universities integrate biostatistics faculty into statistics or other departments, such as epidemiology. Thus, departments carrying the name "biostatistics" may exist under quite different structures. For instance, relatively new biostatistics departments have been founded with a focus on bioinformatics and computational biology, whereas older departments, typically affiliated with schools of public health, will have more traditional lines of research involving epidemiological studies and clinical trials as well as bioinformatics. In larger universities where both a statistics and a biostatistics department exist, the degree of integration between the two departments may range from the bare minimum to very close collaboration. In general, the difference between a statistics program and a biostatistics program is twofold: (i) statistics departments will often host theoretical/methodological research which are less common in biostatistics programs and (ii) statistics departments have lines of research that may include biomedical applications but also other areas such as industry (quality control), business and economics and biological areas other than medicine.
Applications of biostatistics[edit source | edit]
- Public health, including epidemiology, health services research, nutrition, environmental health and healthcare policy & management.
- Design and analysis of clinical trials in medicine
- Population genetics, and statistical genetics in order to link variation in genotype with a variation in phenotype. This has been used in agriculture to improve crops and farm animals (animal breeding). In biomedical research, this work can assist in finding candidates for gene alleles that can cause or influence predisposition to disease in human genetics
- Analysis of genomics data, for example from microarray or proteomics experiments.[3][4] Often concerning diseases or disease stages.[5]
- Ecology, ecological forecasting
- Biological sequence analysis[6]
- Systems biology for gene network inference or pathways analysis.[7]
See also[edit source | edit]
- Group size measures
- Health indicator
- List of biostatistics journals
- Quantitative parasitology
References[edit source | edit]
- ^ Abhaya Indrayan (2012). Medical Biostatistics. CRC Press. ISBN 978-1-4398-8414-0.
- ^ Charles T. Munger (2003-10-03). "Academic Economics: Strengths and Faults After Considering Interdisciplinary Needs".
- ^ Helen Causton, John Quackenbush and Alvis Brazma (2003). Statistical Analysis of Gene Expression Microarray Data. Wiley-Blackwell.
- ^ Terry Speed (2003). Microarray Gene Expression Data Analysis: A Beginner's Guide. Chapman & Hall/CRC.
- ^ Frank Emmert-Streib and Matthias Dehmer (2010). Medical Biostatistics for Complex Diseases. Wiley-Blackwell. ISBN 3-527-32585-9.
- ^ Warren J. Ewens and Gregory R. Grant (2004). Statistical Methods in Bioinformatics: An Introduction. Springer.
- ^ Matthias Dehmer, Frank Emmert-Streib, Armin Graber and Armindo Salvador (2011). Applied Statistics for Network Biology: Methods in Systems Biology. Wiley-Blackwell. ISBN 3-527-32750-9.
External links[edit source | edit]
- The International Biometric Society
- The Collection of Biostatistics Research Archive
- Guide to Biostatistics (MedPageToday.com)
Statistics
|
|
Descriptive statistics
|
|
Continuous data |
Location |
- Mean (Arithmetic, Geometric, Harmonic)
- Median
- Mode
|
|
Dispersion |
- Range
- Standard deviation
- Coefficient of variation
- Percentile
- Interquartile range
|
|
Shape |
- Variance
- Skewness
- Kurtosis
- Moments
- L-moments
|
|
|
Count data |
|
|
Summary tables |
- Grouped data
- Frequency distribution
- Contingency table
|
|
Dependence |
- Pearson product-moment correlation
- Rank correlation (Spearman's rho, Kendall's tau)
- Partial correlation
- Scatter plot
|
|
Statistical graphics |
- Bar chart
- Biplot
- Box plot
- Control chart
- Correlogram
- Forest plot
- Histogram
- Pie chart
- Q–Q plot
- Run chart
- Scatter plot
- Stemplot
- Radar chart
|
|
|
|
Data collection
|
|
Designing studies |
- Effect size
- Standard error
- Statistical power
- Sample size determination
|
|
Survey methodology |
- Sampling
- Stratified sampling
- Opinion poll
- Questionnaire
|
|
Controlled experiment |
- Design of experiments
- Randomized experiment
- Random assignment
- Replication
- Blocking
- Factorial experiment
- Optimal design
|
|
Uncontrolled studies |
- Natural experiment
- Quasi-experiment
- Observational study
|
|
|
|
Statistical inference
|
|
Statistical theory |
- Sampling distribution
- Order statistics
- Sufficiency
- Completeness
- Exponential family
- Permutation test (Randomization test)
- Empirical distribution
- Bootstrap
- U statistic
- Efficiency
- Asymptotics
- Robustness
|
|
Frequentist inference |
- Unbiased estimator (Mean unbiased minimum variance, Median unbiased)
- Biased estimators (Maximum likelihood, Method of moments, Minimum distance, Density estimation)
- Confidence interval
- Testing hypotheses
- Power
- Parametric tests (Likelihood-ratio, Wald, Score)
|
|
Specific tests |
- Z (normal)
- Student's t-test
- F
- Goodness of fit (Chi-squared, G, Sample source, sample normality, Skewness & kurtosis Normality, Model comparison, Model quality)
- Signed-rank (1-sample, 2-sample, 1-way anova)
- Shapiro–Wilk
- Kolmogorov–Smirnov
|
|
Bayesian inference |
- Bayesian probability
- Prior
- Posterior
- Credible interval
- Bayes factor
- Bayesian estimator
- Maximum posterior estimator
|
|
|
|
Correlation and regression analysis
|
|
Correlation |
- Pearson product–moment correlation
- Partial correlation
- Confounding variable
- Coefficient of determination
|
|
Regression analysis |
- Errors and residuals
- Regression model validation
- Mixed effects models
- Simultaneous equations models
- MARS
|
|
Linear regression |
- Simple linear regression
- Ordinary least squares
- General linear model
- Bayesian regression
|
|
Non-standard predictors |
- Nonlinear regression
- Nonparametric
- Semiparametric
- Isotonic
- Robust
- Heteroscedasticity
- Homoscedasticity
|
|
Generalized linear model |
- Exponential families
- Logistic (Bernoulli)
- Binomial
- Poisson
|
|
Partition of variance |
- Analysis of variance (ANOVA)
- Analysis of covariance
- Multivariate ANOVA
- Degrees of freedom
|
|
|
|
Categorical, multivariate, time-series, or survival analysis
|
|
Categorical data |
- Cohen's kappa
- Contingency table
- Graphical model
- Log-linear model
- McNemar's test
|
|
Multivariate statistics |
- Multivariate regression
- Principal components
- Factor analysis
- Cluster analysis
- Classification
- Copulas
|
|
Time series analysis |
General |
- Decomposition
- Trend
- Stationarity
- Seasonal adjustment
- Exponential smoothing
- Cointegration
|
|
Specific tests |
- Granger causality
- Q-Statistic
- Durbin–Watson
|
|
Time domain |
- ACF
- PACF
- XCF
- ARMA model
- ARIMA model
- ARCH
- Vector autoregression
|
|
Frequency domain |
- Spectral density estimation
- Fourier analysis
|
|
|
Survival analysis |
- Survival function
- Kaplan–Meier
- Logrank test
- Failure rate
- Proportional hazards models
- Accelerated failure time model
|
|
|
|
Applications
|
|
Biostatistics |
- Bioinformatics
- Clinical trials & studies
- Epidemiology
- Medical statistics
|
|
Engineering statistics |
- Chemometrics
- Methods engineering
- Probabilistic design
- Process & Quality control
- Reliability
- System identification
|
|
Social statistics |
- Actuarial science
- Census
- Crime statistics
- Demography
- Econometrics
- National accounts
- Official statistics
- Population
- Psychometrics
|
|
Spatial statistics |
- Cartography
- Environmental statistics
- Geographic information system
- Geostatistics
- Kriging
|
|
|
|
- Category
- Portal
- Outline
- Index
|
|
Branches of biology
|
|
- Anatomy
- Astrobiology
- Biochemistry
- Biogeography
- Biohistory
- Biomechanics
- Biophysics
- Bioinformatics
- Biostatistics
- Botany
- Cell biology
- Cellular microbiology
- Chemical biology
- Chronobiology
- Computational biology
- Conservation biology
- Cytogenetics
- Developmental biology
- Ecology
- Embryology
- Epidemiology
- Epigenetics
- Evolutionary biology
- Freshwater Biology
- Genetics
- Genomics
- Histology
- Human biology
- Immunology
- Marine biology
- Mathematical biology
- Microbiology
- Molecular biology
- Mycology
- Neuroscience
- Nutrition
- Origin of life
- Paleontology
- Parasitology
- Pathology
- Pharmacology
- Phylogenetics
- Physiology
- Quantum biology
- Sociobiology
- Structural biology
- Systematics
- Systems biology
- Taxonomy
- Teratology
- Toxicology
- Virology
- Zoology
|
|