WordNet

undergo a test; "She doesnt test well"
any standardized procedure for measuring sensitivity or memory or intelligence or aptitude or personality etc; "the test was standardized on a large sample of students" (同)mental test, mental testing, psychometric test
the act of undergoing testing; "he survived the great test of battle"; "candidates must compete in a trial of skill" (同)trial
the act of testing something; "in the experimental trials the amount of carbon was measured separately"; "he called each flip of the coin a new trial" (同)trial, run
a hard outer covering as of some amoebas and sea urchins
put to the test, as for its quality, or give experimental use to; "This approach has been tried with good results"; "Test this recipe" (同)prove, try, try out, examine, essay
achieve a certain score or rating on a test; "She tested high on the LSAT and was admitted to all the good law schools"
determine the presence or properties of (a substance)
show a certain characteristic when tested; "He tested positive for HIV"
marked by strict and particular and complete accordance with fact; "an exact mind"; "an exact copy"; "hit the exact center of the target"
an examination of the characteristics of something; "there are laboratories for commercial testing"; "it involved testing thousands of children for smallpox"
the act of subjecting to experimental test in order to determine how well something works; "they agreed to end the testing of atomic weapons"
large dark brown North American arboreal carnivorous mammal (同)pekan, fisher cat, black cat, Martes pennanti
tested and proved useful or correct; "a tested method" (同)tried, well-tried
tested and proved to be reliable (同)time-tested, tried, tried and true

PrepTutorEJDIC

(人の能力などの)『試験』,考査,テスト / (物事の)『試験』,検済,試錬,実験《+of+名》 / 化学分析;試薬 / =test match / …‘を'『試験する』,検査する / …‘を'化学分析する / (…の)試験を受ける,試験をする《+for+名》
《名紙の前にのみ用いて》(数量など)『正確な』,きっかりの;(物事が)そのままの,そっくりの / (人が)『厳密な』;(機械など)精密な / 〈事柄が〉…‘を'要求する / (人に)〈支払・服従など〉‘を'強要する,強いる《+『名』+『from』+『名』〈人〉》
《古》漁夫 / 魚を捕食する動物;(特に北米産の)フィッシャーテン

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア（Wikipedia）』「2015/08/23 18:41:22」(JST)

wiki en

Biologist and statistician Ronald Fisher

Fisher's exact test^[1]^[2]^[3] is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis (e.g., P-value) can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as with many statistical tests. Fisher is said to have devised the test following a comment from Dr Muriel Bristol, who claimed to be able to detect whether the tea or the milk was added first to her cup; see lady tasting tea.

Purpose and scope

The test is useful for categorical data that result from classifying objects in two different ways; it is used to examine the significance of the association (contingency) between the two kinds of classification. So in Fisher's original example, one criterion of classification could be whether milk or tea was put in the cup first; the other could be whether Dr Bristol thinks that the milk or tea was put in first. We want to know whether these two classifications are associated – that is, whether Dr Bristol really can tell whether milk or tea was poured in first. Most uses of the Fisher test involve, like this example, a 2 × 2 contingency table. The p-value from the test is computed as if the margins of the table are fixed, i.e. as if, in the tea-tasting example, Dr Bristol knows the number of cups with each treatment (milk or tea first) and will therefore provide guesses with the correct number in each category. As pointed out by Fisher, this leads under a null hypothesis of independence to a hypergeometric distribution of the numbers in the cells of the table.

With large samples, a chi-squared test can be used in this situation. However, the significance value it provides is only an approximation, because the sampling distribution of the test statistic that is calculated is only approximately equal to the theoretical chi-squared distribution. The approximation is inadequate when sample sizes are small, or the data are very unequally distributed among the cells of the table, resulting in the cell counts predicted on the null hypothesis (the "expected values") being low. The usual rule of thumb for deciding whether the chi-squared approximation is good enough is that the chi-squared test is not suitable when the expected values in any of the cells of a contingency table are below 5, or below 10 when there is only one degree of freedom (this rule is now known to be overly conservative^[4]). In fact, for small, sparse, or unbalanced data, the exact and asymptotic p-values can be quite different and may lead to opposite conclusions concerning the hypothesis of interest.^[5]^[6] In contrast the Fisher test is, as its name states, exact as long as the experimental procedure keeps the row and column totals fixed, and it can therefore be used regardless of the sample characteristics. It becomes difficult to calculate with large samples or well-balanced tables, but fortunately these are exactly the conditions where the chi-squared test is appropriate.

For hand calculations, the test is only feasible in the case of a 2 × 2 contingency table. However the principle of the test can be extended to the general case of an m × n table,^[7]^[8] and some statistical packages provide a calculation (sometimes using a Monte Carlo method to obtain an approximation) for the more general case.^[9]

Example

For example, a sample of teenagers might be divided into male and female on the one hand, and those that are and are not currently dieting on the other. We hypothesize, for example, that the proportion of dieting individuals is higher among the women than among the men, and we want to test whether any difference of proportions that we observe is significant. The data might look like this:

	Men	Women	Row total
Dieting	1	9	10
Non-dieting	11	3	14
Column total	12	12	24

The question we ask about these data is: knowing that 10 of these 24 teenagers are dieters, and that 12 of the 24 are female, and assuming the null hypothesis that men and women are equally likely to diet, what is the probability that these 10 dieters would be so unevenly distributed between the women and the men? If we were to choose 10 of the teenagers at random, what is the probability that 9 or more of them would be among the 12 women, and only 1 or fewer from among the 12 men?

Before we proceed with the Fisher test, we first introduce some notation. We represent the cells by the letters a, b, c and d, call the totals across rows and columns marginal totals, and represent the grand total by n. So the table now looks like this:

	Men	Women	Row Total
Dieting	a	b	a + b
Non-dieting	c	d	c + d
Column Total	a + c	b + d	a + b + c + d (=n)

Fisher showed that the probability of obtaining any such set of values was given by the hypergeometric distribution:

where is the binomial coefficient and the symbol ! indicates the factorial operator. With the data above, this gives:

The formula above gives the exact hypergeometric probability of observing this particular arrangement of the data, assuming the given marginal totals, on the null hypothesis that men and women are equally likely to be dieters. To put it another way, if we assume that the probability that a man is a dieter is P, the probability that a woman is a dieter is p, and we assume that both men and women enter our sample independently of whether or not they are dieters, then this hypergeometric formula gives the conditional probability of observing the values a, b, c, d in the four cells, conditionally on the observed marginals (i.e., assuming the row and column totals shown in the margins of the table are given). This remains true even if men enter our sample with different probabilities than women. The requirement is merely that the two classification characteristics—gender, and dieter (or not) -- are not associated.

For example, suppose we knew probabilities P,Q,p,q with P+Q=p+q=1 such that (male dieter, male non-dieter, female dieter, female non-dieter) had respective probabilities (Pp,Pq,Qp,Qq) for each individual encountered under our sampling procedure. Then still, were we to calculate the distribution of cell entries conditional given marginals, we would obtain the above formula in which neither p nor P occurs. Thus, we can calculate the exact probability of any arrangement of the 24 teenagers into the four cells of the table, but Fisher showed that to generate a significance level, we need consider only the cases where the marginal totals are the same as in the observed table, and among those, only the cases where the arrangement is as extreme as the observed arrangement, or more so. (Barnard's test relaxes this constraint on one set of the marginal totals.) In the example, there are 11 such cases. Of these only one is more extreme in the same direction as our data; it looks like this:

	Men	Women	Row Total
Dieting	0	10	10
Non-dieting	12	2	14
Column Total	12	12	24

For this table (with extremely unequal dieting proportions) the probability is .

In order to calculate the significance of the observed data, i.e. the total probability of observing data as extreme or more extreme if the null hypothesis is true, we have to calculate the values of p for both these tables, and add them together. This gives a one-tailed test, with p approximately 0.001346076 + 0.000033652 = 0.001379728. (For example, in the R statistical computing environment, this value can be obtained as fisher.test(rbind(c(1,9),c(11,3)), alternative="less")$p.value. This value can be interpreted as the sum of evidence provided by the observed data—or any more extreme table—for the null hypothesis (that there is no difference in the proportions of dieters between men and women). The smaller the value of p, the greater the evidence for rejecting the null hypothesis; so here the evidence is strong that men and women are not equally likely to be dieters.

For a two-tailed test we must also consider tables that are equally extreme, but in the opposite direction. Unfortunately, classification of the tables according to whether or not they are 'as extreme' is problematic. An approach used by the fisher.test function in R is to compute the p-value by summing the probabilities for all tables with probabilities less than or equal to that of the observed table. In the example here, the 2-sided p-value is twice the 1-sided value—but in general these can differ substantially for tables with small counts, unlike the case with test statistics that have a symmetric sampling distribution.

An example of the Fisher Exact Test used on a 2x3 matrix is provided here. This fictitious example varies high, middle, and low income with owning or not owning at least one dog. The example contains a p-value calculator for a 2x3 matrix in which all of the work is shown. The formulas and rules used are the same as are used for the 2x2 matrix example. All possible matrices, while keeping the row and column sums the same as the original matrix, are calculated in the second sheet of the example. The p-values for those matrices were calculated using the p-value calculator. Finally, all p-values less than or equal to the p-value cutoff (the p-value of the original matrix) are summed to create the final p-value. Since the p-value is smaller than the 0.05, the null hypothesis can be rejected and it can be decided that high, middle, and low income households do not have the same dog-owning trends.

As noted above, most modern statistical packages will calculate the significance of Fisher tests, in some cases even where the chi-squared approximation would also be acceptable. The actual computations as performed by statistical software packages will as a rule differ from those described above, because numerical difficulties may result from the large values taken by the factorials. A simple, somewhat better computational approach relies on a gamma function or log-gamma function, but methods for accurate computation of hypergeometric and binomial probabilities remains an active research area.

Controversies

Despite the fact that Fisher's test gives exact p-values, some authors have argued that it is conservative, i.e. that its actual rejection rate is below the nominal significance level.^[10]^[11]^[12] The apparent contradiction stems from the combination of a discrete statistic with fixed significance levels.^[13]^[14] To be more precise, consider the following proposal for a significance test at the 5%-level: reject the null hypothesis for each table to which Fisher's test assigns a p-value equal to or smaller than 5%. Because the set of all tables is discrete, there may not be a table for which equality is achieved. If is the largest p-value smaller than 5% which can actually occur for some table, then the proposed test effectively tests at the -level. For small sample sizes, might be significantly lower than 5%.^[10]^[11]^[12] While this effect occurs for any discrete statistic (not just in contingency tables, or for Fisher's test), it has been argued that the problem is compounded by the fact that Fisher's test conditions on the marginals.^[15] To avoid the problem, many authors discourage the use of fixed significance levels when dealing with discrete problems.^[13]^[14]

Another early discussion revolved around the necessity to condition on the marginals.^[16]^[17] Fisher's test gives exact p-values both for fixed and for random marginals. Other tests, most prominently Barnard's, require random marginals. Some authors^[13]^[14]^[17] (including, later, Barnard himself^[13]) have criticized Barnard's test based on this property. They argue that the marginal totals are an (almost^[14]) ancillary statistic, containing (almost) no information about the tested property.

Alternatives

An alternative exact test, Barnard's exact test, has been developed and proponents of it suggest that this method is more powerful, particularly in 2 × 2 tables. Another alternative is to use maximum likelihood estimates to calculate a p-value from the exact binomial or multinomial distributions and reject or fail to reject based on the p-value^{[citation needed]}.

References

^ Fisher, R. A. (1922). "On the interpretation of χ² from contingency tables, and the calculation of P". Journal of the Royal Statistical Society 85 (1): 87–94. doi:10.2307/2340521. JSTOR 2340521.
^ Fisher, R.A. (1954). Statistical Methods for Research Workers. Oliver and Boyd. ISBN 0-05-002170-2.
^ Agresti, Alan (1992). "A Survey of Exact Inference for Contingency Tables". Statistical Science 7 (1): 131–153. doi:10.1214/ss/1177011454. JSTOR 2246001.
^ Larntz, Kinley (1978). "Small-sample comparisons of exact levels for chi-squared goodness-of-fit statistics". Journal of the American Statistical Association 73 (362): 253–263. doi:10.2307/2286650. JSTOR 2286650.
^ Mehta, Cyrus R; Patel, Nitin R; Tsiatis, Anastasios A (1984). "Exact significance testing to establish treatment equivalence with ordered categorical data". Biometrics 40 (3): 819–825. doi:10.2307/2530927. JSTOR 2530927. PMID 6518249.
^ Mehta, C. R. 1995. SPSS 6.1 Exact test for Windows. Englewood Cliffs, NJ: Prentice Hall.
^ Mehta C.R., Patel N.R. (1983). "A Network Algorithm for Performing Fisher's Exact Test in r Xc Contingency Tables". Journal of the American Statistical Association 78 (382): 427–434. doi:10.2307/2288652.
^ mathworld.wolfram.com Page giving the formula for the general form of Fisher's exact test for m × n contingency tables
^ Cyrus R. Mehta and Nitin R. Patel (1986). "ALGORITHM 643: FEXACT: a FORTRAN subroutine for Fisher's exact test on unordered r×c contingency tables". ACM Trans. Math. Softw. 12 (2): 154–161. doi:10.1145/6497.214326.
^ ^a ^b Liddell, Douglas (1976). "Practical tests of 2x2 contingency tables". The Statistician 25 (4): 295–304. doi:10.2307/2988087. JSTOR 2988087.
^ ^a ^b Berkson, Joseph (1978). "In dispraise of the exact test". Journal of Statistic Planning and Inference 2: 27–42. doi:10.1016/0378-3758(78)90019-8.
^ ^a ^b D'Agostino, R. B., Chase, W., and Belanger, A. (1988). "The Appropriateness of Some Common Procedures for Testing Equality of Two Independent Binomial Proportions". The American Statistician 42 (3): 198–202. doi:10.2307/2685002. JSTOR 2685002.
^ ^a ^b ^c ^d Yates, F. (1984). "Tests of Significance for 2 x 2 Contingency Tables (with discussion)". Journal of the Royal Statistical Society, Series A 147 (3): 426–463. doi:10.2307/2981577. JSTOR 2981577.
^ ^a ^b ^c ^d Roderick J. A. Little (1989). "Testing the Equality of Two Independent Binomial Proportions". The American Statistician 43 (4): 283–288. doi:10.2307/2685390. JSTOR 2685390.
^ Cyrus R. Mehta and Pralay Senchaudhuri, "Conditional versus Unconditional Exact Tests for Comparing Two Binomials" (4 September 2003). Retrieved 20 November 2009 from http://www.cytel.com/Papers/twobinomials.pdf
^ Barnard, G.A (1945). "A New Test for 2×2 Tables". Nature 156 (3954): 177. doi:10.1038/156177a0.
^ ^a ^b Fisher (1945). "A New Test for 2 × 2 Tables". Nature 156 (3961): 388. doi:10.1038/156388a0. ; Barnard, G.A (1945). "A New Test for 2×2 Tables". Nature 156 (3974): 783. doi:10.1038/156783b0.

External links

Calculate Fishers Exact Test Online
Fisher Exact Test calculator for Android devices

UpToDate Contents

全文を閲覧するには購読必要です。 To read the full text you will need to subscribe.

1. 視覚変化が認められた小児患者に対するアプローチ approach to the pediatric patient with vision change
2. 成人におけるギラン・バレー症候群の臨床的特徴および診断 clinical features and diagnosis of guillain barre syndrome in adults
3. アレルギーのin vitro検査の概要 overview of in vitro allergy tests
4. 重症筋無力症の診断 diagnosis of myasthenia gravis
5. 小児におけるギラン・バレー症候群の疫学、臨床症状、および診断 epidemiology clinical features and diagnosis of guillain barre syndrome in children

English Journal

Factors Associated with Contraceptive Satisfaction in Adolescent Women Using the IUD.

Friedman JO1.
Journal of pediatric and adolescent gynecology.J Pediatr Adolesc Gynecol.2015 Feb;28(1):38-42. doi: 10.1016/j.jpag.2014.02.015.
STUDY OBJECTIVE: To estimate satisfaction and to identify factors contributing to an adolescent woman's satisfaction with the levonorgestrel-containing or copper intrauterine device (IUD).DESIGN: Adolescent women presenting to an urban clinic within 1 month of IUD insertion completed survey question
PMID 25555299

A Randomized Controlled Trial Comparing Dexamethasone with Loteprednol Etabonate on Postoperative Photorefractive Keratectomy.

Thanathanee O1, Sriphon P, Anutarapongpan O, Athikulwongse R, Thongphiew P, Rangsin R, Suwan-Apichon O.
Journal of ocular pharmacology and therapeutics : the official journal of the Association for Ocular Pharmacology and Therapeutics.J Ocul Pharmacol Ther.2015 Jan 2. [Epub ahead of print]
Abstract Purpose: To compare loteprednol etabonate 0.5%/tobramycin 0.3% (Zylet®) with dexamethasone 0.1%/tobramycin 0.3% (Tobradex®) in terms of the epithelial healing time, postoperative visual acuity, corneal haziness score, and intraocular pressure (IOP) in postoperative treatment after photore
PMID 25555173

Cilostazol administration with combination enteral and parenteral nutrition therapy remarkably improves outcome after subarachnoid hemorrhage.

Kimura H1, Okamura Y, Chiba Y, Shigeru M, Ishii T, Hori T, Shiomi R, Yamamoto Y, Fujimoto Y, Maeyama M, Kohmura E.
Acta neurochirurgica. Supplement.Acta Neurochir Suppl.2015;120:147-52. doi: 10.1007/978-3-319-04981-6_25.
OBJECTIVE: In order to prevent cerebral vasospasm (VS) following aneurysmal subarachnoid hemorrhage (SAH), we introduced combined enteral nutrition (EN) and parenteral nutrition (PN) with oral cilostazol administration to the postoperative patient after SAH and investigated the effect on VS.METHODS:
PMID 25366615

Japanese Journal

結核患者の治療中断に関するリスク要因の分析

渡部ゆう
保健医療科学 63(2), 179-180, 2014-04
… Chi-square test or Fisher's exact test was used for analysis of the relationship between defaulting from tuberculosis treatment and factors of the Risk Assessment Seat.Results: The defaulting rate of "Homeless", "No supporters", "Has economical problem" and "Has problem of regular outpatient treatment" were significantly higher than that of control group, "Smear positive" and "Has complications" were significantly lower than that of control group. …
NAID 110009808530

わが国の統計学解説書に見られるFisherの直接確率法の両側確率と片側確率をめぐる混乱

竹森幸一,三上聖治,仁平將 [他],富田恵
弘前医療福祉大学紀要 5(1), 77-81, 2014-03-31
… わが国の統計学解説書において、Fisherの直接確率法の解説が両側確率と片側確率をめぐって混乱していることを述べた。 … 解説の主な相違点はFisherの直接確率の分布は左右対称ではなく、例外的に行あるいは列の合計が等しい場合に左右対称になるとするもの（解説類型1）と、分布は左右対称であるので、両側確率は片側確率の2 倍となるというもの（解説類型2）である。 …
NAID 120005436706

Impact of planning of pregnancy in women with epilepsy on seizure control during pregnancy and on maternal and neonatal outcomes

Abe Kanako,Hamada Hiromi,Yamada Takahiro,Obata-Yasuoka Mana,Minakami Hisanori,Yoshikawa Hiroyuki
Seizure 23(2), 112-116, 2014-02
… treatment profile for epilepsy, seizure control, and maternal and neonatal outcomes in both groups were compared using Chi-square test or Fisher's exact test and Mann–Whitney U test.ResultsCompared to the unplanned-pregnancy group, the planned-pregnancy group showed a significantly greater proportion of patients receiving monotherapy with antiepileptic drugs (80% vs. 61%: planned vs. unplanned, P = 0.049) and those not requiring valproic acid …
NAID 120005399274