WordNet

purely outward or superficial; "external composure"; "an external concern for reputation"- A.R.Gurney,Jr.
outward features; "he enjoyed the solemn externals of religion"
coming from the outside; "extraneous light in the camera spoiled the photograph"; "relying upon an extraneous income"; "disdaining outside pressure groups" (同)extraneous, outside
from or between other countries; "external commerce"; "international trade"; "developing nations need outside help" (同)international, outside
happening or arising or located outside or beyond some limits or especially surface; "the external auditory canal"; "external pressures"
the quality of having legal force or effectiveness (同)validness
a nonresident doctor or medical student; connected with a hospital but not living there (同)medical extern

PrepTutorEJDIC

『外側の』,外部の / (作用・動作などが)『外側からの』,外部からの / 表面上の,見せかけの / 体外の;体外用の / 外国の(foreign) / (…の)外観,外形《+『of』+『名』》
(理論・理由などの)『妥当性』,正当性《+of+名》 / (契約などの)有効性,合法性《+of+名》

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア（Wikipedia）』「2016/04/19 17:43:48」(JST)

wiki en

External validity is the validity of generalized (causal) inferences in scientific research, usually based on experiments as experimental validity.^[1] In other words, it is the extent to which the results of a study can be generalized to other situations and to other people.^[2] Mathematical analysis of external validity concerns a determination of whether generalization across heterogeneous populations is feasible, and devising statistical and computational methods that produce valid generalizations.^[3]

1 Threats to external validity
2 Disarming threats to external validity
3 External, internal, and ecological validity
4 Qualitative research
5 External validity in experiments
- 5.1 Generalizability across situations
- 5.2 Generalizability across people
- 5.3 Replications
6 The basic dilemma of the social psychologist
7 See also
8 Notes

Threats to external validity

"A threat to external validity is an explanation of how you might be wrong in making a generalization."^[4] Generally, generalizability is limited when the cause (i.e. the independent variable) depends on other factors; therefore, all threats to external validity interact with the independent variable - a so-called background factor x treatment interaction.^[5]

Aptitude–treatment Interaction: The sample may have certain features that may interact with the independent variable, limiting generalizability. For example, inferences based on comparative psychotherapy studies often employ specific samples (e.g. volunteers, highly depressed, no comorbidity). If psychotherapy is found effective for these sample patients, will it also be effective for non-volunteers or the mildly depressed or patients with concurrent other disorders?
Situation: All situational specifics (e.g. treatment conditions, time, location, lighting, noise, treatment administration, investigator, timing, scope and extent of measurement, etc. etc.) of a study potentially limit generalizability.
Pre-test effects: If cause-effect relationships can only be found when pre-tests are carried out, then this also limits the generality of the findings.
Post-test effects: If cause-effect relationships can only be found when post-tests are carried out, then this also limits the generality of the findings.
Reactivity (placebo, novelty, and Hawthorne effects): If cause-effect relationships are found they might not be generalizable to other settings or situations if the effects found only occurred as an effect of studying the situation.
Rosenthal effects: Inferences about cause-consequence relationships may not be generalizable to other investigators or researchers.

Cook and Campbell^[6] made the crucial distinction between generalizing to some population and generalizing across subpopulations defined by different levels of some background factor. Lynch has argued that it is almost never possible to generalize to meaningful populations except as a snapshot of history, but it is possible to test the degree to which the effect of some cause on some dependent variable generalizes across subpopulations that vary in some background factor. That requires a test of whether the treatment effect being investigated is moderated by interactions with one or more background factors.^[5]^[7]

Disarming threats to external validity

Whereas enumerating threats to validity may help researchers avoid unwarranted generalizations, many of those threats can be disarmed, or neutralized in a systematic way, so as to enable a valid generalization. Specifically, experimental findings from one population can be "re-processed", or "re-calibrated" so as to circumvent population differences and produce valid generalizations in a second population, where experiments cannot be performed. Pearl and Bareinboim ^[3] classified generalization problems into two categories: (1) those that lend themselves to valid re-calibration, and (2) those where external validity is theoretically impossible. Using graph-based calculus,^[8] they discovered a necessary and sufficient condition for a problem instance to enable a valid generalization, and devised algorithms that automatically produce the needed re-calibration, whenever such exists ^[9] This reduces the external validity problem to an exercise in graph theory, and has led some philosophers to conclude that the problem is now solved.^[10]

An important variant of the external validity problem deals with selection bias also known as sampling bias that is, bias created when studies are conducted on non-representative samples of the intended population. For example, if a clinical trial is conducted on college students, an investigator may wish to know whether the results generalize to the entire population, where attributes such as age, education, and income differ substantially from those of a typical student. The graph-based method of Bareinboim and Pearl identifies conditions under which sample selection bias can be circumvented. The main difference between generalization from improperly sampled studies and generalization across disparate populations lies in the fact that disparities among populations are usually caused by preexisting factors, such as age or ethnicity, whereas selection bias is often caused by post-treatment conditions, for example, patients dropping out of the study, or patients selected by severity of injury. When selection is governed by post-treatment factors, unconventional re-calibration methods are required to ensure bias-free estimation, and these methods are readily obtained from the problem's graph.^[11]^[12]

External, internal, and ecological validity

In many studies and research designs, there may be a "trade-off" between internal validity and external validity: When measures are taken or procedures implemented aiming at increasing the chance for higher degrees of internal validity, these measures may also limit the generalizability of the findings. This situation has led many researchers call for "ecologically valid" experiments. By that they mean that experimental procedures should resemble "real-world" conditions. They criticize the lack of ecological validity in many laboratory-based studies with a focus on artificially controlled and constricted environments. Some researchers think external validity and ecological validity are closely related in the sense that causal inferences based on ecologically valid research designs often allow for higher degrees of generalizability than those obtained in an artificially produced lab environment. However, this again relates to the distinction between generalizing to some population (closely related to concerns about ecological validity) and generalizing across subpopulations that differ on some background factor. Some findings produced in ecologically valid research settings may hardly be generalizable, and some findings produced in highly controlled settings may claim near-universal external validity. Thus, External and Ecological Validity are independent – a study may possess external validity but not ecological validity, and vice versa.

Qualitative research

Within the qualitative research paradigm, external validity is replaced by the concept of transferability. Transferability is the ability of research results to transfer to situations with similar parameters, populations and characteristics.^[13]

External validity in experiments

It is common for researchers to claim that experiments are by their nature low in external validity. Some claim that many drawbacks can occur when following the experimental method. By the virtue of gaining enough control over the situation so as to randomly assign people to conditions and rule out the effects of extraneous variables, the situation can become somewhat artificial and distant from real life.

There are two kinds of generalizability at issue:

The extent to which we can generalize from the situation constructed by an experimenter to real-life situations (generalizability across situations),^[2] and
The extent to which we can generalize from the people who participated in the experiment to people in general (generalizability across people)^[2]

However, both of these considerations pertain to Cook and Campbell's concept of generalizing to some target population rather than the arguably more central task of assessing the generalizability of findings from an experiment across subpopulations that differ from the specific situation studied and people who differ from the respondents studied in some meaningful way.^[6]

Critics of experiments suggest that external validity could be improved by use of field settings (or, at a minimum, realistic laboratory settings) and by use of true probability samples of respondents. However, if one's goal is to understand generalizability across subpopulations that differ in situational or personal background factors, these remedies do not have the efficacy in increasing external validity that is commonly ascribed to them. If background factor X treatment interactions exist of which the researcher is unaware (as seems likely), these research practices can mask a substantial lack of external validity. Dipboye and Flanagan (1979), writing about industrial and organizational psychology, note that the evidence is that findings from one field setting and from one lab setting are equally unlikely to generalize to a second field setting.^[14] Thus, field studies are not by their nature high in external validity and laboratory studies are not by their nature low in external validity. It depends in both cases whether the particular treatment effect studied would change with changes in background factors that are held constant in that study. If one's study is "unrealistic" on the level of some background factor that does not interact with the treatments, it has no effect on external validity. It is only if an experiment holds some background factor constant at an unrealistic level and if varying that background factor would have revealed a strong Treatment x Background factor interaction, that external validity is threatened.^[15]

Generalizability across situations

Research in psychology experiments attempted in universities are often criticized for being conducted in artificial situations and that it cannot be generalized to real life.^[16] To solve this problem, social psychologists attempt to increase the generalizability of their results by making their studies as realistic as possible. As noted above, this is in the hope of generalizing to some specific population. Realism per se does not help the make statements about whether the results would change if the setting were somehow more realistic, or if study participants were placed in a different realistic setting. If only one setting is tested, it is not possible to make statements about generalizability across settings.^[5]^[7]

However, many authors conflate external validity and realism. There is more than one way that an experiment can be realistic:

The similarity of an experimental situation to events that occur frequently in everyday life—it is clear that many experiments are decidedly unreal.
In many experiments, people are placed in situations they would rarely encounter in everyday life.

This is referred to the extent to which an experiment is similar to real-life situations as the experiment's mundane realism.^[16]

It is more important to ensure that a study is high in psychological realism—how similar the psychological processes triggered in an experiment are to psychological processes that occur in everyday life.^[17]

Psychological realism is heightened if people find themselves engrossed in a real event. To accomplish this, researchers sometimes tell the participants a cover story—a false description of the study's purpose. If however, the experimenters were to tell the participants the purpose of the experiment then such a procedure would be low in psychological realism. In everyday life, no one knows when emergencies are going to occur and people do not have time to plan responses to them. This means that the kinds of psychological processes triggered would differ widely from those of a real emergency, reducing the psychological realism of the study.^[2]

People don't always know why they do what they do, or what they do until it happens. Therefore, describing an experimental situation to participants and then asking them to respond normally will produce responses that may not match the behavior of people who are actually in the same situation. We cannot depend on people's predictions about what they would do in a hypothetical situation; we can only find out what people will really do when we construct a situation that triggers the same psychological processes as occur in the real world.

Generalizability across people

Social psychologists study the way in which people in general are susceptible to social influence. Several experiments have documented an interesting, unexpected example of social influence, whereby the mere knowledge that others were present reduced the likelihood that people helped.

The only way to be certain that the results of an experiment represent the behaviour of a particular population is to ensure that participants are randomly selected from that population. Samples in experiments cannot be randomly selected just as they are in surveys because it is impractical and expensive to select random samples for social psychology experiments. It is difficult enough to convince a random sample of people to agree to answer a few questions over the telephone as part of a political poll, and such polls can cost thousands of dollars to conduct. Moreover, even if one somehow was able to recruit a truly random sample, there can be unobserved heterogeneity in the effects of the experimental treatments... A treatment can have a positive effect on some subgroups but a negative effect on others. The effects shown in the treatment averages may not generalize to any subgroup.^[5]^[18]

Many researchers address this problem by studying basic psychological processes that make people susceptible to social influence, assuming that these processes are so fundamental that they are universally shared. Some social psychologist processes do vary in different cultures and in those cases, diverse samples of people have to be studied.^[19]

Replications

The ultimate test of an experiment's external validity is replication — conducting the study over again, generally with different subject populations or in different settings. Researches will often use different methods, to see if they still get the same results.

When many studies of one problem are conducted, the results can vary. Several studies might find an effect of the number of bystanders on helping behaviour, whereas a few do not. To make sense out of this, there is a statistical technique called meta-analysis that averages the results of two or more studies to see if the effect of an independent variable is reliable. A meta analysis essentially tells us the probability that the findings across the results of many studies are attributable to chance or to the independent variable. If an independent variable is found to have an effect in only of 20 studies, the meta-analysis will tell you that that one study was an exception and that, on average, the independent variable is not influencing the dependent variable. If an independent variable is having an effect in most of the studies, the meta analysis is likely to tell us that, on average, it does influence the dependent variable.

There can be reliable phenomena that are not limited to the laboratory. For example, increasing the number of bystanders has been found to inhibit helping behaviour with many kinds of people, including children, university students, and future ministers;^[19] in Israel;^[20] in small towns and large cities in the U.S.;^[21] in a variety of settings, such as psychology laboratories, city streets, and subway trains;^[22] and with a variety of types of emergencies, such as seizures, potential fires, fights, and accidents,^[23] as well as with less serious events, such as having a flat tire.^[24] Many of these replications have been conducted in real-life settings where people could not possibly have known that an experiment was being conducted.

The basic dilemma of the social psychologist

When conducting experiments in psychology, some believe that there is always a trade-off between internal and external validity—

having enough control over the situation to ensure that no extraneous variables are influencing the results and to randomly assign people to conditions, and
ensuring that the results can be generalized to everyday life.

Some researchers believe that a good way to increase external validity is by conducting field experiments. In a field experiment, people's behavior is studied outside the laboratory, in its natural setting. A field experiment is identical in design to a laboratory experiment, except that it is conducted in a real-life setting. The participants in a field experiment are unaware that the events they experience are in fact an experiment. Some claim that the external validity of such an experiment is high because it is taking place in the real world, with real people who are more diverse than a typical university student sample. However, as real-world settings differ dramatically, findings in one real world setting may or may not generalize to another real world setting.^[14]

Neither internal nor external validity are captured in a single experiment. Social psychologists opt first for internal validity, conducting laboratory experiments in which people are randomly assigned to different conditions and all extraneous variables are controlled. Other social psychologists prefer external validity to control, conducting most of their research in field studies. And many do both. Taken together, both types of studies meet the requirements of the perfect experiment. Through replication, researchers can study a given research question with maximal internal and external validity.^[25]

Notes

^ Mitchell, M. & Jolley, J. (2001). Research Design Explained (4th Ed) New York:Harcourt.
^ ^a ^b ^c ^d Aronson, E., Wilson, T. D., Akert, R. M., & Fehr, B. (2007). Social psychology. (4 ed.). Toronto, ON: Pearson Education.
^ ^a ^b Pearl, Judea; Bareinboim, Elias (2014). "External validity: From do-calculus to transportability across populations". Statistical Science 29 (4): 579–595.
^ Trochim, William M. The Research Methods Knowledge Base, 2nd Edition.
^ ^a ^b ^c ^d Lynch, John (1982). "On the External Validity of Experiments in Consumer Research". Journal of Consumer Research 9 (3): 225–239. doi:10.1086/208919. Retrieved December 2, 2010.
^ ^a ^b Cook, Thomas D.; Campbell, Donald T. (1979). Quasi-Experimentation: Design & Analysis Issues for Field Settings. Chicago: Rand McNally College Publishing Company. ISBN 978-0395307908.
^ ^a ^b Lynch, John (1999). "Theory and External Validity" (PDF). Journal of the Academy of Marketing Science 27 (3): 367–76.
^ Pearl, Judea (1995). "Causal diagrams for empirical research". Biometrika 82 (4): 669–710.
^ Bareinboim, Elias; Pearl, Judea (2013). "A general algorithm for deciding transportability of experimental results". Journal of Causal Inference 1 (1): 107–134.
^ Marcellesi, Alexandre (December 2015). "External validity: Is there still a problem?". Philosophy of Science 82: 1308–1317.
^ Pearl, Judea (2015). Generalizing experimental findings. Journal of Causal Inference 3 (2). p. 259-266.
^ Bareinboim, Elias; Tian, Jin; Pearl, Judea (2014). "Recovering from selection bias in causal and statistical inference". Proceedings of the Twenty-eighth AAAI Conference on Artificial Intelligence ("Palo Alto, CA": AAAI Press): 2410-2416.
^ Lincoln, Y.S. & Guba, E.G. (1986). But is it rigorous? Trustworthiness and authenticity in naturalistic evaluation. In D.D. Williams (Ed.), Naturalistic evaluation (pp. 73-84). New Directions for Program Evaluation, 30. San Francisco, CA: Jossey-Bass.
^ ^a ^b Dipboye, Robert L.; Flanagan, Michael F. (1979). "Research Settings in Industrial and Organizational Psychology: Are Findings in the Field More Generalizable than the Laboratory". American Psychologist 34 (2): 141–150. doi:10.1037/0003-066x.34.2.141.
^ Lynch, John (1982). "On the External Validity of Experiments in Consumer Research". Journal of Consumer Research 9 (3): 225–239. doi:10.1086/208919.
^ ^a ^b Aronson, E., & Carlsmith, J.M. (1968). Experimentation in social psychology. In G. Lindzey & E. Aronson(Eds.), The Handbook of social psychology. (Vol. 2, pp. 1-79.) Reading, MA: Addison-Wesley.
^ Aronson, E., Wilson, T.D., & Brewer, m. (1998). Experimental methods. In D. Gilbert, S. Fiske, & G. Lindzey (Eds.), The handbook of social psychology. (4th ed., Vol. 1, pp. 99-142.) New York: Random House.
^ Hutchinson, J. Wesley; Kamakura, Wagner A.; Lynch, John G. (2000). "Unobserved Heterogeneity as an Alternative Explanation for "Reversal" Effects in Behavioral Research". Journal of Consumer Research 27 (3): 324–344. doi:10.1086/317588.
^ ^a ^b Darley, J.M., & Batson, C.,D. (1973). From Jerusalem to Jericho: A study of situational and dispositional variables in helping behaviour. Journal of Personality and Social Psychology, 27, 100-108.
^ Schwartz, S.H., & Gottlieb, A. (1976). Bystander reactions to a violent theft: Crime in Jerusalem. Journal of Personality and Social Psychology, 34, 1188-1199.
^ Latane, B., & Dabbs, J.M. (1975). Sex, group size, and helping in three cities. Sociometry, 38, 108-194.
^ Harrison, J.A., & Wells, R.B. (1991). Bystander effects on male helping behaviour: Social comparison and diffusion of responsibility. Representative Research in Social Psychology, 96, 187-192
^ Latane, B., & Darley, J.M. (1968). Group inhibition of bystander intervention. Journal of Personality and Social Psychology, 10, 215-221.
^ Hurley, D., & Allen, B.P. (1974). The effect of the number of people present in a nonemergency situation. Journal of Social Psychology, 92, 27-29.
^ Latane, B., & Darley, J.M. (1970). The unresponsive bystander: Why doesn't he help? Englewood Cliffs, NJ: Prentice Hall

UpToDate Contents

全文を閲覧するには購読必要です。 To read the full text you will need to subscribe.

1. 生物統計学および疫学に関する一般用語集 glossary of common biostatistical and epidemiological terms
2. 診断的検査の評価 evaluating diagnostic tests
3. 腎移植におけるエビデンスに基づいた医療および臨床試験 evidence based medicine and clinical trials in renal transplantation
4. 透析患者における突然の心停止の治療および予防 treatment and prevention of sudden cardiac arrest in dialysis patients
5. 成人の自殺リスクに対する抗うつ剤の効果 effect of antidepressants on suicide risk in adults

English Journal

Evaluation of multivariate calibration models with different pre-processing and processing algorithms for a novel resolution and quantitation of spectrally overlapped quaternary mixture in syrup.

Moustafa AA1, Hegazy MA1, Mohamed D2, Ali O3.
Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.Spectrochim Acta A Mol Biomol Spectrosc.2016 Feb 5;154:76-83. doi: 10.1016/j.saa.2015.10.010. Epub 2015 Oct 22.
A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration me
PMID 26519913

Validation of the Hearing Implant Sound Quality Index (HISQUI19) to assess Spanish-speaking cochlear implant users' auditory abilities in everyday communication situations.

Calvino M1, Gavilán J1, Sánchez-Cuadrado I1, Pérez-Mora RM1, Muñoz E1, Lassaletta L1.
Acta oto-laryngologica.Acta Otolaryngol.2016 Jan;136(1):48-55. doi: 10.3109/00016489.2015.1086021. Epub 2015 Sep 25.
CONCLUSION: The Spanish-language HISQUI19 is a reliable and easy-to-use tool for quantifying the self-perceived level of auditory benefit that cochlear implant (CI) users experience in everyday listening situations.OBJECTIVES: To validate the Spanish-language version of The Hearing Implant Sound Qua
PMID 26406547

An empirically based model for knowledge management in health care organizations.

Sibbald SL1, Wathen CN, Kothari A.
Health care management review.Health Care Manage Rev.2016 Jan-Mar;41(1):64-74. doi: 10.1097/HMR.0000000000000046.
BACKGROUND: Knowledge management (KM) encompasses strategies, processes, and practices that allow an organization to capture, share, store, access, and use knowledge. Ideal KM combines different sources of knowledge to support innovation and improve performance.PURPOSES: Despite the importance of KM
PMID 25734604

Japanese Journal

作業に根ざした実践のための認知症高齢者版作業的変化ガイドブックの開発(第2報)ガイドブック試作版の内部妥当性・外部妥当性の検討

鎌田陽之,京極真
作業療法 = Japanese occupationai therapy researh : JOTR 33(5), 389-400, 2014-10
NAID 40020237509

Social Behaviour Schedule (SBS)日本語版の妥当性の検討 : 精神科病院における長期在院者の社会行動を看護師が評価する場合

岡本典子,田中有紀
日本精神保健看護学会誌 23(1), 91-100, 2014-06-20
精神科病棟に1年以上入院している統合失調症患者の社会行動を,病棟看護師がSocial Behaviour Schedule (SBS)日本語版を用いて評価した場合の妥当性について検討した.尺度は<対人交流における奇妙さ><過剰で不適切な行動><低下したための不適切な行動><反社会的な行動><対人交流における自己顕示><不安や気分の落 …
NAID 110009818787

強化学習とフィードバック誤差学習を用いた力場適応のための到達運動学習モデル

清水遥,神原裕行,吉村奈津江,辛徳,小池康晴
情報処理学会研究報告. MPS, 数理モデル化と問題解決研究報告 2014-MPS-98(38), 1-7, 2014-06-18
我々は日常的に,机にあるものにむかって手をのばすといった到達運動を行っている.人間の到達運動にはほぼ直線の軌跡と釣鐘型の速度波形という特徴があり,また,外力下においても外力のない状態と同じ運動ができることが知られている.このような特徴を再現する様々な数理的モデルが考えられてきたが,近年,大脳基底核の学習モデルである強化学習や,小脳の学習モデルであるフィードバック誤差学習を組み合わせて,到達運動を実 …
NAID 110009795516