WordNet

set up or distributed in a deliberately random way (同)randomised
lacking any definite plan or order or purpose; governed by or depending on chance; "a random choice"; "bombs fell at random"; "random movements"
(statistics) the selection of a suitable sample for study
measurement at regular intervals of the amplitude of a varying waveform (in order to convert it to digital form)
the basic unit of money in South Africa; equal to 100 cents

PrepTutorEJDIC

『手当たりしだいの』,任意の;行き当たりばったりの
ラント(南アフリカ共和国の通貨の単位)

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア（Wikipedia）』「2015/11/30 15:56:47」(JST)

wiki en

[Wiki en表示]

In statistics, a simple random sample is a subset of individuals (a sample) chosen from a larger set (a population). Each individual is chosen randomly and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has the same probability of being chosen for the sample as any other subset of k individuals.^[1] This process and technique is known as simple random sampling, and should not be confused with systematic random sampling. A simple random sample is an unbiased surveying technique.

Simple random sampling is a basic type of sampling, since it can be a component of other more complex sampling methods. The principle of simple random sampling is that every object has the same probability of being chosen. For example, suppose N college students want to get a ticket for a basketball game, but there are only X < N tickets for them, so they decide to have a fair way to see who gets to go. Then, everybody is given a number in the range from 0 to N-1, and random numbers are generated, either electronically or from a table of random numbers. Numbers outside the range from 0 to N-1 are ignored, as are any numbers previously selected. The first X numbers would identify the lucky ticket winners.

In small populations and often in large ones, such sampling is typically done "without replacement", i.e., one deliberately avoids choosing any member of the population more than once. Although simple random sampling can be conducted with replacement instead, this is less common and would normally be described more fully as simple random sampling with replacement. Sampling done without replacement is no longer independent, but still satisfies exchangeability, hence many results still hold. Further, for a small sample from a large population, sampling without replacement is approximately the same as sampling with replacement, since the odds of choosing the same individual twice is low.

An unbiased random selection of individuals is important so that if a large number of samples were drawn, the average sample would accurately represent the population. However, this does not guarantee that a particular sample is a perfect representation of the population. Simple random sampling merely allows one to draw externally valid conclusions about the entire population based on the sample.

Conceptually, simple random sampling is the simplest of the probability sampling techniques. It requires a complete sampling frame, which may not be available or feasible to construct for large populations. Even if a complete frame is available, more efficient approaches may be possible if other useful information is available about the units in the population.

Advantages are that it is free of classification error, and it requires minimum advance knowledge of the population other than the frame. Its simplicity also makes it relatively easy to interpret data collected in this manner. For these reasons, simple random sampling best suits situations where not much information is available about the population and data collection can be efficiently conducted on randomly distributed items, or where the cost of sampling is small enough to make efficiency less important than simplicity. If these conditions do not hold, stratified sampling or cluster sampling may be a better choice.

Algorithms

Several efficient algorithms for simple random sampling have been developed.^[2]^[3] A naive algorithm is the draw-by-draw algorithm where at each step we remove item from the set with equal probability and put in the sample. We continue until we have sample of desired size k. The drawback of this method is that it requires random access in the set.

The selection-rejection algorithm developed by Fan et al in 1962^[4] requires single pass over data however its sequential algorithm and requires knowledge of total count of items n which is not available in streaming scenarios.

A very simple random sort algorithm was proved by Sunter in 1977^[5] which simply assigns a random number drawn from uniform distribution (0, 1) as key to each item, sorts all items using the key and selects the smallest k items.

J. Vitter in 1985^[6] proposed reservoir sampling algorithm which is often widely used. This algorithm does not require advance knowledge of n and uses constant space.

Distinction between a systematic random sample and a simple random sample

Consider a school with 1000 students, divided equally into boys and girls, and suppose that a researcher wants to select 100 of them for further study. All their names might be put in a bucket and then 100 names might be pulled out. Not only does each person have an equal chance of being selected, we can also easily calculate the probability P of a given person being chosen, since we know the sample size (n) and the population (N):

1. In the case that any given person can only be selected once (i.e., after selection a person is removed from the selection pool):

\begin{align} P &= 1 - \frac{N-1}{N} \cdot \frac{N-2}{N - 1} \cdot \cdots \cdot \frac{N-n}{N - (n - 1)} \\[8pt] &\stackrel{\text{Canceling:}}{=} 1 - \frac{N - n}N \\[8pt] &= \frac nN \\[8pt] &= \frac{100}{1000} \\[8pt] &= 10\% \end{align}

2. In the case that any selected person is returned to the selection pool (i.e., can be picked more than once):

P = 1-\left(1-\frac{1}{N}\right)^n = 1 - \left(\frac{999}{1000}\right)^{100} = 0.0952\dots \approx 9.5\%

This means that every student in the school has in any case approximately a 1 in 10 chance of being selected using this method. Further, all combinations of 100 students have the same probability of selection.

If a systematic pattern is introduced into random sampling, it is referred to as "systematic (random) sampling". An example would be if the students in the school had numbers attached to their names ranging from 0001 to 1000, and we chose a random starting point, e.g. 0533, and then picked every 10th name thereafter to give us our sample of 100 (starting over with 0003 after reaching 0993). In this sense, this technique is similar to cluster sampling, since the choice of the first unit will determine the remainder. This is no longer simple random sampling, because some combinations of 100 students have a larger selection probability than others – for instance, {3, 13, 23, ..., 993} has a 1/10 chance of selection, while {1, 2, 3, ..., 100} cannot be selected under this method.

Sampling a dichotomous population

If the members of the population come in three kinds, say "blue" "red" and "black", the number of red elements in a sample of given size will vary by sample and hence is a random variable whose distribution can be studied. That distribution depends on the numbers of red and black elements in the full population. For a simple random sample with replacement, the distribution is a binomial distribution. For a simple random sample without replacement, one obtains a hypergeometric distribution.

Multistage sampling
Nonprobability sampling
Opinion poll
Quantitative marketing research

References

^ Yates, Daniel S.; David S. Moore; Daren S. Starnes (2008). The Practice of Statistics, 3rd Ed. Freeman. ISBN 978-0-7167-7309-2.
^ Sampling Algorithms - Springer. 2006-01-01. doi:10.1007/0-387-34240-0. ISBN 978-0-387-30814-2.
^ Meng, Xiangrui (2013). "Scalable Simple Random Sampling and Stratified Sampling" (PDF). Proceedings of the 30th International Conference on Machine Learning (ICML-13): 531–539.
^ Fan, C. T.; Muller, Mervin E.; Rezucha, Ivan (1962-06-01). "Development of Sampling Plans by Using Sequential (Item by Item) Selection Techniques and Digital Computers". Journal of the American Statistical Association 57 (298): 387–402. doi:10.1080/01621459.1962.10480667. ISSN 0162-1459.
^ Sunter, A. B. (1977-01-01). "List Sequential Sampling with Equal or Unequal Probabilities without Replacement". Applied Statistics 26 (3). doi:10.2307/2346966.
^ Vitter, Jeffrey S. (1985-03-01). "Random Sampling with a Reservoir". ACM Trans. Math. Softw. 11 (1): 37–57. doi:10.1145/3147.3165. ISSN 0098-3500.

UpToDate Contents

全文を閲覧するには購読必要です。 To read the full text you will need to subscribe.

1. 胎児の採血 fetal blood sampling
2. 絨毛膜サンプリング検査：リスク、合併症、および手技 chorionic villus sampling
3. 証拠、p値、および仮説検定 proof p values and hypothesis testing
4. システマティックレビューおよびメタアナリシス systematic review and meta analysis
5. 卵巣癌、卵管癌、および腹膜癌：病期分類および初期外科的マネージメント cancer of the ovary fallopian tube and peritoneum staging and initial surgical management

English Journal

The study of KBP of road construction workers of highway AIDS prevention project before and after intervention.

Liu D, Dong SP, Gao GM, Fan MY, Zhang ZJ, Fang PQ.SourceSchool of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, Hubei, China.
Asian Pacific journal of tropical medicine.Asian Pac J Trop Med.2013 Oct;6(10):817-22. doi: 10.1016/S1995-7645(13)60144-3.
OBJECTIVE: To get scientific basis for further health education through the research of the road construction workers' KBP before and after the interventions of highway AIDS prevention project.METHODS: Multi-stage random sampling method was employeed to select workers of 8 sites from 14 sites along
PMID 23870472

An intrinsic algorithm for parallel poisson disk sampling on arbitrary surfaces.

Ying X, Xin SQ, Sun Q, He Y.SourceNanyang Technological University, Singapore.
IEEE transactions on visualization and computer graphics.IEEE Trans Vis Comput Graph.2013 Sep;19(9):1425-37. doi: 10.1109/TVCG.2013.63.
Poisson disk sampling has excellent spatial and spectral properties, and plays an important role in a variety of visual computing. Although many promising algorithms have been proposed for multidimensional sampling in euclidean space, very few studies have been reported with regard to the problem of
PMID 23846089

The effectiveness of a health education intervention on self-care of traumatic wounds.

Chen YC, Wang YC, Chen WK, Smith M, Huang HM, Huang LC.SourceDepartment of Emergency Medicine, China Medical University Hospital, Taichung, Taiwan.
Journal of clinical nursing.J Clin Nurs.2013 Sep;22(17-18):2499-508. doi: 10.1111/j.1365-2702.2012.04295.x. Epub 2012 Nov 2.
AIMS AND OBJECTIVES: To explore the effectiveness of wound care programme for emergency traumatic patient in Taiwan.BACKGROUND: Wound care is one of the most major issues for trauma patients at home. Wound infection has been alerted mostly on medical treatment. Little is known about how healthcare e
PMID 23121467

Japanese Journal

Noise statistics of phase-resolved optical coherence tomography imaging: single-and dual-beam-scan Doppler optical coherence tomography

Makita Shuichi,Jaillon Franck,Jahan Israt,Yasuno Yoshiaki,巻田修一,安野嘉晃
Optics Express 22(4), 4830-4848, 2014-02
… In this paper, the statistical properties of phase shift between two OCT signals that contain additive random noises and speckle noises are presented. … As expected, phase shift noise in the case of the dual-beam-scan method is less than that for the single-beam method when the transversal sampling step is large. …
NAID 120005418337

住民意識にみる公共事業効果の「神話」性とその構成要因 : 鞆の浦港湾架橋問題に関するアンケート調査結果を用いて

鈴木晃志郎
歴史地理学 56(1), 1-20, 2014-01
… This surveyincludes the 441 respondents out of 4,434 residents in Tamonoura area through random sampling. …
NAID 120005423518

Efficient Sampling Method for Monte Carlo Tree Search Problem

TERAOKA Kazuki,HATANO Kohei,TAKIMOTO Eiji
IEICE Transactions on Information and Systems E97.D(3), 392-398, 2014
… We consider Monte Carlo tree search problem, a variant of Min-Max tree search problem where the score of each leaf is the expectation of some Bernoulli variables and not explicitly given but can be estimated through (random) playouts. …
NAID 130003394853