19th Ave New York, NY 95822, USA

# 统计代写| Hypergeometric stat代写

## 统计代考

3.4 Hypergeometric
If we have an urn filled with $w$ white and $b$ black balls, then drawing $n$ balls out of the urn with replacement yields a $\operatorname{Bin}(n, w /(w+b))$ distribution for the number of white balls obtained in $n$ trials, since the draws are independent Bernoulli trials, each with probability $w /(w+b)$ of success. If we instead sample without replacement, as illustrated in Figure 3.7, then the number of white balls follows a Hypergeometric distribution.

Story 3.4.1 (Hypergeometric distribution). Consider an urn with $w$ white balls and $b$ black balls. We draw $n$ balls out of the urn at random without replacement, such that all $\left(\begin{array}{c}w+b \ n\end{array}\right)$ samples are equally likely. Let $X$ be the number of white balls in the sample. Then $X$ is said to have the Hypergeometric distribution with parameters $w, b$, and $n$; we denote this by $X \sim \operatorname{HGeom}(w, b, n)$.
116
FIGURE $3.7$
Hypergeometric story. An urn contains $w=6$ white balls and $b=4$ black balls. We sample $n=5$ without replacement. The number $X$ of white balls in the sample is Hypergeometric; here we observe $X=3$.

As with the Binomial distribution, we can obtain the PMF of the Hypergeometric distribution from the story.

Theorem 3.4.2 (Hypergeometric PMF). If $X \sim$ HGeom $(w, b, n)$, then the PMF of $X$ is
$$P(X=k)=\frac{\left(\begin{array}{c} w \ k \end{array}\right)\left(\begin{array}{c} b \ n-k \end{array}\right)}{\left(\begin{array}{c} w+b \ n \end{array}\right)}$$
for integers $k$ satisfying $0 \leq k \leq w$ and $0 \leq n-k \leq b$, and $P(X=k)=0$ otherwise.
Proof. To get $P(X=k)$, we first count the number of possible ways to draw exactly $k$ white balls and $n-k$ black balls from the urn (without distinguishing between different orderings for getting the same set of balls). If $k>w$ or $n-k>b$, then the draw is impossible. Otherwise, there are $\left(\begin{array}{c}w \ k\end{array}\right)\left(\begin{array}{c}b \ n-k\end{array}\right)$ ways to draw $k$ white and $n-k$ black balls by the multiplication rule, and there are $\left(\begin{array}{c}w+b \ n\end{array}\right)$ total ways to draw $n$ balls. Since all samples are equally likely, the naive definition of probability gives
$$P(X=k)=\frac{\left(\begin{array}{c} w \ k \end{array}\right)\left(\begin{array}{c} b \ n-k \end{array}\right)}{\left(\begin{array}{c} w+b \ n \end{array}\right)}$$
for integers $k$ satisfying $0 \leq k \leq w$ and $0 \leq n-k \leq b$. This PMF is valid because the numerator, summed over all $k$, equals $\left(\begin{array}{c}w+b \ n\end{array}\right)$ by Vandermonde’s identity (Example 1.5.3), so the PMF sums to 1 .

The Hypergeometric distribution comes up in many scenarios which, on the surface, have little in common with white and black balls in an urn. The essential structure of the Hypergeometric story is that items in a population are classified using two sets of tags: in the urn story, each ball is either white or black (this is the first set of tags), and each ball is either sampled or not sampled (this is the second set of tags). Furthermore, at least one of these sets of tags is assigned completely at random (in the urn story, the balls are sampled randomly, with all sets of the correct size equally likely). Then $X \sim \operatorname{HGeom}(w, b, n)$ represents the number items: in the urn story, balls that are both white and sampled.

## 统计代考

3.4 超几何

116

$$P(X=k)=\frac{\left(\begin{数组}{c} w \ ķ \end{array}\right)\left(\begin{array}{c} 乙\ n-k \end{array}\right)}{\left(\begin{array}{c} w+b \ n \end{数组}\right)}$$

$$P(X=k)=\frac{\left(\begin{数组}{c} w \ ķ \end{array}\right)\left(\begin{array}{c} 乙\ n-k \end{array}\right)}{\left(\begin{array}{c} w+b \ n \end{数组}\right)}$$