We may recall the discussion of repeated experimentation. In each of $N$ repetitions of an experiment, we observe whether or not a given event $A$ occurs, and we write $N(A)$ for the total number of occurrences of $A$. One possible philosophical underpinning of probability theory requires that the proportion $N(A) / N$ settles down as $N \rightarrow \infty$ to some limit interpretable as the ‘probability of $A^{\prime} .$ Is our theory to date consistent with such a requirement?

With this question in mind, let us suppose that $A_{1}, A_{2}, \ldots$ is a sequence of independent events having equal probability $\mathbb{P}\left(A_{i}\right)=p,$ where $0<p<1$; such an assumption requires of course the existence of a corresponding probability space $(\Omega, \mathcal{F}, \mathbb{P}),$ but we do not plan to get bogged down in such matters here. We think of $A_{i}$ as being the event ‘that $A$ occurs on the $i$ th experiment’. We write $S_{n}=\sum_{i=1}^{n} I_{A_{t}},$ the sum of the indicator functions of $A_{1}, A_{2}, \ldots, A_{n}$ $S_{n}$ is a random variable which counts the number of occurrences of $A_{i}$ for $1 \leq i \leq n$ (certainly $S_{n}$ is a function of $\Omega,$ since it is the sum of such functions, and it is left as an exercise to show that $S_{n}$ is $\mathcal{F}$ -measurable). The following result concerning the ratio $n^{-1} S_{n}$ was proved by James Bernoulli before 1692 .

(1) Theorem. It is the case that $n^{-1} S_{n}$ converges to $p$ as $n \rightarrow \infty$ in the sense that, for all $\epsilon>0$

$$

\mathbb{P}\left(p-\epsilon \leq n^{-1} S_{n} \leq p+\epsilon\right) \rightarrow 1 \quad \text { as } \quad n \rightarrow \infty

$$

There are certain technicalities involved in the study of the convergence of random variables, and this is the reason for the careful statement of the theorem. For the time being, we encourage the reader to interpret the theorem as asserting simply that the proportion $n^{-1} S_{n}$ of times that the events $A_{1}, A_{2}, \ldots, A_{n}$ occur converges as $n \rightarrow \infty$ to their common probability $p$. We shall see later how important it is to be careful when making such statements. Interpreted in terms of tosses of a fair coin, the theorem implies that the proportion of heads is (with large probability) near to $\frac{1}{2}$. As a caveat regarding the difficulties inherent in studying the convergence of random variables, we remark that it is not true that, in a ‘typical’ sequence of tosses of a fair coin, heads outnumber tails about one-half of the time.

Proof. Suppose that we toss a coin repeatedly, and heads occurs on each toss with probability $p$. The random variable $S_{n}$ has the same probability distribution as the number $H_{n}$ of heads which occur during the first $n$ tosses, which is to say that $\mathbb{P}\left(S_{n}=k\right)=\mathbb{P}\left(H_{n}=k\right)$ for all $k$. It follows that, for small positive values of $\epsilon$

$$

\mathbb{P}\left(\frac{1}{n} S_{n} \geq p+\epsilon\right)=\sum_{k \geq n(p+\epsilon)} \mathbb{P}\left(H_{n}=k\right)

$$

We have that

$$

\mathbb{P}\left(H_{n}=k\right)=\left(\begin{array}{l}

n \\

k

\end{array}\right) p^{k}(1-p)^{n-k} \quad \text { for } \quad 0 \leq k \leq n

$$

and hence

(2)

$$

\mathbb{P}\left(\frac{1}{n} S_{n} \geq p+\epsilon\right)=\sum_{k=m}^{n}\left(\begin{array}{l}

n \\

k

\end{array}\right) p^{k}(1-p)^{n-k}

$$

where $m=\lceil n(p+\epsilon)\rceil,$ the least integer not less than $n(p+\epsilon)$. The following argument is standard in probability theory. Let $\lambda>0$ and note that $e^{\lambda k} \geq e^{\lambda n(p+\epsilon)}$ if $k \geq m .$ Writing $q=1-p,$ we have that

$$

\begin{aligned}

\mathbb{P}\left(\frac{1}{n} S_{n} \geq p+\epsilon\right) & \leq \sum_{k=m}^{n} e^{\lambda[k-n(p+\epsilon)]}\left(\begin{array}{l}

n \\

k

\end{array}\right) p^{k} q^{n-k} \\

& \leq e^{-\lambda n \epsilon} \sum_{k=0}^{n}\left(\begin{array}{l}

n \\

k

\end{array}\right)\left(p e^{\lambda q}\right)^{k}\left(q e^{-\lambda p}\right)^{n-k} \\

&=e^{-\lambda n \epsilon}\left(p e^{\lambda q}+q e^{-\lambda p}\right)^{n},

\end{aligned}

$$

by the binomial theorem. It is a simple exercise to show that $e^{x} \leq x+e^{x^{2}}$ for $x \in \mathbb{R}$. With the aid of this inequality, we obtain

(3)

$$

\mathbb{P}\left(\frac{1}{n} S_{n} \geq p+\epsilon\right) \leq e^{-\lambda n \epsilon}\left[p e^{\lambda^{2} q^{2}}+q e^{\lambda^{2} p^{2}}\right]^{n}

$$

$$

\leq e^{\lambda^{2} n-\lambda n \epsilon}

$$

We can pick $\lambda$ to minimize the right-hand side, namely $\lambda=\frac{1}{2} \epsilon,$ giving

(4)

$$

\mathbb{P}\left(\frac{1}{n} S_{n} \geq p+\epsilon\right) \leq e^{-\frac{1}{4} n \epsilon^{2}} \quad \text { for } \quad \epsilon>0

$$

an inequality that is known as ‘Bernstein’s inequality’. It follows immediately that $\mathbb{P}\left(n^{-1} S_{n} \geq\right.$ $p+\epsilon) \rightarrow 0$ as $n \rightarrow \infty$. An exactly analogous argument shows that $\mathbb{P}\left(n^{-1} S_{n} \leq p-\epsilon\right) \rightarrow 0$

as $n \rightarrow \infty,$ and thus the theorem is proved.

Bernstein’s inequality (4) is rather powerful, asserting that the chance that $S_{n}$ exceeds its mean by a quantity of order $n$ tends to zero exponentially fast as $n \rightarrow \infty$; such an inequality is known as a ‘large-deviation estimate’. We may use the inequality to prove rather more than the conclusion of the theorem. Instead of estimating the chance that, for a specific value of $n, S_{n}$ lies between $n(p-\epsilon)$ and $n(p+\epsilon),$ let us estimate the chance that this occurs for all large $n$. Writing $A_{n}=\left{p-\epsilon \leq n^{-1} S_{n} \leq p+\epsilon\right},$ we wish to estimate $\mathbb{P}\left(\bigcap_{n=m}^{\infty} A_{n}\right) .$ Now

the complement of this intersection is the event $\bigcup_{n=m}^{\infty} A_{n}^{\mathrm{c}},$ and the probability of this union satisfies, by the inequalities of Boole and Bernstein,

(5)

$$

\mathbb{P}\left(\bigcup_{n=m}^{\infty} A_{n}^{\mathrm{c}}\right) \leq \sum_{n=m}^{\infty} \mathbb{P}\left(A_{n}^{\mathrm{c}}\right) \leq \sum_{n=m}^{\infty} 2 e^{-\frac{1}{4} n \epsilon^{2}} \rightarrow 0 \quad \text { as } \quad m \rightarrow \infty

$$

giving that, as required,

(6)

$$

\mathbb{P}\left(p-\epsilon \leq \frac{1}{n} S_{n} \leq p+\epsilon \text { for all } n \geq m\right) \rightarrow 1 \quad \text { as } \quad m \rightarrow \infty

$$

##### stat 305 Introduction to stastistics

**上课听不懂lecturer ?**

**笔记也看不懂？**

**Theory 太多 …Practice题目有点hold 不住？**

**需要帮助，欢迎联系我们。**

统计代写，Statistics代写请认准UprivateTA™. UprivateTA™为您的留学生涯保驾护航。

偏微分方程代写

**PDE代写**

Partial Differential Equations代写可以参考一份偏微分方程midterm答案解析