# 集中不等式 - 第一部分

### 马尔可夫不等式

$$\ underbrace {P（X \ GEQ \阿尔法）} _ {\文本{的大于恒定} \阿尔法概率} \当量\ underbrace {\压裂{\ mathbb {E} \左[X \权利]}{\阿尔法}} _ {\文本{界以上由期望超过恒定} \阿尔法}$$

：一家杂货店出售的平均每天40种啤酒（它的夏天！）。是什么，将出售80个或更多的啤酒明天的概率是多少？

\ {开始}对齐 P（X \ GEQ \阿尔法）\当量\压裂{\ mathbb {E} \左[X \右]} {\阿尔法} \\\\ P(X \geq 80) & \leq\frac{40}{80} = 0.5 = 50\% \end{align}

### 切比雪夫不等式

When we have information about the underlying distribution of a random variable, we can take advantage of properties of this distribution to know more about the concentration of this variable. Let’s take for example a normal distribution with mean $$\mu = 0$$ and unit standard deviation $$\sigma = 1$$ given by the probability density function (PDF) below:

$$F（X）= \压裂{1} {\ SQRT {2 \ PI}}ë^ { - X ^ 2/2}$$

Integrating from -1 to 1: $$\int_{-1}^{1} \frac{1}{\sqrt{2\pi}}e^{-x^2/2}$$, we know that 68% of the data is within $$1\sigma$$ (one standard deviation) from the mean $$\mu$$ and 95% is within $$2\sigma$$ from the mean. However, when it’s not possible to assume normality, any other amount of data can be concentrated within $$1\sigma$$ or $$2\sigma$$.

Chebyshev’s inequality provides a way to get a bound on the concentration for any distribution, without assuming any underlying property except a finite mean and variance. Chebyshev’s also holds for any random variable, not only for non-negative variables as in Markov’s inequality.

$$P（\中期X - \亩\中间\ GEQķ\西格马）\当量\压裂{1} {K ^ 2}$$

that can also be rewritten as:

$$P(\mid X – \mu \mid < k\sigma) \geq 1 – \frac{1}{k^2}$$

For the concrete case of $$k = 2$$, the Chebyshev’s tells us that at least 75% of the data is concentrated within 2 standard deviations of the mean. And this holds forany distribution

Now, when we compare this result for $$k = 2$$ with the 95% concentration of the normal distribution for $$2\sigma$$, we can see how conservative is the Chebyshev’s bound. However, one must not forget that this holds for any distribution and not only for a normally distributed random variable, and all that Chebyshev’s needs, is the first and second moments of the data. Something important to note is that in absence of more information about the random variable, this cannot be improved.

### 切比雪夫不等式与大数定律弱

Chebyshev’s inequality can also be used to prove the弱大数定律，它说，在概率对真实均值样本均值收敛。

That can be done as follows:

• 考虑独立同分布的序列（独立同分布）的随机变量\（X_1，X_2，X_3，\ ldots \），平均\（\亩\）和方差\（\西格马^ 2 \）;
• The sample mean is $$M_n = \frac{X_1 + \ldots + X_n}{n}$$ and the true mean is $$\mu$$;
• 对于样品的期望意味着我们有：$$\ mathbb {E} \左[M_n \右] = \压裂{\ mathbb {E} \左[X_1 \右] + \ ldots + \ mathbb {E} \左[X_n \右]} {N} = \压裂{N \亩} {N} = \亩$$
• 对于样品的方差，我们有：$$瓦尔\左[M_n \右] = \压裂{VAR \左[X_1 \右] + \ ldots +无功\左[X_n \右]} {N ^ 2} =\压裂{N \西格玛^ 2} {N ^ 2} = \压裂{\西格马^ 2} {N}$$
• 由切比雪夫不等式的应用e have: $$P(\mid M_n – \mu \mid \geq \epsilon) \leq \frac{\sigma^2}{n\epsilon^2}$$ for any (fixed) $$\epsilon > 0$$, as $$n$$ increases, the right side of the inequality goes to zero. Intuitively, this means that for a large $$n$$ the concentration of the distribution of $$M_n$$ will be around $$\mu$$.

### 提高马尔科夫的切比雪夫和与切尔诺夫界

Before getting into the Chernoff bound, let’s understand the motivation behind it and how one can improve on Chebyshev’s bound. To understand it, we first need to understand the difference between a pairwise independence and mutual independence. For the pairwise independence, we have the following for A, B, and C:

$$P（A \帽B）= P（A）P（B）\\ P（A \帽C）= P（A）P（C）\\ P(B \cap C) = P(B)P(C)$$

Which means that any pair (any two events) are independent, but not necessarily that:

$$P(A \cap B\cap C) = P(A)P(B)P(C)$$

which is called “mutual independence” and it is a stronger independence. By definition, the mutual independence assumes the pairwise independence but the opposite isn’t always true. And this is the case where we can improve on Chebyshev’s bound, as it is not possible without doing these further assumptions (stronger assumptions leads to stronger bounds).

Cite this article as: Christian S. Perone, "Concentration inequalities – Part I," in亚洲金博宝未知领域，23/08/2018，188asia.net

## 2 thoughts to “Concentration inequalities – Part I”

1. Anonymous says:

Is there is a typo in equation 9? Should be $\frac{\sigma^2}{n}$ right?

1. 基督教S. Perone says:

You’re right, thanks for seeing this, will fix it now.