Probability

30 Dec 2019

Reading time ~4 minutes

Possibly discussed below

Probability - a pretty neat thing that humans invested their thoughts on, eh? Ignoring its origin (gambling), of course.

Although it starts as a trivial topic (something you could probably skip studying much of during high school), it eventually turns out to be much nastier as the level of study progresses, from undergrads to post-doc.

The inclusion of statistics, as well as a couple of other mathematical chapters such as combinatorials (typical PnC or rather $\frac{P!}{(P-C)!C!}$ as you might remember) and number theory adds up to its complexity.

Altogether, it is a very significant topic, something which is broadly required for data science and competitive programming as well.

The very basic notion of probability is a value for the occurrence of an event - one that is either one of the binary forms (i.e. 0 and 1, or boolean false and true if I were to interpret it in my domain of computer science!), or it is a value in between them. The former accounts for the case of Bernoulli random variables, i.e., they can only take the values of 0 or 1. These two values are extreme and thus do not fall under the category of ‘an event that is likely/probable to occur’, because if the event doesn’t, it’s a 0 and if it does, it’s a 1. These cases are denoted as an ‘impossible’ (although nothing is as they say) and a ‘certain’ event respectively.

Imagine someone asking you - ‘What proportion of estimated attendants will attend your event today?’ - to which your answer might say ‘around 90%’ or ‘around 99.99%’, but never ‘around 100%’. Why? because it sounds grammatically incorrect? Yup, because they are contradictory in themselves. 100% implies it’s ‘certain’, denoting the probability of a fixed/certain event, whereas ‘around’ in context here means that it’s not certain - in fact, it’s a probability strictly lying between 0 and 1, or between 0% and 100%.

The sum of the probabilities of two events occurring together (intersection) and either of the events occurring (union) is equal to the sum of the individual probabilities of those events to occur, i.e. $P(A \cap B) + P(A \cup B) = P(A) + P(B)$.

The sum of the probabilities of an event and its complement must equal 1, i.e. say for an event A, $P(A) + P(A') = 1$ or $P(A) + P(A^c) = 1$, depending upon how one writes/denotes the complement.

Conditional probability gives the measure of an event occurring, given that another event has already occurred. Considering two events ‘A’ and ‘B’, the probability of A given B is given by $P(A/B) = \frac{P(A \cap B)}{P(B)}$, and conversely, the probability of B given A is given by $P(B/A) = \frac{P(B \cap A)}{P(A)}$.

If A and B are independent (unrelated events), the probability of them occurring together (intersection) becomes the product of their respective probabilities, and hence their conditional probabilities will simply be the probability of the occurring event (given that another event has already occurred, but it doesn’t change anything) i.e. to say, $P(A \cap B) = P(A).P(B) => P(A/B) = P(A)$ and $P(B/A) = P(B)$.

If A and B are disjoint (mutually exclusive), the probability of them occurring together becomes 0, and so does their conditional probabilities; i.e. to say, $P(A \cap B) = \phi$ => $P(A/B) = P(B/A) = 0$.

For discrete random variables:

The expected value (weighted average of the possible values, where the weights are the proportions with which those values occur) and the variance (weighted average of the squared deviations that could occur) are respectively given by: $\mu=E[S] = \sum_{\textrm{possible }s}\;s\cdot P\left(S=s\right)$

\[\sigma^{2}=V[S] = \sum_{\textrm{possible }s}\; (s-\mu)^2 \cdot P\left(S=s\right)\]

Their probabilities are additive, for e.g.: $P(1 \cup 2) = P(1) + P(2)$
There exists a probability mass function, which describes how the probability is spread across the possible outcomes.

For instance, the probability mass function for a:

Binomial random variable $X$ taken from $n$ trials each with probability of success $\pi$ is: $P(X=x)=\frac{n!}{x!(n-x)!}\;\pi^{x}\;(1-\pi)^{n-x}$
Poisson random variable for some count variable $Y$ with expected number of events over an unit of time/space $\lambda$: ($e$ is a constant with a value $2.718\dots$) $P(Y=y)=\frac{\lambda^{y}e^{-\lambda}}{y!}$

On the other hand for continuous random variables, there exists a probability density function instead, which is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would be close to.

The probability density function for the:

Uniform (variation of Binomial) distribution is:

\[f(x)=\begin{cases} \frac{1}{B - A} & A \le x \le B \end{cases}\]

Standard uniform distribution (follows standardization) is:

\[f(x)=\begin{cases} 1 & 0 \le x \le 1 \end{cases}\]

Exponential (continuous analog of Poisson) distribution is:

\[f(x)=\begin{cases} \lambda e^{-\lambda x} & x\ge0\;\textrm{ and }\;\lambda>0\\ 0 & \textrm{otherwise} \end{cases}\]

Normal distribution is: $f(x)=\frac{1}{\sqrt{2\pi}\sigma}\exp\left[-\frac{(x-\mu)^{2}}{2\sigma^{2}}\right]$

Will update later

Anirban

12/30/2019