Denoted , a power set consists of all subsets of .
Its cardinality is given by
Def. Sigma Algebra/Field
A sigma algebra on is a set (containing subsets of ) that:
contains the null set
closed under unions
closed under complementation
Def. Probability Measure
A probability measure defined on a set with algebra is a function with the following properties:
normed
countably additive
, for mutually disjoint
To turn finite additivity into countable additivity, add infinitely many null sets.
Many sample spaces are infinite sets, and there is no that can be defined for every element of these sets. We thus restrict the domain of P to be a subset .
Prop 1.2.2 (Some Event Must Occur)
If is a probability model, then = 0
Proof
Let for so the are mutually disjoint, and
(Contradiction) Suppose that , then , which , so
Lecture 2
HierarchyElements () -> sets of elements (events or ) -> sigma algebras () -> Borel sets ()
Prop 1.3.1 (Intersection of Sigma Algebras)
If is a family/set of on , then is a algebra on
Proof
must have the properties of a :
Since the intersection contains the null set, is closed under unions and complementation, it is a .
Def. Sigma Algebra Generated by C
is obtained by intersecting all containing . It is thus the smallest algebra on containing all subsets in .
Def. Borel Set
is a algebra generated by open sets. Formally:
It is the smallest algebra on containing all rectangles of the form (a, b] where
, since contains all such rectangles., since there is a subset that is not a borel set.
Loosely speaking, any set that can be defined explicitly is a borel set. (Nice) transformations of borel sets are also borel sets.
Def. Ellipsoidal Region
A ball of radius centered at is given by
The set that forms its boundary is denoted , and obtained by replacing with =.
Applying an affine transformation on , where is an invertible matrix
we obtain an ellipsoidal region centered at , whose axes and orientation are determined by and
Recall: A matrix is… - symmetric if - invertible if (The 0 matrix is not invertible) - positive definite if
is…
symmetric since
invertible since
positive definite since for any
= 0 iff w = 0, since A is invertible (cannot be the 0 matrix) and is invertible (transpose of an invertible matrix is invertible)
Note for the multivariate normal, is the mean vector, is the variance matrix.
Lecture 3
Def. Limit inferior/superior of a Sequence
For a sequence :
is a member of at least one of the intersections
is a member of all the unions
Properties:
If , then
Monotone Increasing/Decreasing Sequences
is an increasing sequence of sets (as i increases, fewer sets are intersected, the resulting intersection gets bigger)
is an decreasing sequence of sets (as i increases, fewer sets are unioned, the resulting union gets smaller)
Prop 1.4.1 (Monotone Sequences Converge)
A monotone decreasing sequence of sets converges to their intersection.
If , and , then
Proof
Need to prove that lim inf = lim sup:
(1) Since , we have that , so
(2) Also, , so (if we union the same set over and over again, we get that set)
Optional subproof: , since the intersection of many sets intersection of fewer sets Other direction: let , so , i.e. Since they are subsets of each other,
(1 & 2) Since . Hence we have convergence:
A monotone increasing sequence of sets converges to their union.
If , and , then
Proof
(1) Since , we have that so
(2) Also, , so (intersecting the same set over and over again gives that set)
(1 & 2) Since , we have convergence: .
Prop 1.4.2 (Continuity of P)
If and , then as
Note The converse is true
Proof
By the previous proposition, we know (1) & (2)
(1) Since is a monotone decreasing sequence, it converges to the intersection of the sets, i.e.
(2) Since is a monotone increasing sequence, it converges to the union of the sets, i.e.
By (1) & (2), , and
So
Suppose is a monotone increasing sequence, so
Now create mutually disjoint like so such that
So
Suppose is a monotone decreasing sequence, so is monotone increasing.
Hence
Prop 1.4.3 (Prob Measure on a Sigma Algebra)
is a probability measure on if satisfies
(1)
(2) is additive
(3) as whenever and
Proof
(1) and (2) are contained in the def of probability measure (normed and countably additive)
Combining additivity (2) with continuity (3), we have that P is countably additive:
(3) can also be written as
Let , where are mutually disjoint.
Then is a monotone increasing sequence of events with
Since ,
So continuity finite additivity
Important Note Countable additivity continuity of P. By ensuring countable additivity, we ensure continuity of P, which is needed when we have an infinite sample space.
Def. Conditional Probability Model
If is a probability model and has , then the conditional probability model given is , where is given by
Proof
(1)
(2) If are mutually disjoint,
then
Since is normed and countably additive, is a probability model.
Note The model can also be presented as ()
Prop 1.5.1 (LOTP / Thm of Total Prob.)
Suppose with , and with , then for any
Proof
Since where are mutually disjoint, P(A) =
Fact If each is a partition of , then and the sets are mutually disjoint
Proof
Since when , we have , and (also )
Lecture 4
Def. Statistically Independent
If is a probability model and , then A and C are statistically independent if
It follows that when ,
Statistically Independent Sigma Algebras
and are statistically independent if every element of the generated by is statistically independent of every element of the generated by B:
Proof
and are statistically independent since , and so
and are statistically independent since , and so
and are statistically independent since , and so
and are statistically independent in the same vein.
and are statistically independent since
Def. Mutually Statistically Independent
When is a probability model and is a collection of sub of , the are mutually statistically independent if , where distinct , and .
Notes
Pairwise Independence Mutual Independence
i.e.
Without pairwise independence, mutual independence
Union of 3 events (Inclusion-Exclusion Principles)
Proof
Generalized to n events
Proof
Base The result is true for n=2:
I.H.Assume it's true for n
Consider
Combining the above, we have
Intersection of 3 events
Proof
Generalized to n events
2. Random Variables and Stochastic Processes
Lecture 5
Motivation If we have a population — , a measurement of some sort — , and we want to assign probabilities to events — , or , the probabilities are on instead of — this is difficult. To navigate this, we use inverse images.
Def. Inverse Image
Under the function , the inverse image of the set is given by
By and , we have since they are subsets of each other
Proof for Complements
Let , then
So
Suppose
So
By and ,
Property If , then and are also disjoint.
Proof
Suppose , then
Def. Random Variable
A random variable is a function with the property that for any (i.e. Borel set in ), .
Thus, when X is a random variable,
Prop 2.1.1 (Marginal Probability Measure)
When X is a r.v., the marginal probability measure of X is , which is defined on by
Proof
Normed:
Countably additive: If are mutually disjoint elements of , then
Note The probability model for a random variable X is
Prop 2.1.2 (Determine whether X is a random variable)
If for every , then is a random variable.
Proof
Let
Since and , we know
If , then
Since and , we know
If , then
Since and , we know
By 1 (contains null set), 2 (closed under comp), & 3 (closed under union), we know is a sub of
By hypothesis, , since is the smallest containing all intervals
By and , they are subsets of each other, so X is a random variable.
Examples
is a r.v. since for any ,
is a r.v. since for any ,
is a r.v. since for any ,
is a r.v. since for any b,
if is even, and
if is odd
(projection on the ith coordinate) is a r.v. since for any b,
Also, when X is continuous on , so it must be a r.v.
Note When , then any is a random variable.
Prop 2.1.3 (Sum & Prod of R.V.s are R.V.s)
If X, Y are random variables defined on , then (1) W = X+Y and (2) W = XY are both random variables.
Proof of (1) W = X + Y
Suppose
Let be such that , then such that and ,
We can take the intersection to get that
We can express the set of all as , so
Since is countable, and is a countable union of elements of , we have that
By hypothesis is monotone decreasing, so is a r.v.
Proof of (2) W = XY
Suppose b = 0, then
Suppose b > 0, then
We've shown , so we just need to show the other part: .
Since xy=b is symmetrical over the line y=-x, proving the argument for one of 1 & 4 will suffice.
Suppose and let . Then such that
since is countable.
Since
A similar argument holds for b < 0. For any b, , so W=XY is a r.v.
E.g. is a r.v. if X is a r.v.
Any constant function is a r.v., so all are r.v.'s.
The product of r.v.'s is a r.v., so all are r.v.'s
The sum of r.v.'s is a r.v., so is a r.v.
Prop 2.1.4 (Sigma Algebra generated by X)
When X is a random variable, is a sub of , called the on generated by X.
Alternative notation:
Proof
If , then such that .
So (since )
If , then such that .
So (since )
By 1(contains null), 2(closed under unions), 3(closed under complementation), is a sub of
Def. Random Vector
Recall
A random variable is a function with the property that for any , .
Thus, when X is a random variable, , since
A random vector is a function with the property that for any , .
Thus, when X is a random vector, , since
Properties
is a random vector
The marginal probability measure of is given by
The generated by is
Example (Pt. 1)
Suppose we have , and the uniform prob measure
Let be given by where are defined as and
Example (Pt. 2)
What if we change the def of ? If are now and , what is ?
Only 2 possible outputs now: (0,1) and (1, 0)
Then for
Example (Pt. 3)
If P is not uniform, but instead defined , what is ?
Prop 2.1.5 (Cartesian Prod of Borel Sets is a Borel Set)
If , then and is the smallest on containing all such sets
Proof
Consider the sets that only restrict the ith coord.
Then is a sub of
Sub-proofLet
If for i = 1, 2, …, then since
If , then since .
So
Since each k-cell is of this form, there a on containing all such sets that is smaller than .
Prop 2.1.6 (A Vector of R.V.s is a Random Vector)
If is a random variable for , then is a random vector.
Proof
Suppose . By the previous proposition, . Then we have
Since is a random vector.
Lecture 6
Def. K-cells
, or
K-cells are the basic sets we want to assign probabilities to (using random vectors)
For k = 2, (a, b] =
Def. Cumulative Distribution Function (CDF)
The cumulative distribution function for random vector is given by
Def. Difference Operator
For any , the i-th difference operator is given by
Prop 2.2.1 (Properties of Distribution Functions)
Any distribution function satisfies
If for , then
is right continuous
If , then
Proof for (1)
Proof for (2)
Proof for (3)
Thm 2.2.1 (Extension Theorem)
If satisfies the 3 properties of distribution functions, then a unique probability measure on , such that is the distribution function of
Note such an determines a probability model and we can define a random vector with this model by taking and
Now we can present by a function of points (rather than sets)
Def. Marginal Distributions
Def. Discrete Probability Models
Prop 2.3.1 (Countably Many Points with Positive Prob)
Prop 2.3.2 (Prob Measure Defined by p)
Def. Multinomial Distribution
Def. Multivariate Hypergeometric Distribution
Lecture 7
Def. Continuous Probability Models
Def. Absolutely Continuous Probability Models
Def. Probability Density Functions (PDF)
Prop 2.4.1 (Properties of A.C. Models)
with probability 1
Prop 2.4.2 (Properties of PDFs)
is a density function for a.c. model if
Def. Multivariate Normal Distribution
Lecture 8 & 9
Suppose we transform the random vector to the random vector
Discrete case
If is discrete (with prob function ), then
Def. Projections (& their Prob Functions)
Suppose , then the projection on the first 2 coordinates is
Prob Function Derivation:
To find the probability functions of projections, take the joint probability function, and sum out unwanted variables.
The projection on the second coordinate is
Prob Function Derivation:
Marginal of a Multinomial Random Vector
Let multinomial , then where ,
Suppose , , and we want to find the distribution of
By the defined constraints, and
and
Thus, multinomial
Binomial(n, p) = Multinomial(n, p, 1-p)
If multinomial , then prove binomial multinomial Note this is easy to see intuitively since the multinomial arises by placing ind. observations into mutually disjoint categories, and when we project onto coordinates we are now categorizing into mutually disjoint categories
So
Sum of sub-Multinomial Random Vector ~ Binomial
Use the previous note to determine the distribution of for when
Note in the discrete case, if is 1-1 and , then
is the number of responses falling in the first categories.
A response falls into one of these categories with probability .
So
Def. Indicator Function
For , the indicator function is given by
Indicator Variable ~ Bernoulli(P(A))
Prove: if is a probability model and , then is a random variable with
, and
Since for any , , we know is a r.v.
Transformation Determines Distribution Type
could have a discrete distribution no matter how is distributed.
E.g.Suppose for every , then
and the distribution of is degenerate at
E.g. Suppose , so
Bernoulli
Absolutely continuous case
Suppose has density function , and where .
is also absolutely continuous with density which we want to determine.
Cdf Method
Generally, the cdf method works with projections when there is a formula for :
E.g. Define by
It was proved (in a lec 6 exercise) that this is a cdf (using thm 2.2.1),
so
Check that is a valid :
(i) for all
(ii) f is normed:
so it is valid and we obtain
Therefore, if , then
so , and
Thus, both and have exponential(1) distributions
E.g. Suppose , and () has the triangular density
Change of Variable Method
Suppose is 1-1 and smooth (i.e. all 1st order partial derivatives exist and are continuous),
so and we can find the Jacobian
Since , indicates how is changing volume at ,
so means expands volume at , and means contracts volumes at
If , then for small ,
This intuitive argument can be made rigorous to prove the following.
Proposition II.5.1 (Change of Variable)
If is 1-1 and smooth, and where has an a.c. distribution with density , then has an a.c. distribution with density
E.g. If we have a uniform dist for , find the density for .
, for
Note: solving , we see that T contracts lengths on (0, 1/2) and expands lengths on (1/2, 2)
E.g. Prove for , the pdf.
Consider
Make the polar coordinate change of variable where for
Since , and ,
so
Def. Affine Transformation
(Affine transformation are linear transformations plus a constant.)
is an affine transformation if where
So
Note: iff , so is 1-1 iff is a nonsingular (invertible) matrix, in which case,
If , then
Multivariate Normal
Suppose , so for
Let where is nonsingular and , then since is an affine transformation, we know it has an a.c. distribution with density:
where
If a random vector has this pdf,
Note is symmetric, invertible, and positive definite (see note from lecture 2)