{ "cells": [ { "cell_type": "code", "execution_count": 8, "metadata": { "tags": [ "remove_cell" ] }, "outputs": [], "source": [ "# HIDDEN\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "from datascience import *\n", "from prob140 import *\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "plt.style.use('fivethirtyeight')\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Expectation by Conditioning ##" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let $T$ be a random variable, and let $S$ be a random variable defined on the same space as $T$. As we have seen, conditioning on $S$ might be a good way to find probabilities for $T$ if $S$ and $T$ are related. In this section we will see that conditioning on $S$ can also be a good way to find the expectation of $T$.\n", "\n", "We will start with a simple example to illustrate the ideas. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "remove-input", "hide-output" ] }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# VIDEO: Expectation by Conditioning\n", "from IPython.display import YouTubeVideo\n", "\n", "YouTubeVideo('GrM0Ve-a010')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let the joint distribution of $T$ and $S$ be as in the table below." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
T=3T=4
S=70.30.1
S=60.20.2
S=50.10.1
\n", "
" ], "text/plain": [ " T=3 T=4\n", "S=7 0.3 0.1\n", "S=6 0.2 0.2\n", "S=5 0.1 0.1" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t = [3, 4]\n", "s = [5, 6, 7]\n", "pp = [0.1, 0.2, 0.3, 0.1, 0.2, 0.1]\n", "jt_dist = Table().values('T', t, 'S', s).probabilities(pp)\n", "jt_dist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How can $S$ be involved in the calculation of $E(T)$? \n", "\n", "Notice that to find $E(T)$, you could use the joint distribution table and the definition of expectation as follows:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.4" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "3*(0.3 + 0.2 + 0.1) + 4*(0.1 + 0.2 + 0.1) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is equivalent to going to each cell of the table, weighting the value of $T$ in that cell with the probability in the cell, and then adding. Here's another way of looking at this.\n", "\n", "Let's condition on $S$:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
T=3T=4Sum
Dist. of T | S=70.750.251.0
Dist. of T | S=60.500.501.0
Dist. of T | S=50.500.501.0
Marginal of T0.600.401.0
\n", "
" ], "text/plain": [ " T=3 T=4 Sum\n", "Dist. of T | S=7 0.75 0.25 1.0\n", "Dist. of T | S=6 0.50 0.50 1.0\n", "Dist. of T | S=5 0.50 0.50 1.0\n", "Marginal of T 0.60 0.40 1.0" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jt_dist.conditional_dist('T', 'S')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each of the three conditional distributions is a distribution in its own right. Therefore its histogram has a balance point, just as the marginal distribution of $T$ does." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
T=3T=4SumEV
Dist. of T | S=70.750.251.03.25
Dist. of T | S=60.500.501.03.50
Dist. of T | S=50.500.501.03.50
Marginal of T0.600.401.03.40
\n", "
" ], "text/plain": [ " T=3 T=4 Sum EV\n", "Dist. of T | S=7 0.75 0.25 1.0 3.25\n", "Dist. of T | S=6 0.50 0.50 1.0 3.50\n", "Dist. of T | S=5 0.50 0.50 1.0 3.50\n", "Marginal of T 0.60 0.40 1.0 3.40" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jt_dist.conditional_dist('T', 'S', show_ev=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see $E(T) = 3.4$ in the row corresponding to the distribution of $T$. And you can also see the *conditional expectation of $T$* given each possible value of $S$:\n", "- $~E(T \\mid S=5) = 3.5$\n", "- $~E(T \\mid S=6) = 3.5$\n", "- $~E(T \\mid S=7) = 3.25$\n", "\n", "This defines a *function of $S$*: for each value $s$ of $S$, the function returns $E(T \\mid S=s)$." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
s E(T | S = s) P(S = s)
5 3.5 0.2
6 3.5 0.4
7 3.25 0.4
" ], "text/plain": [ "s | E(T | S = s) | P(S = s)\n", "5 | 3.5 | 0.2\n", "6 | 3.5 | 0.4\n", "7 | 3.25 | 0.4" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ev_T_given_S = Table().with_columns(\n", " 's', s,\n", " 'E(T | S = s)', [3.5, 3.5, 3.25],\n", " 'P(S = s)', [0.2, 0.4, 0.4]\n", ")\n", "ev_T_given_S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This function of $S$ is called the *conditional expectation of $T$ given $S$* and is denoted $E(T \\mid S)$. Unlike expectation which is a number, conditional expectation is a random variable.\n", "\n", "As it's a random variable, it has an expectation, which we can calculate using the non-linear function rule. The answer is a quantity that you will recognize." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.4000000000000004" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ev = sum(ev_T_given_S.column('E(T | S = s)')*ev_T_given_S.column('P(S = s)'))\n", "ev" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's right: it's the expectation of $T$.\n", "\n", "What we have learned from this is that $E(T)$ is the *average of the conditional expectations of $T$ given the different values of $S$, weighted by the probabilities of those values*. \n", "\n", "In short, $E(T)$ is the *expectation of the conditional expectation of $T$ given $S$*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conditional Expectation as a Random Variable ###\n", "In general, suppose $T$ and $S$ are two random variables on a probability space.\n", "\n", "Then for each fixed value of $s$, $T$ has a conditional distribution given $S=s$. This is an ordinary distribution and has an expectation. That is called the *conditional expectation of $T$ given $S=s$* and is denoted $E(T \\mid S = s)$. \n", "\n", "So for each $s$, there is a value $E(T \\mid S=s)$. This defines a function of the random variable $S$. It is called the *conditional expectation of $T$ given $S$*, and is denoted $E(T \\mid S)$.\n", "\n", "The key difference between expectation and conditional expectation:\n", "\n", "- $E(T)$, the expectation of $T$, is a real number.\n", "- $E(T \\mid S)$, the conditional expectation of $T$ given $S$, is a function of $S$ and hence is a random variable." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```{admonition} Quick Check\n", "A class has three sections. \n", "