Section 4.4 Probability
¶Subsection 4.4.1 One Random Variable
¶Subsubsection 4.4.1.1 Discrete Example
You perhaps have at least a rudimentary understanding of discrete probability, which measures the likelihood of an “event” when there are a finite number of possibilities. For example, when an ordinary sixsided die is rolled, the probability of getting any particular number is \(1/6\text{.}\) In general, the probability of an event is the number of ways the event can happen divided by the number of ways that “anything” can happen.
For a slightly more complicated example, consider the case of two sixsided dice. The dice are physically distinct, which means that rolling a 2–5 is different than rolling a 5–2; each is an equally likely event out of a total of 36 ways the dice can land, so each has a probability of \(1/36\text{.}\)
Most interesting events are not so simple. More interesting is the probability of rolling a certain sum out of the possibilities 2 through 12. It is clearly not true that all sums are equally likely: the only way to roll a 2 is to roll 1–1, while there are many ways to roll a 7. Because the number of possibilities is quite small, and because a pattern quickly becomes evident, it is easy to see that the probabilities of the various sums are:
Here we use \(P(n)\) to mean “the probability of rolling an \(n\text{.}\)” Since we have correctly accounted for all possibilities, the sum of all these probabilities is \(36/36=1\text{;}\) the probability that the sum is one of 2 through 12 is 1, because there are no other possibilities.
The study of probability is concerned with more difficult questions as well; for example, suppose the two dice are rolled many times. On the average, what sum will come up? In the language of probability, this average is called the expected value of the sum. This is at first a little misleading, as it does not tell us what to “expect” when the two dice are rolled, but what we expect the long term average will be.
Suppose that two dice are rolled 36 million times. Based on the probabilities, we would expect about 1 million rolls to be 2, about 2 million to be 3, and so on, with a roll of 7 topping the list at about 6 million. The sum of all rolls would be 1 million times 2 plus 2 million times 3, and so on, and dividing by 36 million we would get the average:
There is nothing special about the 36 million in this calculation. No matter what the number of rolls, once we simplify the average, we get the same \(\ds\sum_{i=2}^{12} iP(i)\text{.}\) While the actual average value of a large number of rolls will not be exactly 7, the average should be close to 7 when the number of rolls is large. Turning this around, if the average is not close to 7, we should suspect that the dice are not fair.
Subsubsection 4.4.1.2 Discrete and Continuous Random Variables
In this section we will introduce several concepts from probability concerning a single random variable for the purpose of showing yet another application of integration. In a subsequent section we extend the ideas presented here to showcase the use of double integrals.
Definition 4.23. Random Variable.
A random variable \(X\) is a variable that can take certain values, each with a corresponding probability.
In the discrete example above, the random variable was the sum of the two dice.
Definition 4.24. Discrete Random Variable.
When the number of possible values for \(X\) is finite, we say that \(X\) is a discrete random variable .
Definition 4.25. Continuous Random Variable.
When the number of possible values for \(X\) is infinite, we say that \(X\) is a continuous random variable.
In many applications of probability, the number of possible values of a random variable is very large, perhaps even infinite. To deal with the infinite case we need a different approach, and since there is a sum involved, it should not be wholly surprising that integration turns out to be a useful tool. It then turns out that even when the number of possibilities is large but finite, it is frequently easier to pretend that the number is infinite. Suppose, for example, that a dart is thrown at a dart board. Since the dart board consists of a finite number of atoms, there are in some sense only a finite number of places for the dart to land, but it is easier to explore the probabilities involved by pretending that the dart can land on any point in the usual \(x\)\(y\)plane. Therefore, for the rest of this chapter we are concerned with continuous random variables.
Subsubsection 4.4.1.3 Probability Density and Cumulative Distribution
Unlike for a discrete random variable, for a continuous random variable, we have that
for all \(x\text{.}\) We need to approach this differently and instead find the probability that \(X\) falls in some interval \([a,b]\text{.}\) In other words, we need the density of probability of a continuous random variable, which defines the probability density function and allows us to calculate the probability that some value \(x\) of \(X\) falls in a given interval \(I\text{.}\)
Definition 4.26. Probability Density Function.
Let \(f\) be an integrable function. Then \(f\) is the probability density function of a continuous random variable \(X\) if \(f\) satisfies the following two properties:
\(f(x) \geq 0\) for all \(x\text{.}\)
\(\ds\int_{\infty}^\infty f(x)\,dx = 1\text{.}\)
Note:

We associate a probability density function with a random variable \(X\) by stipulating that the probability that \(X\) is between \(a\) and \(b\) is
\begin{equation*} P(a \leq X \leq b) = \ds\int_a^b f(x)\,dx\text{.} \end{equation*} 
Since \(P(X=x)= 0\) for all \(x\text{,}\) we have
\begin{equation*} P(a\leq x \leq b) = P(a \lt x \leq b) = P(a \leq x \lt b ) = P(a \lt x \lt b)\text{.} \end{equation*} Because of the requirement that the integral from \(\infty\) to \(\infty\) be 1, all probabilities are less than or equal to 1, and the probability that \(X\) takes on some value between \(\infty\) and \(\infty\) is 1, as it should be.
Example 4.27. Constructing a Probability Density Function.
Construct a probability density function \(f\) from the following function \(g\text{:}\)
First, a probability density function must be positive, and since \(g(x)\geq 0\) for all \(x\text{,}\) this is true.
Second, we need that \(\displaystyle \int_{\infty}^{\infty} f(x)\,dx = 1\text{.}\) Since
we let
Example 4.28. Verifying a Probability Density Function I.
Show that the following function \(f\) is a probability density function for \(a \lt b\text{:}\)
First, we need to show that \(f\) is positive:
Since \(b\lt a\) we have that \(ba > 0\text{,}\) and so \(\frac{1}{ba} > 0\text{.}\) Thus, we have indeed that \(f(x) \geq 0\text{.}\)
Second, we verify that \(\ds \int_{\infty}^{\infty}f(x)\,dx = 1:\)
Hence, \(f\) is a probability density function.
Example 4.29. Verifying a Probability Density Function II.
Show that the following function \(f\) is a probability density function for \(c>0\text{:}\)
First, we need to show that \(f\) is positive:
Since \(c>0\) and \(e^{cx} > 0\) for all \(x\text{,}\) we have indeed that \(f(x) \geq 0\text{.}\)
Second, we verify that \(\displaystyle \int_{\infty}^{\infty}f(x)\,dx = 1:\)
Hence, \(f\) is a probability density function.
The entire collection of probabilities for a random variable \(X\text{,}\) namely \(P(X\leq x)\) for all \(x\text{,}\) is called a cumulative distribution.
Definition 4.30. Cumulative Distribution Function.
Suppose \(f\) is the probability density function of a random variable \(X\text{.}\) Then the cumulative distribution function is
Note:

For a continuous random variable \(X\) we have
\begin{equation*} F(x) = P(X \leq x) = \ds\int_{\infty}^x f(t)\,dt\text{,} \end{equation*}and so taking the derivative with respect to \(x\) of both sides of the above equation, we see that
\begin{equation*} \frac{dF(x)}{dx} = f(x)\text{.} \end{equation*} 
The probability that the random variable \(X\) belongs to an interval \([a,b] \subseteq \R\) is given by
\begin{equation*} P(a \leq X \leq b) = F(b)F(a)\text{.} \end{equation*} At times, the cumulative distribution function of a random variable \(X\) is written as \(F_X(x)\text{.}\)
We introduce three specific distribution functions, namely uniform distribution, or rectangular distribution, exponential distribution and normal distribution. In a uniform distribution (see Figure 4.3), all subintervals of equal length are equally probable.
Definition 4.31. Uniform Distribution.
Suppose that \(a\lt b\) and
Then \(f(x)\) is the uniform probability density function on \([a,b]\) and the corresponding distribution is the uniform distribution on \([a,b]\text{.}\)
If the probability that an event occurs during a certain time interval is proportional to the length of that time interval–the wait time, then the wait time has exponential distribution (see Figure 4.4). This type of distribution allows us to answer questions such as “What is the wait time in a shopping queue?” or “What is the wait time making a call to some agency?”.
Definition 4.32. Exponential Distribution.
Suppose \(c\) is a positive constant and
Then \(f(x)\) is the exponential probability density function and the corresponding distribution is the exponential distribution.
Example 4.33. Calculating Probabilities.
Given the probability density function
of a continuous random variable \(X\text{,}\) calculate the following probabilities:
\(P(X=1/2)\)
\(P(1/2 \leq X \leq 1)\)
\(P(X=1/2) = \ds{\int_{1/2}^{1/2} 5x^4\,dx = x^5 \big\vert_{1/2}^{1/2} = \left(\frac{1}{2}\right)^5\left(\frac{1}{2}\right)^5 = 0.}\)

We begin by graphing \(f\) and the interested interval \([1/2,1]\text{:}\)
We are interested in the probability that \(X\) falls between \(1/2\) and 1, which is the shaded area under the curve of \(f\) in the above graph:\begin{equation*} P\left(\frac{1}{2} \leq X \leq 1\right) = \int_{1/2}^1 5x^4\,dx =x^5 \bigg\vert_{1/2}^1 = 1^5\left(\frac{1}{2}\right)^5 = 0.96875\text{.} \end{equation*}
Example 4.34. Constructing a Special Probability Density Function.
Consider the function \(\ds f(x) = e^{x^2/2}\text{.}\) What can we say about
Use this information to construct a probability density function \(g\) from \(f\text{.}\)
First, it is easy to see that \(f\) is positive for all \(x\text{.}\) Next, we analyze
We cannot find an antiderivative of \(f\text{,}\) but we can see that this integral is some finite number. Notice that \(\ds 0\lt f(x) = e^{x^2/2} \leq e^{x/2}\) for \(x > 1\text{.}\) This implies that the area under \(\ds e^{x^2/2}\) is less than the area under \(\ds e^{x/2}\text{,}\) over the interval \([1,\infty)\text{.}\) It is easy to compute the latter area, namely
By the Comparison Test, \(\ds \int_1^\infty e^{x^2/2}\,dx\) is some finite number smaller than \(\ds 2/\sqrt{e}\text{.}\) Because \(f\) is symmetric around the \(y\)axis,
This means that
for some finite positive number \(A\text{.}\) Now if we let \(g(x) = f(x)/A\text{,}\)
so \(g\) is a probability density function.
We have shown that \(A\) is some finite number without computing it. By using some techniques from multivariable calculus, it can be shown that \(\ds A=\sqrt{2\pi}\text{.}\)
The function
from the above example is the probability density function of a continuous random variable \(X\) which has a standard normal distribution. Before we say more about the standard normal distribution, let us introduce three more concepts, namely expected value, variance and standard deviation. Given a cumulative distribution, these concepts inform us about its central tendency and provide us with a measure of dispersion from this central tendency.
Subsubsection 4.4.1.4 Expected Value, Variance and Standard Deviation
We ended our earlier discussion about the sum of two dice with a brief analysis of the average of such a sum. In probability, the average is often referred to as the mean or the expected value. This quantity is essentially calculated as the weighted average of all possible values of a random variable based on their probabilities. This means that if more and more values of a random variable were collected by repeated trials of a probability activity, then the sample mean becomes closer to the expected value, and as such, the expected value is the longrun mean of a random variable. For example, you want to know how well you perform on a multiple choice exam if you guess all the answers. Then the expected value tells you how many questions you might get right. We now formally introduce this concept for a discrete random variable.
Definition 4.35. Expected Value for a Discrete Random Variable.
Suppose \(X\) is a discrete random variable. Then the expected value of \(X\) is
where \(x_i\) are the values of \(X\) and \(P(x_i)\) are the associated probabilities.
It comes as no surprise that for the calculation of the expected value of a continuous random variable the sum is extended to an integral.
Definition 4.36. Expected Value for a Continuous Random Variable.
Suppose \(X\) is a continuous random variable. Then the expected value of \(X\) is
provided the integral converges.
Note:
The expected value is often denoted by the Greek symbol \(\mu\) (read “mu”).
The expected value does not always exist.
The expected value is essentially a type of centrality measure as it indicates the typical value for a probability distribution.
Example 4.37. Expected Value of the Standard Normal Distribution.
Calculate the expected value of the standard normal distribution, where the probability density function is
The expected value of the standard normal distribution is
We compute the two halves:
and
Hence,
Therefore, the expected value of the standard normal distribution is zero.
Example 4.38. Expected Value of the Uniform Distribution.
Calculate the expected value of the uniform distribution, where the probability density function is
with \(a \lt b\text{.}\)
The expected value of the uniform distribution is
Therefore,
And so the expected value of the uniform distribution is half the length of the interval \([a,b]\text{.}\)
Example 4.39. Expected Value of the Exponential Distribution.
Calculate the expected value of the exponential distribution, where the probability density function is
for \(c > 0\text{.}\)
The expected value of the exponential distribution is
We calculate the indefinite integral using Integration by Parts:
Thus,
While the expected value is very useful, it typically is not enough information to properly evaluate a situation. For example, suppose we could manufacture an 11sided die, with the faces numbered 2 through 12 so that each face is equally likely to be down when the die is rolled. The value of a roll is the value on this lower face. Rolling the die gives the same range of values as rolling two ordinary dice, but now each value occurs with probability \(1/11\text{.}\) The expected value of a roll is
which is the same value as the expected value of the earlier twodice experiment. Therefore, the expected value does not distinguish the two cases, though of course they are quite different.
If \(f\) is a probability density function for a random variable \(X\text{,}\) with expected value \(\mu\text{,}\) we would like to measure how far a typical value of \(X\) is from \(\mu\text{.}\) One way to measure this distance is \(\ds(X\mu)^2\text{;}\) we square the difference so as to measure all distances as positive. To get the typical such squared distance, we compute the mean, which is referred to as the variance of a discrete random variable. For two dice, for example, we get
Because we squared the differences this does not directly measure the typical distance we seek; if we take the square root of this we do get such a measure, \(\ds\sqrt{35/36}\approx 2.42\text{.}\) The square root of the variance is called the standard deviation and denoted by the Greek letter \(\sigma\) (read “sigma”). Doing the computation for the strange 11sided die we get
with square root approximately 3.16. Comparing 2.42 to 3.16 tells us that the twodice rolls clump somewhat more closely near 7 than the rolls of the weird die, which of course we already knew because these examples are quite simple.
Definition 4.40. Variance of a Discrete Random Variable.
Suppose \(X\) is a discrete random variable. Then the variance of \(X\) is
where \(x_i\) are values of \(X\text{,}\) \(P(x_i\)) are the associated probabilities, and \(\mu\) is the expected value of \(X\text{.}\)
Definition 4.41. Standard Deviation of a Discrete Random Variable.
Suppose \(X\) is a discrete random variable. Then the standard deviation of \(X\) is
where \(x_i\) are values of \(X\text{,}\) \(P(x_i\)) are the associated probabilities, and \(V\) is the variance of \(X\text{.}\)
To perform the same computation for a probability density function the sum is replaced by an integral, just as in the computation of the mean.
Definition 4.42. Variance of a Continuous Random Variable.
Suppose \(X\) is a continuous random variable with probability density function \(f\) and expected value \(\mu\text{.}\) Then the variance of \(X\) is
Definition 4.43.
{Standard Deviation of a Continuous Random Variable} Suppose \(X\) is a continuous random variable with probability density function \(f\) and variance \(V\text{.}\) Then the standard deviation of \(X\) is
Note:
The variance \(V\) of \(X\) is the dispersion from the mean.

The calculation of the variance is based on the mean, and so
\begin{equation*} V(X)= E((X\mu)^2) = E((XE(X))^2)\text{.} \end{equation*} The variance is the mean of a squared number, and so \(V(X) \geq 0\text{.}\)
The larger the distance \((X\mu)^2\) is on average, the higher the variance.
The variance of a constant random variable is zero, since then \(E(X)=X\text{.}\)
Example 4.44. Standard Deviation of the Standard Normal Distribution.
Calculate the standard deviation of the standard normal distribution, where the probability density function is
We begin by finding the variance:
To compute the antiderivative, use Integration by Parts, with \(u=x\) and \(\ds dv=xe^{x^2/2}\,dx\text{.}\) This gives
We cannot compute the new integral, but we know its value when the limits are \(\infty\) to \(\infty\text{,}\) from our discussion of the standard normal distribution in Example 4.34.
Therefore, the standard deviation of the standard normal distribution is \(\sigma = \sqrt{1} = 1\text{.}\)
Example 4.45. Standard Deviation of the Uniform Distribution.
Calculate the standard deviation of the uniform distribution, where the probability density function is
where \(a \lt b\text{.}\)
The mean of the uniform distribution is found in Example 4.38 to be
We now calculate the variance:
Hence, the standard deviation of the uniform distribution over the interval \([a,b]\) is
Example 4.46. Standard Deviation of the Exponential Distribution.
Calculate the standard deviation of the exponential distribution, where the probability density function is
for \(c > 0\text{.}\)
Recall from Example 4.39 that \(\mu = \frac{1}{c}\text{.}\) So the variance is given by
which we can calculate using Integration by Parts:
Therefore,
Hence, the standard deviation of the exponential distribution is \(\sigma(X) = \sqrt{\dfrac{1}{c^2}} = \dfrac{1}{c}\text{.}\)
Subsubsection 4.4.1.5 Normal Distribution
¶One of the most prominent distributions in probability is the socalled normal distribution or bellshaped distribution; the special case where \(\sigma=1\) and \(\mu=0\) was discussed in Example 4.37. Many important data sets, such as exam grades or annual precipitation on the West Coast of BC, can be modelled by a normal distribution. The following is a list of the characteristics of such a distribution:
Characteristics of a Normal Distribution Function.
Let \(X\) be a normal random variable, then its probability density function is
where \(\mu\) is the mean and \(\sigma\) is the standard deviation. The associated normal distribution has the following characteristics.
Its graph is a bellshaped curve, and hence called bell curve (see sample graphs below).
The total area under the curve is 1.
The data are symmetrically distributed in the graph around its mean.
The data are concentrated around the mean.
The further a value is from the mean, the less probable it is to observe that value.
About 68.27% of the values are within one standard deviation of the mean. About 95.45% of the values are within two standard deviations of the mean. About 99.73% of the values — almost all of them — are within three standard deviations of the mean.
The standard normal distribution is the simplest case of a normal distribution, namely when \(\mu=0\) and \(\sigma=1\text{.}\) The standard normal distribution allows us to compare distributions of data.
Characteristics of the Standard Normal Distribution Function.
Let \(X\) be a standard normal random variable, then its probability density function is
where \(\mu\) is the mean and \(\sigma\) is the standard deviation. The associated standard normal distribution has the following characteristics.
The same characteristics as a Normal Distribution Function.
The mean is zero: \(\mu = 0\text{.}\)
The standard deviation is one: \(\sigma = 1\text{.}\)
Because normal distributions play such a vital role in statistics, the areas under the standard normal curve for any value of the normal random variable have been extensively computed and tabulated for easy reference. We will not concern ourselves with such tables. Instead, here is a simple example showing how these ideas can be useful.
Example 4.47. Memory Chips.
Suppose it is known that, in the long run, 1 out of every 100 computer memory chips produced by a certain manufacturing plant is defective when the manufacturing process is running correctly. Suppose 1000 chips are selected at random and 15 of them are defective. This is more than the expected number 10, but is it so many that we should suspect that something has gone wrong in the manufacturing process?
We are interested in the probability that various numbers of defective chips arise; the probability distribution is discrete: there can only be a whole number of defective chips. But (under reasonable assumptions) the distribution is very close to a normal distribution, namely this one:
(recall that \(\ds \exp(x)=e^x\)).
Now how do we measure how unlikely it is that under normal circumstances we would see 15 defective chips? We can't compute the probability of exactly 15 defective chips, as this would be \(\ds\int_{15}^{15} f(x)\,dx = 0\text{.}\) We could compute \(\ds\int_{14.5}^{15.5} f(x)\,dx \approx 0.036\text{;}\) this means there is only a \(3.6\)% chance that the number of defective chips is 15. (We cannot compute these integrals exactly; computer software has been used to approximate the integral values in this discussion.) But this is misleading: \(\ds\int_{9.5}^{10.5} f(x)\,dx \approx 0.126\text{,}\) which is larger, certainly, but still small, even for the “most likely” outcome. The most useful question, in most circumstances, is this: how likely is it that the number of defective chips is “far from” the mean? For example, how likely, or unlikely, is it that the number of defective chips is different by 5 or more from the expected value of 10? This is the probability that the number of defective chips is less than 5 or larger than 15, namely
So there is an \(11\)% chance that this happens—not large, but not tiny. Hence the 15 defective chips does not appear to be cause for alarm: about one time in nine we would expect to see the number of defective chips 5 or more away from the expected 10.
What if the observed number of defective chips was 20? Here we compute
So there is only a \(0.15\)% chance that the number of defective chips is more than 10 away from the mean; this would typically be interpreted as too suspicious to ignore—it shouldn't happen if the process is running normally.
The big question, of course, is what level of improbability should trigger concern? It depends to some degree on the application, and in particular on the consequences of getting it wrong in one direction or the other. If we're wrong, do we lose a little money? A lot of money? Do people die? In general, the standard choices are 5% and 1%. So what we should do is find the number of defective chips that has only, let us say, a 1% chance of occurring under normal circumstances, and use that as the relevant number. In other words, we want to know when
A bit of trial and error shows that with \(r=8\) the value is about \(0.011\text{,}\) and with \(r=9\) it is about \(0.004\text{,}\) so if the number of defective chips is 19 or more, or 1 or fewer, we should look for problems. If the number is high, we worry that the manufacturing process has a problem, or conceivably that the process that tests for defective chips is not working correctly and is flagging good chips as defective. If the number is too low, we suspect that the testing procedure is broken, and is not detecting defective chips.
Subsection 4.4.2 Two Random Variables
In Section 4.4.1, we have learned that the probability density function characterizes the distribution of a continuous random variable. Often, we are interested in several random variables that are related to each other, such as the cost of ski tickets and number of skiers at some local mountain resort. Using multivariable calculus, we can generalize the ideas from Section 4.4.1 to two or more continuous random variables. In these notes, we will restrict ourselves to two continuous random variables \(X\) and \(Y\text{.}\)
Subsubsection 4.4.2.1 Joint Probability Density and Joint Cumulative Distribution
A pair of continuous random variables is characterized by a socalled joint probability density function.
Definition 4.48. Joint Probability Density Function.
Let \(f\) be an integrable function. Then \(f\) is the joint probability density function of a pair of continuous random variables \(X\) and \(Y\) if \(f\) satisfies the following two properties:
\(f(x,y) \geq 0\) for all \(x\) and for all \(y\text{.}\)
\(\ds{\int_{\infty}^{\infty}\int_{\infty}^{\infty}f(x,y)\,dx\,dy = 1}\)
Note:
Recall Fubini's Theorem 4.17, which says that the order of integration does not matter as long as the function which is integrated is continuous over the region of integration. Throughout this section, we simply write \(dxdy\) rather than pointing out every time that we can choose between \(dxdy\) and \(dydx\text{.}\)
At times, the joint probability density function of a pair of random variables \(X\) and \(Y\) is written as \(f_{XY}(x,y)\text{.}\)

The marginal probability density functions are given by
\begin{equation*} f_X(x,y) = \int_{\infty}^{\infty} f_{XY}(x,y)\,dy \text{ and } f_Y(x,y) = \int_{\infty}^{\infty} f_{XY}(x,y)\,dx\text{.} \end{equation*} 
The pair of continuous random variables \(X\) and \(Y\) is independent if and only if the joint probability density function of \(X\) and \(Y\) factors into the product of their marginal probability density functions:
\begin{equation*} f_{XY}(x,y) = f_X(x)f_Y(y)\text{.} \end{equation*}
Example 4.49. Verifying a Joint Probability Density Function.
Verify that
is a joint probability density function on \([0,1]\times[0,3]\) for a pair of continuous random variables \(X\) and \(Y\text{.}\)
First, we observe that \(f\) is nonnegative on \([0,1]\times[0,3]\text{.}\)
Second, we evaluate
Hence, \(f\) is a joint probability density function on \([0,1]\times[0,3]\) for \(X\) and \(Y\text{.}\)
Example 4.50. Finding a Joint Probability Density Function.
Find the constant \(c\) so that
is a joint probability density function for the pair of continuous random variables \(X\) and \(Y\text{.}\)
We begin by graphing the region \(0\leq y \leq x \leq 1\) in the \(x\)\(y\)plane:
Therefore we have that \(y \leq x \leq 1\) with \(0 \leq y \leq 1\) or \(0 \leq y \leq x\) with \(0\leq x\leq 1\text{.}\)
To find the constant \(c\text{,}\) we solve
Evaluating the double integral, we find
Hence, \(c=15\) and so
is a joint probability function for \(X\) and \(Y\text{.}\)
Example 4.51. Independent or Not.
Let \(X\) and \(Y\) be a pair of continuous random variables with joint density function
Are \(X\) and \(Y\) independent?
First, we compute
and
Since
we have that \(X\) and \(Y\) are not independent.
As expected, the probability that the pair of continuous random variables \(X\) and \(Y\) lies in a region \(R\) is obtained by integrating its joint probability density function over \(R\text{,}\) which provides us with a joint cumulative distribution function.
Definition 4.52. Joint Cumulative Distribution Function.
Suppose \(f\) is the probability density function of a pair of random variables \(X\) and \(Y\text{.}\) Then the joint cumulative distribution function is
Note:

Similar to the single random variable case, we have
\begin{equation*} \frac{\partial^2 F(x,y)}{\partial x\partial y} = f(x,y) \end{equation*}for a pair of random variables \(X\) and \(Y\text{.}\) The derivative here is a mixed partial second derivative, which you should have encountered in any differential calculus course.
At times, the joint cumulative distribution function of a pair of random variables \(X\) and \(Y\) is written as \(F_{XY}(x,y)\text{.}\)
\(F(\infty,\infty)=0\) and \(F(\infty,\infty)=1\text{.}\)
\(F(x,y)\) is an increasing function in both \(x\) and \(y\text{.}\)

If the pair of random variables \(X\) and \(Y\) is independent, then
\begin{equation*} F_{XY}(x,y) = F_{X}(x)F_{Y}(y)\text{.} \end{equation*}
Example 4.53. Finding a Cumulative Distribution Function.
Let \(X\) and \(Y\) be two independent uniform random variables on \([0,1]\text{.}\) Find their joint cumulative distribution function \(F_{XY}(x,y)\text{.}\)
Since \(X\) and \(Y\) are uniform on \([0,1]\text{,}\) we have that
Since \(X\) and \(Y\) are independent, we obtain
The graph below shows the values of the joint cumulative distribution function \(F_{XY}(x,y)\) in the \(x\)\(y\)plane:
Example 4.54. Finding a Cumulative Distribution Function.
Let \(X\) and \(Y\) be a pair of continuous random variables with joint density function
Find the joint cumulative distribution function \(F(x,y)\text{.}\)
We first observe that \(F(x,y)=0\) for \(x\lt 0\) or \(y\lt 0\text{,}\) and \(F(x,y)=1\) for \(x \geq 1\) and \(y \geq 1\text{.}\)
We integrate the joint density function to find the cumulative distribution function for \(x>0\) and \(y>0\text{:}\)
When \(0 \leq x \leq 1\) and \(0 \leq y \leq 1\text{,}\) then
When \(0\leq x\leq 1\) and \(y\geq 1\text{,}\) we use that \(F(x,y)\) is continuous, then
When \(x \geq 1\) and \(0 \leq y \leq 1\text{,}\) we use that \(F(x,y)\) is continuous, then
Hence,
Definition 4.55. Probability — Two Random Variables.
Suppose \(f\) is the probability density function of a pair of random variables \(X\) and \(Y\text{.}\) Then the probability that \(X\) and \(Y\) take values in the region \(R=[a,b]\times[c,d]\subseteq \R^2\) is
Example 4.56. Calculating a Probability.
Given the probability density function
of a pair of continuous random variables \(X\) and \(Y\text{,}\) calculate the following probabilities:
\(P(X=1/2,Y=1/2)\)
\(P(X \leq 1/2, Y \leq 1/2)\)

The probability is computed as
\begin{equation*} \begin{split} P\left(X = 1/2, Y=1/2\right) \amp= \int_{1/2}^{1/2}\int_{1/2}^{1/2}\,dx\,dy\\ \amp = \int_{1/2}^{1/2} x \big\vert_{1/2}^{1/2}\,dy\\ \amp = \int_{1/2}^{1/2} 0 \, dy = 0\text{.} \end{equation*} 
The probability is computed as
\begin{equation*} \begin{split} P\left(X \leq 1/2, Y\leq 1/2\right) \amp = P\left((X,Y) \in \left(\infty,1/2\right] \times \left(\infty,1/2\right]\right) \\ \amp = \int_{\infty}^{1/2}\int_{\infty}^{1/2} f(x,y)\,dx\,dy\\ \amp = \int_{0}^{1/2}\int_{0}^{1/2} \,dx\,dy \\ \amp = \int_0^{1/2} x \big\vert_0^{1/2}\,dy\\ \amp = \int_0^{1/2}\frac{1}{2}\,dy = \frac{1}{2} y \big\vert_0^{1/2} = \frac{1}{4}. \end{split} \end{equation*}
Subsubsection 4.4.2.2 Expected Value, Variance and Covariance
Analogous to the single random variable case, we can compute the expected value, variance and standard deviation for a pair of continuous random variables.
Definition 4.57. Expected Values for a Pair of Continuous Random Variables.
Suppose \(X\) and \(Y\) are a pair of continuous random variables with probability density function \(f(x,y)\text{.}\)
Then the expected value of \(X\) is
and the expected value of \(Y\) is
provided the integrals converge.
Note: Equivalently, we can use the marginal probability density functions to compute the expected values of \(X\) and \(Y\) respectively:
Definition 4.58. Variance of a Pair for Continuous Random Variables.
Suppose \(X\) and \(Y\) are a pair of continuous random variables with probability density function \(f(x,y)\text{.}\)
Then the variance of \(X\) is
and the variance of \(Y\) is
provided the integrals converge.
Note: Since the calculation of the variance is based on the mean, we can write
Definition 4.59. Standard Deviation for a Pair of Continuous Random Variables.
Suppose \(X\) and \(Y\) are a pair of continuous random variables with probability density function \(f(x,y)\text{.}\) Then the standard deviation of \(X\) and the standard deviation of \(Y\) are
respectively, where \(V(X)\) is the variance of \(X\) and \(V(Y)\) is the variance of \(Y\text{.}\)
Example 4.60. Calculating Expected Values and Variance.
Let \(X\) and \(Y\) be a pair of continuous random variables with probability density function
What is the expected value of \(X\text{?}\)
What is the expected value of \(Y\text{?}\)
Compute \(V(X)\) and \(V(Y)\text{.}\)
Recall from Example 4.50 that the region \(0\leq y \leq x \leq 1\) in the \(x\)\(y\)plane is graphed as follows:
Therefore we have that \(y \leq x \leq 1\) with \(0 \leq y \leq 1\) or \(0 \leq y \leq x\) with \(0\leq x\leq 1\text{.}\) Recall that the order of integration does not matter for continuous \(f\text{.}\) In our work below, we simply want to show both orders of integration.

The expected value of \(X\) is
\begin{equation*} \begin{split} E(X) \amp = \int_0^1 \int_y^1 x \left(15xy^2\right) \,dx\,dy = \int_0^1 \left[5x^3y^2\right]_y^1 \,dy \\ \amp = 5\int_0^1 \left(y^2y^5\right)\,dy = 5\left[\frac{y^3}{3}\frac{y^6}{6}\right]_0^1 = \frac{5}{6}. \end{split} \end{equation*} 
The expected value of \(Y\) is
\begin{equation*} \begin{split} E(Y) \amp = \int_0^1 \int_0^x y\left(15xy^2\right)\,dy\,dx = \int_0^1 \left[\frac{15xy^4}{4}\right]_0^x\,dx \\ \amp = \int_0^1 \frac{15x^5}{4}\,dx = \frac{5x^6}{8}\bigg\vert_0^1 = \frac{5}{8}. \end{split} \end{equation*} 
Using \(V(X)=E(X^2)E(X)^2\text{,}\) we obtain
\begin{equation*} \begin{split} E(X^2) \amp = \int_0^1\int_y^1 x^2 \left(15xy^2\right)\,dx\,dy = \int_0^1 \left[\frac{15}{4}x^4y^2\right]_y^1\,dy \\ \amp = \frac{15}{4}\int_0^1 \left(y^2y^6\right)\,dy = \frac{15}{4}\left[\frac{y^3}{3}\frac{y^7}{7}\right]_0^1 = \frac{15}{4}\cdot \frac{4}{21} = \frac{5}{7}. \end{split} \end{equation*}Using the result from (a), we have
\begin{equation*} V(X) = E(X^2)E(X)^2 = \frac{5}{7}\left(\frac{5}{6}\right)^2 = \frac{5}{252}\text{.} \end{equation*}Similarly, we compute
\begin{equation*} \begin{split} E(Y^2) \amp = \int_0^1\int_0^x y^2 \left(15xy^2\right)\,dy\,dx = \int_0^1 \left[3xy^5\right]_0^x\,dx \\ \amp = \int_0^1 3x^6\,dx = \frac{3x^7}{7}\bigg\vert_0^1 = \frac{3}{7}, \end{split} \end{equation*}and using the result from (b), we have
\begin{equation*} V(Y) = E(Y^2)E(Y)^2 = \frac{3}{7}  \left(\frac{5}{8}\right)^2 = \frac{17}{448}\text{.} \end{equation*}
Aside from the expected value, variance and standard deviation, we may also be interested in how two continuous random variables are related. To measure such a relationship, we define the covariance.
Definition 4.61. Covariance for a Pair of Continuous Random Variables.
Suppose \(X\) and \(Y\) are a pair of continuous random variables with probability density function \(f(x,y)\text{.}\) Then the covariance of \(X\) and \(Y\) is
provided the integrals converge.
Note:

The sign of the covariance of two random variables \(X\) and \(Y\) tells us the direction of the linear relationship between them:
If the covariance is positive, we say \(X\) and \(Y\)are positively correlated. This means that large values of \(X\) tend to happen with large values of \(Y\text{,}\) and similarly small values of \(X\) tend to happen with small values of \(Y\text{.}\)
If the covariance is negative, we say \(X\) and \(Y\) are negatively correlated. This means that small values of \(X\) tend to happen with large values of \(Y\text{,}\) and vice versa.

Since the calculation of the covariance is based on the mean, we can write
\begin{equation*} Cov(X,Y) = E(XY)E(X)E(Y)\text{.} \end{equation*} \(Cov(X,X) = V(X)\text{.}\)
If the random variables \(X\) and \(Y\) are independent, then \(Cov(X,Y)=0\text{.}\)
Example 4.62. Calculating Covariance.
Let \(X\) and \(Y\) be a pair of continuous random variables with probability density function
Compute the covariance of \(X\) and \(Y\text{,}\) and interpret your result.
We begin by computing
Using the results from Example 4.60, we find the covariance of \(X\) and \(Y\text{:}\)
The covariance is slightly positive. Hence, large values of \(X\) tend to occur more often with large values of \(Y\text{.}\)
Exercises for Section 4.4.
Exercise 4.4.1.
Verify that \(f\) is a probability density function on the given interval.

\(f(x) = \dfrac{10}{3x^2}, x \in [2,5]\)
SolutionSince \(f(x) > 0\) for all \(x \in[2,5]\text{,}\) \(f\) satisfies the positivity condition. We now calculate the area under the curve of \(f\text{:}\)\begin{equation*} \int_2^5 \frac{10}{3x^2}\,dx = \frac{10}{3} \left[\frac{1}{x}\right]_2^5 = \frac{10}{3}\frac{3}{10} = 1. \end{equation*}Thus, \(f\) is a probability density function on \([2,5]\text{.}\) 
\(f(x) = 6\left(\sqrt{x}x\right), x \in [0,1]\)
SolutionSince \(\sqrt{x} \geq x\) on \([0,1]\text{,}\) we have that \(\sqrt{x}  x \geq 0\) and so \(f\) satisfies the positivity condition. We now calculate the area under the curve of \(f\text{:}\)
\begin{equation*} \int_0^1 6(\sqrt{x}x)\,dx = 6\left[\frac{2}{3}x^{3/2}\frac{x^2}{2}\right]_0^1 = 6\left[\frac{2}{3}\frac{1}{2}\right]=1 \end{equation*}Thus, \(f\) is a probability density function on \([0,1]\text{.}\)

\(f(x,y) = \dfrac{1}{3}, (x,y)\in[1,2]\times[3,6]\)
SolutionClearly, \(f\) satisfies the positivity condition. We now integrate:\begin{equation*} \int_1^2 \int_3^6 \frac{1}{3} \,dy\,dx = \frac{1}{3} \int_1^2 3\,dx = \frac{63}{3} = 1. \end{equation*}Hence, \(f\) is a probability density function on \([1,2]\times[3,6]\text{.}\) 
\(f(x,y) = \dfrac{1}{4} xy, (x,y) \in[0,1]\times[0,4]\)
SolutionWe see that \(f(x,y) \geq 0\) for all \(x,y \in [0,1]\times[0,4]\text{.}\) Next, we calculate
\begin{equation*} \int_0^1\int_0^4 \frac{1}{4}xy\,dy\,dx = \int_0^1 \frac{1}{8}xy^2\bigg\vert_0^4\,dx = \int_0^1 2x\,dx = 1\text{.} \end{equation*}Hence, \(f(x,y)\) is a probability density function on \([0,1]\times[0,4]\text{.}\)
Exercise 4.4.2.
Construct a probability density function \(f\) from the given function \(g\text{.}\)

\(g(x)=3x, x\in[0,3]\)
AnswerSolution\(f(x)=\frac{2}{9}(3x), x\in[0,3]\)We integrate \(g(x)\) over \([0,3]\text{:}\)\begin{equation*} \int_0^3 (3x)\,dx = \left[3x\frac{x^2}{2}\right]_0^3 = \frac{9}{2}. \end{equation*}So let \(f(x) = \frac{2}{9} (3x)\text{.}\) Then we notice that \(f(x) \geq 0\) on \([0,3]\text{.}\) Hence, \(f\) is a probability density function on \([0,3]\text{.}\) 
\(g(x)=\dfrac{1}{x^5}, x \in[1,\infty)\)
AnswerSolution\(f(x)= \frac{4}{x^5}, x\in[1,\infty)\)We compute
\begin{equation*} \begin{split} \int_1^{\infty} g(x)\,dx \amp = \int_1^{\infty} \frac{1}{x^5}\,dx \\ \amp = \lim_{a\to\infty} \int_1^{a} \frac{1}{x^5}\,dx \\ \amp = \lim_{a\to\infty} \left.\frac{1}{4x^4} \right\vert_1^a\\ \amp = \left(0+\frac{1}{4}\right) = \frac{1}{4} \end{split} \end{equation*}Therefore, we take \(f(x)= \dfrac{4}{x^5}\text{.}\) Clearly, \(f(x) \geq 0\) on \([1,\infty]\text{,}\) and so \(f(x) = \dfrac{4}{x^5}\) is a probability density function on \([1,\infty]\text{.}\)

\(g(x,y) = xy^2, (x,y)\in [0,1]\times[0,1]\)
AnswerSolution\(f(x,y)=6xy^2, (x,y)\in[0,1]\times[0,1]\)We integrate \(g(x,y)\) over \([0,1]\times[0,1]\text{:}\)
\begin{equation*} \int_0^1\int_0^1 xy^2\,dx\,dy = \int_0^1 \frac{1}{2}y^2\,dy = \frac{1}{6}y^3\bigg\vert_0^1 = \frac{1}{6} \end{equation*}Therefore, we take \(f(x,y) = 6xy^2\text{,}\) and we notice that \(f(x,y) \geq 0\) on the given interval. Thus, \(f(x,y)= 6xy^2\) is a probability density function on \([0,1]\times[0,1]\text{.}\)

\(g(x,y) = x^2e^{y}, (x,y)\in [1,2]\times[1,\infty)\)
AnswerSolution\(f(x,y)=\frac{3e}{7}x^2e^{y}, (x,y)\in [1,2]\times[1,\infty)\)We first integrate \(g(x,y)\) over the given region:\begin{equation*} \begin{split} \int_1^{\infty} \int_1^2 x^2 e^{y} \,dx\,dy \amp= \frac{7}{3} \int_1^{\infty} e^{y}\,dy\\ \amp= \frac{7}{9} \left[e^{1}\lim_{y\to\infty} e^{y}\right]\\ \amp= \frac{7}{9e}.\end{split} \end{equation*}Therefore, let \(f(x,y) = \frac{7}{9e} x^2 e^{y} = \frac{7}{9} x^2 e^{(y+1)}\) (notice that \(f(x,y) \geq 0\)). Then \(f\) is a probability density function on the given interval.
Exercise 4.4.3.
Calculate the following cumulative distributions.

\(f(x)=\dfrac{1}{2}e^{x/2}, x\in[0,\infty)\)
\(P(x=1)\)
\(P(3\leq x \leq 6)\)
\(P(x \leq 50)\)
\(P(x \geq 6)\)
Solutioni.0, ii. 0.173, iii.\(\approx 1\text{,}\) iv. 0.0498
Given the probability density function \(f(x)=\dfrac{1}{2}e^{x/2}\) for \(x \in [0,\infty)\text{,}\) we calculate the following probabilities:
\(P(x=1) = 0\) since \(f\) is a continuous probability distribution.
\(\begin{aligned}P(3\leq x \leq 6) \amp = \int_3^6 f(x)\,dx = \int_3^6 \frac{1}{2}e^{x/2}\,dx\\ \amp = \frac{1}{2}\left(2 e^{x/2} \big\vert_3^6\right)\\\amp = e^{3/2}e^{3} \approx 0.173 \end{aligned}\)
\(\begin{aligned}P(x \leq 50) \amp = \int_{\infty}^{50} f(x)\,dx \\ \amp= \int_0^{50} \frac{1}{2}e^{x/2}\,dx\\ \amp = e^{x/2}\big\vert_0^{50} = 1\frac{1}{e^{25}} \approx 1 \end{aligned}\)
\(\begin{aligned}P(x \geq 6) \amp = \int_6^{\infty} f(x)\,dx\\ \amp= \int_6^{\infty} \frac{1}{2}e^{x/2}\,dx\\ \amp= \lim_{a \to \infty} \int_6^a \frac{1}{2}e^{x/2}\,dx \\ \amp = \lim_{a\to\infty} e^{x/2}\big\vert_6^a = e^{3}\lim_{a\to\infty}e^{a/2} = e^{3} \approx 0.05 \end{aligned}\)

\(f(x)=\dfrac{3}{14}\sqrt{x}, x \in[1,4]\)
\(P(2 \leq x \leq 4)\)
\(P(1\leq x \leq 3)\)
\(P(x \leq 2)\)

\(P(x \geq 2)\)
AnswerSolutioni. 0.739, ii. 0.261, iii.0.599, iv. 0.739
Given the probability density function \(f(x)=\dfrac{3}{14}\sqrt{x}\) for \(x \in [1,4]\text{,}\) we calculate the following probabilities: \begin{equation*} \begin{aligned} P(2 \leq x \leq 4) = \int_2^4 f(x)\,dx = \frac{3}{14}\int_2^4 \sqrt{x}\,dx \amp= \frac{3}{14} \frac{2}{3} x^{3/2}\big\vert_2^4 \\ \amp= \frac{3}{14}\frac{2}{3}\left(4^{3/2}  2^{3/2}\right) \approx 0.73880. \end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned} P(1 \leq x \leq 3) = \int_1^3 f(x)\,dx = \frac{3}{14}\int_1^3 \sqrt{x}\,dx \amp= \frac{3}{14} \frac{2}{3} x^{3/2}\big\vert_1^3 \\ \amp= \frac{3}{14}\frac{2}{3}\left(3^{3/2}  1\right) \approx 0.59945.\end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned} P(x \leq 2) = \int_1^2 f(x)\,dx = \frac{3}{14}\int_1^2 \sqrt{x}\,dx \amp= \frac{3}{14} \frac{2}{3} x^{3/2}\big\vert_1^2 \\ \amp= \frac{3}{14}\frac{2}{3}\left(2^{3/2}  1\right) \approx 0.26120.\end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned}[t] P(x \geq 2) = \int_2^4 f(x)\,dx = \frac{3}{14}\int_2^4 \sqrt{x}\,dx \amp= \frac{3}{14} \frac{2}{3} x^{3/2}\big\vert_2^4 \\ \amp= \frac{3}{14}\frac{2}{3}\left(4^{3/2}  2^{3/2}\right) \approx 0.73880.\end{aligned} \end{equation*}

\(f(x,y)=\dfrac{1}{3}xy, (x,y)\in[0,2]\times[1,2]\)
\(P(0\leq x\leq 1, 1\leq y \leq 2)\)
\(P(1 \leq x \leq 2, y = 1)\)
Solutioni. 0.25, ii. 0
Given the joint probability density function \(f(x,y)=\dfrac{1}{3}xy\) on \([0,2]\times[1,2]\text{,}\) we calculate the following probabilities:
\(\begin{aligned}P(0\leq x\leq 1, 1 \leq y \leq 2) \amp = \int_0^1 \int_1^2 \frac{1}{3}xy\,dy\,dx \\ \amp = \int_0^1 \left.\frac{1}{6} xy^2\right\vert_1^2\,dx \\ \amp = \int_0^1 \frac{1}{3} x \,dx \\ \amp = \left.\frac{1}{6}x^2\right\vert_0^1 = \frac{1}{6} \end{aligned}\)
\(\begin{aligned}P(1 \leq x \leq 2, y = 1) \amp = \int_1^2 \int_1^1 \frac{1}{3}xy\,dy \,dx \\ \amp = \int_1^2 0 \,dx = 0 \end{aligned}\)

\(f(x,y)=\dfrac{1}{16}(2x)y, (x,y)\in[0,2]\times[0,4]\)
\(P(0 \leq x \leq 1, 0 \leq y \leq 1)\)
\(P(x \geq 1, y \geq 3)\)
Solutioni. 0.0469, ii. 0.109
Given the joint probability density function \(f(x,y)=\dfrac{1}{16}(2x)y\) on \([0,2]\times[0,4]\text{,}\) we calculate the following probabilities: \begin{equation*} \begin{aligned} P(0\leq x\leq 1, 0 \leq y \leq 1) \amp= \int_0^1 \int_0^1 \dfrac{1}{16}(2x)y \,dy\,dx \\ \amp= \frac{3}{64} \approx 0.046875. \end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned} P(x \geq 1, y \geq 3) \amp= \int_1^2 \int_3^4 \dfrac{1}{16}(2x)y \,dy\,dx\\ \amp= \frac{7}{64} \approx 0.109375. \end{aligned} \end{equation*}
Exercise 4.4.4.
Let
Show that \(f\) is a probability density function, and that the distribution has no mean.
SolutionWe first notice that \(f(x) \geq 0\) for all \(x\text{.}\) We also have that
Therefore, \(f(x)\) is a probability density function. The mean of this distribution is given by
Since the integral does not converge, the distribution has no mean.
Exercise 4.4.5.
Let
Show that \(\ds \int_{\infty }^\infty f(x)\,dx = 1\text{.}\) Is \(f\) a probability density function? Justify your answer.
SolutionWe show that
However, \(f\) is not a probability density function since it does not satisfy the nonnegativity requirement. In particular, \(f(x) \lt 0 \text{ for } x \in [1,0)\text{.}\)
Exercise 4.4.6.
A sawmill wants to assess the performance of a debarker. They determine that length of time between machine failures is exponentially distributed with probability density function
where \(t\) is measured in hours.

Determine the probability that the debarker breaks down between \(t=200\) and \(t=600\) hours.
AnswerSolution0.318We compute:\begin{equation*} P(200 \leq t \leq 600) = \int_{200}^{600} 0.005 e^{0.005t}\,dt = e^{0.005t}\big\vert_{200}^{600} \approx 0.318. \end{equation*} 
Determine the probability that the machine breaks down after \(t=1000\) hours.
AnswerSolution0.0067The probability that the machine breaks down after 1000 hours is\begin{equation*} P(t \geq 1000) = \lim_{u\to\infty} \int_{1000}^u 0.005e^{0.005t}\,dt = \lim_{u\to\infty} \left[e^{0.005t}\right]_{1000}^u = e^{5} \approx 0.0067. \end{equation*}
Exercise 4.4.7.
For each of the given probability density functions \(f(x)\text{,}\) determine (i) the mean, (ii) the variance, and (iii) the standard deviation.

\(f(x)=\dfrac{3}{215}x^2, \ x \in[1,6]\)
AnswerSolution(i) 4.52 (ii) 17.18 (ii) 4.14We find the mean \(\mu\text{,}\) variance \(V\) and standard deviation \(\sigma\) of the probability density function\begin{equation*} f(x)=\dfrac{3}{152} x^2 \text{ for } x\in [1,6]. \end{equation*} \begin{equation*} \begin{aligned} \mu \amp= \int_{\infty}^{\infty} xf(x)\,dx = \int_1^6 \frac{3}{215} x^3 \,dx\\ \amp= \frac{777}{172} \approx 4.52.\end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned} V \amp= \int_{\infty}^{\infty} x^2f(x)\,dx  \mu^2 = \int_1^6 \frac{3}{215}x^4\,dx  \frac{777}{172}\\ \amp= \frac{2955}{172} \approx 17.18.\end{aligned} \end{equation*}
\(\sigma = \sqrt{V} \approx 4.14\)

\(f(x)=\dfrac{1}{36}(x1)(7x), \ x \in[1,7]\)
AnswerSolution(i) 4 (ii) 13.8 (iii) 3.7We find the mean \(\mu\text{,}\) variance \(V\) and standard deviation \(\sigma\) of the probability density function\begin{equation*} f(x)=\dfrac{1}{36} (x1)(7x) \text{ for } x\in [1,7]. \end{equation*} \begin{equation*} \begin{aligned} \mu \amp= \int_{\infty}^{\infty} xf(x)\,dx \\ \amp= \int_1^7 \frac{1}{36} x(x1)(7x) \,dx = 4. \end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned} V \amp= \int_{\infty}^{\infty} x^2f(x)\,dx  \mu^2 = \int_1^7 \frac{1}{36} x^2(x1)(7x) \,dx  4 = \\ \amp= \frac{89}{5} \approx 13.8.\end{aligned} \end{equation*}
\(\sigma = \sqrt{V} \approx 3.7\)

\(f(x)=\dfrac{24}{x^4}, \ x \in[2,\infty)\)
AnswerSolutioni.3, ii.3, iii. \(\sqrt{3}\)
We find the mean \(\mu\text{,}\) variance \(V\) and standard deviation \(\sigma\) of the probability density function
\begin{equation*} f(x)=\dfrac{24}{x^4} \text{ for } x\in [2,\infty)\text{.} \end{equation*}\(\begin{aligned}\mu \amp = \int_{\infty}^{\infty}xf(x)\,dx = 24\int_2^{\infty}\frac{1}{x^3}\,dx\\ \amp = \frac{24}{2} \lim_{a\to\infty} \frac{1}{x^2} \bigg\vert_2^a = 12 \left(\frac{1}{4}  \lim_{a\to\infty}\frac{1}{a^2}\right) = \frac{12}{4} = 3 \end{aligned}\)

We first calculate
\begin{equation*} \begin{split} \int_{\infty}^{\infty} x^2f(x)\,dx \amp= 24 \int_2^{\infty} \frac{1}{x^2}\,dx \\ \amp= 24 \lim_{a\to\infty}\frac{1}{x}\bigg\vert_2^a \\ \amp= 24\left(\frac{1}{2}  \lim_{a\to\infty}\frac{1}{a}\right) = \frac{24}{2} = 12\end{split}\text{.} \end{equation*}Therefore, using our computed value \(\mu\text{,}\) the variance of the distribution is
\begin{equation*} V(x) = \int_{\infty}^{\infty}x^2f(x)\,dx  \mu^2 = 12  9 = 3\text{.} \end{equation*} \(\sigma = \sqrt{V(x)} = \sqrt{3}\)

\(f(x)=\dfrac{1}{5}e^{x/5}, \ x \in [0,\infty)\)
AnswerSolutionWe find the mean \(\mu\text{,}\) variance \(V\) and standard deviation \(\sigma\) of the probability density function\begin{equation*} f(x)=\dfrac{1}{5} e^{x/5} \text{ for } x\in [0,\infty). \end{equation*} \begin{equation*} \begin{aligned} \mu \amp= \int_{\infty}^{\infty} xf(x)\,dx = \int_0^{\infty} \dfrac{1}{5} xe^{x/5}\,dx \\ \amp= \lim_{a\to\infty} \left[e^{x/5}(x+5)\right]_0^a = 5\end{aligned} \end{equation*}
 \begin{equation*} \begin{aligned} V \amp= \int_{\infty}^{\infty} x^2f(x)\,dx  \mu^2 = \int_0^{\infty} \dfrac{1}{5} x^2e^{x/5}\,dx  5\\ \amp= 50  5 = 45. \end{aligned} \end{equation*}
\(\sigma = \sqrt{V} \approx 6.7\)
Exercise 4.4.8.
Suppose the probability density function which describes the number of leaves on a certain plant is given by

Determine the probability that a plant has exactly one leaf.
AnswerSolution0.2Let \(X\) be the discrete random variable which indicates the number of leaves on a plant. The probability that a plant has exactly one leaf is
\begin{equation*} P(X=1) = f_1 = \frac{1}{20}(51) =\frac{1}{5} = 0.2\text{.} \end{equation*} 
Determine the probability that a plant has less than 3 leaves.
AnswerSolution0.5Let \(X\) be the discrete random variable which indicates the number of leaves on a plant. The probability that a plant has less than 3 leaves is
\begin{equation*} P(X=0)+P(X=1)+P(X=2) = f_0+f_1+f_2 = \frac{1}{20} \left(0+4+6\right)=\frac{1}{2} = 0.5\text{.} \end{equation*}
Exercise 4.4.9.
A city planner wishes to improve traffic on a busy bridge. He determines that the length of time it takes a car to cross the bridge is a continuous random variable with probability density function
where \(t\) is measured in minutes. How long is a randomly chosen car expected to take to cross the bridge? \begin{ans} 2 minutes \end{ans}
AnswerExercise 4.4.10.
The amount of chicken (in kg) demanded weekly at a popular restaurant is a continuous random variable with probability distribution function
What is the expected weekly demand for chicken? Answer
Exercise 4.4.11.
Let \(X\) and \(Y\) be a pair of continuous random variables with the given joint probability density function. Are \(X\) and \(Y\) independent?

\(f_{XY}(x,y) = \begin{cases}6xy \amp 0 \leq x \leq 1, 0 \leq y \leq \sqrt{x}\\ 0\amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
AnswerSolutionNot independent.We compute the marginal probability density functions:
\begin{equation*} f_X(x) = \int_0^{\sqrt{x}} 6xy \,dy = 3xy^2 \big\vert_{y=0}^{\sqrt{x}} = 3x^2\text{,} \end{equation*}and
\begin{equation*} f_Y(y) = \int_0^1 6xy \,dx = 3yx^2 \big\vert_0^1 = 3y\text{.} \end{equation*}Since
\begin{equation*} f_X(x)\cdot f_Y(y) = (3x^2)(3y) = 9x^2y \neq f(x,y)=6xy\text{,} \end{equation*}we find that \(X\) and \(Y\) are not independent.

\(f_{XY}(x,y) = \begin{cases}6e^{(2x+3y)} \amp x,y \geq 0 \\ 0 \amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
AnswerSolutionIndependent.We compute the marginal probability density functions:\begin{equation*} f_X(x) = \int_0^{\infty} 6e^{(2x+3y)}\,dy = 2 \lim_{a\to\infty} e^{(2x+3y)} \big\vert_0^a = 2e^{2x}, \end{equation*}and\begin{equation*} f_Y(y) = \int_0^{\infty} 6e^{(2x+3y)}\,dx = 3 \lim_{a\to\infty} e^{(2x+3y)} \big\vert_0^a = 3e^{3y}. \end{equation*}Since\begin{equation*} f_X(x)\cdot f_Y(y) = \bigl( 2e^{2x}\bigr)\bigl( 3e^{3y}\bigr) = 6e^{(2x+3y)} = f(x,y), \end{equation*}for \(x, y \geq 0\) we find that \(X\) and \(Y\) are independent. 
\(f_{XY}(x,y) = \begin{cases}10x^2y \amp 0 \leq y \leq x \leq 1 \\ 0 \amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
AnswerSolutionNot independent.Note that the region of integration is the triangle with vertices at \((0,0)\text{,}\) \((1,1)\) and \((1,0)\text{.}\) We compute the marginal probability density functions:\begin{equation*} f_X(x) = \int_0^x 10x^2y\,dy = 5x^4, \quad \text{for } 0 \leq x \leq 1, \end{equation*}and\begin{equation*} f_Y(y) =\int_y^1 10x^2y\,dx = \frac{10}{3} (yy^4), \quad \text{for } 0 \leq y\leq 1. \end{equation*}Since\begin{equation*} f_X(x)\cdot f_Y(y) = \bigl( 5x^4 \bigr)\bigl(\frac{10}{3} (yy^4) \bigr) \neq f(x,y) = 10x^2 y \end{equation*}on the triangular domain, we find that \(X\) and \(Y\) are not independent. 
\(f_{XY}(x,y) = \begin{cases}2 \amp 0 \leq x\leq y \leq 1 \\ 0 \amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
AnswerSolutionNot independent.Note that the region of integration is the triangle with vertices at \((0,0)\text{,}\) \((1,1)\) and \((0,1)\text{.}\) We compute the marginal probability density functions:\begin{equation*} f_X(x) = \int_x^1 2\,dy = 2(1x) \quad \text{ for } 0 \leq x \leq 1, \end{equation*}and\begin{equation*} f_Y(y) = \int_0^y 2\,dx = 2y \quad \text{ for } 0 \leq y \leq 1. \end{equation*}Since\begin{equation*} f_X(x)\cdot f_Y(y) = \bigl(2(1x) \bigr)\bigl(2y \bigr) \neq f(x,y)=2 \end{equation*}on the domain in question, we find that $X$ and $Y$ are independent.
Exercise 4.4.12.
Given the following probability density function \(f(x,y)\) for a pair of continuous random variables \(X\) and \(Y\text{,}\) compute (i) the expected values \(E(X)\) and \(E(Y)\text{,}\) (ii) the variance \(V(X)\) and \(V(Y)\text{,}\) and (iii) the covariance \(Cov(X,Y)\text{.}\)

\(f_{XY}(x,y) = \begin{cases}6xy \amp 0 \leq x \leq 1, 0 \leq y \leq \sqrt{x}\\ 0\amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
AnswerSolutioni.3/4,4/7; ii.3/80,19/392; iii.1/63We consider the joint probability density function
\begin{equation*} f_{XY}(x,y) = \begin{cases}6xy \amp 0 \leq x \leq 1, 0 \leq y \leq \sqrt{x} \\ 0 \amp \text{ all other \(x\) and \(y\). } \end{cases} \end{equation*}
The expected value of \(X\) is
\begin{equation*} E(X) = \int_0^1 \int_0^{\sqrt{x}} 6x^2y\,dy\,dx = \int_0^1 3x^3 \,dx = \frac{3}{4}x^4 \bigg\vert_0^1 = \frac{3}{4}\text{,} \end{equation*}and the exepected value of \(Y\) is
\begin{equation*} E(Y) = \int_0^1\int_0^{\sqrt{x}} 6xy^2 \,dy\,dx = \int_0^1 2x^{5/2}\,dx = \frac{4}{7}x^{7/2}\bigg\vert_0^1 = \frac{4}{7}\text{.} \end{equation*} 
First, we compute
\begin{equation*} \int_0^1\int_0^{\sqrt{x}} 6x^3y\,dy\,dx = \int_0^1 3x^4\,dx = \frac{3}{5}x^5\bigg\vert_0^1 = \frac{3}{5}\text{,} \end{equation*}and
\begin{equation*} \int_0^1\int_0^{\sqrt{x}} 6xy^3\,dy\,dx = \int_0^1 \frac{3}{2}x^3 \,dx = \frac{3}{8}x^4\bigg\vert_0^1 = \frac{3}{8}\text{.} \end{equation*}Using the calculated values for \(E(X)\) and \(E(Y)\text{,}\) we find the variance of \(X\) to be
\begin{equation*} V(X) = \int_0^1\int_0^{\sqrt{x}} 6x^3y\,dy\,dx  E(X)^2 = \frac{3}{5}  \frac{9}{16} = \frac{3}{80}\text{,} \end{equation*}and the variance of \(Y\) to be
\begin{equation*} V(Y)=\int_0^1\int_0^{\sqrt{x}} 6xy^3\,dy\,dxE(Y)^2 = \frac{3}{8}\frac{16}{49} = \frac{19}{392}\text{.} \end{equation*} 
First calculate
\begin{equation*} \int_0^1\int_0^{\sqrt{x}} 6x^2y^2\,dy\,dx = \int_0^1 2x^{7/2}\,dx = \frac{4}{9}x^{9/2}\bigg\vert_0^1 = \frac{4}{9}\text{.} \end{equation*}Then, again using our computed values for \(E(X)\) and \(E(Y)\text{,}\) the covariance of \(X\) and \(Y\) is
\begin{equation*} Cov(X,Y) = \int_0^1\int_0^{\sqrt{x}} 6x^2y^2\,dy\,dx  E(X)E(Y)=\frac{4}{9}\frac{3}{4}\cdot\frac{4}{7} =\frac{1}{63} \approx 0.016\text{.} \end{equation*}Thus, \(X\) and \(Y\) are slightly positively correlated.

\(f_{XY}(x,y) = \begin{cases}6e^{(2x+3y)} \amp x,y \geq 0 \\ 0 \amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
\(f_{XY}(x,y) = \begin{cases}10x^2y \amp 0 \leq y \leq x \leq 1 \\ 0 \amp \text{ all other \(x\) and \(y\) } . \end{cases}\)
\(f_{XY}(x,y) = \begin{cases}2 \amp 0 \leq x\leq y \leq 1 \\ 0 \amp \text{ all other \(x\) and \(y\) } . \end{cases}\)