Finding the standard deviation of a probability distribution.

up vote
5
down vote

favorite

Here is the question:

The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.

b. Find the standard deviation of the length of time the bus takes to drive the length of its route.

(Original image here.)

I did the first part and got $E(X)=43.54$, which is the correct answer. However, for the second part, I use the formula $sigma = sqrt(sum x^2P(x))-E(X)^2$ and get approximately $4.517$. The answer is $1.204$. Where did I go wrong?

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

asked Aug 19 at 22:06

numericalorange

1,286110

add a commentÂ |Â

up vote
5
down vote

favorite

Here is the question:

The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.

b. Find the standard deviation of the length of time the bus takes to drive the length of its route.

(Original image here.)

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

asked Aug 19 at 22:06

numericalorange

1,286110

add a commentÂ |Â

up vote
5
down vote

favorite

Here is the question:

The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.

b. Find the standard deviation of the length of time the bus takes to drive the length of its route.

(Original image here.)

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

asked Aug 19 at 22:06

numericalorange

1,286110

Here is the question:

The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.

b. Find the standard deviation of the length of time the bus takes to drive the length of its route.

(Original image here.)

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

asked Aug 19 at 22:06

numericalorange

1,286110

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

edited Aug 23 at 16:49

Jendrik Stelzner

7,57221037

asked Aug 19 at 22:06

numericalorange

1,286110

asked Aug 19 at 22:06

numericalorange

1,286110

asked Aug 19 at 22:06

numericalorange

1,286110

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
4
down vote

accepted

You are dealing with a slight inaccuracy due to rounding.
By definition, the mean is $mu = sum_i=1^5 p_ix_i$ and the variance is $sigma^2 = sum_i=1^5p_i(x_i - mu)^2.$
By a formula, derived from the definition,
$$sigma^2 = E(X^2) - mu^2 = sum_i=2^5p_ix_i^2 - mu^2.$$
However, the formula is very sensitive to round-off error.

In R, the mean can be computed as follows:

p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54

This agrees with what you found.

According to the definition, the variance and standard deviation are

sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609

But the formula (exaggerating the errors) gives the standard deviation as

sqrt(sum(p*x^2) - mu^2)
[1] 4.517566

I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:

sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1

Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):

mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971

edited Aug 19 at 23:23

answered Aug 19 at 23:08

BruceET

33.7k71440

add a commentÂ |Â

up vote
4
down vote

There is a nasty trick, lying in the remark "As sometimes happens...".

The variance is indeed given by

$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$

With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.

So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.

Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.

Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.

What would be best? I suggest this: consider the $p$ as "general" weights (that is, not summing to $1$, since they don't anyway) and compute the mean and variance accordingly. Equivalently, reweight by dividing the $p_i$ by the sum. The standard deviation is then $1.127971255$.

Note: even if your teacher is expecting you to use blindly the first formula, the correct approximation is $1.205$, not $1.204$. But since there is a bias in the mean (too low by roughly $0.01times44$, considering the missing "mass" $0.01$ is somewhere between $42$ and $47$), thus also in the final result, I would not recommend this.

Another note: the exercise showed you that the first formula is more immune to numerical errors (the standard deviation returned is closer to any sensible value you might consider). You should always use this formula, and not the other one.

edited Aug 19 at 23:13

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
â€“Â BruceET
Aug 19 at 23:28

@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
â€“Â Jean-Claude Arbaut
Aug 20 at 5:58

I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
â€“Â BruceET
Aug 20 at 6:43

@BruceET Yes, that applies only to numerical computations.
â€“Â Jean-Claude Arbaut
Aug 20 at 7:07

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2888188%2ffinding-the-standard-deviation-of-a-probability-distribution%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
4
down vote

accepted

In R, the mean can be computed as follows:

p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54

This agrees with what you found.

According to the definition, the variance and standard deviation are

sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609

But the formula (exaggerating the errors) gives the standard deviation as

sqrt(sum(p*x^2) - mu^2)
[1] 4.517566

I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:

sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1

Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):

mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971

edited Aug 19 at 23:23

answered Aug 19 at 23:08

BruceET

33.7k71440

add a commentÂ |Â

up vote
4
down vote

accepted

In R, the mean can be computed as follows:

p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54

This agrees with what you found.

According to the definition, the variance and standard deviation are

sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609

But the formula (exaggerating the errors) gives the standard deviation as

sqrt(sum(p*x^2) - mu^2)
[1] 4.517566

I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:

sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1

Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):

mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971

edited Aug 19 at 23:23

answered Aug 19 at 23:08

BruceET

33.7k71440

add a commentÂ |Â

up vote
4
down vote

accepted

In R, the mean can be computed as follows:

p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54

This agrees with what you found.

According to the definition, the variance and standard deviation are

sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609

But the formula (exaggerating the errors) gives the standard deviation as

sqrt(sum(p*x^2) - mu^2)
[1] 4.517566

I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:

sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1

Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):

mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971

edited Aug 19 at 23:23

answered Aug 19 at 23:08

BruceET

33.7k71440

In R, the mean can be computed as follows:

p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54

This agrees with what you found.

According to the definition, the variance and standard deviation are

sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609

But the formula (exaggerating the errors) gives the standard deviation as

sqrt(sum(p*x^2) - mu^2)
[1] 4.517566

I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:

sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1

Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):

mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971

edited Aug 19 at 23:23

answered Aug 19 at 23:08

BruceET

33.7k71440

edited Aug 19 at 23:23

answered Aug 19 at 23:08

BruceET

33.7k71440

answered Aug 19 at 23:08

BruceET

33.7k71440

answered Aug 19 at 23:08

BruceET

33.7k71440

add a commentÂ |Â

up vote
4
down vote

There is a nasty trick, lying in the remark "As sometimes happens...".

The variance is indeed given by

$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$

With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.

So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.

Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.

Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.

edited Aug 19 at 23:13

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
â€“Â BruceET
Aug 19 at 23:28

@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
â€“Â Jean-Claude Arbaut
Aug 20 at 5:58

I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
â€“Â BruceET
Aug 20 at 6:43

@BruceET Yes, that applies only to numerical computations.
â€“Â Jean-Claude Arbaut
Aug 20 at 7:07

add a commentÂ |Â

up vote
4
down vote

There is a nasty trick, lying in the remark "As sometimes happens...".

The variance is indeed given by

$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$

With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.

So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.

Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.

Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.

edited Aug 19 at 23:13

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
â€“Â BruceET
Aug 19 at 23:28

@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
â€“Â Jean-Claude Arbaut
Aug 20 at 5:58

I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
â€“Â BruceET
Aug 20 at 6:43

@BruceET Yes, that applies only to numerical computations.
â€“Â Jean-Claude Arbaut
Aug 20 at 7:07

add a commentÂ |Â

up vote
4
down vote

There is a nasty trick, lying in the remark "As sometimes happens...".

The variance is indeed given by

$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$

With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.

So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.

Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.

Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.

edited Aug 19 at 23:13

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

There is a nasty trick, lying in the remark "As sometimes happens...".

The variance is indeed given by

$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$

With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.

So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.

Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.

Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.

edited Aug 19 at 23:13

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

edited Aug 19 at 23:13

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

answered Aug 19 at 22:49

Jean-Claude Arbaut

14.3k63261

Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
â€“Â BruceET
Aug 19 at 23:28

@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
â€“Â Jean-Claude Arbaut
Aug 20 at 5:58

I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
â€“Â BruceET
Aug 20 at 6:43

@BruceET Yes, that applies only to numerical computations.
â€“Â Jean-Claude Arbaut
Aug 20 at 7:07

add a commentÂ |Â

Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
â€“Â BruceET
Aug 19 at 23:28

@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
â€“Â Jean-Claude Arbaut
Aug 20 at 5:58

I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
â€“Â BruceET
Aug 20 at 6:43

@BruceET Yes, that applies only to numerical computations.
â€“Â Jean-Claude Arbaut
Aug 20 at 7:07

Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
â€“Â BruceET
Aug 19 at 23:28

@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
â€“Â Jean-Claude Arbaut
Aug 20 at 5:58

I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
â€“Â BruceET
Aug 20 at 6:43

@BruceET Yes, that applies only to numerical computations.
â€“Â Jean-Claude Arbaut
Aug 20 at 7:07

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky