Finding the standard deviation of a probability distribution.
Clash Royale CLAN TAG#URR8PPP
up vote
5
down vote
favorite
Here is the question:
The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.
b. Find the standard deviation of the length of time the bus takes to drive the length of its route.
(Original image here.)
I did the first part and got $E(X)=43.54$, which is the correct answer. However, for the second part, I use the formula $sigma = sqrt(sum x^2P(x))-E(X)^2$ and get approximately $4.517$. The answer is $1.204$. Where did I go wrong?
probability statistics probability-distributions
add a comment |Â
up vote
5
down vote
favorite
Here is the question:
The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.
b. Find the standard deviation of the length of time the bus takes to drive the length of its route.
(Original image here.)
I did the first part and got $E(X)=43.54$, which is the correct answer. However, for the second part, I use the formula $sigma = sqrt(sum x^2P(x))-E(X)^2$ and get approximately $4.517$. The answer is $1.204$. Where did I go wrong?
probability statistics probability-distributions
add a comment |Â
up vote
5
down vote
favorite
up vote
5
down vote
favorite
Here is the question:
The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.
b. Find the standard deviation of the length of time the bus takes to drive the length of its route.
(Original image here.)
I did the first part and got $E(X)=43.54$, which is the correct answer. However, for the second part, I use the formula $sigma = sqrt(sum x^2P(x))-E(X)^2$ and get approximately $4.517$. The answer is $1.204$. Where did I go wrong?
probability statistics probability-distributions
Here is the question:
The time, to the nearest whole minute, that a city bus takes to go from one end of its route to the other has the probability distribution shown.
As sometimes happens with probabilities computed as empirical relative frequencies, probabilities in the table add up only to a value other than $1.00$ because of round-off error.
$$
beginarrayc
x
& 42
& 43
& 44
& 45
& 46
& 47
\
hline
P(x)
& 0.10
& 0.23
& 0.34
& 0.25
& 0.05
& 0.02
\
endarray
$$
a. Find the average time the bus takes to drive the length of its route.
b. Find the standard deviation of the length of time the bus takes to drive the length of its route.
(Original image here.)
I did the first part and got $E(X)=43.54$, which is the correct answer. However, for the second part, I use the formula $sigma = sqrt(sum x^2P(x))-E(X)^2$ and get approximately $4.517$. The answer is $1.204$. Where did I go wrong?
probability statistics probability-distributions
edited Aug 23 at 16:49


Jendrik Stelzner
7,57221037
7,57221037
asked Aug 19 at 22:06
numericalorange
1,286110
1,286110
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
4
down vote
accepted
You are dealing with a slight inaccuracy due to rounding.
By definition, the mean is $mu = sum_i=1^5 p_ix_i$ and the variance is $sigma^2 = sum_i=1^5p_i(x_i - mu)^2.$
By a formula, derived from the definition,
$$sigma^2 = E(X^2) - mu^2 = sum_i=2^5p_ix_i^2 - mu^2.$$
However, the formula is very sensitive to round-off error.
In R, the mean can be computed as follows:
p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54
This agrees with what you found.
According to the definition, the variance and standard deviation are
sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609
But the formula (exaggerating the errors) gives the standard deviation as
sqrt(sum(p*x^2) - mu^2)
[1] 4.517566
I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:
sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1
Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):
mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971
add a comment |Â
up vote
4
down vote
There is a nasty trick, lying in the remark "As sometimes happens...".
The variance is indeed given by
$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$
With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.
So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.
Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.
Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.
What would be best? I suggest this: consider the $p$ as "general" weights (that is, not summing to $1$, since they don't anyway) and compute the mean and variance accordingly. Equivalently, reweight by dividing the $p_i$ by the sum. The standard deviation is then $1.127971255$.
Note: even if your teacher is expecting you to use blindly the first formula, the correct approximation is $1.205$, not $1.204$. But since there is a bias in the mean (too low by roughly $0.01times44$, considering the missing "mass" $0.01$ is somewhere between $42$ and $47$), thus also in the final result, I would not recommend this.
Another note: the exercise showed you that the first formula is more immune to numerical errors (the standard deviation returned is closer to any sensible value you might consider). You should always use this formula, and not the other one.
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
You are dealing with a slight inaccuracy due to rounding.
By definition, the mean is $mu = sum_i=1^5 p_ix_i$ and the variance is $sigma^2 = sum_i=1^5p_i(x_i - mu)^2.$
By a formula, derived from the definition,
$$sigma^2 = E(X^2) - mu^2 = sum_i=2^5p_ix_i^2 - mu^2.$$
However, the formula is very sensitive to round-off error.
In R, the mean can be computed as follows:
p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54
This agrees with what you found.
According to the definition, the variance and standard deviation are
sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609
But the formula (exaggerating the errors) gives the standard deviation as
sqrt(sum(p*x^2) - mu^2)
[1] 4.517566
I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:
sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1
Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):
mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971
add a comment |Â
up vote
4
down vote
accepted
You are dealing with a slight inaccuracy due to rounding.
By definition, the mean is $mu = sum_i=1^5 p_ix_i$ and the variance is $sigma^2 = sum_i=1^5p_i(x_i - mu)^2.$
By a formula, derived from the definition,
$$sigma^2 = E(X^2) - mu^2 = sum_i=2^5p_ix_i^2 - mu^2.$$
However, the formula is very sensitive to round-off error.
In R, the mean can be computed as follows:
p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54
This agrees with what you found.
According to the definition, the variance and standard deviation are
sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609
But the formula (exaggerating the errors) gives the standard deviation as
sqrt(sum(p*x^2) - mu^2)
[1] 4.517566
I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:
sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1
Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):
mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971
add a comment |Â
up vote
4
down vote
accepted
up vote
4
down vote
accepted
You are dealing with a slight inaccuracy due to rounding.
By definition, the mean is $mu = sum_i=1^5 p_ix_i$ and the variance is $sigma^2 = sum_i=1^5p_i(x_i - mu)^2.$
By a formula, derived from the definition,
$$sigma^2 = E(X^2) - mu^2 = sum_i=2^5p_ix_i^2 - mu^2.$$
However, the formula is very sensitive to round-off error.
In R, the mean can be computed as follows:
p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54
This agrees with what you found.
According to the definition, the variance and standard deviation are
sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609
But the formula (exaggerating the errors) gives the standard deviation as
sqrt(sum(p*x^2) - mu^2)
[1] 4.517566
I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:
sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1
Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):
mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971
You are dealing with a slight inaccuracy due to rounding.
By definition, the mean is $mu = sum_i=1^5 p_ix_i$ and the variance is $sigma^2 = sum_i=1^5p_i(x_i - mu)^2.$
By a formula, derived from the definition,
$$sigma^2 = E(X^2) - mu^2 = sum_i=2^5p_ix_i^2 - mu^2.$$
However, the formula is very sensitive to round-off error.
In R, the mean can be computed as follows:
p = c(.1,.23,.34,.25,.05,.02); x = 42:47; mu = sum(p*x); mu
[1] 43.54
This agrees with what you found.
According to the definition, the variance and standard deviation are
sum(p*(x - mu)^2)
[1] 1.451084
sg = sqrt(sum(p*(x - mu)^2)); sg
[1] 1.204609
But the formula (exaggerating the errors) gives the standard deviation as
sqrt(sum(p*x^2) - mu^2)
[1] 4.517566
I don't know what you are supposed to show as the solution to this problem.
However, to make sense of it, I think the logical course of action is to
adjust the probabilities so that they add to 1:
sum(p)
[1] 0.99
p1 = p/sum(p); p1; sum(p1)
[1] 0.10101010 0.23232323 0.34343434 0.25252525 0.05050505 0.02020202 # adj probs
[1] 1 # sum to 1
Then use adjusted probabilities from the start to get the true mean and
standard deviation (where both the definition and formula agree):
mu1 = sum(p1*x); mu1; sqrt(sum(p1*(x - mu1)^2)); sqrt(sum(p1*x^2) - mu1^2)
[1] 43.9798
[1] 1.127971
[1] 1.127971
edited Aug 19 at 23:23
answered Aug 19 at 23:08
BruceET
33.7k71440
33.7k71440
add a comment |Â
add a comment |Â
up vote
4
down vote
There is a nasty trick, lying in the remark "As sometimes happens...".
The variance is indeed given by
$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$
With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.
So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.
Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.
Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.
What would be best? I suggest this: consider the $p$ as "general" weights (that is, not summing to $1$, since they don't anyway) and compute the mean and variance accordingly. Equivalently, reweight by dividing the $p_i$ by the sum. The standard deviation is then $1.127971255$.
Note: even if your teacher is expecting you to use blindly the first formula, the correct approximation is $1.205$, not $1.204$. But since there is a bias in the mean (too low by roughly $0.01times44$, considering the missing "mass" $0.01$ is somewhere between $42$ and $47$), thus also in the final result, I would not recommend this.
Another note: the exercise showed you that the first formula is more immune to numerical errors (the standard deviation returned is closer to any sensible value you might consider). You should always use this formula, and not the other one.
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
add a comment |Â
up vote
4
down vote
There is a nasty trick, lying in the remark "As sometimes happens...".
The variance is indeed given by
$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$
With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.
So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.
Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.
Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.
What would be best? I suggest this: consider the $p$ as "general" weights (that is, not summing to $1$, since they don't anyway) and compute the mean and variance accordingly. Equivalently, reweight by dividing the $p_i$ by the sum. The standard deviation is then $1.127971255$.
Note: even if your teacher is expecting you to use blindly the first formula, the correct approximation is $1.205$, not $1.204$. But since there is a bias in the mean (too low by roughly $0.01times44$, considering the missing "mass" $0.01$ is somewhere between $42$ and $47$), thus also in the final result, I would not recommend this.
Another note: the exercise showed you that the first formula is more immune to numerical errors (the standard deviation returned is closer to any sensible value you might consider). You should always use this formula, and not the other one.
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
add a comment |Â
up vote
4
down vote
up vote
4
down vote
There is a nasty trick, lying in the remark "As sometimes happens...".
The variance is indeed given by
$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$
With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.
So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.
Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.
Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.
What would be best? I suggest this: consider the $p$ as "general" weights (that is, not summing to $1$, since they don't anyway) and compute the mean and variance accordingly. Equivalently, reweight by dividing the $p_i$ by the sum. The standard deviation is then $1.127971255$.
Note: even if your teacher is expecting you to use blindly the first formula, the correct approximation is $1.205$, not $1.204$. But since there is a bias in the mean (too low by roughly $0.01times44$, considering the missing "mass" $0.01$ is somewhere between $42$ and $47$), thus also in the final result, I would not recommend this.
Another note: the exercise showed you that the first formula is more immune to numerical errors (the standard deviation returned is closer to any sensible value you might consider). You should always use this formula, and not the other one.
There is a nasty trick, lying in the remark "As sometimes happens...".
The variance is indeed given by
$$V(X)=sum_i p_i(x_i-mu)^2=sum_i p_ix_i^2-mu^2$$
With $mu=sum_i p_ix_i$. And the standard deviation is the square root of the variance. But this equality only holds if $sum_i p_i=1$.
So what happened? Do again the computation with the last probability being $0.03$ instead of $0.02$, to make the probabilities sum to $1$. Both formulas yield a variance equal to $1.3499$.
Redo the computation with last probability $0.02$: the first formula yields a variance $1.451084$, the other formula yields the value $20.4084$. What happens is the weights do not sum to $1$.
Notice that the first formula yields a standard deviation $sqrt1.451084simeq1.20460948$.
What would be best? I suggest this: consider the $p$ as "general" weights (that is, not summing to $1$, since they don't anyway) and compute the mean and variance accordingly. Equivalently, reweight by dividing the $p_i$ by the sum. The standard deviation is then $1.127971255$.
Note: even if your teacher is expecting you to use blindly the first formula, the correct approximation is $1.205$, not $1.204$. But since there is a bias in the mean (too low by roughly $0.01times44$, considering the missing "mass" $0.01$ is somewhere between $42$ and $47$), thus also in the final result, I would not recommend this.
Another note: the exercise showed you that the first formula is more immune to numerical errors (the standard deviation returned is closer to any sensible value you might consider). You should always use this formula, and not the other one.
edited Aug 19 at 23:13
answered Aug 19 at 22:49


Jean-Claude Arbaut
14.3k63261
14.3k63261
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
add a comment |Â
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
Reasonable answer (+1). However, I take issue with your last sentence, the alternate formula is fine (sometimes even necessary) for theoretical discussions, and it is also OK for applications where the probabilities are not subject to roundoff error.
– BruceET
Aug 19 at 23:28
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
@BruceET There are known problems with this formula, when the mean is much larger than the variance (it was once a criticism against MS Excel, that it could return wrong results, and even negative variance, due to uing the second formula). There are ways to compute the second formula with higher precision, and it is indeed somteimes necessary, however I do think it's better to always use the first whenever possible.
– Jean-Claude Arbaut
Aug 20 at 5:58
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
I guess you are talking only about computation, not about theoretical uses. For the sample variance $S^2 = frac1n-1sum_i(X_i-bar X)^2$ is often more computationally stable than $frac1n-1[sum_i X_i^2 -nbar X^2],$ but some software claims a linear combination of the two is optimal.
– BruceET
Aug 20 at 6:43
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
@BruceET Yes, that applies only to numerical computations.
– Jean-Claude Arbaut
Aug 20 at 7:07
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2888188%2ffinding-the-standard-deviation-of-a-probability-distribution%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password