Formal definition of the qqline used in a Q-Q plot

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
5
down vote

favorite

I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.

My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.

For a Weibull distribution, I understand how the points on the Q-Q plot are constructed (using the quantiles of observed data vs. the quantiles of an estimated Weibull distribution). The piece I am not clear on is how the line used in Q-Q plots is calculated/constructed.

The R documentation for the qqplot() function provides the following description:

qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.

Another post on Cross Validated seems to indicate that the line is essentially a line constructed from the parameters of the theoretical (estimated) distribution. Is this a true statement and correct interpretation?

If a link to a formal definition could be provided I'd very much appreciate it.

edited Aug 19 at 10:16

Peter Mortensen

18718

asked Aug 18 at 19:31

Jonathan Dunne

1067

add a commentÂ |Â

up vote
5
down vote

favorite

I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.

My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.

The R documentation for the qqplot() function provides the following description:

qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.

If a link to a formal definition could be provided I'd very much appreciate it.

edited Aug 19 at 10:16

Peter Mortensen

18718

asked Aug 18 at 19:31

Jonathan Dunne

1067

add a commentÂ |Â

up vote
5
down vote

favorite

I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.

My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.

The R documentation for the qqplot() function provides the following description:

qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.

If a link to a formal definition could be provided I'd very much appreciate it.

edited Aug 19 at 10:16

Peter Mortensen

18718

asked Aug 18 at 19:31

Jonathan Dunne

1067

I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.

My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.

The R documentation for the qqplot() function provides the following description:

qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.

If a link to a formal definition could be provided I'd very much appreciate it.

edited Aug 19 at 10:16

Peter Mortensen

18718

asked Aug 18 at 19:31

Jonathan Dunne

1067

edited Aug 19 at 10:16

Peter Mortensen

18718

edited Aug 19 at 10:16

Peter Mortensen

18718

edited Aug 19 at 10:16

Peter Mortensen

18718

asked Aug 18 at 19:31

Jonathan Dunne

1067

asked Aug 18 at 19:31

Jonathan Dunne

1067

asked Aug 18 at 19:31

Jonathan Dunne

1067

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
6
down vote

accepted

Sort of "both" - the line depends both on the observed quantiles (which define the y-axis of the QQ plot) and the expected/theoretical/reference quantiles (which the define the x-axis). The documentation (which you quote) should always be taken as the canonical reference:

Ã¢Â€Â˜qqlineÃ¢Â€Â™ adds a
line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot
which passes through the Ã¢Â€Â˜probsÃ¢Â€Â™ quantiles, by default the first
and third quartiles.

If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version

 ## quantiles (.25 and 0.75 by default) of data
 y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
 ## quantiles of reference/theoretical distribution
 x <- distribution(probs)
 ## ...
 slope <- diff(y)/diff(x) ## observed slope between quantiles
 int <- y[1L]-slope*x[1L] ## intercept
 abline(int, slope, ...) ## draw the line

For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:

quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)

edited Aug 18 at 21:09

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

add a commentÂ |Â

up vote
4
down vote

I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)

(x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R

# sample data
set.seed(2)
y <- rt(100, df = 5)

# get the values
probs <- c(0.25, 0.75)
x1 <- qnorm(probs[1])
x2 <- qnorm(probs[2])
y1 <- quantile(y, probs[1])
y2 <- quantile(y, probs[2])

# plot
qqnorm(y)
segments(x1, y1, x2, y2, col = "red", lwd = 2)
qqline(y, lty = 2)
# theoretical match is straight line. If you add more samples, qqline should 
# converge to this
abline(0,1)

answered Aug 18 at 21:04

Jonny Phelps

1411

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362840%2fformal-definition-of-the-qqline-used-in-a-q-q-plot%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
6
down vote

accepted

Ã¢Â€Â˜qqlineÃ¢Â€Â™ adds a
line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot
which passes through the Ã¢Â€Â˜probsÃ¢Â€Â™ quantiles, by default the first
and third quartiles.

If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version

 ## quantiles (.25 and 0.75 by default) of data
 y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
 ## quantiles of reference/theoretical distribution
 x <- distribution(probs)
 ## ...
 slope <- diff(y)/diff(x) ## observed slope between quantiles
 int <- y[1L]-slope*x[1L] ## intercept
 abline(int, slope, ...) ## draw the line

For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:

quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)

edited Aug 18 at 21:09

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

add a commentÂ |Â

up vote
6
down vote

accepted

Ã¢Â€Â˜qqlineÃ¢Â€Â™ adds a
line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot
which passes through the Ã¢Â€Â˜probsÃ¢Â€Â™ quantiles, by default the first
and third quartiles.

If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version

 ## quantiles (.25 and 0.75 by default) of data
 y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
 ## quantiles of reference/theoretical distribution
 x <- distribution(probs)
 ## ...
 slope <- diff(y)/diff(x) ## observed slope between quantiles
 int <- y[1L]-slope*x[1L] ## intercept
 abline(int, slope, ...) ## draw the line

For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:

quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)

edited Aug 18 at 21:09

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

add a commentÂ |Â

up vote
6
down vote

accepted

Ã¢Â€Â˜qqlineÃ¢Â€Â™ adds a
line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot
which passes through the Ã¢Â€Â˜probsÃ¢Â€Â™ quantiles, by default the first
and third quartiles.

If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version

 ## quantiles (.25 and 0.75 by default) of data
 y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
 ## quantiles of reference/theoretical distribution
 x <- distribution(probs)
 ## ...
 slope <- diff(y)/diff(x) ## observed slope between quantiles
 int <- y[1L]-slope*x[1L] ## intercept
 abline(int, slope, ...) ## draw the line

For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:

quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)

edited Aug 18 at 21:09

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

Ã¢Â€Â˜qqlineÃ¢Â€Â™ adds a
line to a Ã¢Â€ÂœtheoreticalÃ¢Â€Â, by default normal, quantile-quantile plot
which passes through the Ã¢Â€Â˜probsÃ¢Â€Â™ quantiles, by default the first
and third quartiles.

If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version

 ## quantiles (.25 and 0.75 by default) of data
 y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
 ## quantiles of reference/theoretical distribution
 x <- distribution(probs)
 ## ...
 slope <- diff(y)/diff(x) ## observed slope between quantiles
 int <- y[1L]-slope*x[1L] ## intercept
 abline(int, slope, ...) ## draw the line

For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:

quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)

edited Aug 18 at 21:09

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

edited Aug 18 at 21:09

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

answered Aug 18 at 21:01

Ben Bolker

20.5k15583

add a commentÂ |Â

up vote
4
down vote

I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)

(x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R

# sample data
set.seed(2)
y <- rt(100, df = 5)

# get the values
probs <- c(0.25, 0.75)
x1 <- qnorm(probs[1])
x2 <- qnorm(probs[2])
y1 <- quantile(y, probs[1])
y2 <- quantile(y, probs[2])

# plot
qqnorm(y)
segments(x1, y1, x2, y2, col = "red", lwd = 2)
qqline(y, lty = 2)
# theoretical match is straight line. If you add more samples, qqline should 
# converge to this
abline(0,1)

answered Aug 18 at 21:04

Jonny Phelps

1411

add a commentÂ |Â

up vote
4
down vote

I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)

(x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R

# sample data
set.seed(2)
y <- rt(100, df = 5)

# get the values
probs <- c(0.25, 0.75)
x1 <- qnorm(probs[1])
x2 <- qnorm(probs[2])
y1 <- quantile(y, probs[1])
y2 <- quantile(y, probs[2])

# plot
qqnorm(y)
segments(x1, y1, x2, y2, col = "red", lwd = 2)
qqline(y, lty = 2)
# theoretical match is straight line. If you add more samples, qqline should 
# converge to this
abline(0,1)

answered Aug 18 at 21:04

Jonny Phelps

1411

add a commentÂ |Â

up vote
4
down vote

I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)

(x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R

# sample data
set.seed(2)
y <- rt(100, df = 5)

# get the values
probs <- c(0.25, 0.75)
x1 <- qnorm(probs[1])
x2 <- qnorm(probs[2])
y1 <- quantile(y, probs[1])
y2 <- quantile(y, probs[2])

# plot
qqnorm(y)
segments(x1, y1, x2, y2, col = "red", lwd = 2)
qqline(y, lty = 2)
# theoretical match is straight line. If you add more samples, qqline should 
# converge to this
abline(0,1)

answered Aug 18 at 21:04

Jonny Phelps

1411

I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)

(x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R

# sample data
set.seed(2)
y <- rt(100, df = 5)

# get the values
probs <- c(0.25, 0.75)
x1 <- qnorm(probs[1])
x2 <- qnorm(probs[2])
y1 <- quantile(y, probs[1])
y2 <- quantile(y, probs[2])

# plot
qqnorm(y)
segments(x1, y1, x2, y2, col = "red", lwd = 2)
qqline(y, lty = 2)
# theoretical match is straight line. If you add more samples, qqline should 
# converge to this
abline(0,1)

answered Aug 18 at 21:04

Jonny Phelps

1411

answered Aug 18 at 21:04

Jonny Phelps

1411

answered Aug 18 at 21:04

Jonny Phelps

1411

answered Aug 18 at 21:04

Jonny Phelps

1411

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky