Formal definition of the qqline used in a Q-Q plot

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
5
down vote

favorite
1












I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.



My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.



For a Weibull distribution, I understand how the points on the Q-Q plot are constructed (using the quantiles of observed data vs. the quantiles of an estimated Weibull distribution). The piece I am not clear on is how the line used in Q-Q plots is calculated/constructed.



The R documentation for the qqplot() function provides the following description:




qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a “theoretical”, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.




Another post on Cross Validated seems to indicate that the line is essentially a line constructed from the parameters of the theoretical (estimated) distribution. Is this a true statement and correct interpretation?



If a link to a formal definition could be provided I'd very much appreciate it.







share|cite|improve this question




























    up vote
    5
    down vote

    favorite
    1












    I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.



    My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.



    For a Weibull distribution, I understand how the points on the Q-Q plot are constructed (using the quantiles of observed data vs. the quantiles of an estimated Weibull distribution). The piece I am not clear on is how the line used in Q-Q plots is calculated/constructed.



    The R documentation for the qqplot() function provides the following description:




    qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a “theoretical”, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.




    Another post on Cross Validated seems to indicate that the line is essentially a line constructed from the parameters of the theoretical (estimated) distribution. Is this a true statement and correct interpretation?



    If a link to a formal definition could be provided I'd very much appreciate it.







    share|cite|improve this question
























      up vote
      5
      down vote

      favorite
      1









      up vote
      5
      down vote

      favorite
      1






      1





      I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.



      My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.



      For a Weibull distribution, I understand how the points on the Q-Q plot are constructed (using the quantiles of observed data vs. the quantiles of an estimated Weibull distribution). The piece I am not clear on is how the line used in Q-Q plots is calculated/constructed.



      The R documentation for the qqplot() function provides the following description:




      qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a “theoretical”, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.




      Another post on Cross Validated seems to indicate that the line is essentially a line constructed from the parameters of the theoretical (estimated) distribution. Is this a true statement and correct interpretation?



      If a link to a formal definition could be provided I'd very much appreciate it.







      share|cite|improve this question














      I'm doing some distribution fitting work and I'm looking at Q-Q plots and how they can be used visually to interpret goodness of fit.



      My data is heavy-tailed so I am looking at Weibull, log-normal, Pareto and log-logistic distributions initially.



      For a Weibull distribution, I understand how the points on the Q-Q plot are constructed (using the quantiles of observed data vs. the quantiles of an estimated Weibull distribution). The piece I am not clear on is how the line used in Q-Q plots is calculated/constructed.



      The R documentation for the qqplot() function provides the following description:




      qqnorm is a generic function the default method of which produces a normal QQ plot of the values in y. qqline adds a line to a “theoretical”, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles.




      Another post on Cross Validated seems to indicate that the line is essentially a line constructed from the parameters of the theoretical (estimated) distribution. Is this a true statement and correct interpretation?



      If a link to a formal definition could be provided I'd very much appreciate it.









      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Aug 19 at 10:16









      Peter Mortensen

      18718




      18718










      asked Aug 18 at 19:31









      Jonathan Dunne

      1067




      1067




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          6
          down vote



          accepted










          Sort of "both" - the line depends both on the observed quantiles (which define the y-axis of the QQ plot) and the expected/theoretical/reference quantiles (which the define the x-axis). The documentation (which you quote) should always be taken as the canonical reference:




          ‘qqline’ adds a
          line to a “theoretical”, by default normal, quantile-quantile plot
          which passes through the ‘probs’ quantiles, by default the first
          and third quartiles.




          If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version



           ## quantiles (.25 and 0.75 by default) of data
          y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
          ## quantiles of reference/theoretical distribution
          x <- distribution(probs)
          ## ...
          slope <- diff(y)/diff(x) ## observed slope between quantiles
          int <- y[1L]-slope*x[1L] ## intercept
          abline(int, slope, ...) ## draw the line


          For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:



          • quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

          • robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)





          share|cite|improve this answer





























            up vote
            4
            down vote













            I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)



            (x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R



            # sample data
            set.seed(2)
            y <- rt(100, df = 5)

            # get the values
            probs <- c(0.25, 0.75)
            x1 <- qnorm(probs[1])
            x2 <- qnorm(probs[2])
            y1 <- quantile(y, probs[1])
            y2 <- quantile(y, probs[2])

            # plot
            qqnorm(y)
            segments(x1, y1, x2, y2, col = "red", lwd = 2)
            qqline(y, lty = 2)
            # theoretical match is straight line. If you add more samples, qqline should
            # converge to this
            abline(0,1)





            share|cite|improve this answer




















              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "65"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362840%2fformal-definition-of-the-qqline-used-in-a-q-q-plot%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              6
              down vote



              accepted










              Sort of "both" - the line depends both on the observed quantiles (which define the y-axis of the QQ plot) and the expected/theoretical/reference quantiles (which the define the x-axis). The documentation (which you quote) should always be taken as the canonical reference:




              ‘qqline’ adds a
              line to a “theoretical”, by default normal, quantile-quantile plot
              which passes through the ‘probs’ quantiles, by default the first
              and third quartiles.




              If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version



               ## quantiles (.25 and 0.75 by default) of data
              y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
              ## quantiles of reference/theoretical distribution
              x <- distribution(probs)
              ## ...
              slope <- diff(y)/diff(x) ## observed slope between quantiles
              int <- y[1L]-slope*x[1L] ## intercept
              abline(int, slope, ...) ## draw the line


              For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:



              • quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

              • robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)





              share|cite|improve this answer


























                up vote
                6
                down vote



                accepted










                Sort of "both" - the line depends both on the observed quantiles (which define the y-axis of the QQ plot) and the expected/theoretical/reference quantiles (which the define the x-axis). The documentation (which you quote) should always be taken as the canonical reference:




                ‘qqline’ adds a
                line to a “theoretical”, by default normal, quantile-quantile plot
                which passes through the ‘probs’ quantiles, by default the first
                and third quartiles.




                If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version



                 ## quantiles (.25 and 0.75 by default) of data
                y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
                ## quantiles of reference/theoretical distribution
                x <- distribution(probs)
                ## ...
                slope <- diff(y)/diff(x) ## observed slope between quantiles
                int <- y[1L]-slope*x[1L] ## intercept
                abline(int, slope, ...) ## draw the line


                For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:



                • quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

                • robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)





                share|cite|improve this answer
























                  up vote
                  6
                  down vote



                  accepted







                  up vote
                  6
                  down vote



                  accepted






                  Sort of "both" - the line depends both on the observed quantiles (which define the y-axis of the QQ plot) and the expected/theoretical/reference quantiles (which the define the x-axis). The documentation (which you quote) should always be taken as the canonical reference:




                  ‘qqline’ adds a
                  line to a “theoretical”, by default normal, quantile-quantile plot
                  which passes through the ‘probs’ quantiles, by default the first
                  and third quartiles.




                  If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version



                   ## quantiles (.25 and 0.75 by default) of data
                  y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
                  ## quantiles of reference/theoretical distribution
                  x <- distribution(probs)
                  ## ...
                  slope <- diff(y)/diff(x) ## observed slope between quantiles
                  int <- y[1L]-slope*x[1L] ## intercept
                  abline(int, slope, ...) ## draw the line


                  For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:



                  • quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

                  • robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)





                  share|cite|improve this answer














                  Sort of "both" - the line depends both on the observed quantiles (which define the y-axis of the QQ plot) and the expected/theoretical/reference quantiles (which the define the x-axis). The documentation (which you quote) should always be taken as the canonical reference:




                  ‘qqline’ adds a
                  line to a “theoretical”, by default normal, quantile-quantile plot
                  which passes through the ‘probs’ quantiles, by default the first
                  and third quartiles.




                  If in doubt, USTL ("Use the Source, Luke") , which can be found here: here's a slightly abridged and commented version



                   ## quantiles (.25 and 0.75 by default) of data
                  y <- quantile(y, probs, names=FALSE, type=qtype, na.rm = TRUE)
                  ## quantiles of reference/theoretical distribution
                  x <- distribution(probs)
                  ## ...
                  slope <- diff(y)/diff(x) ## observed slope between quantiles
                  int <- y[1L]-slope*x[1L] ## intercept
                  abline(int, slope, ...) ## draw the line


                  For what it's worth, I believe that this approach (line connecting central quantiles) is used because it fulfills the following criteria for exploratory/diagnostic approaches:



                  • quick (e.g. no need to run a linear regression, just find the quantiles and draw a straight line)

                  • robust (it only depends on the behavior of the central part of the distribution, won't be thrown off by weird tails)






                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited Aug 18 at 21:09

























                  answered Aug 18 at 21:01









                  Ben Bolker

                  20.5k15583




                  20.5k15583






















                      up vote
                      4
                      down vote













                      I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)



                      (x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R



                      # sample data
                      set.seed(2)
                      y <- rt(100, df = 5)

                      # get the values
                      probs <- c(0.25, 0.75)
                      x1 <- qnorm(probs[1])
                      x2 <- qnorm(probs[2])
                      y1 <- quantile(y, probs[1])
                      y2 <- quantile(y, probs[2])

                      # plot
                      qqnorm(y)
                      segments(x1, y1, x2, y2, col = "red", lwd = 2)
                      qqline(y, lty = 2)
                      # theoretical match is straight line. If you add more samples, qqline should
                      # converge to this
                      abline(0,1)





                      share|cite|improve this answer
























                        up vote
                        4
                        down vote













                        I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)



                        (x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R



                        # sample data
                        set.seed(2)
                        y <- rt(100, df = 5)

                        # get the values
                        probs <- c(0.25, 0.75)
                        x1 <- qnorm(probs[1])
                        x2 <- qnorm(probs[2])
                        y1 <- quantile(y, probs[1])
                        y2 <- quantile(y, probs[2])

                        # plot
                        qqnorm(y)
                        segments(x1, y1, x2, y2, col = "red", lwd = 2)
                        qqline(y, lty = 2)
                        # theoretical match is straight line. If you add more samples, qqline should
                        # converge to this
                        abline(0,1)





                        share|cite|improve this answer






















                          up vote
                          4
                          down vote










                          up vote
                          4
                          down vote









                          I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)



                          (x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R



                          # sample data
                          set.seed(2)
                          y <- rt(100, df = 5)

                          # get the values
                          probs <- c(0.25, 0.75)
                          x1 <- qnorm(probs[1])
                          x2 <- qnorm(probs[2])
                          y1 <- quantile(y, probs[1])
                          y2 <- quantile(y, probs[2])

                          # plot
                          qqnorm(y)
                          segments(x1, y1, x2, y2, col = "red", lwd = 2)
                          qqline(y, lty = 2)
                          # theoretical match is straight line. If you add more samples, qqline should
                          # converge to this
                          abline(0,1)





                          share|cite|improve this answer












                          I think it simply adds a line segment between the points (x1, y1) & (x2, y2) for given probabilities (p1, p2)



                          (x1, x2) are the quantiles of the theoretical distribution; (y1, y2) for the data comparison. Function qline has simple code under the hood. This is a simple e.g. in R



                          # sample data
                          set.seed(2)
                          y <- rt(100, df = 5)

                          # get the values
                          probs <- c(0.25, 0.75)
                          x1 <- qnorm(probs[1])
                          x2 <- qnorm(probs[2])
                          y1 <- quantile(y, probs[1])
                          y2 <- quantile(y, probs[2])

                          # plot
                          qqnorm(y)
                          segments(x1, y1, x2, y2, col = "red", lwd = 2)
                          qqline(y, lty = 2)
                          # theoretical match is straight line. If you add more samples, qqline should
                          # converge to this
                          abline(0,1)






                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered Aug 18 at 21:04









                          Jonny Phelps

                          1411




                          1411



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362840%2fformal-definition-of-the-qqline-used-in-a-q-q-plot%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Comments

                              Popular posts from this blog

                              Long meetings (6-7 hours a day): Being “babysat” by supervisor

                              Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                              Confectionery