GLMM for count data using square root link in lme4

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
3
down vote

favorite












I have data from a field survey. The objective of the study is to relate number of seedling (respond variable, count data), landform (exploratory variable, categorical variable with 3 levels) and percent canopy coverage (exploratory variable, quantitative).
In each habitat, I have data from five 25x25 meter plots. Within each plot I used three 2x2 meter subplots nested within the bigger plot, and number of seedlings were count from these subplots. Total number of observations is 60; 20 plots x 3 subplots.
Only one kind of landform contained seedlings. Other two landforms contained no seedlings, so they are empty plots. the data looks like this:



data.frame': 60 obs. of 5 variables:
$ plot : Factor w/ 20 levels "HD2","LC1","LC2",..: 16 16 16 17 17 17 12 12 12 9 ...
$
subplot : Factor w/ 60 levels "HD2.1","HD2.2",..: 46 47 48 49 50 51 34 35 36 25 ...
$ av.canopy : num 92.2 92.2 92.2 92.3 92.3 ...
$
landform : Factor w/ 3 levels "abandoned","ridge",..: 2 2 2 2 2 2 2 2 2 2 ...
$ seedling.2016: int 6 7 5 2 5 4 4 6 4 0 ...


The problem is when I compared number of seedlings between landforms using this code:



seedling <- glmer(seedling.2016 ~ landform +(1|plot), family = poisson)


The result does not make sense for me-there were no any significant different between landforms event there is only one landform (ridge) that has seedlings, while other had no seedlings at al. It is also note that SEs are enormous.
The result looks like this:



Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: poisson ( log )
Formula: seedling.2016 ~ landform + (1 | plot)
Data: streblus.subplots

AIC BIC logLik deviance df.resid
294.9 303.3 -143.5 286.9 56

Scaled residuals:
Min 1Q Median 3Q Max
-6.3619 -0.0001 -0.0001 0.0000 8.7305

Random effects:
Groups Name Variance Std.Dev.
plot (Intercept) 2.637 1.624
Number of obs: 60, groups: plot, 20

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -20.412 1461.267 -0.014 0.989
landformridge 22.250 1461.265 0.015 0.988
landformtemp 1.066 390.540 0.003 0.998


When I changed link function to square root as this code:



Seedling2 <- glmer(seedling.2016 ~ landform + (1|plot), family = poisson(link = sqrt))
Fixed effects:
#Estimate Std. Error z value Pr(>|z|)
#(Intercept) -1.220e-05 5.296e-01 0.000 1
#landformridge 3.250e+00 7.429e-01 4.376 1.21e-05 ***
# landformtemp 1.018e-05 7.795e-01 0.000 1


Now number of seedlings in ridge is significantly higher than the other, and it makes more sense to me.



My question is: Is it valid in term of statistics to use square root link with Poisson distribution, there are any better methods available with better ground of statistics?










share|cite|improve this question





























    up vote
    3
    down vote

    favorite












    I have data from a field survey. The objective of the study is to relate number of seedling (respond variable, count data), landform (exploratory variable, categorical variable with 3 levels) and percent canopy coverage (exploratory variable, quantitative).
    In each habitat, I have data from five 25x25 meter plots. Within each plot I used three 2x2 meter subplots nested within the bigger plot, and number of seedlings were count from these subplots. Total number of observations is 60; 20 plots x 3 subplots.
    Only one kind of landform contained seedlings. Other two landforms contained no seedlings, so they are empty plots. the data looks like this:



    data.frame': 60 obs. of 5 variables:
    $ plot : Factor w/ 20 levels "HD2","LC1","LC2",..: 16 16 16 17 17 17 12 12 12 9 ...
    $
    subplot : Factor w/ 60 levels "HD2.1","HD2.2",..: 46 47 48 49 50 51 34 35 36 25 ...
    $ av.canopy : num 92.2 92.2 92.2 92.3 92.3 ...
    $
    landform : Factor w/ 3 levels "abandoned","ridge",..: 2 2 2 2 2 2 2 2 2 2 ...
    $ seedling.2016: int 6 7 5 2 5 4 4 6 4 0 ...


    The problem is when I compared number of seedlings between landforms using this code:



    seedling <- glmer(seedling.2016 ~ landform +(1|plot), family = poisson)


    The result does not make sense for me-there were no any significant different between landforms event there is only one landform (ridge) that has seedlings, while other had no seedlings at al. It is also note that SEs are enormous.
    The result looks like this:



    Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
    Family: poisson ( log )
    Formula: seedling.2016 ~ landform + (1 | plot)
    Data: streblus.subplots

    AIC BIC logLik deviance df.resid
    294.9 303.3 -143.5 286.9 56

    Scaled residuals:
    Min 1Q Median 3Q Max
    -6.3619 -0.0001 -0.0001 0.0000 8.7305

    Random effects:
    Groups Name Variance Std.Dev.
    plot (Intercept) 2.637 1.624
    Number of obs: 60, groups: plot, 20

    Fixed effects:
    Estimate Std. Error z value Pr(>|z|)
    (Intercept) -20.412 1461.267 -0.014 0.989
    landformridge 22.250 1461.265 0.015 0.988
    landformtemp 1.066 390.540 0.003 0.998


    When I changed link function to square root as this code:



    Seedling2 <- glmer(seedling.2016 ~ landform + (1|plot), family = poisson(link = sqrt))
    Fixed effects:
    #Estimate Std. Error z value Pr(>|z|)
    #(Intercept) -1.220e-05 5.296e-01 0.000 1
    #landformridge 3.250e+00 7.429e-01 4.376 1.21e-05 ***
    # landformtemp 1.018e-05 7.795e-01 0.000 1


    Now number of seedlings in ridge is significantly higher than the other, and it makes more sense to me.



    My question is: Is it valid in term of statistics to use square root link with Poisson distribution, there are any better methods available with better ground of statistics?










    share|cite|improve this question

























      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I have data from a field survey. The objective of the study is to relate number of seedling (respond variable, count data), landform (exploratory variable, categorical variable with 3 levels) and percent canopy coverage (exploratory variable, quantitative).
      In each habitat, I have data from five 25x25 meter plots. Within each plot I used three 2x2 meter subplots nested within the bigger plot, and number of seedlings were count from these subplots. Total number of observations is 60; 20 plots x 3 subplots.
      Only one kind of landform contained seedlings. Other two landforms contained no seedlings, so they are empty plots. the data looks like this:



      data.frame': 60 obs. of 5 variables:
      $ plot : Factor w/ 20 levels "HD2","LC1","LC2",..: 16 16 16 17 17 17 12 12 12 9 ...
      $
      subplot : Factor w/ 60 levels "HD2.1","HD2.2",..: 46 47 48 49 50 51 34 35 36 25 ...
      $ av.canopy : num 92.2 92.2 92.2 92.3 92.3 ...
      $
      landform : Factor w/ 3 levels "abandoned","ridge",..: 2 2 2 2 2 2 2 2 2 2 ...
      $ seedling.2016: int 6 7 5 2 5 4 4 6 4 0 ...


      The problem is when I compared number of seedlings between landforms using this code:



      seedling <- glmer(seedling.2016 ~ landform +(1|plot), family = poisson)


      The result does not make sense for me-there were no any significant different between landforms event there is only one landform (ridge) that has seedlings, while other had no seedlings at al. It is also note that SEs are enormous.
      The result looks like this:



      Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
      Family: poisson ( log )
      Formula: seedling.2016 ~ landform + (1 | plot)
      Data: streblus.subplots

      AIC BIC logLik deviance df.resid
      294.9 303.3 -143.5 286.9 56

      Scaled residuals:
      Min 1Q Median 3Q Max
      -6.3619 -0.0001 -0.0001 0.0000 8.7305

      Random effects:
      Groups Name Variance Std.Dev.
      plot (Intercept) 2.637 1.624
      Number of obs: 60, groups: plot, 20

      Fixed effects:
      Estimate Std. Error z value Pr(>|z|)
      (Intercept) -20.412 1461.267 -0.014 0.989
      landformridge 22.250 1461.265 0.015 0.988
      landformtemp 1.066 390.540 0.003 0.998


      When I changed link function to square root as this code:



      Seedling2 <- glmer(seedling.2016 ~ landform + (1|plot), family = poisson(link = sqrt))
      Fixed effects:
      #Estimate Std. Error z value Pr(>|z|)
      #(Intercept) -1.220e-05 5.296e-01 0.000 1
      #landformridge 3.250e+00 7.429e-01 4.376 1.21e-05 ***
      # landformtemp 1.018e-05 7.795e-01 0.000 1


      Now number of seedlings in ridge is significantly higher than the other, and it makes more sense to me.



      My question is: Is it valid in term of statistics to use square root link with Poisson distribution, there are any better methods available with better ground of statistics?










      share|cite|improve this question















      I have data from a field survey. The objective of the study is to relate number of seedling (respond variable, count data), landform (exploratory variable, categorical variable with 3 levels) and percent canopy coverage (exploratory variable, quantitative).
      In each habitat, I have data from five 25x25 meter plots. Within each plot I used three 2x2 meter subplots nested within the bigger plot, and number of seedlings were count from these subplots. Total number of observations is 60; 20 plots x 3 subplots.
      Only one kind of landform contained seedlings. Other two landforms contained no seedlings, so they are empty plots. the data looks like this:



      data.frame': 60 obs. of 5 variables:
      $ plot : Factor w/ 20 levels "HD2","LC1","LC2",..: 16 16 16 17 17 17 12 12 12 9 ...
      $
      subplot : Factor w/ 60 levels "HD2.1","HD2.2",..: 46 47 48 49 50 51 34 35 36 25 ...
      $ av.canopy : num 92.2 92.2 92.2 92.3 92.3 ...
      $
      landform : Factor w/ 3 levels "abandoned","ridge",..: 2 2 2 2 2 2 2 2 2 2 ...
      $ seedling.2016: int 6 7 5 2 5 4 4 6 4 0 ...


      The problem is when I compared number of seedlings between landforms using this code:



      seedling <- glmer(seedling.2016 ~ landform +(1|plot), family = poisson)


      The result does not make sense for me-there were no any significant different between landforms event there is only one landform (ridge) that has seedlings, while other had no seedlings at al. It is also note that SEs are enormous.
      The result looks like this:



      Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
      Family: poisson ( log )
      Formula: seedling.2016 ~ landform + (1 | plot)
      Data: streblus.subplots

      AIC BIC logLik deviance df.resid
      294.9 303.3 -143.5 286.9 56

      Scaled residuals:
      Min 1Q Median 3Q Max
      -6.3619 -0.0001 -0.0001 0.0000 8.7305

      Random effects:
      Groups Name Variance Std.Dev.
      plot (Intercept) 2.637 1.624
      Number of obs: 60, groups: plot, 20

      Fixed effects:
      Estimate Std. Error z value Pr(>|z|)
      (Intercept) -20.412 1461.267 -0.014 0.989
      landformridge 22.250 1461.265 0.015 0.988
      landformtemp 1.066 390.540 0.003 0.998


      When I changed link function to square root as this code:



      Seedling2 <- glmer(seedling.2016 ~ landform + (1|plot), family = poisson(link = sqrt))
      Fixed effects:
      #Estimate Std. Error z value Pr(>|z|)
      #(Intercept) -1.220e-05 5.296e-01 0.000 1
      #landformridge 3.250e+00 7.429e-01 4.376 1.21e-05 ***
      # landformtemp 1.018e-05 7.795e-01 0.000 1


      Now number of seedlings in ridge is significantly higher than the other, and it makes more sense to me.



      My question is: Is it valid in term of statistics to use square root link with Poisson distribution, there are any better methods available with better ground of statistics?







      lme4-nlme count-data glmm separation link-function






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 37 mins ago









      kjetil b halvorsen

      26.7k977195




      26.7k977195










      asked 2 hours ago









      Ponlawat

      163




      163




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          It looks very much like you have a case of complete separation:




          • there is only one landform (ridge) that has seedlings, while other had no seedlings at al



          • large estimates ($|hat beta|>10$), and ridiculously large standard error estimates.

          Basically what's happening is that the baseline level ("abandoned") has an expected number of counts equal to zero for all plots, so the intercept $beta_0$ - which is the expected log(counts) for the baseline level - should be estimated as $-infty$ ... which messes up the Wald estimation of the uncertainty (the approximate, fast method that summary() uses).



          You can read more about complete separation elsewhere; it is more typically discussed in the context of logistic regression (in part because logistic regression is more widely used than count regression ...)



          Solutions:



          • your square-root-link solution is reasonable (in this case the intercept is expected $sqrttextrmcounts$ in the baseline level, which is zero rather than $-infty$); it will change the assumed distribution of the random effects slightly (i.e., Normal on the square-root rather than on the log scale), but that wouldn't worry me too much. If you had continuous covariates or interactions in the model, it would also change the interpretation of the fixed effects.

          • you could use some kind of penalization (most conveniently in a Bayesian framework), as described in my answer to the linked question (and here, search for "complete separation") to keep the parameters reasonable.





          share|cite|improve this answer




















          • Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
            – Isabella Ghement
            39 mins ago

















          up vote
          0
          down vote













          Indeed this seems to be a separation issue. To account for these cases, in my GLMMadaptive package you can include a penalty for the fixed-effects coefficients in the form of a Students-t density (i.e., for large enough df equivalent to ridge regression). For a worked example, have a look at the last section of this vignette.






          share|cite|improve this answer




















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f373887%2fglmm-for-count-data-using-square-root-link-in-lme4%23new-answer', 'question_page');

            );

            Post as a guest






























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote













            It looks very much like you have a case of complete separation:




            • there is only one landform (ridge) that has seedlings, while other had no seedlings at al



            • large estimates ($|hat beta|>10$), and ridiculously large standard error estimates.

            Basically what's happening is that the baseline level ("abandoned") has an expected number of counts equal to zero for all plots, so the intercept $beta_0$ - which is the expected log(counts) for the baseline level - should be estimated as $-infty$ ... which messes up the Wald estimation of the uncertainty (the approximate, fast method that summary() uses).



            You can read more about complete separation elsewhere; it is more typically discussed in the context of logistic regression (in part because logistic regression is more widely used than count regression ...)



            Solutions:



            • your square-root-link solution is reasonable (in this case the intercept is expected $sqrttextrmcounts$ in the baseline level, which is zero rather than $-infty$); it will change the assumed distribution of the random effects slightly (i.e., Normal on the square-root rather than on the log scale), but that wouldn't worry me too much. If you had continuous covariates or interactions in the model, it would also change the interpretation of the fixed effects.

            • you could use some kind of penalization (most conveniently in a Bayesian framework), as described in my answer to the linked question (and here, search for "complete separation") to keep the parameters reasonable.





            share|cite|improve this answer




















            • Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
              – Isabella Ghement
              39 mins ago














            up vote
            2
            down vote













            It looks very much like you have a case of complete separation:




            • there is only one landform (ridge) that has seedlings, while other had no seedlings at al



            • large estimates ($|hat beta|>10$), and ridiculously large standard error estimates.

            Basically what's happening is that the baseline level ("abandoned") has an expected number of counts equal to zero for all plots, so the intercept $beta_0$ - which is the expected log(counts) for the baseline level - should be estimated as $-infty$ ... which messes up the Wald estimation of the uncertainty (the approximate, fast method that summary() uses).



            You can read more about complete separation elsewhere; it is more typically discussed in the context of logistic regression (in part because logistic regression is more widely used than count regression ...)



            Solutions:



            • your square-root-link solution is reasonable (in this case the intercept is expected $sqrttextrmcounts$ in the baseline level, which is zero rather than $-infty$); it will change the assumed distribution of the random effects slightly (i.e., Normal on the square-root rather than on the log scale), but that wouldn't worry me too much. If you had continuous covariates or interactions in the model, it would also change the interpretation of the fixed effects.

            • you could use some kind of penalization (most conveniently in a Bayesian framework), as described in my answer to the linked question (and here, search for "complete separation") to keep the parameters reasonable.





            share|cite|improve this answer




















            • Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
              – Isabella Ghement
              39 mins ago












            up vote
            2
            down vote










            up vote
            2
            down vote









            It looks very much like you have a case of complete separation:




            • there is only one landform (ridge) that has seedlings, while other had no seedlings at al



            • large estimates ($|hat beta|>10$), and ridiculously large standard error estimates.

            Basically what's happening is that the baseline level ("abandoned") has an expected number of counts equal to zero for all plots, so the intercept $beta_0$ - which is the expected log(counts) for the baseline level - should be estimated as $-infty$ ... which messes up the Wald estimation of the uncertainty (the approximate, fast method that summary() uses).



            You can read more about complete separation elsewhere; it is more typically discussed in the context of logistic regression (in part because logistic regression is more widely used than count regression ...)



            Solutions:



            • your square-root-link solution is reasonable (in this case the intercept is expected $sqrttextrmcounts$ in the baseline level, which is zero rather than $-infty$); it will change the assumed distribution of the random effects slightly (i.e., Normal on the square-root rather than on the log scale), but that wouldn't worry me too much. If you had continuous covariates or interactions in the model, it would also change the interpretation of the fixed effects.

            • you could use some kind of penalization (most conveniently in a Bayesian framework), as described in my answer to the linked question (and here, search for "complete separation") to keep the parameters reasonable.





            share|cite|improve this answer












            It looks very much like you have a case of complete separation:




            • there is only one landform (ridge) that has seedlings, while other had no seedlings at al



            • large estimates ($|hat beta|>10$), and ridiculously large standard error estimates.

            Basically what's happening is that the baseline level ("abandoned") has an expected number of counts equal to zero for all plots, so the intercept $beta_0$ - which is the expected log(counts) for the baseline level - should be estimated as $-infty$ ... which messes up the Wald estimation of the uncertainty (the approximate, fast method that summary() uses).



            You can read more about complete separation elsewhere; it is more typically discussed in the context of logistic regression (in part because logistic regression is more widely used than count regression ...)



            Solutions:



            • your square-root-link solution is reasonable (in this case the intercept is expected $sqrttextrmcounts$ in the baseline level, which is zero rather than $-infty$); it will change the assumed distribution of the random effects slightly (i.e., Normal on the square-root rather than on the log scale), but that wouldn't worry me too much. If you had continuous covariates or interactions in the model, it would also change the interpretation of the fixed effects.

            • you could use some kind of penalization (most conveniently in a Bayesian framework), as described in my answer to the linked question (and here, search for "complete separation") to keep the parameters reasonable.






            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered 1 hour ago









            Ben Bolker

            21.1k15684




            21.1k15684











            • Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
              – Isabella Ghement
              39 mins ago
















            • Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
              – Isabella Ghement
              39 mins ago















            Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
            – Isabella Ghement
            39 mins ago




            Ben, this does not affect your excellent answer, but should the final model (whatever it is) include a random effect for subplot as well?
            – Isabella Ghement
            39 mins ago












            up vote
            0
            down vote













            Indeed this seems to be a separation issue. To account for these cases, in my GLMMadaptive package you can include a penalty for the fixed-effects coefficients in the form of a Students-t density (i.e., for large enough df equivalent to ridge regression). For a worked example, have a look at the last section of this vignette.






            share|cite|improve this answer
























              up vote
              0
              down vote













              Indeed this seems to be a separation issue. To account for these cases, in my GLMMadaptive package you can include a penalty for the fixed-effects coefficients in the form of a Students-t density (i.e., for large enough df equivalent to ridge regression). For a worked example, have a look at the last section of this vignette.






              share|cite|improve this answer






















                up vote
                0
                down vote










                up vote
                0
                down vote









                Indeed this seems to be a separation issue. To account for these cases, in my GLMMadaptive package you can include a penalty for the fixed-effects coefficients in the form of a Students-t density (i.e., for large enough df equivalent to ridge regression). For a worked example, have a look at the last section of this vignette.






                share|cite|improve this answer












                Indeed this seems to be a separation issue. To account for these cases, in my GLMMadaptive package you can include a penalty for the fixed-effects coefficients in the form of a Students-t density (i.e., for large enough df equivalent to ridge regression). For a worked example, have a look at the last section of this vignette.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 23 mins ago









                Dimitris Rizopoulos

                2,630211




                2,630211



























                     

                    draft saved


                    draft discarded















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f373887%2fglmm-for-count-data-using-square-root-link-in-lme4%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Comments

                    Popular posts from this blog

                    Long meetings (6-7 hours a day): Being “babysat” by supervisor

                    Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                    Confectionery