Basic: Why are Slope, Intercept in Regression considered Random Variables?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












Sorry if this is too basic.



In an OLS regression given by



$y=ax+b$



$b$ intercept, $a$ the slope.



Then $a,b$ are not numbers but random variables.



I find this confusing since I start with data points $(x_1,y_1),(x_2,y_2),..(x_n,y_n) ; x_i neq x_j $ when $i neq j $. Then we find
the line of best fit, which would give us actual numbers $a,b$, so that these seem to be constants, not (Random) variables.



Is the reason these are called variables that I am selecting just one
of many possible values $y_j$ for a given variable $x_j$?










share|cite|improve this question



















  • 1




    Estimators of slope and intercept are random variables.because they're functions of the responses, which are random variables.
    – Glen_b♦
    4 hours ago










  • @Glen_b: So the point is that for each data set $(x_i,y_i)$ for the same variables X,Y ( of same size) I would get different values for the slope, the intercept?
    – gary
    4 hours ago







  • 2




    New samples would indeed lead to different estimates (because - even assuming fixed x's - you'd have different realizations of each of the n corresponding y's)
    – Glen_b♦
    3 hours ago










  • Thanks, Glen_b , should I delete the question or do you want to answer it. Or should I?
    – gary
    3 hours ago










  • I wasn't sure whether that's what you were seeking (which is why I commented, figuring you'd clarify the question if you needed something else), I am happy to post it as an answer (or you can if you prefer).
    – Glen_b♦
    3 hours ago

















up vote
1
down vote

favorite












Sorry if this is too basic.



In an OLS regression given by



$y=ax+b$



$b$ intercept, $a$ the slope.



Then $a,b$ are not numbers but random variables.



I find this confusing since I start with data points $(x_1,y_1),(x_2,y_2),..(x_n,y_n) ; x_i neq x_j $ when $i neq j $. Then we find
the line of best fit, which would give us actual numbers $a,b$, so that these seem to be constants, not (Random) variables.



Is the reason these are called variables that I am selecting just one
of many possible values $y_j$ for a given variable $x_j$?










share|cite|improve this question



















  • 1




    Estimators of slope and intercept are random variables.because they're functions of the responses, which are random variables.
    – Glen_b♦
    4 hours ago










  • @Glen_b: So the point is that for each data set $(x_i,y_i)$ for the same variables X,Y ( of same size) I would get different values for the slope, the intercept?
    – gary
    4 hours ago







  • 2




    New samples would indeed lead to different estimates (because - even assuming fixed x's - you'd have different realizations of each of the n corresponding y's)
    – Glen_b♦
    3 hours ago










  • Thanks, Glen_b , should I delete the question or do you want to answer it. Or should I?
    – gary
    3 hours ago










  • I wasn't sure whether that's what you were seeking (which is why I commented, figuring you'd clarify the question if you needed something else), I am happy to post it as an answer (or you can if you prefer).
    – Glen_b♦
    3 hours ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











Sorry if this is too basic.



In an OLS regression given by



$y=ax+b$



$b$ intercept, $a$ the slope.



Then $a,b$ are not numbers but random variables.



I find this confusing since I start with data points $(x_1,y_1),(x_2,y_2),..(x_n,y_n) ; x_i neq x_j $ when $i neq j $. Then we find
the line of best fit, which would give us actual numbers $a,b$, so that these seem to be constants, not (Random) variables.



Is the reason these are called variables that I am selecting just one
of many possible values $y_j$ for a given variable $x_j$?










share|cite|improve this question















Sorry if this is too basic.



In an OLS regression given by



$y=ax+b$



$b$ intercept, $a$ the slope.



Then $a,b$ are not numbers but random variables.



I find this confusing since I start with data points $(x_1,y_1),(x_2,y_2),..(x_n,y_n) ; x_i neq x_j $ when $i neq j $. Then we find
the line of best fit, which would give us actual numbers $a,b$, so that these seem to be constants, not (Random) variables.



Is the reason these are called variables that I am selecting just one
of many possible values $y_j$ for a given variable $x_j$?







regression random-variable






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited 3 hours ago









Glen_b♦

202k22384708




202k22384708










asked 5 hours ago









gary

238210




238210







  • 1




    Estimators of slope and intercept are random variables.because they're functions of the responses, which are random variables.
    – Glen_b♦
    4 hours ago










  • @Glen_b: So the point is that for each data set $(x_i,y_i)$ for the same variables X,Y ( of same size) I would get different values for the slope, the intercept?
    – gary
    4 hours ago







  • 2




    New samples would indeed lead to different estimates (because - even assuming fixed x's - you'd have different realizations of each of the n corresponding y's)
    – Glen_b♦
    3 hours ago










  • Thanks, Glen_b , should I delete the question or do you want to answer it. Or should I?
    – gary
    3 hours ago










  • I wasn't sure whether that's what you were seeking (which is why I commented, figuring you'd clarify the question if you needed something else), I am happy to post it as an answer (or you can if you prefer).
    – Glen_b♦
    3 hours ago













  • 1




    Estimators of slope and intercept are random variables.because they're functions of the responses, which are random variables.
    – Glen_b♦
    4 hours ago










  • @Glen_b: So the point is that for each data set $(x_i,y_i)$ for the same variables X,Y ( of same size) I would get different values for the slope, the intercept?
    – gary
    4 hours ago







  • 2




    New samples would indeed lead to different estimates (because - even assuming fixed x's - you'd have different realizations of each of the n corresponding y's)
    – Glen_b♦
    3 hours ago










  • Thanks, Glen_b , should I delete the question or do you want to answer it. Or should I?
    – gary
    3 hours ago










  • I wasn't sure whether that's what you were seeking (which is why I commented, figuring you'd clarify the question if you needed something else), I am happy to post it as an answer (or you can if you prefer).
    – Glen_b♦
    3 hours ago








1




1




Estimators of slope and intercept are random variables.because they're functions of the responses, which are random variables.
– Glen_b♦
4 hours ago




Estimators of slope and intercept are random variables.because they're functions of the responses, which are random variables.
– Glen_b♦
4 hours ago












@Glen_b: So the point is that for each data set $(x_i,y_i)$ for the same variables X,Y ( of same size) I would get different values for the slope, the intercept?
– gary
4 hours ago





@Glen_b: So the point is that for each data set $(x_i,y_i)$ for the same variables X,Y ( of same size) I would get different values for the slope, the intercept?
– gary
4 hours ago





2




2




New samples would indeed lead to different estimates (because - even assuming fixed x's - you'd have different realizations of each of the n corresponding y's)
– Glen_b♦
3 hours ago




New samples would indeed lead to different estimates (because - even assuming fixed x's - you'd have different realizations of each of the n corresponding y's)
– Glen_b♦
3 hours ago












Thanks, Glen_b , should I delete the question or do you want to answer it. Or should I?
– gary
3 hours ago




Thanks, Glen_b , should I delete the question or do you want to answer it. Or should I?
– gary
3 hours ago












I wasn't sure whether that's what you were seeking (which is why I commented, figuring you'd clarify the question if you needed something else), I am happy to post it as an answer (or you can if you prefer).
– Glen_b♦
3 hours ago





I wasn't sure whether that's what you were seeking (which is why I commented, figuring you'd clarify the question if you needed something else), I am happy to post it as an answer (or you can if you prefer).
– Glen_b♦
3 hours ago











1 Answer
1






active

oldest

votes

















up vote
3
down vote



accepted










Estimators of slope and intercept are random variables because they're functions of the responses, which are random variables.



New samples would lead to different estimates (because - even assuming fixed $x$'s - you'd have different realizations of each of the n corresponding sets of $Y$'s)



If we set the situation up to make the variables and their realizations a little more distinct, the situation may become clearer; taking the $x$'s as fixed (for simplicity of exposition), you have $$Y_i = ax_i + b + epsilon_i$$ where $epsilon_i$ is the error term. You draw a sample with that set of $x$'s and you observe a corresponding set of $y$'s, corresponding to a particular realization of the $epsilon_i$. Let's call that set of observed $y$ values $mathbfy^(1)$. We repeat our sampling procedure at the same set of x-values and obtain a new set of responses, $mathbfy^(2)$ and we keep going, up to $mathbfy^(k)$ say. Each realization will have its own slope and intercept, so realization $j$ has $a^(j)$ and intercept $b^(j)$, which will be a function of $mathbfy^(j)$.






share|cite|improve this answer






















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f368365%2fbasic-why-are-slope-intercept-in-regression-considered-random-variables%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote



    accepted










    Estimators of slope and intercept are random variables because they're functions of the responses, which are random variables.



    New samples would lead to different estimates (because - even assuming fixed $x$'s - you'd have different realizations of each of the n corresponding sets of $Y$'s)



    If we set the situation up to make the variables and their realizations a little more distinct, the situation may become clearer; taking the $x$'s as fixed (for simplicity of exposition), you have $$Y_i = ax_i + b + epsilon_i$$ where $epsilon_i$ is the error term. You draw a sample with that set of $x$'s and you observe a corresponding set of $y$'s, corresponding to a particular realization of the $epsilon_i$. Let's call that set of observed $y$ values $mathbfy^(1)$. We repeat our sampling procedure at the same set of x-values and obtain a new set of responses, $mathbfy^(2)$ and we keep going, up to $mathbfy^(k)$ say. Each realization will have its own slope and intercept, so realization $j$ has $a^(j)$ and intercept $b^(j)$, which will be a function of $mathbfy^(j)$.






    share|cite|improve this answer


























      up vote
      3
      down vote



      accepted










      Estimators of slope and intercept are random variables because they're functions of the responses, which are random variables.



      New samples would lead to different estimates (because - even assuming fixed $x$'s - you'd have different realizations of each of the n corresponding sets of $Y$'s)



      If we set the situation up to make the variables and their realizations a little more distinct, the situation may become clearer; taking the $x$'s as fixed (for simplicity of exposition), you have $$Y_i = ax_i + b + epsilon_i$$ where $epsilon_i$ is the error term. You draw a sample with that set of $x$'s and you observe a corresponding set of $y$'s, corresponding to a particular realization of the $epsilon_i$. Let's call that set of observed $y$ values $mathbfy^(1)$. We repeat our sampling procedure at the same set of x-values and obtain a new set of responses, $mathbfy^(2)$ and we keep going, up to $mathbfy^(k)$ say. Each realization will have its own slope and intercept, so realization $j$ has $a^(j)$ and intercept $b^(j)$, which will be a function of $mathbfy^(j)$.






      share|cite|improve this answer
























        up vote
        3
        down vote



        accepted







        up vote
        3
        down vote



        accepted






        Estimators of slope and intercept are random variables because they're functions of the responses, which are random variables.



        New samples would lead to different estimates (because - even assuming fixed $x$'s - you'd have different realizations of each of the n corresponding sets of $Y$'s)



        If we set the situation up to make the variables and their realizations a little more distinct, the situation may become clearer; taking the $x$'s as fixed (for simplicity of exposition), you have $$Y_i = ax_i + b + epsilon_i$$ where $epsilon_i$ is the error term. You draw a sample with that set of $x$'s and you observe a corresponding set of $y$'s, corresponding to a particular realization of the $epsilon_i$. Let's call that set of observed $y$ values $mathbfy^(1)$. We repeat our sampling procedure at the same set of x-values and obtain a new set of responses, $mathbfy^(2)$ and we keep going, up to $mathbfy^(k)$ say. Each realization will have its own slope and intercept, so realization $j$ has $a^(j)$ and intercept $b^(j)$, which will be a function of $mathbfy^(j)$.






        share|cite|improve this answer














        Estimators of slope and intercept are random variables because they're functions of the responses, which are random variables.



        New samples would lead to different estimates (because - even assuming fixed $x$'s - you'd have different realizations of each of the n corresponding sets of $Y$'s)



        If we set the situation up to make the variables and their realizations a little more distinct, the situation may become clearer; taking the $x$'s as fixed (for simplicity of exposition), you have $$Y_i = ax_i + b + epsilon_i$$ where $epsilon_i$ is the error term. You draw a sample with that set of $x$'s and you observe a corresponding set of $y$'s, corresponding to a particular realization of the $epsilon_i$. Let's call that set of observed $y$ values $mathbfy^(1)$. We repeat our sampling procedure at the same set of x-values and obtain a new set of responses, $mathbfy^(2)$ and we keep going, up to $mathbfy^(k)$ say. Each realization will have its own slope and intercept, so realization $j$ has $a^(j)$ and intercept $b^(j)$, which will be a function of $mathbfy^(j)$.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited 3 hours ago

























        answered 3 hours ago









        Glen_b♦

        202k22384708




        202k22384708



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f368365%2fbasic-why-are-slope-intercept-in-regression-considered-random-variables%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What does second last employer means? [closed]

            Installing NextGIS Connect into QGIS 3?

            One-line joke