Clash Royale CLAN TAG#URR8PPP

up vote
1
down vote

favorite

I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!

OLS visualisation

For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:

Linear regression model

where all right-hand side X variablesÃ¢Â€Â”regressorsÃ¢Â€Â”are linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem

OLS penalty function

on a sample of size n, given observations

Data used for estimation

The OLS solution to this problem as a vector looks like

enter image description here

where Y is the first column of the input matrix and X is a matrix made of remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!

Of course, econometricians prefer slightly different notation, but letÃ¢Â€Â™s just ignore them. Ã¢Â€Â” Do you know what econometricians use as a contraceptive? Ã¢Â€Â” Their personalities! (DonÃ¢Â€Â™t worry, I can say that, I am an econometrician...)

Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.

Task

Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.

Input

A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc.

NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U, and the OLS estimate of b0 is the sample average of Y. In this case, the input is just a matrix with one column, Y.

You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S or any other unambiguous characters).

Output

OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.

Challenge rules

I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

This is code-golf, so shortest answer in bytes wins.

Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.

Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.

Default loopholes are forbidden.

Test cases

[[4,5,6],[1,2,3]] Ã¢Â†Â’ output: [3,1].

[[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] Ã¢Â†Â’ output: [2.1171050,1.1351122,0.4539268].

[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] Ã¢Â†Â’ output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).

[[1,2,3,4,5]] Ã¢Â†Â’ output: 3.

Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4] in Test Case 3) with high precision.

For fun: you can implement estimation of standard errors (WhiteÃ¢Â€Â™s sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!

edited 1 hour ago

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â€“Â Kirill L.
1 hour ago

@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â€“Â Kirill L.
1 hour ago

1

@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneÃ¢Â€Â™s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

add a commentÂ |Â

up vote
1
down vote

favorite

OLS visualisation

For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:

Linear regression model

OLS penalty function

on a sample of size n, given observations

Data used for estimation

The OLS solution to this problem as a vector looks like

enter image description here

Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.

Task

Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.

Input

A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc.

Output

OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.

Challenge rules

I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

This is code-golf, so shortest answer in bytes wins.

Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.

Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.

Default loopholes are forbidden.

Test cases

[[4,5,6],[1,2,3]] Ã¢Â†Â’ output: [3,1].

[[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] Ã¢Â†Â’ output: [2.1171050,1.1351122,0.4539268].

[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] Ã¢Â†Â’ output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).

[[1,2,3,4,5]] Ã¢Â†Â’ output: 3.

edited 1 hour ago

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â€“Â Kirill L.
1 hour ago

@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â€“Â Kirill L.
1 hour ago

1

@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneÃ¢Â€Â™s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

add a commentÂ |Â

up vote
1
down vote

favorite

OLS visualisation

For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:

Linear regression model

OLS penalty function

on a sample of size n, given observations

Data used for estimation

The OLS solution to this problem as a vector looks like

enter image description here

Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.

Task

Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.

Input

A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc.

Output

OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.

Challenge rules

I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

This is code-golf, so shortest answer in bytes wins.

Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.

Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.

Default loopholes are forbidden.

Test cases

[[4,5,6],[1,2,3]] Ã¢Â†Â’ output: [3,1].

[[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] Ã¢Â†Â’ output: [2.1171050,1.1351122,0.4539268].

[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] Ã¢Â†Â’ output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).

[[1,2,3,4,5]] Ã¢Â†Â’ output: 3.

edited 1 hour ago

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

OLS visualisation

For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:

Linear regression model

OLS penalty function

on a sample of size n, given observations

Data used for estimation

The OLS solution to this problem as a vector looks like

enter image description here

Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.

Task

Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.

Input

A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc.

Output

OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.

Challenge rules

I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

This is code-golf, so shortest answer in bytes wins.

Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.

Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.

Default loopholes are forbidden.

Test cases

[[4,5,6],[1,2,3]] Ã¢Â†Â’ output: [3,1].

[[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] Ã¢Â†Â’ output: [2.1171050,1.1351122,0.4539268].

[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] Ã¢Â†Â’ output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).

[[1,2,3,4,5]] Ã¢Â†Â’ output: 3.

code-golf matrix statistics

edited 1 hour ago

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

edited 1 hour ago

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

edited 1 hour ago

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

asked 2 hours ago

AndreÃ¯ Kostyrka

1,149617

What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â€“Â Kirill L.
1 hour ago

@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â€“Â Kirill L.
1 hour ago

1

@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneÃ¢Â€Â™s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

add a commentÂ |Â

What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â€“Â Kirill L.
1 hour ago

@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â€“Â Kirill L.
1 hour ago

1

@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneÃ¢Â€Â™s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â€“Â Kirill L.
1 hour ago

@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â€“Â Kirill L.
1 hour ago

@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneÃ¢Â€Â™s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
â€“Â AndreÃ¯ Kostyrka
1 hour ago

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
4
down vote

R, 34 bytes

function(x)try(lm(V1~.,x,si=F)$co)

Try it online!

Notes:

Takes input as a data frame (which is what lm expects)

lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case

si=F, short for singular.ok=FALSE
throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.

edited 1 hour ago

answered 1 hour ago

Kirill L.

2,6061116

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "200"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f172643%2fordinary-least-squares-regression%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
4
down vote

R, 34 bytes

function(x)try(lm(V1~.,x,si=F)$co)

Try it online!

Notes:

Takes input as a data frame (which is what lm expects)

lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case

si=F, short for singular.ok=FALSE
throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.

edited 1 hour ago

answered 1 hour ago

Kirill L.

2,6061116

add a commentÂ |Â

up vote
4
down vote

R, 34 bytes

function(x)try(lm(V1~.,x,si=F)$co)

Try it online!

Notes:

Takes input as a data frame (which is what lm expects)

lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case

si=F, short for singular.ok=FALSE
throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.

edited 1 hour ago

answered 1 hour ago

Kirill L.

2,6061116

add a commentÂ |Â

up vote
4
down vote

R, 34 bytes

function(x)try(lm(V1~.,x,si=F)$co)

Try it online!

Notes:

Takes input as a data frame (which is what lm expects)

lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case

si=F, short for singular.ok=FALSE
throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.

edited 1 hour ago

answered 1 hour ago

Kirill L.

2,6061116

R, 34 bytes

function(x)try(lm(V1~.,x,si=F)$co)

Try it online!

Notes:

Takes input as a data frame (which is what lm expects)

lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case

si=F, short for singular.ok=FALSE
throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.

edited 1 hour ago

answered 1 hour ago

Kirill L.

2,6061116

edited 1 hour ago

answered 1 hour ago

Kirill L.

2,6061116

answered 1 hour ago

Kirill L.

2,6061116

answered 1 hour ago

Kirill L.

2,6061116

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky

Ordinary Least Squares Regression

Task

Input

Output

Challenge rules

Test cases

Task

Input

Output

Challenge rules

Test cases

Task

Input

Output

Challenge rules

Test cases

Task

Input

Output

Challenge rules

Test cases

1 Answer
1

R, 34 bytes

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

R, 34 bytes

R, 34 bytes

R, 34 bytes

R, 34 bytes

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

One-line joke

Category

Random preview

Ordinary Least Squares Regression

Task

Input

Output

Challenge rules

Test cases

Task

Input

Output

Challenge rules

Test cases

Task

Input

Output

Challenge rules

Test cases

Task

Input

Output

Challenge rules

Test cases

1 Answer 1

R, 34 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

R, 34 bytes

R, 34 bytes

R, 34 bytes

R, 34 bytes

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

One-line joke

1 Answer
1

1 Answer
1

1 Answer
1