Can we remove features that have zero-correlation with the target/label?

up vote
2
down vote

favorite

So I draw a pairplot/heatmap from the feature correlations of a dataset and see a set of features that bears Zero-correlations both with:

every other feature and

also with the target/label

.Reference code snippet in python is below:

corr = df.corr()
sns.heatmap(corr) # Visually see how each feature is correlate with other (incl. the target)

Can I drop these features to improve the accuracy of my classification problem?

Can I drop these features to improve the accuracy of my classification problem, if it is explicitly given that these features are derived features?

edited 10 mins ago

asked 1 hour ago

karthiks

1205

add a commentÂ |Â

up vote
2
down vote

favorite

So I draw a pairplot/heatmap from the feature correlations of a dataset and see a set of features that bears Zero-correlations both with:

every other feature and

also with the target/label

.Reference code snippet in python is below:

corr = df.corr()
sns.heatmap(corr) # Visually see how each feature is correlate with other (incl. the target)

Can I drop these features to improve the accuracy of my classification problem?

Can I drop these features to improve the accuracy of my classification problem, if it is explicitly given that these features are derived features?

edited 10 mins ago

asked 1 hour ago

karthiks

1205

add a commentÂ |Â

up vote
2
down vote

favorite

So I draw a pairplot/heatmap from the feature correlations of a dataset and see a set of features that bears Zero-correlations both with:

every other feature and

also with the target/label

.Reference code snippet in python is below:

corr = df.corr()
sns.heatmap(corr) # Visually see how each feature is correlate with other (incl. the target)

Can I drop these features to improve the accuracy of my classification problem?

Can I drop these features to improve the accuracy of my classification problem, if it is explicitly given that these features are derived features?

edited 10 mins ago

asked 1 hour ago

karthiks

1205

So I draw a pairplot/heatmap from the feature correlations of a dataset and see a set of features that bears Zero-correlations both with:

every other feature and

also with the target/label

.Reference code snippet in python is below:

corr = df.corr()
sns.heatmap(corr) # Visually see how each feature is correlate with other (incl. the target)

Can I drop these features to improve the accuracy of my classification problem?

Can I drop these features to improve the accuracy of my classification problem, if it is explicitly given that these features are derived features?

classification scikit-learn pandas seaborn

edited 10 mins ago

asked 1 hour ago

karthiks

1205

edited 10 mins ago

asked 1 hour ago

karthiks

1205

edited 10 mins ago

asked 1 hour ago

karthiks

1205

asked 1 hour ago

karthiks

1205

asked 1 hour ago

karthiks

1205

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
2
down vote

accepted

Can I drop these features to improve the accuracy of my classification problem?

If you are using a simple linear classifier, such as logistic regression then yes. That is because your plots are giving you a direct visualisation of how the model could make use of the data.

As soon as you start to use a non-linear classifier, that can combine features inside the learning model, then it is not so straightforward. Your plots cannot exclude a complex relationship that such a model might be able to exploit. Generally the only way to proceed is to train and test the model (using some form of cross-validation) with and without the feature.

A plot might visually show a strong non-linear relationship with zero linear correlation - e.g. a complete bell curve of feature versus target would have close to zero linear correlation, but suggest that something interesting is going on that would be useful in a predictive model. If you see plots like this, you can either try to turn them into linear relationships with some feature engineering, or you can treat it as evidence that you should use a non-linear model.

answered 16 mins ago

Neil Slater

15.7k22758

That clears the air. Thanks. I've also added a follow-up question. Do you mind answering it as well? Thanks in advance.
â€“Â karthiks
8 mins ago

add a commentÂ |Â

up vote
3
down vote

These uncorrelated features might be important for target in connection with other non-target features. So, it might be not a good idea to remove them, especially if your model is a complex one.

It might be a good idea to remove one of the highly correlated between themselves non-target features, because they might be redundant.

Still, it might be better to use feature reduction technics like PCA, because PCA maximize variance, without removing the whole feature, but including it into principal component.

In case of ordinals or binary features, correlation won't tell you a lot. So I guess, the best way to test if a feature is important in case it's not correlated with target is to directly compare performance of a model with and without the feature. But still different features might have different importance for different algorithms.

answered 21 mins ago

DmytroSytro

787

add a commentÂ |Â

up vote
1
down vote

If I understand you well, you are asking if you can remove features having zero-correlation either :

With other features

With the label you want to predict

Those are two different cases :

1. We usually recommend to remove features having correlation between them (stabilize the model). If they are ZERO-correlated, you cannot conclude here. This is by training your model that you will see is the feature is worth or not.

Don't drop those ones.

2. If a feature is strongly correlated with your label, this means a linear function (or model) should be able to predict well the latter. Even if it is not correlated, it doesn't tell you that a non-linear model wouldn't perform well by using this feature.

Don't drop this one either !

I hope I answered your question.

edited 3 mins ago

answered 22 mins ago

Atani

113

New contributor

Modified question for pressing clarity. I meant that a set of features bearing ZERO-correlation with all other features including the target/label. Hope that clarifies..
â€“Â karthiks
13 mins ago

Thank you for your clarification. Edited my answer accordingly.
â€“Â Atani
2 mins ago

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40602%2fcan-we-remove-features-that-have-zero-correlation-with-the-target-label%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
2
down vote

accepted

Can I drop these features to improve the accuracy of my classification problem?

If you are using a simple linear classifier, such as logistic regression then yes. That is because your plots are giving you a direct visualisation of how the model could make use of the data.

answered 16 mins ago

Neil Slater

15.7k22758

That clears the air. Thanks. I've also added a follow-up question. Do you mind answering it as well? Thanks in advance.
â€“Â karthiks
8 mins ago

add a commentÂ |Â

up vote
2
down vote

accepted

Can I drop these features to improve the accuracy of my classification problem?

If you are using a simple linear classifier, such as logistic regression then yes. That is because your plots are giving you a direct visualisation of how the model could make use of the data.

answered 16 mins ago

Neil Slater

15.7k22758

That clears the air. Thanks. I've also added a follow-up question. Do you mind answering it as well? Thanks in advance.
â€“Â karthiks
8 mins ago

add a commentÂ |Â

up vote
2
down vote

accepted

Can I drop these features to improve the accuracy of my classification problem?

If you are using a simple linear classifier, such as logistic regression then yes. That is because your plots are giving you a direct visualisation of how the model could make use of the data.

answered 16 mins ago

Neil Slater

15.7k22758

Can I drop these features to improve the accuracy of my classification problem?

If you are using a simple linear classifier, such as logistic regression then yes. That is because your plots are giving you a direct visualisation of how the model could make use of the data.

answered 16 mins ago

Neil Slater

15.7k22758

answered 16 mins ago

Neil Slater

15.7k22758

answered 16 mins ago

Neil Slater

15.7k22758

answered 16 mins ago

Neil Slater

15.7k22758

That clears the air. Thanks. I've also added a follow-up question. Do you mind answering it as well? Thanks in advance.
â€“Â karthiks
8 mins ago

add a commentÂ |Â

That clears the air. Thanks. I've also added a follow-up question. Do you mind answering it as well? Thanks in advance.
â€“Â karthiks
8 mins ago

That clears the air. Thanks. I've also added a follow-up question. Do you mind answering it as well? Thanks in advance.
â€“Â karthiks
8 mins ago

add a commentÂ |Â

up vote
3
down vote

These uncorrelated features might be important for target in connection with other non-target features. So, it might be not a good idea to remove them, especially if your model is a complex one.

It might be a good idea to remove one of the highly correlated between themselves non-target features, because they might be redundant.

Still, it might be better to use feature reduction technics like PCA, because PCA maximize variance, without removing the whole feature, but including it into principal component.

answered 21 mins ago

DmytroSytro

787

add a commentÂ |Â

up vote
3
down vote

These uncorrelated features might be important for target in connection with other non-target features. So, it might be not a good idea to remove them, especially if your model is a complex one.

It might be a good idea to remove one of the highly correlated between themselves non-target features, because they might be redundant.

Still, it might be better to use feature reduction technics like PCA, because PCA maximize variance, without removing the whole feature, but including it into principal component.

answered 21 mins ago

DmytroSytro

787

add a commentÂ |Â

up vote
3
down vote

These uncorrelated features might be important for target in connection with other non-target features. So, it might be not a good idea to remove them, especially if your model is a complex one.

It might be a good idea to remove one of the highly correlated between themselves non-target features, because they might be redundant.

Still, it might be better to use feature reduction technics like PCA, because PCA maximize variance, without removing the whole feature, but including it into principal component.

answered 21 mins ago

DmytroSytro

787

These uncorrelated features might be important for target in connection with other non-target features. So, it might be not a good idea to remove them, especially if your model is a complex one.

It might be a good idea to remove one of the highly correlated between themselves non-target features, because they might be redundant.

Still, it might be better to use feature reduction technics like PCA, because PCA maximize variance, without removing the whole feature, but including it into principal component.

answered 21 mins ago

DmytroSytro

787

answered 21 mins ago

DmytroSytro

787

answered 21 mins ago

DmytroSytro

787

answered 21 mins ago

DmytroSytro

787

add a commentÂ |Â

up vote
1
down vote

If I understand you well, you are asking if you can remove features having zero-correlation either :

With other features

With the label you want to predict

Those are two different cases :

Don't drop those ones.

Don't drop this one either !

I hope I answered your question.

edited 3 mins ago

answered 22 mins ago

Atani

113

New contributor

Modified question for pressing clarity. I meant that a set of features bearing ZERO-correlation with all other features including the target/label. Hope that clarifies..
â€“Â karthiks
13 mins ago

Thank you for your clarification. Edited my answer accordingly.
â€“Â Atani
2 mins ago

add a commentÂ |Â

up vote
1
down vote

If I understand you well, you are asking if you can remove features having zero-correlation either :

With other features

With the label you want to predict

Those are two different cases :

Don't drop those ones.

Don't drop this one either !

I hope I answered your question.

edited 3 mins ago

answered 22 mins ago

Atani

113

New contributor

Modified question for pressing clarity. I meant that a set of features bearing ZERO-correlation with all other features including the target/label. Hope that clarifies..
â€“Â karthiks
13 mins ago

Thank you for your clarification. Edited my answer accordingly.
â€“Â Atani
2 mins ago

add a commentÂ |Â

up vote
1
down vote

If I understand you well, you are asking if you can remove features having zero-correlation either :

With other features

With the label you want to predict

Those are two different cases :

Don't drop those ones.

Don't drop this one either !

I hope I answered your question.

edited 3 mins ago

answered 22 mins ago

Atani

113

New contributor

If I understand you well, you are asking if you can remove features having zero-correlation either :

With other features

With the label you want to predict

Those are two different cases :

Don't drop those ones.

Don't drop this one either !

I hope I answered your question.

edited 3 mins ago

answered 22 mins ago

Atani

113

New contributor

edited 3 mins ago

answered 22 mins ago

Atani

113

New contributor

answered 22 mins ago

Atani

113

answered 22 mins ago

Atani

113

New contributor

Atani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

Modified question for pressing clarity. I meant that a set of features bearing ZERO-correlation with all other features including the target/label. Hope that clarifies..
â€“Â karthiks
13 mins ago

Thank you for your clarification. Edited my answer accordingly.
â€“Â Atani
2 mins ago

add a commentÂ |Â

Modified question for pressing clarity. I meant that a set of features bearing ZERO-correlation with all other features including the target/label. Hope that clarifies..
â€“Â karthiks
13 mins ago

Thank you for your clarification. Edited my answer accordingly.
â€“Â Atani
2 mins ago

Modified question for pressing clarity. I meant that a set of features bearing ZERO-correlation with all other features including the target/label. Hope that clarifies..
â€“Â karthiks
13 mins ago

Thank you for your clarification. Edited my answer accordingly.
â€“Â Atani
2 mins ago

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky