How to extend the 'summary' function to include sd, kurtosis and skew?
Clash Royale CLAN TAG#URR8PPP
up vote
6
down vote
favorite
R's summary
function works really well on a dataframe, giving, for example:
> summary(fred)
sum.count count sum value
Min. : 1.000 Min. : 1.0 Min. : 1 Min. : 0.00
1st Qu.: 1.000 1st Qu.: 6.0 1st Qu.: 7 1st Qu.:35.82
Median : 1.067 Median : 9.0 Median : 10 Median :42.17
Mean : 1.238 Mean : 497.1 Mean : 6120 Mean :43.44
3rd Qu.: 1.200 3rd Qu.: 35.0 3rd Qu.: 40 3rd Qu.:51.31
Max. :40.687 Max. :64425.0 Max. :2621278 Max. :75.95
What I'd like to do is modify the function so it also gives, after 'Mean', an entry for the standard deviation, the kurtosis and the skew.
What's the best way to do this? I've researched this a bit, and adding a function with a method doesn't work for me:
> summary.class <- function(x)
return(sd(x))
The above is just ignored. I suppose that I need to understand how to define all classes to return.
r std summary skew kurtosis
add a comment |Â
up vote
6
down vote
favorite
R's summary
function works really well on a dataframe, giving, for example:
> summary(fred)
sum.count count sum value
Min. : 1.000 Min. : 1.0 Min. : 1 Min. : 0.00
1st Qu.: 1.000 1st Qu.: 6.0 1st Qu.: 7 1st Qu.:35.82
Median : 1.067 Median : 9.0 Median : 10 Median :42.17
Mean : 1.238 Mean : 497.1 Mean : 6120 Mean :43.44
3rd Qu.: 1.200 3rd Qu.: 35.0 3rd Qu.: 40 3rd Qu.:51.31
Max. :40.687 Max. :64425.0 Max. :2621278 Max. :75.95
What I'd like to do is modify the function so it also gives, after 'Mean', an entry for the standard deviation, the kurtosis and the skew.
What's the best way to do this? I've researched this a bit, and adding a function with a method doesn't work for me:
> summary.class <- function(x)
return(sd(x))
The above is just ignored. I suppose that I need to understand how to define all classes to return.
r std summary skew kurtosis
summary.data.frame <- function(...) tt <- base::summary.data.frame(...); <code to modify tt>; return(tt)
â Ben Bolker
4 hours ago
Possible duplicate of R extended summary numerical values including kurtosis, skew, etc?
â Tung
2 hours ago
add a comment |Â
up vote
6
down vote
favorite
up vote
6
down vote
favorite
R's summary
function works really well on a dataframe, giving, for example:
> summary(fred)
sum.count count sum value
Min. : 1.000 Min. : 1.0 Min. : 1 Min. : 0.00
1st Qu.: 1.000 1st Qu.: 6.0 1st Qu.: 7 1st Qu.:35.82
Median : 1.067 Median : 9.0 Median : 10 Median :42.17
Mean : 1.238 Mean : 497.1 Mean : 6120 Mean :43.44
3rd Qu.: 1.200 3rd Qu.: 35.0 3rd Qu.: 40 3rd Qu.:51.31
Max. :40.687 Max. :64425.0 Max. :2621278 Max. :75.95
What I'd like to do is modify the function so it also gives, after 'Mean', an entry for the standard deviation, the kurtosis and the skew.
What's the best way to do this? I've researched this a bit, and adding a function with a method doesn't work for me:
> summary.class <- function(x)
return(sd(x))
The above is just ignored. I suppose that I need to understand how to define all classes to return.
r std summary skew kurtosis
R's summary
function works really well on a dataframe, giving, for example:
> summary(fred)
sum.count count sum value
Min. : 1.000 Min. : 1.0 Min. : 1 Min. : 0.00
1st Qu.: 1.000 1st Qu.: 6.0 1st Qu.: 7 1st Qu.:35.82
Median : 1.067 Median : 9.0 Median : 10 Median :42.17
Mean : 1.238 Mean : 497.1 Mean : 6120 Mean :43.44
3rd Qu.: 1.200 3rd Qu.: 35.0 3rd Qu.: 40 3rd Qu.:51.31
Max. :40.687 Max. :64425.0 Max. :2621278 Max. :75.95
What I'd like to do is modify the function so it also gives, after 'Mean', an entry for the standard deviation, the kurtosis and the skew.
What's the best way to do this? I've researched this a bit, and adding a function with a method doesn't work for me:
> summary.class <- function(x)
return(sd(x))
The above is just ignored. I suppose that I need to understand how to define all classes to return.
r std summary skew kurtosis
r std summary skew kurtosis
edited 2 hours ago
Tung
5,79321634
5,79321634
asked 5 hours ago
Peter Brooks
1358
1358
summary.data.frame <- function(...) tt <- base::summary.data.frame(...); <code to modify tt>; return(tt)
â Ben Bolker
4 hours ago
Possible duplicate of R extended summary numerical values including kurtosis, skew, etc?
â Tung
2 hours ago
add a comment |Â
summary.data.frame <- function(...) tt <- base::summary.data.frame(...); <code to modify tt>; return(tt)
â Ben Bolker
4 hours ago
Possible duplicate of R extended summary numerical values including kurtosis, skew, etc?
â Tung
2 hours ago
summary.data.frame <- function(...) tt <- base::summary.data.frame(...); <code to modify tt>; return(tt)
â Ben Bolker
4 hours ago
summary.data.frame <- function(...) tt <- base::summary.data.frame(...); <code to modify tt>; return(tt)
â Ben Bolker
4 hours ago
Possible duplicate of R extended summary numerical values including kurtosis, skew, etc?
â Tung
2 hours ago
Possible duplicate of R extended summary numerical values including kurtosis, skew, etc?
â Tung
2 hours ago
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
7
down vote
accepted
How about using already existing solutions from the psych
package?
my.dat <- cbind(norm = rnorm(100), pois = rpois(n = 100, 10))
library(psych)
describe(my.dat)
# vars n mean sd median trimmed mad min max range skew kurtosis se
# norm 1 100 -0.02 0.98 -0.09 -0.06 0.86 -3.25 2.81 6.06 0.13 0.74 0.10
# pois 2 100 9.91 3.30 10.00 9.95 4.45 3.00 17.00 14.00 -0.07 -0.75 0.33
add a comment |Â
up vote
1
down vote
Another choice is the Desc
function from the DescTools
package which produce both summary stats and plot
library(DescTools)
Desc(iris3, plotit = TRUE)
#> -------------------------------------------------------------------------
#> iris3 (numeric)
#>
#> length n NAs unique 0s mean meanCI
#> 600 600 0 74 0 3.46 3.31
#> 100.0% 0.0% 0.0% 3.62
#>
#> .05 .10 .25 median .75 .90 .95
#> 0.20 1.10 1.70 3.20 5.10 6.20 6.70
#>
#> range sd vcoef mad IQR skew kurt
#> 7.80 1.98 0.57 2.52 3.40 0.13 -1.05
#>
#> lowest : 0.1 (5), 0.2 (29), 0.3 (7), 0.4 (7), 0.5
#> highest: 7.3, 7.4, 7.6, 7.7 (4), 7.9
The skim
function from the skimr package is also a good one
library(skimr)
skim(iris)
Skim summary statistics
n obs: 150
n variables: 5
-- Variable type:factor --------------------------------------------------------
variable missing complete n n_unique
Species 0 150 150 3
top_counts ordered
set: 50, ver: 50, vir: 50, NA: 0 FALSE
-- Variable type:numeric -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50
Petal.Length 0 150 150 3.76 1.77 1 1.6 4.35
Petal.Width 0 150 150 1.2 0.76 0.1 0.3 1.3
Sepal.Length 0 150 150 5.84 0.83 4.3 5.1 5.8
Sepal.Width 0 150 150 3.06 0.44 2 2.8 3
p75 p100 hist
5.1 6.9 âÂÂâÂÂâÂÂâÂÂâÂÂ
âÂÂ
âÂÂâÂÂ
1.8 2.5 âÂÂâÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂ
6.4 7.9 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂ
âÂÂâÂÂ
3.3 4.4 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂâÂÂ
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
7
down vote
accepted
How about using already existing solutions from the psych
package?
my.dat <- cbind(norm = rnorm(100), pois = rpois(n = 100, 10))
library(psych)
describe(my.dat)
# vars n mean sd median trimmed mad min max range skew kurtosis se
# norm 1 100 -0.02 0.98 -0.09 -0.06 0.86 -3.25 2.81 6.06 0.13 0.74 0.10
# pois 2 100 9.91 3.30 10.00 9.95 4.45 3.00 17.00 14.00 -0.07 -0.75 0.33
add a comment |Â
up vote
7
down vote
accepted
How about using already existing solutions from the psych
package?
my.dat <- cbind(norm = rnorm(100), pois = rpois(n = 100, 10))
library(psych)
describe(my.dat)
# vars n mean sd median trimmed mad min max range skew kurtosis se
# norm 1 100 -0.02 0.98 -0.09 -0.06 0.86 -3.25 2.81 6.06 0.13 0.74 0.10
# pois 2 100 9.91 3.30 10.00 9.95 4.45 3.00 17.00 14.00 -0.07 -0.75 0.33
add a comment |Â
up vote
7
down vote
accepted
up vote
7
down vote
accepted
How about using already existing solutions from the psych
package?
my.dat <- cbind(norm = rnorm(100), pois = rpois(n = 100, 10))
library(psych)
describe(my.dat)
# vars n mean sd median trimmed mad min max range skew kurtosis se
# norm 1 100 -0.02 0.98 -0.09 -0.06 0.86 -3.25 2.81 6.06 0.13 0.74 0.10
# pois 2 100 9.91 3.30 10.00 9.95 4.45 3.00 17.00 14.00 -0.07 -0.75 0.33
How about using already existing solutions from the psych
package?
my.dat <- cbind(norm = rnorm(100), pois = rpois(n = 100, 10))
library(psych)
describe(my.dat)
# vars n mean sd median trimmed mad min max range skew kurtosis se
# norm 1 100 -0.02 0.98 -0.09 -0.06 0.86 -3.25 2.81 6.06 0.13 0.74 0.10
# pois 2 100 9.91 3.30 10.00 9.95 4.45 3.00 17.00 14.00 -0.07 -0.75 0.33
answered 5 hours ago
storaged
1,4001222
1,4001222
add a comment |Â
add a comment |Â
up vote
1
down vote
Another choice is the Desc
function from the DescTools
package which produce both summary stats and plot
library(DescTools)
Desc(iris3, plotit = TRUE)
#> -------------------------------------------------------------------------
#> iris3 (numeric)
#>
#> length n NAs unique 0s mean meanCI
#> 600 600 0 74 0 3.46 3.31
#> 100.0% 0.0% 0.0% 3.62
#>
#> .05 .10 .25 median .75 .90 .95
#> 0.20 1.10 1.70 3.20 5.10 6.20 6.70
#>
#> range sd vcoef mad IQR skew kurt
#> 7.80 1.98 0.57 2.52 3.40 0.13 -1.05
#>
#> lowest : 0.1 (5), 0.2 (29), 0.3 (7), 0.4 (7), 0.5
#> highest: 7.3, 7.4, 7.6, 7.7 (4), 7.9
The skim
function from the skimr package is also a good one
library(skimr)
skim(iris)
Skim summary statistics
n obs: 150
n variables: 5
-- Variable type:factor --------------------------------------------------------
variable missing complete n n_unique
Species 0 150 150 3
top_counts ordered
set: 50, ver: 50, vir: 50, NA: 0 FALSE
-- Variable type:numeric -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50
Petal.Length 0 150 150 3.76 1.77 1 1.6 4.35
Petal.Width 0 150 150 1.2 0.76 0.1 0.3 1.3
Sepal.Length 0 150 150 5.84 0.83 4.3 5.1 5.8
Sepal.Width 0 150 150 3.06 0.44 2 2.8 3
p75 p100 hist
5.1 6.9 âÂÂâÂÂâÂÂâÂÂâÂÂ
âÂÂ
âÂÂâÂÂ
1.8 2.5 âÂÂâÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂ
6.4 7.9 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂ
âÂÂâÂÂ
3.3 4.4 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂâÂÂ
add a comment |Â
up vote
1
down vote
Another choice is the Desc
function from the DescTools
package which produce both summary stats and plot
library(DescTools)
Desc(iris3, plotit = TRUE)
#> -------------------------------------------------------------------------
#> iris3 (numeric)
#>
#> length n NAs unique 0s mean meanCI
#> 600 600 0 74 0 3.46 3.31
#> 100.0% 0.0% 0.0% 3.62
#>
#> .05 .10 .25 median .75 .90 .95
#> 0.20 1.10 1.70 3.20 5.10 6.20 6.70
#>
#> range sd vcoef mad IQR skew kurt
#> 7.80 1.98 0.57 2.52 3.40 0.13 -1.05
#>
#> lowest : 0.1 (5), 0.2 (29), 0.3 (7), 0.4 (7), 0.5
#> highest: 7.3, 7.4, 7.6, 7.7 (4), 7.9
The skim
function from the skimr package is also a good one
library(skimr)
skim(iris)
Skim summary statistics
n obs: 150
n variables: 5
-- Variable type:factor --------------------------------------------------------
variable missing complete n n_unique
Species 0 150 150 3
top_counts ordered
set: 50, ver: 50, vir: 50, NA: 0 FALSE
-- Variable type:numeric -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50
Petal.Length 0 150 150 3.76 1.77 1 1.6 4.35
Petal.Width 0 150 150 1.2 0.76 0.1 0.3 1.3
Sepal.Length 0 150 150 5.84 0.83 4.3 5.1 5.8
Sepal.Width 0 150 150 3.06 0.44 2 2.8 3
p75 p100 hist
5.1 6.9 âÂÂâÂÂâÂÂâÂÂâÂÂ
âÂÂ
âÂÂâÂÂ
1.8 2.5 âÂÂâÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂ
6.4 7.9 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂ
âÂÂâÂÂ
3.3 4.4 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂâÂÂ
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Another choice is the Desc
function from the DescTools
package which produce both summary stats and plot
library(DescTools)
Desc(iris3, plotit = TRUE)
#> -------------------------------------------------------------------------
#> iris3 (numeric)
#>
#> length n NAs unique 0s mean meanCI
#> 600 600 0 74 0 3.46 3.31
#> 100.0% 0.0% 0.0% 3.62
#>
#> .05 .10 .25 median .75 .90 .95
#> 0.20 1.10 1.70 3.20 5.10 6.20 6.70
#>
#> range sd vcoef mad IQR skew kurt
#> 7.80 1.98 0.57 2.52 3.40 0.13 -1.05
#>
#> lowest : 0.1 (5), 0.2 (29), 0.3 (7), 0.4 (7), 0.5
#> highest: 7.3, 7.4, 7.6, 7.7 (4), 7.9
The skim
function from the skimr package is also a good one
library(skimr)
skim(iris)
Skim summary statistics
n obs: 150
n variables: 5
-- Variable type:factor --------------------------------------------------------
variable missing complete n n_unique
Species 0 150 150 3
top_counts ordered
set: 50, ver: 50, vir: 50, NA: 0 FALSE
-- Variable type:numeric -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50
Petal.Length 0 150 150 3.76 1.77 1 1.6 4.35
Petal.Width 0 150 150 1.2 0.76 0.1 0.3 1.3
Sepal.Length 0 150 150 5.84 0.83 4.3 5.1 5.8
Sepal.Width 0 150 150 3.06 0.44 2 2.8 3
p75 p100 hist
5.1 6.9 âÂÂâÂÂâÂÂâÂÂâÂÂ
âÂÂ
âÂÂâÂÂ
1.8 2.5 âÂÂâÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂ
6.4 7.9 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂ
âÂÂâÂÂ
3.3 4.4 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂâÂÂ
Another choice is the Desc
function from the DescTools
package which produce both summary stats and plot
library(DescTools)
Desc(iris3, plotit = TRUE)
#> -------------------------------------------------------------------------
#> iris3 (numeric)
#>
#> length n NAs unique 0s mean meanCI
#> 600 600 0 74 0 3.46 3.31
#> 100.0% 0.0% 0.0% 3.62
#>
#> .05 .10 .25 median .75 .90 .95
#> 0.20 1.10 1.70 3.20 5.10 6.20 6.70
#>
#> range sd vcoef mad IQR skew kurt
#> 7.80 1.98 0.57 2.52 3.40 0.13 -1.05
#>
#> lowest : 0.1 (5), 0.2 (29), 0.3 (7), 0.4 (7), 0.5
#> highest: 7.3, 7.4, 7.6, 7.7 (4), 7.9
The skim
function from the skimr package is also a good one
library(skimr)
skim(iris)
Skim summary statistics
n obs: 150
n variables: 5
-- Variable type:factor --------------------------------------------------------
variable missing complete n n_unique
Species 0 150 150 3
top_counts ordered
set: 50, ver: 50, vir: 50, NA: 0 FALSE
-- Variable type:numeric -------------------------------------------------------
variable missing complete n mean sd p0 p25 p50
Petal.Length 0 150 150 3.76 1.77 1 1.6 4.35
Petal.Width 0 150 150 1.2 0.76 0.1 0.3 1.3
Sepal.Length 0 150 150 5.84 0.83 4.3 5.1 5.8
Sepal.Width 0 150 150 3.06 0.44 2 2.8 3
p75 p100 hist
5.1 6.9 âÂÂâÂÂâÂÂâÂÂâÂÂ
âÂÂ
âÂÂâÂÂ
1.8 2.5 âÂÂâÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂ
6.4 7.9 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂ
âÂÂâÂÂ
3.3 4.4 âÂÂâÂÂâÂÂ
âÂÂâÂÂâÂÂâÂÂâÂÂ
edited 1 hour ago
answered 1 hour ago
Tung
5,79321634
5,79321634
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52344047%2fhow-to-extend-the-summary-function-to-include-sd-kurtosis-and-skew%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
summary.data.frame <- function(...) tt <- base::summary.data.frame(...); <code to modify tt>; return(tt)
â Ben Bolker
4 hours ago
Possible duplicate of R extended summary numerical values including kurtosis, skew, etc?
â Tung
2 hours ago