Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating mean for columns, ignoring non-numerical values #777

Open
borysj opened this issue Jun 13, 2023 · 1 comment
Open

Calculating mean for columns, ignoring non-numerical values #777

borysj opened this issue Jun 13, 2023 · 1 comment
Labels

Comments

@borysj
Copy link

borysj commented Jun 13, 2023

Hello,

It seems that csvstat recognizes a column as text if there is a single non-numerical exception among numerical entries. It is then unfortunately not possible to calculate the mean of such a column (by skipping the non-numeral exception(s)).

It would be great if it would be possible to force csvstat to calculate the mean for the numerical entries only, and maybe print the number of non-numerical exceptions.

@borysj borysj changed the title csvsort: Calculating mean for columns that have non-numerical exceptions csvstat: Calculating mean for columns that have non-numerical exceptions Jun 13, 2023
@jpmckinney
Copy link
Member

In the meantime, could you potentially csvgrep the numerical entries, then pipe to csvstat?

@jpmckinney jpmckinney transferred this issue from wireservice/csvkit Oct 17, 2023
@jpmckinney jpmckinney changed the title csvstat: Calculating mean for columns that have non-numerical exceptions Calculating mean for columns that have non-numerical values Oct 17, 2023
@jpmckinney jpmckinney changed the title Calculating mean for columns that have non-numerical values Calculating mean for columns, ignoring non-numerical values Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants