Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare files with different columns count #39

Open
tmtben opened this issue Feb 5, 2020 · 5 comments
Open

Compare files with different columns count #39

tmtben opened this issue Feb 5, 2020 · 5 comments

Comments

@tmtben
Copy link

tmtben commented Feb 5, 2020

Hello,

Thanks for this great tool!

Here is my use case:
• my base-csv contains whole rows with many columns.
• my delta-csv contains a list of primary keys (one column only).
I would like to get diff comparing only primary keys.

I got this error with csvdiff version 1.3.0:

# csvdiff base.csv pk.csv
csvdiff: command failed - base-file and delta-file columns count do not match

Best regards,
Ben

@aswinkarthik
Copy link
Owner

While validating both the configuration files, there is a check for the column cout here

How about

csvdiff base.csv pk.csv --columns 0

This means only one column is compared and it will also check if that index of that column is present in both the CSV files.

@tmtben
Copy link
Author

tmtben commented Feb 23, 2020

Using version 1.4 with "columns" flag:

# csvdiff base.csv pk.csv --columns 0
csvdiff: command failed - base-file and delta-file columns count do not match

@aswinkarthik
Copy link
Owner

aswinkarthik commented Feb 23, 2020

I was thinking of introducing a feature where if we specify columns flag, we dont need to check if columns count match. All that matters is if the specified column is present on both csvs.

That would satisfy your requirement i believe.

@pascalbe-dev
Copy link

Any update on this?

This would be really nice.
Furthermore it would be nice, if we could check which columns have been added (with which values).

@tmtben
Copy link
Author

tmtben commented Dec 24, 2020

Maybe just replace following line

if baseRecordCount != deltaRecordCount {

by this one
if len(valueColumnPositions) == 0 && baseRecordCount != deltaRecordCount {

Now, I can use "--columns 0" to compare only the first column of two CSV files having different headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants