Skip to content

A collection of scripts that do common tasks when working with large datasets.

License

Notifications You must be signed in to change notification settings

Aoriseth/Python-Dataset-Helpers

Repository files navigation

Python-Dataset-Helpers

A collection of scripts that do common tasks when working with large datasets.

sampleFile.py

Samples every Nth line of a file

listToSqlStatements.py

Pastes specified text before and after a list of lines. Adds commit statements every 100 lines.

removeListFromList.py

Removes every line specified in a file from another file.

combineFilesInFolder.py

Combine all files with a given extension to a single file. Option to keep only the header of the first file. (useful for csv files)

semicolonToCommaSeperated.py

Replaces all commas with a dot '.' Replaces all semicolons with a comma ','

removeDouble.py

Takes a file with a list of words Removes all instances of words which occur more than once

About

A collection of scripts that do common tasks when working with large datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages