Skip to content

Latest commit

 

History

History
22 lines (19 loc) · 905 Bytes

README.md

File metadata and controls

22 lines (19 loc) · 905 Bytes

MPXL2CSV

A Python Multiprocessing library to convert Excel(xlsx, xls) files to csv. Python based libraries are notoriously slow to process large Excel files. This library utilizes Python multiprocessing and openpyxl to process multiple Excel files in parallel so as to reduce the total time taken to convert them.

Installing

pip install mpxl2csv

Sample Usage:

Important Note: __main__ guard is needed for Python's multiprocessing code to work. So it is advisable to wrap the code in a function as shown below and call it under the __main__ guard.

import os 
from mpxl2csv import Excel2Csv

def main():
    mp = Excel2Csv(num_processes=3, delimiter="|")
    xl_base_path = os.path.join(os.getcwd(), "sample_input")
    csv_base_path = os.path.join(os.getcwd(), "sample_output")
    mp.convert(xl_base_path, csv_base_path) 

if __name__ == "__main__":
    main()