Skip to content
/ CLISC Public

Command Line Interface Spreadsheet Count, Convert, Compare & Archive

License

Notifications You must be signed in to change notification settings

Asbjoedt/CLISC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLISC

Command Line Interface Spreadsheet Count, Convert, Compare & Archive

A Windows console application made in C#. It is a prototype project for digital archiving of spreadsheets.

🏳️‍🌈 General

  • Batch convert spreadsheets in a directory to .xlsx
  • Include or exclude subdirectories recursively
  • Output results in a new directory with logs in .csv

Count

Count number of spreadsheets in directory by file format.

  • Accepted file extensions: .gsheet, .fods, .numbers, .ods, .ots, .xla, .xlam, .xls, .xlsb, .xlsm, .xlsx, .xlt, .xltm, .xltx
  • .xlsx of Transitional and Strict conformance can be counted separately

🪄 Convert

Convert any spreadsheet12 to .xlsx (Transitional conformance).

  • Office Open XML (Excel) with extensions: .xlsb, .xlsm, .xltm, .xltx, .xlsx with Strict conformance
  • Legacy Microsoft Excel with extensions: .xls, .xlt
  • OpenDocument with extensions: .fods, .ods, .ots
  • Apple Numbers with extension: .numbers

🔍 Compare

Compare original and converted spreadsheets to log differences.3

  • Cell values

🗄️ Archive

The program can convert, package and describe spreadsheets to meet a data quality level, that will enable you to open your spreadsheets many years from now.

  • Convert any spreadsheet12 to both .xlsx (Strict conformance) and .ods
  • Package spreadsheets and metadata in a new archive directory
  • Output all conversions in subdirectories named n+1
  • Rename all conversions 1.xlsx and 1.ods
  • Include copies of the original spreadsheets, this include password protected or otherwise unreadable files
  • Validate spreadsheet against its file format standard (Office Open XML and OpenDocument)
  • Remove cell formula references to other spreadsheets but keep cell values
  • Remove data connections but keep cell values
  • Remove RealTimeData (RTD) functions but keep cell values
  • Remove external object references but copy objects to new subfolder
  • Convert embedded images to .tif
  • Extract all embedded objects to new subfolder
  • Warn if no cell values or objects detected
  • Inform if metadata detected
  • Inform if hyperlinks detected
  • Full compliance option involves
    • Remove printer settings
    • Remove absolute path to local directory
    • Make first sheet active
    • Extract metadata but keep in spreadsheet
    • Extract hyperlinks but keep in spreadsheet
  • Calculate file checksums
  • Zip the archive directory

Dependencies

⚠️ Beyond Compare 4

  • If you want to use the compare function
  • You need to install program in its default directory, or create environment variable "BeyondCompare" with path to your installation

⚠️ LibreOffice

  • If you want to convert OpenDocument spreadsheets and/or use the archiving method
  • You need to install program in its default directory, or create environment variable "LibreOffice" with path to your installation

⚠️ Microsoft Excel

  • If you want to convert legacy Excel and/or use the archiving method, which converts .xlsx conformance from Transitional to Strict

⚠️ ODF Validator 0.11.0

  • If you want to validate .ods spreadsheets
  • You need to install program in "C:\Program Files\ODF Validator" and name program "odfvalidator-0.11.0-jar-with-dependencies.jar", or create environment variable "ODFValidator" with path to your installation
  • ODF Validator needs latest version of Java Development Kit installed

How to use

Download the executable version here. There's no need to install. In your terminal change directory to the folder where CLISC.exe is. Then, to execute the program input:

.\CLISC.exe [your_arguments]

Create your arguments from the following list:

Functions to use (required, pick one of the four)

--function Count
--function CountConvert
--function CountConvertCompare
--function CountConvertCompareArchive

Input directory (required)

--inputdir "[path to input directory]"

Output directory (required)

--outputdir "[path to output directory]"

Include subdirectories from input directory (optional, by default false)

--recurse

Change data to meet all archival requirements (optional, by default false)

--fullcompliance

Example of full usage

.\CLISC.exe --function CountConvertCompareArchive --inputdir "c:\folder" --outputdir "c:\folder" --recurse --fullcompliance

or shorter

.\CLISC.exe -f CountConvertCompareArchive -i "c:\folder" -o "c:\folder" -r -c

If you want to test the application, a sample dataset is provided here.

Packages and software

The following packages and software are used under license in CLISC. Read more.

Footnotes

  1. See definition of accepted spreadsheet file formats. 2

  2. The program currently has a conversion filesize limit of 150MB to prevent excessive performance bottlenecks. Larger filesize spreadsheets should be converted manually. 2

  3. The program can currently not compare cell formatting, embedded objects, charts and other advanced spreadsheet features.