Skip to content

An opinionated implementation of the 'cut' *nix utility to slice rows and columns of data.

License

Notifications You must be signed in to change notification settings

harshasrisri/slice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Slice

I wrote this to scratch an itch that cut utility was a tad bit unintuitive. In the process, I got to address another pet peeve of mine that there is no easy tool to filter rows based on row numbers.

Not aiming to be performant(yet), although the performance is only slightly worse than cut. Not aiming to have feature parity with cut(yet), but just to have a close approximation with a different set of defaults.

Default behavior

  • Doesn't have a bytes/chars mode of operation
  • Omits lines not containing a delimiter
  • Uses space as the delimiter(IFS) and the separator(OFS)
  • Trailing separators in input are not included in output
  • Treats consecutive separators as one (not configurable at the moment)

Usage

Some examples are included.

$ slice --help
An opinionated implementation of the 'cut' *nix utility to slice rows and columns of data.

Usage: slice [OPTIONS] --fields <FIELDS> [FILES]...

Arguments:
  [FILES]...  Files to process [default: -]

Options:
  -f, --fields <FIELDS>        Fields to be extracted. See FIELD SPECIFICATION
  -r, --rows <ROWS>            Rows to be extracted. All, by default. See FIELD SPECIFICATION
  -d, --delimiter <DELIMITER>  Delimiter to be used to split fields [default: " "]
  -s, --separator <SEPARATOR>  Separator to use to print results [default: " "]
  -n, --non-delimited          Include lines that don't contain a delimiter
  -c, --complement             Complement field spec. Print all fields but those specified with -f
  -h, --help                   Print help information
  -V, --version                Print version information


FIELD SPECIFICATION:
    The required fileds to be extracted can be specified or combined like below:
        3           => Extract column 3
        4-7         => Extract fields 4,5,6,7
        -5          => Extract all fields upto and including 5, i.e 1,2,3,4,5
        6-          => Extract all fields from and including 6, ie. 6,7,8,...
        2,4,6       => Extract only fields 2, 4 and 6
        -2,5-7,9-   => Extract fields 1,2,5,6,7,9,...

Performance

Performance measured using hyperfine:

hyperfine --export-markdown hyf.md -Nw 5 -r 1000 \
    'slice inputs/pipe.txt -d "|" -f -3,12-15,6-9,22,30- -s ","' \
    'cut -d "|" -f -3,12-15,6-9,22,30- inputs/pipe.txt'

cut is still faster than slice. Someday, I hope to make slice more performant.

Command Mean [ms] Min [ms] Max [ms] Relative
cut -d ""| -f -3,12-15,6-9,22,30- --output-delimiter "," inputs/pipe.txt 3.7 ± 0.7 3.1 6.7 1.00
slice inputs/pipe.txt -d "|" -f -3,12-15,6-9,22,30- -s "," 4.2 ± 0.3 3.9 7.6 1.16 ± 0.24

About

An opinionated implementation of the 'cut' *nix utility to slice rows and columns of data.

Topics

Resources

License

Stars

Watchers

Forks

Languages