Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove h5cpp in favor of just HDF5 C API. #314

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

brettviren
Copy link
Member

@brettviren brettviren commented Jun 21, 2024

This is to address #192

To start out with we limit the scope:

  • Remove h5cpp
  • keep only HDF5FrameTap
  • validate a file

To start with we achieve the first two but HDF5 crashes. The C API is probably not being properly used. But, enough of a scaffold is in this PR to start to improve things.

@brettviren
Copy link
Member Author

This looks to be working now. I check that ADC and SP input from npz come out has HDF5 which can be plotted and look "normal". It would be very good to check with an old job to see if anything beyond these broke. O.w., I am ready for merge on my side.

I have mostly tried to keep configuration interface consistent but there are some changes:

  1. Ignore "anode" setting. It's a needless requirement and was being used for suspicious reason (mucking with channel IDs).
  2. Ignore "high_throughput" option. Not sure what this is yet. If important and supported by the HDF5 C API, I'll look into it.
  3. Fix spelling of "gzip" option.

Use of ignored params results in logged warnings.

Here is a summary of performance as a function of ADC with digitze=true and SP with digitize=false as a function of gzip level and square chunk size.

Here are results on i7-4770K

| time (s) | size | tier-gz-chunk |
|----------+------+---------------|
|     0.29 | 30M  | adc-0-32.hdf  |
|     0.27 | 30M  | adc-0-64.hdf  |
|     0.27 | 30M  | adc-0-128.hdf |
|     0.27 | 30M  | adc-0-256.hdf |
|     0.27 | 30M  | adc-0-512.hdf |
|----------+------+---------------|
|     0.62 | 19M  | adc-1-32.hdf  |
|     0.46 | 18M  | adc-1-64.hdf  |
|     0.41 | 18M  | adc-1-128.hdf |
|     0.41 | 17M  | adc-1-256.hdf |
|     0.40 | 17M  | adc-1-512.hdf |
|----------+------+---------------|
|     0.84 | 13M  | adc-2-32.hdf  |
|     0.81 | 12M  | adc-2-64.hdf  |
|     0.54 | 11M  | adc-2-128.hdf |
|     0.52 | 11M  | adc-2-256.hdf |
|     0.52 | 11M  | adc-2-512.hdf |
|----------+------+---------------|
|     0.65 | 13M  | adc-3-32.hdf  |
|     0.57 | 11M  | adc-3-64.hdf  |
|     0.56 | 11M  | adc-3-128.hdf |
|     0.56 | 9.9M | adc-3-256.hdf |
|     0.56 | 9.9M | adc-3-512.hdf |
|----------+------+---------------|
|     0.90 | 13M  | adc-4-32.hdf  |
|     0.73 | 11M  | adc-4-64.hdf  |
|     0.73 | 9.7M | adc-4-128.hdf |
|     0.76 | 9.4M | adc-4-256.hdf |
|     0.74 | 9.3M | adc-4-512.hdf |
|----------+------+---------------|
|     0.96 | 13M  | adc-5-32.hdf  |
|     0.80 | 11M  | adc-5-64.hdf  |
|     0.83 | 9.7M | adc-5-128.hdf |
|     0.89 | 9.4M | adc-5-256.hdf |
|     0.90 | 9.3M | adc-5-512.hdf |
|----------+------+---------------|
|     1.23 | 13M  | adc-9-32.hdf  |
|     1.28 | 11M  | adc-9-64.hdf  |
|     2.20 | 10M  | adc-9-128.hdf |
|     3.31 | 9.5M | adc-9-256.hdf |
|     3.61 | 9.3M | adc-9-512.hdf |
|----------+------+---------------|
|----------+------+---------------|
|     0.29 | 59M  | sig-0-32.hdf  |
|     0.29 | 59M  | sig-0-64.hdf  |
|     0.29 | 59M  | sig-0-128.hdf |
|     0.29 | 59M  | sig-0-256.hdf |
|     0.29 | 59M  | sig-0-512.hdf |
|----------+------+---------------|
|     0.47 | 2.7M | sig-1-32.hdf  |
|     0.33 | 2.1M | sig-1-64.hdf  |
|     0.31 | 1.9M | sig-1-128.hdf |
|     0.30 | 1.9M | sig-1-256.hdf |
|     0.30 | 1.9M | sig-1-512.hdf |
|----------+------+---------------|
|     0.53 | 2.1M | sig-2-32.hdf  |
|     0.37 | 1.3M | sig-2-64.hdf  |
|     0.33 | 1.1M | sig-2-128.hdf |
|     0.31 | 1.1M | sig-2-256.hdf |
|     0.31 | 1.1M | sig-2-512.hdf |
|----------+------+---------------|
|     0.60 | 2.1M | sig-3-32.hdf  |
|     0.44 | 1.3M | sig-3-64.hdf  |
|     0.40 | 1.1M | sig-3-128.hdf |
|     0.39 | 1.1M | sig-3-256.hdf |
|     0.39 | 1.1M | sig-3-512.hdf |
|----------+------+---------------|
|     0.62 | 2.1M | sig-4-32.hdf  |
|     0.44 | 1.3M | sig-4-64.hdf  |
|     0.40 | 1.1M | sig-4-128.hdf |
|     0.39 | 1.1M | sig-4-256.hdf |
|     0.39 | 1.1M | sig-4-512.hdf |
|----------+------+---------------|
|     0.63 | 2.1M | sig-5-32.hdf  |
|     0.46 | 1.3M | sig-5-64.hdf  |
|     0.41 | 1.1M | sig-5-128.hdf |
|     0.40 | 1.1M | sig-5-256.hdf |
|     0.40 | 1.1M | sig-5-512.hdf |
|----------+------+---------------|
|     0.81 | 2.1M | sig-9-32.hdf  |
|     0.72 | 1.3M | sig-9-64.hdf  |
|     0.67 | 1.1M | sig-9-128.hdf |
|     0.64 | 1.1M | sig-9-256.hdf |
|     0.64 | 1.1M | sig-9-512.hdf |
|----------+------+---------------|

To remake, run something like:

$ cp /path/to/some/adc-frame-format-file.npz adc.npz
$ cp /path/to/some/sig-frame-format-file.npz sig.npz
$ bash hio/test/frame-npz-to-hio.sh
$ grep 'Timer: WireCell::Hio::HDF5FrameTap' adc-*.log sig-*.log 
$ ls -sh sig-*.hdf adc-*.hdf

In general:

  • Stronger compression is slower but smaller files - duh.
  • Likewise larger chunk, if a weaker effect.

Default is no compression but perhaps a gzip level of 1 or 2 and the current chunk size of 256 would make a good compromise.

@HaiwangYu HaiwangYu self-assigned this Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants