Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark for reading spike_times from units table #13

Open
oruebel opened this issue Jan 27, 2024 · 0 comments
Open

Add benchmark for reading spike_times from units table #13

oruebel opened this issue Jan 27, 2024 · 0 comments

Comments

@oruebel
Copy link
Contributor

oruebel commented Jan 27, 2024

This issue hdmf-dev/hdmf-zarr#141 discusses differences in performance when reading from an indexed column in a DynamicTable, i.e., when we have a ragged array defined via a VectorData and VectorIndex

  1. spike_times = nwbfile_read.units.spike_times[:] which reads reads all values from the VectorData but does not create a ragged array
  2. spike_times = nwbfile_read.units['spike_times'][:] which reads data using VectorIndex (which in turn reads from VectorData to create a ragged array

It would be useful to benchmark the difference in performance and to see if the read of ragged arrays could be optimized. This difference in performance is likely exasperated when doing remote read, because VectorIndex likely makes many read requests to VectorData to read the different segments, rather than loading the data first into memory and then segmenting the array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant