Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing unordered_maps may give big performance improvements #483

Closed
halflearned opened this issue Aug 16, 2019 · 2 comments
Closed

Removing unordered_maps may give big performance improvements #483

halflearned opened this issue Aug 16, 2019 · 2 comments
Labels
performance Issue relates to the speed, memory usage, or scaling aspects of the package.

Comments

@halflearned
Copy link
Member

halflearned commented Aug 16, 2019

We noted in our meeting today that our splitting rules seem to be making use of unordered_map. As we've seen before (#359), it seems that we should avoid c++ hash tables at all costs.

What I have now is far from a full-blown perf analysis, but check this out.

  1. On this branch, I run a regression_forest on test/forest/resources/regression_data.csv (n=500, p=10). This is the output of CLion's profiler.

Screen Shot 2019-08-15 at 11 08 47 PM

The total runtime was 3.43 seconds. (Note this branch contains the same code as the current development version of grf, except I added a command-line interface for profiling).

  1. On this branch, all I did was to search and replace instances of std::unordered_map<size_t, double> with std::vector<double> in folders splitting and relabeling, and then make sure that the vectors are property initialized.

This time, the total runtime was 2.26 seconds, which is 65.8% of the version with unordered_map.


With a larger dataset (n=15000, p=10), the runtime is 61s with grf development version, and about 37s using std::vector.

@halflearned halflearned added the performance Issue relates to the speed, memory usage, or scaling aspects of the package. label Aug 16, 2019
jtibshirani added a commit that referenced this issue Sep 1, 2019
This PR performs some light optimizations discovered through profiling.

First, it improves memory access during the relabeling procedure:
- Use `vector` instead of `unordered_map` for relabeled responses.
- Re-use a single vector to hold relabeled responses.

It also updates the vector allocation and access strategy for code on the hot
path:
- When populating vectors, use `resize` and `operator[]` instead of `reserve`
and `push_back`.
- Prefer `operator[]` instead of `at` for vector element access.

Addresses #483.
@erikcs
Copy link
Member

erikcs commented Oct 30, 2019

On this and #359 : did you want to switch out all unordered_maps?

Aside for future reference: easy way to set up profiling from R:

On Mac you can get the above output without starting from the C++ side. Build the package from source as usual, debug symbols are included by default (the -g in $ R CMD config CXXFLAGS).

a) Start R or RStudio.
b) Open Xcode->Xcode->Open Developer Tools->Instruments:
select profile. Next to the record button attach to the R process. Press record
c) In R run the commands you want to profile. For example training a large causal_forest.
e) Instruments: stop the recording and you get the profiling output:

Screen Shot 2019-10-30 at 11 39 56

To get source annotations (Mac specific dSYM file is not automatically generated) run $ dsymutil /Library/Frameworks/R.framework/Versions/3.6/Resources/library/grf/libs/grf.so -o grf.dSYM and instruments should find it:

Screen Shot 2019-10-30 at 14 50 44

Another aside: stepping through grf C++ from R with lldb: https://gist.github.com/erikcs/5e1f0994f8c4d274926e50a2ba1901de

@erikcs
Copy link
Member

erikcs commented Oct 31, 2019

This was already fixed in #514 but the issue remained open. Closing now

@erikcs erikcs closed this as completed Oct 31, 2019
davidahirshberg pushed a commit to davidahirshberg/grf that referenced this issue Dec 6, 2019
This PR performs some light optimizations discovered through profiling.

First, it improves memory access during the relabeling procedure:
- Use `vector` instead of `unordered_map` for relabeled responses.
- Re-use a single vector to hold relabeled responses.

It also updates the vector allocation and access strategy for code on the hot
path:
- When populating vectors, use `resize` and `operator[]` instead of `reserve`
and `push_back`.
- Prefer `operator[]` instead of `at` for vector element access.

Addresses grf-labs#483.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issue relates to the speed, memory usage, or scaling aspects of the package.
Projects
None yet
Development

No branches or pull requests

2 participants