Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(NODE-5906): optimize toArray to use batches #4171

Merged
merged 12 commits into from
Aug 7, 2024

Conversation

aditi-khare-mongoDB
Copy link
Contributor

@aditi-khare-mongoDB aditi-khare-mongoDB commented Jul 9, 2024

Description

Performance optimization on AbstractCursor.toArray

What is changing?

Rather than awaiting each document, toArray now only awaits a new batch.

Is there new documentation needed for these changes?

No.

What is the motivation for this change?

Performance improvement. See performance data matrix here.

Release Highlight (Remove if this ends up being only a code clarity fix)

Optimized cursor.toArray()

Prior to this change, toArray() simply used the cursor's async iterator API, which parses BSON documents lazily (see more here). toArray(), however, eagerly fetches the entire set of results, pushing each document into the returned array. As such, toArray does not have the same benefits from lazy parsing as other parts of the cursor API.

With this change, when toArray() accumulates documents, it empties the current batch of documents into the array before calling the async iterator again, which means each iteration will fetch the next batch rather than wrap each document in a promise. This allows the cursor.toArray() to avoid the required delays associated with async/await execution, and allows for a performance improvement of up to 5% on average! 🎉

Note: This performance optimization does not apply if a transform has provided to cursor.map() before toArray is called.

Double check the following

  • Ran npm run check:lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: type(NODE-xxxx)[!]: description
    • Example: feat(NODE-1234)!: rewriting everything in coffeescript
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

@aditi-khare-mongoDB aditi-khare-mongoDB force-pushed the NODE-5906/optimize-toArray-to-batches branch from 5c87be0 to f888cd3 Compare July 26, 2024 20:07
@aditi-khare-mongoDB aditi-khare-mongoDB force-pushed the NODE-5906/optimize-toArray-to-batches branch from 712f24c to 98d4471 Compare July 26, 2024 20:53
@aditi-khare-mongoDB aditi-khare-mongoDB marked this pull request as ready for review July 30, 2024 14:49
@nbbeeken nbbeeken self-assigned this Jul 30, 2024
@nbbeeken nbbeeken added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Jul 30, 2024
Copy link
Contributor

@nbbeeken nbbeeken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you get the chance can you fill out the release highlights?

And (I guess on slack?) can you share the numbers you're seeing, in BSON and previous driver performance improvements I have often included a conservative estimation of improvements downstream projects can expect to see from the perf change.

src/cursor/abstract_cursor.ts Outdated Show resolved Hide resolved
src/cursor/abstract_cursor.ts Outdated Show resolved Hide resolved
src/cursor/abstract_cursor.ts Outdated Show resolved Hide resolved
@aditi-khare-mongoDB aditi-khare-mongoDB force-pushed the NODE-5906/optimize-toArray-to-batches branch 2 times, most recently from 9d43fb9 to 62d1acc Compare July 31, 2024 20:08
@nbbeeken nbbeeken added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Aug 6, 2024
@nbbeeken nbbeeken merged commit 5565d50 into main Aug 7, 2024
26 of 30 checks passed
@nbbeeken nbbeeken deleted the NODE-5906/optimize-toArray-to-batches branch August 7, 2024 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
3 participants