Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rmeta: avoid embedding StabilityLevel::Unstable reason multiple times into .rmeta\.rlib files #98770

Merged
merged 1 commit into from
Jul 25, 2022

Conversation

klensy
Copy link
Contributor

@klensy klensy commented Jul 1, 2022

Avoids bloating size of some rmeta\rlib files by not placing default string for StabilityLevel::Unstable reason multiple times, affects only stdlib\rustc artifacts. For stdlib cuts about 3% (diff of total size for patched\unpatched *.rmeta files of stage1-std) of file size, depending on crates.

fixes #88180

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 1, 2022
@rust-highfive
Copy link
Collaborator

r? @lcnr

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 1, 2022
@klensy
Copy link
Contributor Author

klensy commented Jul 1, 2022

Or maybe encode with something like this, but values deduped?

/// A list of lazily-decoded values, with the added capability of random access.
///
/// Random-access table (i.e. offering constant-time `get`/`set`), similar to
/// `LazyArray<T>`, but without requiring encoding or decoding all the values
/// eagerly and in-order.
struct LazyTable<I, T> {
position: NonZeroUsize,
encoded_size: usize,
_marker: PhantomData<fn(I) -> T>,
}

@cjgillot
Copy link
Contributor

cjgillot commented Jul 2, 2022

There is precedent for such encoding, for instance type_shorthands or ExpnIndex.
Do you have an estimate of the amount of duplicated strings?

@klensy
Copy link
Contributor Author

klensy commented Jul 2, 2022

There is precedent for such encoding, for instance type_shorthands or ExpnIndex. Do you have an estimate of the amount of duplicated strings?

from #88180 when issue was created:

found 172 times in libaddr2line.rmeta
found 911 times in libhashbrown.rmeta
found 9590 times in libobject.rmeta

@klensy
Copy link
Contributor Author

klensy commented Jul 2, 2022

Checked now, for stage0 compiler artifact libobject.rmeta: 11237 times, for example.

@cjgillot
Copy link
Contributor

cjgillot commented Jul 2, 2022

Why not the following:

  • encode the integer value of Symbol;
  • encode somewhere else a FxHashMap<u32, String> of only actually encoded Symbols.

This would maximize the deduplication while being self-contained.
If we don't care about being self-contained, we can skip encoding the strings for static Symbols (ie. those listed in rustc_span::symbol).

What do you think?

@klensy
Copy link
Contributor Author

klensy commented Jul 2, 2022

impl<S: Encoder> Encodable<S> for Symbol {
fn encode(&self, s: &mut S) {
s.emit_str(self.as_str());
}
}
impl<D: Decoder> Decodable<D> for Symbol {
#[inline]
fn decode(d: &mut D) -> Symbol {
Symbol::intern(&d.read_str())
}
}

Something like replacing s.emit_str(self.as_str()); with with s.emit_u32(self.as_u32()); + saving in hash table, and in decode reading from that hash table? Only left to find the place where to save that hash table.

@cjgillot
Copy link
Contributor

cjgillot commented Jul 2, 2022

To generate those hash-tables, you can use specialization like we do for impl {De,En}codable for Span. Then, the hash-tables can be stored as any other in CrateRoot. We may want to only decode it once.

@cjgillot cjgillot self-assigned this Jul 3, 2022
@lcnr
Copy link
Contributor

lcnr commented Jul 4, 2022

r? @cjgillot

@cjgillot
Copy link
Contributor

@klensy do you still want to merge this, or should it be closed in favour of #98851?

@cjgillot cjgillot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 11, 2022
@bors
Copy link
Contributor

bors commented Jul 14, 2022

☔ The latest upstream changes (presumably #98975) made this pull request unmergeable. Please resolve the merge conflicts.

@klensy
Copy link
Contributor Author

klensy commented Jul 19, 2022

I think this should be merged (and #98851 tested after that), as current default error message looks more like lint warn\error (that can possibly be translatable in future) and shouldn't be embed into compiled artifacts, will rebase.

@klensy
Copy link
Contributor Author

klensy commented Jul 24, 2022

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 24, 2022
@cjgillot
Copy link
Contributor

@bors r+ rollup=never
^ may have perf implications

@bors
Copy link
Contributor

bors commented Jul 24, 2022

📌 Commit b38c948 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 24, 2022
@bors
Copy link
Contributor

bors commented Jul 25, 2022

⌛ Testing commit b38c948 with merge 7f93d4a...

@bors
Copy link
Contributor

bors commented Jul 25, 2022

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 7f93d4a to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 25, 2022
@bors bors merged commit 7f93d4a into rust-lang:master Jul 25, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jul 25, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7f93d4a): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 😿 relevant regression found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
0.3% 0.3% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) N/A N/A 0

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
2.4% 2.4% 1
Improvements 🎉
(primary)
-1.3% -1.6% 6
Improvements 🎉
(secondary)
-2.0% -3.8% 7
All 😿🎉 (primary) -1.3% -1.6% 6

Cycles

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: 😿 relevant regressions found
mean1 max count2
Regressions 😿
(primary)
2.2% 2.2% 1
Regressions 😿
(secondary)
2.9% 3.6% 5
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 2.2% 2.2% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rustc_attr::Stability inlined a lot of dublicates in rustc dependency crates rmeta's, increasing it's size
7 participants