-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collation fetching fairness #4880
base: master
Are you sure you want to change the base?
Conversation
if let Some((_, mut lowest_score)) = lowest_score { | ||
for claim in claims { | ||
if let Some((_, collations)) = lowest_score.iter_mut().find(|(id, _)| *id == claim) | ||
{ | ||
match collations.pop_front() { | ||
Some(collation) => return Some(collation), | ||
None => { | ||
unreachable!("Collation can't be empty!") | ||
}, | ||
} | ||
} | ||
} | ||
unreachable!("All entries in waiting_queue should also be in claim queue") | ||
} else { | ||
None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking again at this I am a bit uneasy about the unreachable
s here. I'll try to refactor this to be more reliable.
c7f24aa
to
0f28aa8
Compare
@@ -266,9 +264,6 @@ impl PeerData { | |||
let candidates = | |||
state.advertisements.entry(on_relay_parent).or_default(); | |||
|
|||
if candidates.len() > max_candidate_depth { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error leads to reporting the peer with COST_UNEXPECTED_MESSAGE
. I think we shold relax it to just ignoring the advertisement.
Pros:
- with the new logic submitting more elements than scheduled is not such a major offence
- old collators won't get punished for not respecting the claim queue
Cons:
- we don't punish spammy collators
/// | ||
/// If prospective parachains mode is not enabled then we fall back to synchronous backing. In | ||
/// this case there is a limit of 1 collation per relay parent. | ||
pub(super) fn is_collations_limit_reached( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open for better name suggestions
The CI pipeline was cancelled due to failure one of the required jobs. |
Related to #1797
When fetching collations in collator protocol/validator side we need to ensure that each parachain has got a fair core time share depending on its assignments in the claim queue. This means that the number of collations fetched per parachain should ideally be equal to (but definitely not bigger than) the number of claims for the particular parachain in the claim queue.
The current implementation doesn't guarantee such fairness. For each relay parent there is a
waiting_queue
(PerRelayParent -> Collations -> waiting_queue) which holds any unfetched collations advertised to the validator. The collations are fetched on first in first out principle which means that if two parachains share a core and one of the parachains is more aggresive it might starve the second parachain. How? At each relay parent up tomax_candidate_depth
candidates are accepted (enforced infn is_seconded_limit_reached
) so if one of the parachains is quick enough to fill in the queue with its advertisements the validator will never fetch anything from the rest of the parachains despite they are scheduled. This doesn't mean that the aggressive parachain will occupy all the core time (this is guaranteed by the runtime) but it will deny the rest of the parachains sharing the same core to have collations backed.The solution I am proposing extends the checks inThe solution I am proposing is to limit fetches and advertisements based on the state of the claim queue. At each relay parent the claim queue for the core assigned to the validator is fetched. For each parachain a fetch limit is calculated (equal to the number of entries in the claim queue). Advertisements are not fetched for a parachain which has exceeded its claims in the claim queue. This solves the problem with aggressive parachains advertising too much collations.is_seconded_limit_reached
with an additional check.The second part is in collation fetching logic. Instead of popping the first entry from the
waiting_queue
the validator calculates score for each entry there. The score isperformed collation fetches for paracahin A at relay parent X / number of entries in claim queue for parachain A at relay parent X
. The score will be lower for parachains which has less fetches than expected and 0 for parachains which has no fetches at all. This should provide an ordering based on the urgency of each fetch. If two parachains end up with the same score then the one earlier in the claim queue is preferred.TODOs: