-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new validator missing all p/v points #4914
Comments
Two minutes after posting this issue the validator started producing blocks.. |
Can you provide some timelines:
One possible root-cause, is this one #4324, where validators entering the active set have to pre compile the PVFs for all parachains, so they might fall behind on backing work, but the validator should catch-up and recover by itself in 5-10 minutes. Restarting won't help because, all pvfs are recompiled when you restart your validator. |
@alexggh sure,
8976
First time troubleshooting various things after like 2 hours, when I set the
Mostly syncing logs expect like once a minute this one:
And these after restarting the validator various times:
12D3KooWHRd8ANXyUvkRNJiXsSUzB2pSaE7QvVATdsTNRCjRvdiP |
This two errors are red-herring they are kind of expected on restarts, so they can be ignored.
Bare in mind that propagating node addresses and IDs in the network is not instantaneous(because of DHT), so that might explain what you noticed, did your node change reported IP/Ports between restarts ? This is what I'm seeing now for it:
Also, it would help a lot if you could provide the full logs. |
But the first time it joined the active set the validator had a uptime of 7 days. After 4-6 of hours I assigned a new public IP to the host, thought maybe for a reason that IP got banned or so. Port stayed the same. Logs for the last 24h are attached |
Ok, I think I have a theory of what happened here:
Because of #4324, your validator is slow on backing stuff, because it needs to precompile stuff. One thing to note here is that you are in group
Overall, I think this was a bit unfortunate, things are in progress on fixing #4324, so that the initial cause that triggered you to restart the validator and try various fixes won't happen anymore, till then I would always recommend waiting around 20 minutes and always try to compare the relative performance with your peers, before concurring something is off with the validator and restart it. |
@alexggh thanks. |
This issue is with
polkadot 1.13.0-d5160c1d567
We have a new Polkadot validator that joined the active set for the first time today.
This one is configured just like all the others, incl. Kusama that don't have any issue and always perform as they should.
It's missing so far all p/v points. But Relay chain blocks get produced.
Incoming p2p connection works, we have constantly new incoming connections.
The validator has the same hardware specs as the other ones in the active set that work fine.
Node identity key is the same from when the session keys were generated, which was done when the validator was fully in sync.
I tried to restore the db from backup for both paritydb and rocksdb, didn't fix it. Then I tried both with warp sync and also no points. When I added
--no-private-ip
it was producing blocks for the remaining time of that epoch but when the new epoch started it started missing all blocks again.occasionally I have the following error log:
This is the systemd file which works for all the other validators:
The validator has a public IP assigned, I also additionally specified the public IP with
--public-addr
and--listen-addr
, which didn't help either.The only difference this validator has is that it the stash account joined the active set for the first time ever.
The text was updated successfully, but these errors were encountered: