Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abnormally high beacon response latency #6658

Open
ank-everstake opened this issue Oct 18, 2024 · 0 comments
Open

Abnormally high beacon response latency #6658

ank-everstake opened this issue Oct 18, 2024 · 0 comments

Comments

@ank-everstake
Copy link

Describe the bug
Nimbus beacon with working incoming connections has drastically higher response latency between the validator clients and the nimbus beacon node itself exceeding even 3s. These delays resulted in missed blocks, which can significantly impact validator performance.

We tried limiting beacon roles on VC to attestation relevant only, but to no avail.
Reducing amount of keys that is attesting/proposing using that beacon node also doesn't have any impact at all.
Even with a small number of validator keys (500 keys) interacting with the beacon node, we observed same response delays.

Nimbus BN nodes that did not allow incoming p2p peers did not have such issue even with amount of keys more than 5k.

During testing, we closed the tcp/udp P2P ports using iptables to drop incoming peer connections totally and latency decreased at once.
It's a temporary fix as we believe it may affect blockchain peering efficiency in the long run to leave beacon node p2p port closed.

Versions:

Nimbus BN: v24.9.0
Nimbus VC: v24.9.0

Beacon server specs:

OS: Ubuntu 24.04
CPU: AMD EPYC 9254 24-Core Processor
RAM: 128GB
DISKS: NVMe SAMSUNG MZQL2960HCJR-00A07

Relevant flags(same as default):

    --max-peers=160
    --hard-max-peers=240
    --tcp-port=9000
    --udp-port=9000

To Reproduce
If UPnP is working (you are not using remapped ports or "behind NAT" setup), node uses default 9000 p2p port and firewall port is open you may be impacted by this issue.

Screenshots
Image
Image
Image

The attached screenshots illustrate the correlation between the drop in incoming connections to zero and improved validator<>beacon response times.

10.0.0.2 is misbehaving Nimbus BN (bad nimbus)
10.0.0.6 is Nimbus BN w/o incomming peers at all (good nimbus)

Additional context
Comparing to Teku, Lighthouse, and Prysm, Nimbus has highest delays, so we think that API performance needs to be improved quite a bit.
Network latency is under 2ms between all connected BN nodes.
In retrospective, issue was observed with all versions we used starting from v24.4.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant