Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup IP race condition on EC2Architect #606

Merged
merged 16 commits into from
Nov 16, 2021
Merged

Cleanup IP race condition on EC2Architect #606

merged 16 commits into from
Nov 16, 2021

Conversation

JackUrb
Copy link
Contributor

@JackUrb JackUrb commented Nov 16, 2021

Summary

At the moment, the EC2Architect could run into three failure scenarios.

  1. You may have already allocated all 5 elastic IPs associated with your account by having many jobs running
  2. You may try to connect to a server before the association is complete, leading to the wrong IP being returned before an attempted push.
  3. The attempt to connect to the server could fail due to having used that same IP for a different server before, thus clashing with an entry in known_hosts
    This PR addresses all three in one sweep.

Implementation

We now only associate an IP to an instance when trying to push content to that instance, and then free that IP afterwards. This way in reserving the IP we know exactly what IP is going to be used, we can clear it from known hosts, and can stay under the 5 IP limit.

I also create the try_server_push to consolidate some logic used in both server pushes to one place.

Testing

Launched a test task on sandbox with the new setup, ensured that the server loaded properly and reserved IPs were not present after launch.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 16, 2021
@codecov-commenter
Copy link

Codecov Report

Merging #606 (4877639) into main (bc3526f) will decrease coverage by 0.05%.
The diff coverage is 11.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #606      +/-   ##
==========================================
- Coverage   63.34%   63.28%   -0.06%     
==========================================
  Files          86       86              
  Lines        8208     8221      +13     
==========================================
+ Hits         5199     5203       +4     
- Misses       3009     3018       +9     
Impacted Files Coverage Δ
...histo/abstractions/architects/ec2/ec2_architect.py 35.76% <0.00%> (+0.51%) ⬆️
...ephisto/abstractions/architects/ec2/ec2_helpers.py 17.18% <12.00%> (+0.46%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc3526f...4877639. Read the comment docs.

@JackUrb JackUrb merged commit e70e460 into main Nov 16, 2021
@JackUrb JackUrb deleted the ec2-architect branch November 16, 2021 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants