Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[network] Make NetworkId & NetworkContext copyable #9203

Merged
merged 3 commits into from Sep 20, 2021

Conversation

gregnazario
Copy link
Contributor

@gregnazario gregnazario commented Sep 16, 2021

Motivation

I made a mistake when I first created NetworkId, making it have to be cloned everywhere because of the contained string. This made it more unwieldy and a bunch of extra work around using the NetworkId.

Now, we can copy and make everything much easier, no longer copying around the string "vfn". Additionally, cleans up some code and we are able to get rid of the Arc<NetworkContext> to just NetworkContext.

Migration Plan

  1. Deploy all validators and VFNs with this version (serializing to old format).
  2. Change code to output serialization to new format
  3. Remove old format compatibility (or even move to using the Invalid slot instead of the current Vfn slot for readability purposes

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Take a look at my backwards compatible tests for serialization. They cause some pain around having an unused value in the enum, but everything works outside of that!

Related PRs

Related to #9191 and thinking about how we can standardize identifiers across networks. This PR definitely is blocked on that one (due to the ton of merge conflicts that are going to happen

@gregnazario gregnazario added the network Validator and full node p2p network label Sep 16, 2021
@gregnazario gregnazario self-assigned this Sep 16, 2021
@bors-libra bors-libra added this to In Review in bors Sep 16, 2021
config/src/network_id.rs Outdated Show resolved Hide resolved
@gregnazario gregnazario marked this pull request as ready for review September 16, 2021 23:57
@gregnazario gregnazario requested a review from a team as a code owner September 16, 2021 23:57
config/src/network_id.rs Outdated Show resolved Hide resolved
config/src/network_id.rs Outdated Show resolved Hide resolved
@gregnazario gregnazario force-pushed the network-id branch 2 times, most recently from 4881303 to ba6106b Compare September 17, 2021 16:19
@gregnazario
Copy link
Contributor Author

/canary

bors-libra pushed a commit that referenced this pull request Sep 17, 2021
@bors-libra bors-libra moved this from In Review to Canary in bors Sep 17, 2021
@bors-libra bors-libra moved this from Canary to In Review in bors Sep 17, 2021
@github-actions
Copy link

Cluster Test Result

Test runner setup time spent 264 secs
Compatibility test results for land_235bbd9b ==> land_7c9e2f0c (PR)
1. All instances running land_235bbd9b, generating some traffic on network
2. First full node land_235bbd9b ==> land_7c9e2f0c, to validate new full node to old validator node traffic
3. First Validator node land_235bbd9b ==> land_7c9e2f0c, to validate storage compatibility
4. First batch validators (14) land_235bbd9b ==> land_7c9e2f0c, to test consensus and traffic between old full nodes and new validator node
5. First batch full nodes (14) land_235bbd9b ==> land_7c9e2f0c
6. Second batch validators (15) land_235bbd9b ==> land_7c9e2f0c, to upgrade rest of the validators
7. Second batch of full nodes (15) land_235bbd9b ==> land_7c9e2f0c, to finish the network upgrade, time spent 703 secs
all up : 1162 TPS, 3905 ms latency, 4400 ms p99 latency, no expired txns, time spent 250 secs
Logs: http://kibana.ct-1-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-09-17T16:40:56Z',to:'2021-09-17T17:04:03Z'))
Dashboard: http://grafana.ct-1-k8s-testnet.aws.hlw3truzy4ls.com/d/performance/performance?from=1631896856000&to=1631898243000
Validator 1 logs: http://kibana.ct-1-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-09-17T16:40:56Z',to:'2021-09-17T17:04:03Z'))&_a=(columns:!(log),query:(language:kuery,query:'kubernetes.pod_name:"val-1"'),sort:!(!('@timestamp',desc)))

Repro cmd:

./scripts/cti --tag land_235bbd9b --cluster-test-tag land_7c9e2f0c -E BATCH_SIZE=15 -E UPDATE_TO_TAG=land_7c9e2f0c --report report.json --suite land_blocking_compat

🎉 Land-blocking cluster test passed! 👌

@gregnazario gregnazario changed the title [network] Make NetworkId copyable [network] Make NetworkId & NetworkContext copyable Sep 17, 2021
bmwill
bmwill previously approved these changes Sep 20, 2021
Copy link
Contributor

@bmwill bmwill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Next cool thing to do (eventually, probably not anytime soon since we'd need another logging stack change) would be to using tracing's Span concept to open a span with the network context data in it so all logging events inherently have this data instead of needing to pipe it through everywhere.

Comment on lines +159 to +163
self == &NetworkId::Vfn
}

pub fn is_validator_network(&self) -> bool {
matches!(self, NetworkId::Validator)
self == &NetworkId::Validator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit (probably not worth fixing) Using a match statement or matches! macro may be a bit more idiomatic

@gregnazario
Copy link
Contributor Author

/land

@bors-libra bors-libra moved this from In Review to Queued in bors Sep 20, 2021
I removed the "private" network and replaced it with "vfn" so that we
don't have to clone NetworkId everywhere.  Given we have no other
private networks, this should be fine for now.  It did have to add an
invalid type for backwards compatibility in conversion
@bors-libra bors-libra moved this from Queued to Testing in bors Sep 20, 2021
@github-actions
Copy link

Cluster Test Result

Test runner setup time spent 284 secs
Compatibility test results for land_454d3041 ==> land_3b5e28b7 (PR)
1. All instances running land_454d3041, generating some traffic on network
2. First full node land_454d3041 ==> land_3b5e28b7, to validate new full node to old validator node traffic
3. First Validator node land_454d3041 ==> land_3b5e28b7, to validate storage compatibility
4. First batch validators (14) land_454d3041 ==> land_3b5e28b7, to test consensus and traffic between old full nodes and new validator node
5. First batch full nodes (14) land_454d3041 ==> land_3b5e28b7
6. Second batch validators (15) land_454d3041 ==> land_3b5e28b7, to upgrade rest of the validators
7. Second batch of full nodes (15) land_454d3041 ==> land_3b5e28b7, to finish the network upgrade, time spent 654 secs
all up : 1171 TPS, 3879 ms latency, 4400 ms p99 latency, no expired txns, time spent 250 secs
Logs: http://kibana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-09-20T17:56:25Z',to:'2021-09-20T18:18:51Z'))
Dashboard: http://grafana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/d/performance/performance?from=1632160585000&to=1632161931000
Validator 1 logs: http://kibana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-09-20T17:56:25Z',to:'2021-09-20T18:18:51Z'))&_a=(columns:!(log),query:(language:kuery,query:'kubernetes.pod_name:"val-1"'),sort:!(!('@timestamp',desc)))

Repro cmd:

./scripts/cti --tag land_454d3041 --cluster-test-tag land_3b5e28b7 -E BATCH_SIZE=15 -E UPDATE_TO_TAG=land_3b5e28b7 --report report.json --suite land_blocking_compat

🎉 Land-blocking cluster test passed! 👌

@bors-libra bors-libra removed this from Testing in bors Sep 20, 2021
@bors-libra bors-libra merged commit 3b5e28b into diem:main Sep 20, 2021
@bors-libra bors-libra temporarily deployed to Sccache September 20, 2021 18:19 Inactive
@bors-libra bors-libra temporarily deployed to Docker September 20, 2021 18:19 Inactive
@bors-libra bors-libra temporarily deployed to Sccache September 20, 2021 18:19 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed network Validator and full node p2p network
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants