Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[diemdb] add API get_transaction_by_hash #9206

Merged
merged 1 commit into from Oct 5, 2021

Conversation

lightmark
Copy link

Motivation

Store the hash->version to db and add the API to get the transaction by hash.

Have you read the Contributing Guidelines on pull requests?

Y

Test Plan

current coverage.

@lightmark lightmark added the storage Issues related to storage label Sep 17, 2021
@lightmark lightmark requested a review from a team as a code owner September 17, 2021 00:55
@bors-libra bors-libra added this to In Review in bors Sep 17, 2021
.get_transaction_by_hash(txn_to_commit.transaction().hash(), ledger_version, true)
.unwrap()
.unwrap();
txn_with_proof
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably assert_eq!(txn_with_proof.transaction().hash(), txn_to_commit.transaction().hash()) as well, which is the user intended way to use it? @xli

) -> Result<Option<TransactionWithProof>> {
let version = match self
.transaction_store
.get_transaction_version_by_hash(&hash, ledger_version)?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use Option::map()?

@msmouse msmouse requested a review from a team September 17, 2021 17:58
@@ -190,6 +190,16 @@ pub trait DbReader: Send + Sync {
fetch_events: bool,
) -> Result<TransactionListWithProof>;

/// See [`DiemDB::get_transaction_by_hash`].
///
/// [`DiemDB::get_transactions`]: ../diemdb/struct.DiemDB.html#method.get_transaction_by_hash
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to note it somewhere that we can't prove absence.

hash: HashValue,
ledger_version: Version,
fetch_events: bool,
) -> Result<Option<TransactionWithProof>>;
Copy link
Contributor

@aching aching Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning None might not give enough detail to the user.

  1. Is there is a range of ledger versions can't we prove absence? - Does this node only have the latest version?
  2. Should we let the user know the version range that was searched if it can't be found?
  3. Should we also let the user search a particular version range?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. l think I can remove the ledger_version as it is best-effort. We cannot prove absence for any version.
  2. Even we let user know, it doesn't help because there is no proof and the reason can be multifold.
  3. If we cannot prove, why do we need version range?

Copy link

@msmouse msmouse Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. return only "if txn.version < ledger_version" would be the most we want to do :P or remove the parameter.
  2. I don't think the complexity is worth it. We just put in disclaimers to say we don't guarantee absence and can't prove it. In reality everything after the API is supported will be searched and returned, but again we can't prove it.
  3. no

Copy link
Contributor

@aching aching Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume a reasonable use case is something like:
I want to find my txn so do you have it? Server says no.
Do I look elsewhere? If the server searched versions 0 current, then I don't need to (if I trust this server).
Otherwise if the server only had current_minus_one to current, I should absolutely ask another server who can tell me 0 to currrent_minus_one.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case is I just submitted a transaction and I want to see if it's included yet, if I didn't misunderstand.
We always have the proper APIs (get_by_addr_and_seq(), get_by_ver()) to get historical transactions.

I'd rather always guarantee (0, ledger_version) is searched by backfilling the data, than complicating the API for ever.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we return OK(None) is NOT_FOUND, other query errors are Err(_). But the reasons of NOT_FOUND can be various. In our case, the index may somehow get lost or the ledger version is too new.
the ledger version there is to keep a consistent view of api, to make the client know not_found returned by querying the server's local range but not beyond server's local version range.

Copy link
Contributor

@xli xli Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think index data lost probably can be ignored if we don't have a way to distinguish it from not exist.

We should assume index (0, ledger_version) will be eventually fulfilled. (should we have an admin api to rebuild index in background for node?)

Is there a plan to prune this index for old txn data? I assumed not, once we build the index, we will keep it, right? if that's the case, I think return Ok(None) is fine to me for the case not found.
Service layer will return current latest ledger version in response body when txn is not found.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For admin api, that's a possible way to go. But maybe not hi-pri for now.
I think the pruning will be consistent with the txn, if txn got pruned, there is no sense to keep the index. If this case, how do you run the admin api?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced offline with our "customer" @xli, and the team agreed on moving forward with the simple interface.

@aching My thinking is, if pruning is in place, with this simple interface, we have two options: 1. maintain a strong guarantee that the index searched covers all transactions that's on the server, in that case, the JsonAPI layer will have way to know the range. or 2. as we claim today, we don't make any guarantee that the index is complete (in reality today though, our index covers all transactions created after the new version of software is put in, and if we do a backfill, that equals the full range of existing transactions.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All sounds good to me, one ask for the case that no guarantee the index is complete: could we create a document for how to make the index complete (when we have the way to backfill the index), including configurations required for keep index after built?

msmouse
msmouse previously approved these changes Sep 30, 2021
xli
xli previously approved these changes Oct 1, 2021
@lightmark
Copy link
Author

@bors-libra land

@bors-libra bors-libra moved this from In Review to Queued in bors Oct 1, 2021
@bors-libra bors-libra moved this from Queued to Testing in bors Oct 1, 2021
bors-libra pushed a commit that referenced this pull request Oct 1, 2021
@bors-libra
Copy link
Contributor

💔 Test Failed - ci-test

@bors-libra bors-libra moved this from Testing to In Review in bors Oct 1, 2021
@xli
Copy link
Contributor

xli commented Oct 4, 2021

/land

@bors-libra bors-libra moved this from In Review to Queued in bors Oct 4, 2021
@bors-libra bors-libra moved this from Queued to Testing in bors Oct 4, 2021
bors-libra pushed a commit that referenced this pull request Oct 4, 2021
@bors-libra
Copy link
Contributor

💔 Test Failed - ci-test

@bors-libra bors-libra moved this from Testing to In Review in bors Oct 4, 2021
@lightmark lightmark dismissed stale reviews from xli and msmouse via 672e11e October 5, 2021 00:32
@lightmark lightmark requested a review from a team October 5, 2021 00:32
xli
xli previously approved these changes Oct 5, 2021
CapCap
CapCap previously approved these changes Oct 5, 2021
@lightmark
Copy link
Author

@bors-libra land

@bors-libra bors-libra moved this from In Review to Queued in bors Oct 5, 2021
@bors-libra bors-libra moved this from Queued to Testing in bors Oct 5, 2021
@github-actions
Copy link

github-actions bot commented Oct 5, 2021

Forge Test Result

Compatibility test results for land_f5ba0e6e ==> land_82a75c34 (PR)
1. Downgrade all validators to older version: land_f5ba0e6e
2. Upgrading first Validator to new version: land_82a75c34
3. Upgrading rest of first batch to new version: land_82a75c34
4. upgrading second batch to new version: land_82a75c34
5. check swarm health
Compatibility test for land_f5ba0e6e ==> land_82a75c34 passed
performance::performance-benchmark : 948 TPS, 4742 ms latency, 6150 ms p99 latency,no expired txns
Logs: https://diem-development.kb.us-west1.gcp.cloud.es.io:9243/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15m,to:now))&_a=(columns:!(),filters:!(),index:'2da2a6e0-f70a-11eb-ad53-b979a6c9b83a',interval:auto,query:(language:kuery,query:''),sort:!(!('@timestamp',desc)))
Dashboard: http://mon.forge-0.aws.hlw3truzy4ls.com/d/overview/overview?from=1633420273976&to=1633421518722&orgId=1

Repro cmd:

./scripts/fgi/run --tag land_82a75c34 --base-image-tag land_f5ba0e6e --suite land_blocking_compat --report forge_report.json

🎉 Land-blocking forge test passed! 👌

@github-actions
Copy link

github-actions bot commented Oct 5, 2021

Cluster Test Result

Test runner setup time spent 283 secs
Compatibility test results for land_f5ba0e6e ==> land_82a75c34 (PR)
1. All instances running land_f5ba0e6e, generating some traffic on network
2. First full node land_f5ba0e6e ==> land_82a75c34, to validate new full node to old validator node traffic
3. First Validator node land_f5ba0e6e ==> land_82a75c34, to validate storage compatibility
4. First batch validators (14) land_f5ba0e6e ==> land_82a75c34, to test consensus and traffic between old full nodes and new validator node
5. First batch full nodes (14) land_f5ba0e6e ==> land_82a75c34
6. Second batch validators (15) land_f5ba0e6e ==> land_82a75c34, to upgrade rest of the validators
7. Second batch of full nodes (15) land_f5ba0e6e ==> land_82a75c34, to finish the network upgrade, time spent 689 secs
all up : 1169 TPS, 3882 ms latency, 4400 ms p99 latency, no expired txns, time spent 250 secs
Logs: http://kibana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-10-05T07:50:55Z',to:'2021-10-05T08:13:08Z'))
Dashboard: http://grafana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/d/performance/performance?from=1633420255000&to=1633421588000
Validator 1 logs: http://kibana.ct-0-k8s-testnet.aws.hlw3truzy4ls.com/app/kibana#/discover?_g=(time:(from:'2021-10-05T07:50:55Z',to:'2021-10-05T08:13:08Z'))&_a=(columns:!(log),query:(language:kuery,query:'kubernetes.pod_name:"val-1"'),sort:!(!('@timestamp',desc)))

Repro cmd:

./scripts/cti --tag land_f5ba0e6e --cluster-test-tag land_82a75c34 -E BATCH_SIZE=15 -E UPDATE_TO_TAG=land_82a75c34 --report report.json --suite land_blocking_compat

🎉 Land-blocking cluster test passed! 👌

@bors-libra bors-libra dismissed stale reviews from CapCap and xli via 82a75c3 October 5, 2021 08:13
@bors-libra bors-libra removed this from Testing in bors Oct 5, 2021
@bors-libra bors-libra closed this in 82a75c3 Oct 5, 2021
@bors-libra bors-libra merged commit 82a75c3 into diem:main Oct 5, 2021
@bors-libra bors-libra temporarily deployed to Sccache October 5, 2021 08:13 Inactive
@bors-libra bors-libra temporarily deployed to Docker October 5, 2021 08:14 Inactive
@bors-libra bors-libra temporarily deployed to Sccache October 5, 2021 08:14 Inactive
@bors-libra bors-libra temporarily deployed to Sccache October 5, 2021 14:37 Inactive
@bors-libra bors-libra temporarily deployed to Docker October 5, 2021 14:37 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed storage Issues related to storage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants