Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-v2] Finishing refactoring of abilities + type checker overhaul #12675

Merged
merged 3 commits into from Mar 29, 2024

Conversation

wrwg
Copy link
Contributor

@wrwg wrwg commented Mar 25, 2024

Description

This concludes moving all ability checking into the type inference algorithm by representing checks as type constraints. It also introduces some new constraints to deal with some other checks which have done previously ad-hoc (whether types are allowed to be references, tuples, phantom).

The PR turned out to be a larger refactoring of the checker as expected. On the good side, a lot of redundant ad-hoc code could be removed and replaced by more generic code.

Closes #12437 and closes #12674

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify) compiler

How Has This Been Tested?

New and existing tests

Key Areas to Review

Baseline files

Copy link

trunk-io bot commented Mar 25, 2024

⏱️ 39h 8m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-unit-tests 4h 51m 🟩🟩🟩🟩🟩 (+8 more)
rust-unit-coverage 4h 31m 🟩
rust-smoke-tests 4h 16m 🟩🟩🟩🟥🟩 (+3 more)
windows-build 4h 9m 🟩🟩🟩🟩🟩 (+8 more)
rust-smoke-coverage 3h 45m 🟩
execution-performance / single-node-performance 3h 16m 🟩🟩🟩🟩🟩 (+4 more)
rust-move-unit-coverage 3h 11m 🟩🟩🟩🟩 (+7 more)
forge-compat-test / forge 1h 44m 🟩🟩🟩🟩🟩 (+2 more)
forge-e2e-test / forge 1h 38m 🟩🟩🟩🟩🟩 (+2 more)
rust-images / rust-all 1h 25m 🟩🟩🟩🟩 (+4 more)
rust-lints 1h 18m 🟩🟥🟩🟩🟩 (+8 more)
run-tests-main-branch 1h 🟩🟩🟩🟩🟩 (+9 more)
cli-e2e-tests / run-cli-tests 55m 🟥🟥🟥🟥🟩 (+2 more)
check 51m 🟩🟩🟩🟩 (+9 more)
rust-move-tests 33m 🟩🟩🟩🟩🟩 (+8 more)
check-dynamic-deps 30m 🟩🟩🟩🟩🟩 (+9 more)
general-lints 27m 🟩🟩🟩🟩🟩 (+8 more)
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 17m 🟩🟩🟩🟩🟩 (+2 more)
file_change_determinator 8m 🟩🟩🟩🟩🟩 (+9 more)
node-api-compatibility-tests / node-api-compatibility-tests 6m 🟩🟩🟩🟩🟩 (+2 more)
semgrep/ci 6m 🟩🟩🟩🟩🟩 (+9 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+9 more)
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+6 more)
file_change_determinator 2m 🟩🟩🟩🟩 (+4 more)
execution-performance / file_change_determinator 2m 🟩🟩🟩🟩🟩 (+4 more)
permission-check 46s 🟩🟩🟩🟩🟩 (+9 more)
permission-check 43s 🟩🟩🟩🟩🟩 (+9 more)
permission-check 39s 🟩🟩🟩🟩🟩 (+8 more)
permission-check 33s 🟩🟩🟩🟩🟩 (+8 more)
permission-check 30s 🟩🟩🟩🟩🟩 (+4 more)
determine-docker-build-metadata 17s 🟩🟩🟩🟩🟩 (+4 more)
upload-to-codecov 17s 🟩

🚨 4 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
forge-compat-test / forge 17m 14m +25%
rust-lints 6m 7m -21%
rust-images / rust-all 28s 16m -97%
rust-move-tests 3s 8m -99%

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link

codecov bot commented Mar 25, 2024

Codecov Report

Attention: Patch coverage is 83.24176% with 183 lines in your changes are missing coverage. Please review.

Project coverage is 62.6%. Comparing base (36855e7) to head (783eb10).
Report is 5 commits behind head on main.

Files Patch % Lines
third_party/move/move-model/src/ty.rs 78.3% 140 Missing ⚠️
...d_party/move/move-model/src/builder/exp_builder.rs 90.6% 34 Missing ⚠️
third_party/move/move-model/src/builder/mod.rs 18.1% 9 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main   #12675       +/-   ##
===========================================
- Coverage    69.8%    62.6%     -7.2%     
===========================================
  Files        2293      819     -1474     
  Lines      436088   183454   -252634     
===========================================
- Hits       304431   114978   -189453     
+ Misses     131657    68476    -63181     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wrwg wrwg marked this pull request as draft March 26, 2024 03:27
Base automatically changed from wg-receiver to main March 26, 2024 15:50
@wrwg wrwg marked this pull request as ready for review March 26, 2024 18:30
Copy link
Contributor

@vineethk vineethk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor comments. Looks great!

SomeReceiverFunction(Symbol, Option<Vec<Type>>, Vec<(Loc, Type)>, Type),
/// The type variable must be instantiated with a type which has the given ability
HasAbility(Ability),
SomeReceiverFunction(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the notes added.

),
/// The type must not be reference because it is used as the type of some field or
/// as a type argument.
NoReference,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Using NotReference, NotTuple, etc., may be a bit clearer. But because there is a clear comment, it is up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this would be clearer? I thought NoReference is the right English here, compared to NotReference which is not a valid word.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constraint is that it is "not a reference" => NotReference.

Both NoReference and NotReference are not valid words by themselves.

Anyway, I have marked this as a nit because the comment makes the constraint clear to me.

@@ -309,13 +619,29 @@ impl Constraint {
/// Represents an error resulting from type unification.
#[derive(Debug)]
pub enum TypeUnificationError {
/// The two types mismatch: `TypeMatch(actual, expected)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The two types mismatch: `TypeMatch(actual, expected)`
/// The two types mismatch: `TypeMismatch(actual, expected)`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// Lifts a type mismatch error to a pair of context type
/// Lifts a type unification error from the critical pair to the given context type.
/// A critical pair in type unification is the sub-term in which two type terms disagree,
/// e.g for `S<t> != S<t'>`, `(t, t`)` is the critical pair.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// e.g for `S<t> != S<t'>`, `(t, t`)` is the critical pair.
/// e.g for `S<t> != S<t'>`, `(t, t')` is the critical pair.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

};
// Providing instantiation context for number constraints is
// confusing for users. Those constraints are used in
// operators like `*` and so on but not visible to the use.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// operators like `*` and so on but not visible to the use.
// operators like `*` and so on but not visible to the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1,48 +1,48 @@

Diagnostics:
error: operand of cast must be a number
error: expected `u8|u16|u32|u64|u128|u256|num` but found a value of type `S`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like we use integer in some error message contexts (eg, in binary_gt_invalid.exp), and u8|....|num here. Maybe we could use integer here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Done

Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I haven't finished readng ty.rs.

third_party/move/move-model/src/ty.rs Outdated Show resolved Hide resolved
@@ -374,34 +387,18 @@ impl<'env, 'translator, 'module_translator> ExpTranslator<'env, 'translator, 'mo
&mut self,
node_id: NodeId,
ty: &Type,
reported_types: &mut BTreeSet<Type>,
reported_vars: &mut BTreeSet<u32>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to state in the comment above that

/// Free type variables found are added to `reported_vars`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reading test outputs. Here's one, though.

/// given a struct with declared abilities.
pub fn for_field(struct_abilities: AbilitySet, field_ty: &Type) -> Vec<Constraint> {
let mut result = vec![Constraint::NoPhantom, Constraint::NoTuple];
// Only if the field is not a type parameter, requires abilities. That reflects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need NoReference(true)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fields can be references.

Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good but we have a number of test regressions, so if you don't fix them here file an issue to make sure we don't miss them.

Notably:

  • we don't seem to type-check structs until they are used
  • maybe also inline functions
  • we don't seem to check for reference fields of structs (Zekun noted in the code)
  • we could do better with types like u8|u16|..|num and u16|u32|...|num
  • when we infer types for variables it would be a bonus if we could show the reason when we complain about an assignment of a different type
  • other items I may have forgotten.
    Fix field reference types and I think we can record the test regressions and move on.

I checked all the tests. I hope you will forgive any mistakes I made, since Github makes it really hard to review all my comments before sending them.

TypeParameter(_) => true,
Tuple(ts) => ts.iter().any(|t| t.depends_from_type_parameter()),
Vector(t) => t.depends_from_type_parameter(),
Struct(_, _, ts) => ts.iter().any(|t| t.depends_from_type_parameter()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you could use

Tuple(ts) | Struct(_, _, ts) | ResourceDomain(_, _, Some(ts) => ts.iter().any(|t| t.depends_from_type_parameter()),
Vector(t) | Reference(_, t) | TypeDomain(t) => t.depends_from_type_parameter(),
Fun(t, r) => t.depends_from_type_parameter() || r.depends_from_type_parameter()

to be a little more concise and easier to see how each is handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,112 @@

Diagnostics:
error: type `R` is missing required ability `drop`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: before the move, this file had errors at lines 11, 12, 14, 19, 21, 24, 26, 28, 33, 35, 38, 40, 43, 45. You may want to compare with the original test output.

Do we perhaps not check struct field constraints until the struct is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this bug is now fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing errors, compared to main:

v1 had:   11, 12, 13, 14, 18, 19, 20, 21, 23, 24, 25, 26, 32, 33, 34, 35, 37, 38, 39, 40, 42, 43, 44, 45
v2 had:   11, 12, 13, 14, 19, 21, 24, 26, 28,                 33, 35,         38,     40,     43,     45
now have: 11, 12, 13, 14,                                 32, 33, 35,         38, 39, 40, 42, 43, 44, 45

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, there was another bug which lead to that types of parameters were not checked. Fixed now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I see you're still missing some errors that V1 had, but these seem to all be redundant (same type instantiation as a previous error), so I'll assume that's why we don't have them.

@wrwg wrwg requested review from fEst1ck and brmataptos March 28, 2024 21:44
@wrwg
Copy link
Contributor Author

wrwg commented Mar 28, 2024

Thanks for the reviews, all issues addressed.

@wrwg wrwg added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Mar 28, 2024
Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some errors missing.

@@ -0,0 +1,112 @@

Diagnostics:
error: type `R` is missing required ability `drop`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing errors, compared to main:

v1 had:   11, 12, 13, 14, 18, 19, 20, 21, 23, 24, 25, 26, 32, 33, 34, 35, 37, 38, 39, 40, 42, 43, 44, 45
v2 had:   11, 12, 13, 14, 19, 21, 24, 26, 28,                 33, 35,         38,     40,     43,     45
now have: 11, 12, 13, 14,                                 32, 33, 35,         38, 39, 40, 42, 43, 44, 45

Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more I think.

@@ -0,0 +1,40 @@

Diagnostics:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see it. Perhaps you forgot to push?

┌─ tests/checking/typing/conditional_global_operations.move:15:20
15 │ move_to(s, Box<R> { f: R {} });
│ ^^^^^^^^^^^^^^^^^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors are now all there, but they used to complain about ability 'key' and now they complain about ability 'store'. Perhaps we could mention multiple abilities if multiple are required and missing?

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor Author

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, comments addressed. I also spend some time to generally get error positions better. Now type parameter constraint errors are actually reported at the type argument, and constraints for inferred types at the function/struct which introduces the type parameters.

There seem to be still some errors missing compared to v1 but this stems from removing duplicate errors. If in a function a particular type has been identified to not satisfy e.g. an ability constraint, this will be only reported once in this function. v2 reports it multiple times.

@@ -0,0 +1,40 @@

Diagnostics:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AH, now I see what you mean. Those drop errors on code level are later handled on bytecode level in AbilityProcessor. There is a test in ability-check/assign.rs.

┌─ tests/checking/typing/conditional_global_operations.move:15:20
15 │ move_to(s, Box<R> { f: R {} });
│ ^^^^^^^^^^^^^^^^^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error before was imprecise. For Box to have key, R needs to have store. I improved the error positions a bit (this is actually a requirement on the inferred type of move_to), as well as add now the constraints from type parameters to the notes, so you can see it was key which was originally requested.

error: type `R` is missing required ability `store` (type was inferred)
   ┌─ tests/checking/typing/conditional_global_operations.move:15:9
   │
15 │         move_to(s, Box<R> { f: R {} });
   │         ^^^^^^^
   │
   = required by instantiating type parameter `T` of struct `Box`
   = required by instantiating type parameter `T:key` of function `move_to`

@@ -0,0 +1,112 @@

Diagnostics:
error: type `R` is missing required ability `drop`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, there was another bug which lead to that types of parameters were not checked. Fixed now.

@wrwg wrwg requested a review from brmataptos March 29, 2024 18:17

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So close. I think there's a case missing involving phantom type parameters.

@@ -0,0 +1,112 @@

Diagnostics:
error: type `R` is missing required ability `drop`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I see you're still missing some errors that V1 had, but these seem to all be redundant (same type instantiation as a previous error), so I'll assume that's why we don't have them.

11 │ fun cds<T: copy + drop + store>() {}
│ - declaration of type parameter `T`
·
29 │ cds<vector<Cup<u8>>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V1 had errors on 31(c), 32(c), 33(k), 34(k), 35(k), 36(k), 37(cs), 38(c), 39(cds). These are phantom parameters, maybe that's the issue?

(c=copy, k=key, d=drop, s=store)

@@ -0,0 +1,114 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing errors for lines 29+, these relate to phantom parameters.

@@ -0,0 +1,114 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 29 on previously (v1) have errors involving phantom type parameters that are missing now.

@wrwg
Copy link
Contributor Author

wrwg commented Mar 29, 2024

So close. I think there's a case missing involving phantom type parameters.

The reason for the remaining missing errors is duplicate removal. I extended one of those tests to demonstrate this.

Basically, once we reported say that signer is missing drop in one function context, we don't do again, regardless of context. That's why the phantom errors are not reported, because they go through all types again.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@wrwg wrwg enabled auto-merge (squash) March 29, 2024 21:40

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 783eb109ab13d156bb67455e487336ca76f25caa

two traffics test: inner traffic : committed: 7932 txn/s, latency: 4954 ms, (p50: 4800 ms, p90: 5700 ms, p99: 10500 ms), latency samples: 3419080
two traffics test : committed: 100 txn/s, latency: 1907 ms, (p50: 1800 ms, p90: 2100 ms, p99: 6500 ms), latency samples: 1840
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.226, avg: 0.204", "QsPosToProposal: max: 0.278, avg: 0.257", "ConsensusProposalToOrdered: max: 0.474, avg: 0.427", "ConsensusOrderedToCommit: max: 0.329, avg: 0.312", "ConsensusProposalToCommit: max: 0.752, avg: 0.739"]
Max round gap was 1 [limit 4] at version 1586204. Max no progress secs was 4.644276 [limit 15] at version 1586204.
Test Ok

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> 783eb109ab13d156bb67455e487336ca76f25caa

Compatibility test results for aptos-node-v1.9.5 ==> 783eb109ab13d156bb67455e487336ca76f25caa (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 5290 txn/s, latency: 5639 ms, (p50: 4800 ms, p90: 9600 ms, p99: 13900 ms), latency samples: 232780
2. Upgrading first Validator to new version: 783eb109ab13d156bb67455e487336ca76f25caa
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 854 txn/s, latency: 32350 ms, (p50: 38800 ms, p90: 47600 ms, p99: 49000 ms), latency samples: 57220
3. Upgrading rest of first batch to new version: 783eb109ab13d156bb67455e487336ca76f25caa
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 18 txn/s, submitted: 144 txn/s, expired: 125 txn/s, latency: 2551 ms, (p50: 2600 ms, p90: 3100 ms, p99: 3100 ms), latency samples: 5767
4. upgrading second batch to new version: 783eb109ab13d156bb67455e487336ca76f25caa
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2313 txn/s, latency: 12815 ms, (p50: 13500 ms, p90: 19900 ms, p99: 20500 ms), latency samples: 108740
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 783eb109ab13d156bb67455e487336ca76f25caa passed
Test Ok

@wrwg wrwg merged commit 7eda6b5 into main Mar 29, 2024
104 of 106 checks passed
@wrwg wrwg deleted the wrwg/inst-ability branch March 29, 2024 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
4 participants