Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked implicit borrowing to be more relaxed #396

Merged
merged 1 commit into from Dec 21, 2020

Conversation

vallentin
Copy link
Collaborator

@vallentin vallentin commented Dec 7, 2020

Fixes #376 (can't compare &str with str, while import/include remains the same by design)

This PR changes SetChain into MapChain, where the value represents an optional variable which the local variable "points to". This improves the generated Rust code, as existing variables are no longer continuously generated for every macro call. This means that variables are no longer continuously borrowed, e.g. making it possible to s == "foo" in a macro.

No test cases were modified, and this is now a non-breaking change.


Old Explanation:

To be able to reuse variables in nested macro calls, I ended up reworking SetChain into MapChain. Such that the each local can have an optional variable (or expression) which it points to. Then I added resolve_var to be able to follow and resolve a chain if variables, e.g. "param" -> "a" -> "b" -> "self.c".

This avoid that previously when let (..) = (..) would be defined in the local scope, it would either (a) cause values to be moved if non-copyable or (b) force all values to be referenced. This would cause every level of nested macro call, to increase the amount of indirection (i.e. 3 depth macro call == &&&T). This would makeb it harder to use .clone(), but now it does not exceed &T so using .clone() is always possible.

Given that HashSet<T> internally uses a HashMap<T, ()>, then a SetChain can still be created using, e.g. type SetChain<'a, T> = MapChain<'a, T, ()>. Which implies that they are still identical. I have not changed "SetChain"'s functionality, I've only added get and get_skip as well as resolve_var which I implemented specifically for MapChain<'_, &str, Option<String>>.

Additionally, now empty let () = (); would be generated more frequently, since variables are forwarded. So these are also culled from the output.

@vallentin
Copy link
Collaborator Author

vallentin commented Dec 7, 2020

I've sent you a message on Twitter the other day, in case you haven't noticed.

@djc
Copy link
Owner

djc commented Dec 13, 2020

I think I like this direction but in order to be able to review/understand it well I feel like I need more guidance, and perhaps I'd also want to aim for splitting up the initial commit here into smaller changes. Let's maybe start by leaving new tests out of this PR for now so I can focus on reviewing the logic changes and the changes to existing tests. Also, please squash intermediate commits like Reverted TemplateLoop and Updated indent that changes things from earlier commits into their earlier commits (one option here is the the git-absorb git extension).

First, how do you see this affecting backwards compatibility for existing templates?

Second, what is the goal of the change from SetChain to MapChain? What additional information is it tracking, and how?

@vallentin
Copy link
Collaborator Author

to review/understand it well I feel like I need more guidance

The actual change to the generator isn't that crazy. It's overall just:

  1. Removing a bunch of "&"
  2. The places where it does "&", do so if !is_copyable()
  3. Reuse Expr::Var when possible, instead of emitting a let ... for every macro call

But tell me how I can help or feel free to comment on all the lines you feel necessary, and I'll comment. (Maybe wait until I've rebased.)


Second, what is the goal of the change from SetChain to MapChain? What additional information is it tracking, and how?

In short, consider the following template:

{% macro bar(y) %}
    {{ y }}
    {% if y == "bar" %}
    {% endif %}
{% endmacro %}

{% macro foo(x) %}
    {{ x }}
{% endmacro %}

{% call foo(self.item) %}

Then before it would essentially generate:

(Which would result in a compile error for the if)

let x = &self.item;
let y = &x;
write(&y);
if &y == "bar" {}

Whereas now it essentially generates:

write(&self.item);
if self.item == "bar" {}

All in all, now you're able to do y == "bar" in macro bar, as self.item is propagated all the way, and whether it is borrowed, is now decided when it is used.

So in short, to be able to know which variable a local variable points to, that information is stored in the value part of the HashMap. To reuse the same example x points to self.item, y points to x. So when y is finally emitted, then it is resolved back to self.item.

So since that information needed to be part of every variable in every scope, then I thought the simplest way would be to change HashSet<T> to HashMap<K, V>. Functionally MapChain<T, ()> is the equivalent to SetChain<T>.


Backwards compatibility for existing templates?

Technically, it's a breaking change.

However, only 4 of the existing test cases were affected, 1 of which I introduced 11 days ago (#393). The affected test cases, are functions that had a borrowed literal as a parameter (e.g. &u8). Technically, if you did {{ self::foo(self.i) }} then it would require .clone() as anything accessed through self is still borrowed.

So realistically, I don't think it will be an issue, and if someone did encounter issues, then adding .clone() or inversely .as_ref() is enough to fix those few cases.

Though, maybe the previous non-breaking changes should be released as a patch (v0.10.6), and then after we can bump the minor version (v0.11.0) and include this change.


I'd also want to aim for splitting up the initial commit here into smaller changes

Maybe it will be possible to split the write_loop change. But I'm unsure how much else of the generator changes can be split.

squash intermediate commits like Reverted TemplateLoop and Updated indent that changes things from earlier commits into their earlier commits

👍


Let's maybe start by leaving new tests out of this PR for now so I can focus on reviewing the logic changes and the changes to existing tests

Sure, I wanted to rewrite those into the existing test cases, in a more approachable manner anyways. As it is indeed a bit much, I just wanted to test every case I could think of, to ensure they wouldn't break later.

So, do I just drop the commit for the new test cases? (I'll keep a local copy, to rework into the other test cases for later then)

@djc
Copy link
Owner

djc commented Dec 14, 2020

Ok, that all sounds good. Definitely split off the new test cases for a later PR. In fact, if it's not too hard to extract from the other changes, I'd prefer to focus this PR only on the change around the MapChain, that is (if I understand correctly) to "resolve" Askama-local variables at (template) compile time, and then move the rest of the implicit borrowing changes out to yet another PR.

@vallentin vallentin force-pushed the implicit-borrow branch 2 times, most recently from 07b056f to 5202327 Compare December 15, 2020 05:38
@vallentin
Copy link
Collaborator Author

vallentin commented Dec 15, 2020

(if I understand correctly) to "resolve" Askama-local variables at (template) compile time

Yes, that is correct.


I've updated the PR. Now it only includes the SetChain to MapChain change. Which also means it now is a non-breaking change.

In short:

  • No tests have been modified, nor affected
  • The filters remain as they were prior to this PR (i.e. they still take literals by reference)
  • Calling is back to using a references (which is why it's a non-breaking change now)

I've updated the PR description to reflect the changes.

askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
fn insert(&mut self, val: T) {
self.scopes.last_mut().unwrap().insert(val);
fn insert(&mut self, key: K, val: V) {
let old_val = self.scopes.last_mut().unwrap().insert(key, val);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment documenting why the asserted invariant should hold?

Copy link
Collaborator Author

@vallentin vallentin Dec 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we could also remove the assert because Rust itself will error when attempting to compile the generated template. Basically, the cases are e.g. {% macro f(a, a) %} and {% let (a, a) = ... %} which both results in a compile error along the lines of "identifier a used more than once" .

I'll remove it and add a note to insert. Because if we want to add "identifier {:?} used more than once" to the assert, then insert requires where K: Debug + Clone/Copy.

askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
@vallentin
Copy link
Collaborator Author

vallentin commented Dec 19, 2020

  • Refactored into using struct LocalMeta.
  • Removed get_skip and resolve_var.
    • In write_call the expression is resolved up front. So since it's resolved then get() always get's the actual expression, so no need to resolve_var and traverse backwards anymore.

When this is merged, I'll do "let shadow" and then after that "copy literals".

askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
askama_shared/src/generator.rs Outdated Show resolved Hide resolved
@vallentin
Copy link
Collaborator Author

I tried changing the resolve methods to return Cow. However, it would just introduce an additional clone for Expr::Attr. Further changing LocalMeta to use Cow made the rest more cumbersome.

@djc djc merged commit c7697cb into djc:main Dec 21, 2020
@djc
Copy link
Owner

djc commented Dec 21, 2020

Looks good! In the future, please consider the following in terms of ordering:

  • Items that depend on another item should appear before the method they depend on
  • &mut self methods generally should be defined before &self methods

In the new LocalMeta methods, you did it the other way around.

@vallentin
Copy link
Collaborator Author

Hmm, LocalMeta only has a single &self method. Are you talking about the insert and insert_with_default I added to MapChain? 😅

Items that depend on another item should appear before the method they depend on

So in relation to that, you referring to swapping insert and insert_with_default around?

&mut self methods generally should be defined before &self methods

I assume, you're just mentioning this so I know for the future. Because a quick glance, and I don't see anywhere that I switched it around. To be fair, I didn't really think about that, because my personal preference for ordering, is different than that. 😅

However, I'll try to remember that for the future.

@vallentin vallentin deleted the implicit-borrow branch December 21, 2020 21:34
@djc
Copy link
Owner

djc commented Dec 22, 2020

Yeah, sorry, the MapChain methods (apparently I keep mixing these up!).

So in relation to that, you referring to swapping insert and insert_with_default around?

Yup.

I assume, you're just mentioning this so I know for the future. Because a quick glance, and I don't see anywhere that I switched it around. To be fair, I didn't really think about that, because my personal preference for ordering, is different than that. 😅

What is your preference for ordering, and why?

@vallentin
Copy link
Collaborator Author

Yup.

I'll try to remember this in the future. But otherwise, just tell me to reorder if I forget.

What is your preference for ordering, and why?

Actually, "preference" might not be the correct word. Now that I'm thinking about it, it's probably more correct to say "habit" and "what I'm used too".

Generally, I order stuff by visibility, i.e. pub, pub(crate), pub(super) and lastly private items. After that, I order methods in relation to their "kind", i.e. I usually do:

  • Static methods
  • Constructors, i.e. static methods returning some form of Self or Result<Self, _>
    • e.g. new(), with_...()
  • Builder methods
  • Methods that consume self, e.g. I have a Worker with a join(self) method for some multithreading stuff
  • &mut self methods excluding setters
  • Pairwise getter and setter for fields
  • Other utility methods, like is_empty, len, etc
  • Lastly private methods

I'd actually write use declarations in the opposite order (in e.g. generator.rs), i.e. std, then dependencies, then crate::.

However, ultimately I always attempt to follow the style of the repo I'm making a PR for. :)

@djc
Copy link
Owner

djc commented Dec 22, 2020

That order is actually pretty close to what I'm aiming for, but I do like to order functions (and items in general) such that depending items come before depended upon items. Usually the former are more complex and need more changes and are more likely to be called on by "external" callers.

@vallentin
Copy link
Collaborator Author

such that depending items come before depended upon items

Of course, and honestly that actually makes a lot of sense in terms of readability. Say insert was sufficiently complex, you might not "notice" the subsequent small insert_with_default. But if it comes beforehand, then it stands out more. Though this is of course from the perspective of reading through the code.

My only annoyance would be that in the documentation, I'd prefer insert to come before insert_with_default. However, rustdoc would follow the order of the items in the code. 😅

@clouds56
Copy link

I'm not sure if this is the right place, but I cannot figure out how to make this run

{% for i in v.iter().skip(offset).take(size.clone()) %}
{% endfor %}

it would generate

for (i, _loop_item) in ::askama::helpers::TemplateLoop::new((&self.v.iter().skip(&self.offset).take(&{
    self.size.clone()}
)).into_iter())

I think it should not borrow when the expression is an function call

@djc
Copy link
Owner

djc commented Dec 29, 2020

Have you tried with latest main?

@vallentin
Copy link
Collaborator Author

Currently v.iter().skip(offset).take(size.clone()) is not supported, neither the latest release nor the latest commit on main. The issue in particular is .skip(offset), currently all fields and variables are implicitly referenced and cannot be deferenced.

This is related to "copy literals" (issue #404), and a possible solution and workaround should be available soon.


If you use the latest commit on main, then there is a rather simple workaround, by using a method on your template.

impl SomeTemplate {
    fn iter_skip_take(&self) -> impl Iterator<Item = &...> {
        self.v.iter().skip(self.offset).take(self.size)
    }
}
{% for i in self.iter_skip_take()  %}
{% endfor %}
[dependencies]
askama = { git = "https://github.com/djc/askama", rev = "b880799" }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow macro blocks in included templates
3 participants