Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dyno: Initial resolution of zip() expressions and parallel iterators #24915

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

dlongnecke-cray
Copy link
Contributor

@dlongnecke-cray dlongnecke-cray commented Apr 23, 2024

This PR introduces resolution of zip() expressions and resolution of parallel iterators for forall and [] loops.

For zip() expressions that appear as the iterand of a serial loop, the strategy is to resolve only serial iterators for each actual in the zip().

For zip() expressions that appear as the iterand of a forall loop, the strategy is to:

  • For the first actual (known as the leader), attempt to resolve a suitable leader iterator or error
  • For the leader and all remaining actuals (known as followers), attempt to resolve a suitable follower iterator or error

For [] loops, the strategy is similar, except serial iterators may be used as a substitute for the leader and followers IFF the leader iterator could not be resolved for the leader. If the leader iterator could be resolved, but e.g., its return type could not, then serial iterators will not be considered as fallbacks.

For iterands that are not zip() expressions, the "standalone" parallel iterator is preferred for parallel loops before attempting to resolve a leader/follower combo. As with zip(), forall loops will emit an error if no form of parallel iterator could be resolved. All other loops will fall back to serial iterators.

Thanks to @vasslitvinov for walking me through the semantics of zip() and parallel iterator resolution.

FUTURE WORK

  • Iterator forwarding for iterators in the internal modules (e.g., tag=tag, followThis=followThis)
  • Expand tests, handle error cases, attempt to resolve leader/follower/standalone iterators for standard/internal types
  • Consider adding an enum to TypedFnSignature expressing its iterKind
  • Set array types before resolving iterators instead of backpatching

Reviewed by @DanilaFe, @mppf. Thanks!

@dlongnecke-cray
Copy link
Contributor Author

I'll fix the CI check failures tomorrow, have to head out for the night.

Copy link
Contributor

@DanilaFe DanilaFe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm halfway through, a few comments, mostly minor changes within individual lines.

frontend/include/chpl/types/EnumType.h Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
frontend/lib/resolution/Resolver.cpp Outdated Show resolved Hide resolved
@dlongnecke-cray
Copy link
Contributor Author

Adding isSerialIterator isLeaderIterator isFollowerIterator and isStandaloneIterator as per your suggestion.

@dlongnecke-cray
Copy link
Contributor Author

My PR has exposed a different bug in testLibrary.testHelloWorld that is causing the test to fail (so CI checks fail).

While here, also comment out the 'testHelloWorld' dyno test.
Resolving parallel iterators has introduced some bugs that cause this
test to fail. Re-add it when we think the dyno resolver is ready.

Signed-off-by: David Longnecker <[email protected]>
Copy link
Contributor

@DanilaFe DanilaFe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a straggler comment I apparently never submitted.

@@ -819,6 +820,33 @@ void TypedFnSignature::stringify(std::ostream& ss,
ss << ")";
}

bool TypedFnSignature::
isIterWithIterKind(Context* context, const std::string& iterKindStr) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using UniqueString for this after all because USTR comparisons are O(1) -- as opposed to O(n) for std::strings. They also require less allocations.

Copy link
Contributor

@DanilaFe DanilaFe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went over it again since I've not been ooking at it for a while. This looks very very slick. 👍

if (it != m->end()) return it->second;
}
}
return QualifiedType(QualifiedType::VAR, UnknownType::get(rv.context));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For UnknownType, I think you need to set the kind to QualifiedYype::UNKNOWN.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this would be better supported by a query. Seems like a fair amount of unnecessary work to redo & there will only be a handful of inputs and outputs to it.

Comment on lines 4205 to 4206
// Resolve iterators, stopping immediately when we get a valid yield type.
auto ret = [&]() -> IterDetails {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you doing this to make early return possible? I suggest extracting this into a helper and not defining an immediately-invoked lambda.

Comment on lines 4294 to 4296
bool needStandalone = iterKindStr == "standalone";
bool needLeader = iterKindStr == "leader";
bool needFollower = iterKindStr == "follower";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use USTR("standalone") etc. here to make the comparison O(1) instead of O(n).

Comment on lines +4313 to +4316
((fn->isParallelStandaloneIterator(context) && needStandalone) ||
(fn->isParallelLeaderIterator(context) && needLeader) ||
(fn->isParallelFollowerIterator(context) && needFollower) ||
(fn->isSerialIterator(context) && needSerial));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this be possible? I don't think we allow being able to write calls with user-provided tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this - it is possible and is called iterator "forwarding" and is done in the internal modules. It should be an error in user code, though.

cur.enterScope(loop);
if (!ret.isUnknownOrErroneous()) {
rv.handleResolvedCall(iterandRE, astForErr, ci, c,
{ { AssociatedAction::ITERATE, iterand->id() } });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suspect the associated action needs to be different for turning foo() into foo(tag=follower)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry about that right now; we aren't consuming the associated actions yet and will know more when we do. It might be sufficient because the resolved function should be saved here in the associated action.

if (!isIterator()) return false;

auto ik = types::EnumType::getIterKindType(context);
if (auto m = types::EnumType::getParamConstantsMapOrNull(context, ik)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider early return if this is nullptr to avoid nesting code.

#define ADVANCE_PRESERVING_SEARCH_PATHS_(ctx__) \
do { \
ctx__->advanceToNextRevision(false); \
setupModuleSearchPaths(ctx__, false, false, {}, {}); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking this doesn't preserve them as much as it resets them to empty. A macro or method for this that truly preserves search paths would be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go ahead and rename the macro to something like "advance preserving standard modules".

Copy link
Member

@mppf mppf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, this looks good. I'm happy to see Resolver::enter(const IndexableLoop* loop) broken up into helper functions.

I'd like you to address my feedback comments before merging; some of these are asking you to restructure some of the code, but I am not expecting these changes to be very hard since the computation will be the same.

Can you take a minute to try to add the early check for it being an array type? I am not expecting that is very hard. If it is hard, listing it as Future Work in the PR message would be good. I think we should try to do this sooner rather than later as I expect it'll be a compilation performance hit if we don't.

@@ -949,6 +955,9 @@ class TypedFnSignature {
const TypedFnSignature* parentFn,
Bitmap formalsInstantiated);

bool
isIterWithIterKind(Context* context, UniqueString iterKindStr) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we introduce an enum in the C++ code for this? The use of UniqueStrings is OK with me as well; but an enum would allow additional checking at compile-time (e.g. avoiding mis-spelling; and in some cases we can write switch statements and have the compiler check we covered all cases).

in 'et' to each constant represented as a param value.
If there are multiple enum constants with the same name (which
means the AST is semantically incorrect), then only the first
constant is added to the map. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docs string should say in what situation it would return nullptr.

auto k = UniqueString::get(context, elem->name().str());
auto it = ret.find(k);
if (it != ret.end()) continue;
ret.emplace_hint(it, std::move(k), std::move(qt));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's clearer to use insert instead of find/emplace_hint. Also we use this insert pattern in many other places in dyno, so using it here would make this code easier to follow for anybody looking at lots of dyno code.

insert can return a pair, where the second element of that pair indicates if insertion took place. If the map already had the element, insert won't insert anything, and so that second element will be false.


auto it = m->find(iterKindStr);
if (it != m->end()) {
bool isFollowerIterKind = iterKindStr == "follower";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterKindStr == USTR("follower");

@@ -843,6 +844,33 @@ void TypedFnSignature::stringify(std::ostream& ss,
ss << ")";
}

bool TypedFnSignature::
isIterWithIterKind(Context* context, UniqueString iterKindStr) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be unnecessarily slow in the context of isSerialIterator where it is called multiple times with different kind strings.

Let's instead make a function to return the iterator kind for iterators. Then the TypedFnSignature would be doing things like iterKind(context) == USTR("leader").

Also, IMO it is worthwhile to make some kind of representation of the iterator kind part of TypedFnSignature itself. That would prevent the work of repeatedly computing this information. I don't view this as a strict requirement, but if it appeals to you, let's go ahead with it. (If you do this, please add an enum for the compiler's representation of iterator kinds). Of course, an alternative way would be use a query to compute the iterKind for a const TypedFnSignature*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn about whether or not to add an enum or just compute it. I'm not sure the performance cost of computing it repeatedly will be that high or significant. I agree this particular code can be improved.

Context* context = rv.context;

if (mask == IterDetails::NONE || rv.scopeResolveOnly) {
iterand->traverse(rv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably deserves a comment here about why we need to traverse the iterand in this case.

MSC.only().fn()->untyped()->kind() == uast::Function::Kind::ITER;
// Resolve the iterand but suppress errors for now. We'll reissue them
// next, possibly suppressing a "NoMatchingCandidates" for the iterand if
// our injected call is successful.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why suppress errors here? That seems odd to me. I would think, if we have for i in abc() or forall j in def() then an error in resolving abc() or def() would be fatal. Is the issue that, def might have a standalone iterator but not a serial iterator? Would that be better handled by resolving interior parts of a call (e.g. forall j in a(b(c())) we resolve b(c()) but delay resolving a() for now.

It seems to me that we could do this by simply traversing the children of iterand here before we proceed. The runAndTrackErrors stuff is cool but I think it has performance implications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just needed to resolve the iterand's type fully before I could make any decisions about it. It could be an iterator call. It could be as you say, a type with a standalone iterator but not a serial, in which case we need to inject the arguments and try again.

This is one way to do it, but I agree about the potential negative performance impact. Another option could be to add some sort of bool doNotEmitCallErrors flag to the Resolver. I tried to get that to work for a bit but I wasn't happy with it.

I will add a future work item about exploring an alternative to runAndTrackErrors.

bool wasIterandTypeResolved = !iterandRE.type().isUnknownOrErroneous();
bool wasIterResolved = fn && fn->isIterator();
bool wasMatchingIterResolved = wasIterResolved &&
((fn->isParallelStandaloneIterator(context) && needStandalone) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this code would be more efficiently written if it computes the iterator tag and then does comparisons including needStandalone etc.

auto iteratorTag = fn->iteratorTag(context);
if ((iteratorTag == USTR("standalone") && needStandalone) || ...

(of course it would look different if you add that enum).

cur.enterScope(loop);
if (!ret.isUnknownOrErroneous()) {
rv.handleResolvedCall(iterandRE, astForErr, ci, c,
{ { AssociatedAction::ITERATE, iterand->id() } });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry about that right now; we aren't consuming the associated actions yet and will know more when we do. It might be sufficient because the resolved function should be saved here in the associated action.


// TODO: What would it take to make backpatching of array types happen
// _before_ / without resolving an iterator? Currently we rely on iterator
// resolution to resolve the iterand for us.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that we only needed to detect if elements yielded by the loop expression are types rather than values? Also it checks something about there being a domain involved. Seems like it'd be easy enough to filter these earlier. Am I wrong about something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change the stale TODO.

Signed-off-by: David Longnecker <[email protected]>
@dlongnecke-cray
Copy link
Contributor Author

Hello @mppf @DanilaFe. I believe I've responded to all of your feedback. I've added some TODOS to the future work part of the PR message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants