Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does loops work? How does Gigahorse translate loops? #137

Open
lahiri-phdworks opened this issue May 23, 2024 · 3 comments
Open

How does loops work? How does Gigahorse translate loops? #137

lahiri-phdworks opened this issue May 23, 2024 · 3 comments

Comments

@lahiri-phdworks
Copy link

lahiri-phdworks commented May 23, 2024

I am not able to understand the translation of a loop function to the IR generated by Gigahorse.
I have a solidity function that looks like the example shown below.

Example:

pragma solidity >=0.4.2 <=0.7.6;

contract Loop {
    function loop_example (uint a, uint b) public view returns (uint) {
       require(a > 0, "parameter a passed to this function be greater than 0");
       uint c = 0;
       while (a > 0) {
           a = a - 1;
           c = c + b
       }
       return c;
    }
}
  • How does Gigahorse translate this function?
  • Is a, the loop variable, stored in the register or in memory using MSTORE/MLOAD constructs?

Any clarification in this direction will be really helpful.

@sifislag
Copy link
Collaborator

Hi, have you been able to analyze this program using gigahorse? You can find our loop related logic in clientlib/loops.dl and clientlib/loops_semantics.dl.

Our main relation is this one, denoting that a basic block s is part of a loop identified via loophead:

.decl BlockInStructuredLoop(s:Block, loophead:Block)

a originating as a stack variable in the analyzed EVM program, will be placed on a register Variable in our IR.

Hope I was able to help, let me know if you have more questions.

@lahiri-phdworks
Copy link
Author

Thanks, I was going over those files and the one on dominance as well.

The doubt I have is that in case of loops, in the loop header block, there will be two sets of variables that come in (i.e. two stacks of registers I guess), one from the previous block and other via a back edge.

How does the tool merge them (definitions)? I can see PHI instructions in the IR, but is it guaranteed that the number of variables in both the stacks will remain the same?

lahiri-phdworks added a commit to Research-Tools-PAVT/gigahorse-toolchain that referenced this issue May 23, 2024
@sifislag
Copy link
Collaborator

sifislag commented Jun 9, 2024

Hi, I'm not exactly sure what the question is but I think I got it. File clientlib/loop_semantics.dl defines several related concepts we use to model loops.

/**
  Simple loop induction variable. Identified by the two variables.
*/
.type SimpleInductionVariable = [beforeLoopVar: Variable, inLoopVar: Variable]

.decl WellFormedLoopInductionVariable(loop: Block, inductionPhiVar: Variable, simpleIndVar: SimpleInductionVariable)
.decl InductionVariableIncreasesByConst(loop:Block, simpleIndVar:SimpleInductionVariable, const:number)
.decl InductionVariableDecreasesByConst(loop:Block, simpleIndVar:SimpleInductionVariable, const:number)
.decl InductionVariableStartsAtConst(loop:Block, simpleIndVar:SimpleInductionVariable, const:Value)
.decl InductionVariableLowerBoundVar(loop:Block, indVar:SimpleInductionVariable, bound:Variable)
.decl InductionVariableUpperBoundVar(loop:Block, indVar:SimpleInductionVariable, bound:Variable)

Essentially because of PHIs in our IR are introduced at every block that accesses them (instead of placing them at dominance frontier), so we identify well formed induction variables by the beforeLoop and inLoop variables. We have several high-level relations that use this concept.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants