Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple parser from PNode with DOD approach #423

Open
5 tasks
saem opened this issue Aug 31, 2022 · 3 comments
Open
5 tasks

Decouple parser from PNode with DOD approach #423

saem opened this issue Aug 31, 2022 · 3 comments
Assignees
Labels
refactor Implementation refactor

Comments

@saem
Copy link
Collaborator

saem commented Aug 31, 2022

Upgraded parser (and possibly lexer/tokenizer) that conforms to a more data oriented design approach as taken in the VM and new Backend (in progress). The semantic analysis layer won't be able to take advantage of it yet, but other tools like filters and pretty might.

The approach at the out set:

  • decouple the parser from PNode
  • improve parser data types to have a more data oriented approach and get it working
  • look at the DOD approach and see what rough edges/misfeatures need fixing, replacement, or removal; do some of those

NB: like any plan revise and improve the approach as we learn more. At the end we should have a faster parser which is nicer to memory, with more tests/validation against the grammar, fewer oddities in the grammar, clearer node kinds, decoupling from semantic analysis, and a path for tools like source filters, formatters, etc, easily making effective use of the new parser.

Deets

Decouple the parser from PNode

The parser is coupled to PNode right now which takes a sea of nodes on the heap approach. First step make a ParsedNode (or whatever) that's basically a very slim PNode, add a proc to convert to PNode, and get everything working as before.

Improve parser data types

Start revising the parser produced data types with a more DOD style. Start with a ParsedFragment that is a bit more than a sequence of parsed nodes (minimal object) and some metadata. Internally it should be a DOD layout and be produced by parseAll, parseTopLevelStmt, etc. A fragment could be some arbitrary string of AST or a while file/module, statement, etc. The idea is this is the parser's production for a single parse action. If a whole file vs fragment distinction proves to be useful then do that.

First cut data type will be easy and then getting the parser code to use it will be where it gets interesting focus on getting it working and keeping the data model as clean as possible. Mark all the weirdness so we know what to come back to next.

Smooth out rough edges

The above will get it working, now we clean things up so we're learning from the design smells showing up in the data model.

Final remarks on approach

This doesn't have to be a big bang. Quick wins regardless of this work should be PR'd separately ahead of time. If the first part of the approach doesn't hurt performance too much then that can come in on its own. If the second is clean enough then ditto. Other intermediate milestones might be discovered and get pulled in early.


todo steps added by @haxscramper, collected from the discussion in #425

  • Clean up the parser implementation - this PR. It introduces another slowdown of the parsing stage, but not too drastic, and immediately enables decoupling.
  • Throw away nimpretty hacks - `nimpretty` #113
  • Transition lexer to use DOD approach - I'm thinking of a global FileIndex -> seq[Token] mapping that is generated by the lexer. Each token includes a line, a column and a tag information. And an extent range.
    • ParsedNode includes (FileIndex, TokenIndex) pair that is used to get the token back from the main storage
    • No tokens are discarded from the tokenizer - Non-documentation comments, obviously included. I'm not exactly sure about the INDENT/DEDENT/SAME-INDENT token set, but I think we can also retain them.
  • Add reduced parsed node set - based on the TNodeKind, but without sem-related data - while ParsedNode and PNode are structurally similar, the latter one has a lot more states.
  • Change the ParsedNode to use a DOD structure as well
@saem saem added the refactor Implementation refactor label Aug 31, 2022
@saem
Copy link
Collaborator Author

saem commented Aug 31, 2022

@haxscramper took a bit but to a first crack at it. Wrote it quickly on my phone during lunch so hopefully it's not too bad.

@haxscramper haxscramper self-assigned this Aug 31, 2022
@haxscramper
Copy link
Collaborator

I think "Decouple the parser from PNode", "Improve parser data types" and follow-ups can be implemented as three separate PRs - the first one is rather easy, so I can start working on it right away, and then we can look closer at the DOD rewrite.

haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 2, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous author of the codebase)
  misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.
-
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 2, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.
-
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
- Implement `astrepr.nim` support for the `ParsedNode` - `debug` and
  `treeRepr` procedures.

Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
- Implement `astrepr.nim` support for the `ParsedNode` - `debug` and
  `treeRepr` procedures.

Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
- Implement `astrepr.nim` support for the `ParsedNode` - `debug` and
  `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds.

Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
- Implement `astrepr.nim` support for the `ParsedNode` - `debug` and
  `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds.
- compiler/front/options.nim:693 :: Unconditionally output debugging
  traces if they are requested, regardless of the surrounding hooks and
  filters. Introduce the `bypassWriteHookForTrace` flag in the debugging
  hack controller which makes it possible to bypass the `writeln` hook.

Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
- Implement `astrepr.nim` support for the `ParsedNode` - `debug` and
  `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds in the `astrepr`
- Restructure imports of the `astepr` and move it closer to the
  'primitive' modules - type definitions and trivial data queries. The
  most important change is removal of the `ast.nim` and `renderer.nim`
  imports, which opens these modules for debugging as well.
- Consider possibility of a nil `owner` in the symbol owner chain
  representation calculations in `astrepr`
- compiler/front/options.nim:693 :: Unconditionally output debugging
  traces if they are requested, regardless of the surrounding hooks and
  filters. Introduce the `bypassWriteHookForTrace` flag in the debugging
  hack controller which makes it possible to bypass the `writeln` hook.

Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b
- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`
- Implement `astrepr.nim` support for the `ParsedNode` - `debug` and
  `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds in the `astrepr`
- Restructure imports of the `astepr` and move it closer to the
  'primitive' modules - type definitions and trivial data queries. The
  most important change is removal of the `ast.nim` and `renderer.nim`
  imports, which opens these modules for debugging as well.
- Consider possibility of a nil `owner` in the symbol owner chain
  representation calculations in `astrepr`
- Semantic tracer debug output file rotation now uses location of the
  first `.define(` call as a file name base instead of integer-based
  ones. Added basic logging information about created files - now a
  developer can see what is going on and what gets written.

  For example, running with `--define=nimCompilerDebugTraceDir=/tmp` and
  seveal `define(...)` sections produces the following output:

  ```
  comparisons.nim(269, 8): opening /tmp/comparisons_nim_0 trace
  comparisons.nim(274, 7): closing trace, wrote 44 records
  comparisons.nim(276, 8): opening /tmp/comparisons_nim_1 trace
  comparisons.nim(285, 7): closing trace, wrote 329 records
  ```

- compiler/front/options.nim:693 :: Unconditionally output debugging
  traces if they are requested, regardless of the surrounding hooks and
  filters. Introduce the `bypassWriteHookForTrace` flag in the debugging
  hack controller which makes it possible to bypass the `writeln` hook.

Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

* Documentation changes

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b

* Tangentially related refactoring work

- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`

* Debugging tools improvements

- Implement `astrepr.nim` support for the `ParsedNode` and `PIdent` -
  `debug` and `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds in the `astrepr`
- Restructure imports of the `astepr` and move it closer to the
  'primitive' modules - type definitions and trivial data queries. The
  most important change is removal of the `ast.nim` and `renderer.nim`
  imports, which opens these modules for debugging as well.
- Consider possibility of a nil `owner` in the symbol owner chain
  representation calculations in `astrepr`
- Semantic tracer debug output file rotation now uses location of the
  first `.define(` call as a file name base instead of integer-based
  ones. Added basic logging information about created files - now a
  developer can see what is going on and what gets written.

  For example, running with `--define=nimCompilerDebugTraceDir=/tmp` and
  seveal `define(...)` sections produces the following output:

  ```
  comparisons.nim(269, 8): opening /tmp/comparisons_nim_0 trace
  comparisons.nim(274, 7): closing trace, wrote 44 records
  comparisons.nim(276, 8): opening /tmp/comparisons_nim_1 trace
  comparisons.nim(285, 7): closing trace, wrote 329 records
  ```
- Simplify implementation of the `reportInst` handling in the debug
  utils tracer - now each toplevel tracer template must submit the
  location by itself - this solution avoids unintuitive and fragile
  `instLoc(-5)` call which might break with more templates introduced.
  Also updated documentation on the `reportInst` and `reportFrom` in the
  reports file.

- compiler/front/options.nim:693 :: Unconditionally output debugging
  traces if they are requested, regardless of the surrounding hooks and
  filters. Introduce the `bypassWriteHookForTrace` flag in the debugging
  hack controller which makes it possible to bypass the `writeln` hook.

* Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

* Documentation changes

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b

* Tangentially related refactoring work

- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`

* Debugging tools improvements

- Implement `astrepr.nim` support for the `ParsedNode` and `PIdent` -
  `debug` and `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds in the `astrepr`
- Restructure imports of the `astepr` and move it closer to the
  'primitive' modules - type definitions and trivial data queries. The
  most important change is removal of the `ast.nim` and `renderer.nim`
  imports, which opens these modules for debugging as well.
- Consider possibility of a nil `owner` in the symbol owner chain
  representation calculations in `astrepr`
- Semantic tracer debug output file rotation now uses location of the
  first `.define(` call as a file name base instead of integer-based
  ones. Added basic logging information about created files - now a
  developer can see what is going on and what gets written.

  For example, running with `--define=nimCompilerDebugTraceDir=/tmp` and
  seveal `define(...)` sections produces the following output:

  ```
  comparisons.nim(269, 8): opening /tmp/comparisons_nim_0 trace
  comparisons.nim(274, 7): closing trace, wrote 44 records
  comparisons.nim(276, 8): opening /tmp/comparisons_nim_1 trace
  comparisons.nim(285, 7): closing trace, wrote 329 records
  ```
- Simplify implementation of the `reportInst` handling in the debug
  utils tracer - now each toplevel tracer template must submit the
  location by itself - this solution avoids unintuitive and fragile
  `instLoc(-5)` call which might break with more templates introduced.
  Also updated documentation on the `reportInst` and `reportFrom` in the
  reports file.

- compiler/front/options.nim:693 :: Unconditionally output debugging
  traces if they are requested, regardless of the surrounding hooks and
  filters. Introduce the `bypassWriteHookForTrace` flag in the debugging
  hack controller which makes it possible to bypass the `writeln` hook.

* Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
haxscramper added a commit to haxscramper/nimskull that referenced this issue Sep 3, 2022
The main change - introduce a `ParsedNode` type which replaces `PNode`
in the parser. This change allows for further work on decoupling `sem`
from other parts of the compiler, making it easier to implement
improvements in a way that would not rip through the whole codebase and
test suite. Right now introduced type closely mimics the `PNode`
counterpart, but this is just a temporary measure for the transition
period. This commit is a part of multi-step series - full list can be
seen in the related issue nim-works#423

* Documentation changes

- Add missing documentation for changes in the earlier commit, add more
  how-tos to the debugging section (I haven't coded in a while, so was
  especially important to write down explanations for anything I had
  trouble with)
  nim-works@602367b

* Tangentially related refactoring work

- Cleanup the `passes.nim` implementation a bit - despite common (at
  least seemingly shared by many of the previous authors of the
  codebase) misconception longer variable names actually *do* increase
  readability. Also infamous recommendations for the "structured
  programming" also do not really mesh with proliferation of `break`
  statements in the code.

  Add todo/bug comment for the main processing loop bug related to the
  phase ordering in `compiler/sem/passes.nim:234`

* Debugging tools improvements

- Implement `astrepr.nim` support for the `ParsedNode` and `PIdent` -
  `debug` and `treeRepr` procedures.
- Allow skipping repeated symbol in the `(open|closed)SymChoice` node
  kinds in the `astrepr`
- Restructure imports of the `astepr` and move it closer to the
  'primitive' modules - type definitions and trivial data queries. The
  most important change is removal of the `ast.nim` and `renderer.nim`
  imports, which opens these modules for debugging as well.
- Consider possibility of a nil `owner` in the symbol owner chain
  representation calculations in `astrepr`
- Semantic tracer debug output file rotation now uses location of the
  first `.define(` call as a file name base instead of integer-based
  ones. Added basic logging information about created files - now a
  developer can see what is going on and what gets written.

  For example, running with `--define=nimCompilerDebugTraceDir=/tmp` and
  seveal `define(...)` sections produces the following output:

  ```
  comparisons.nim(269, 8): opening /tmp/comparisons_nim_0 trace
  comparisons.nim(274, 7): closing trace, wrote 44 records
  comparisons.nim(276, 8): opening /tmp/comparisons_nim_1 trace
  comparisons.nim(285, 7): closing trace, wrote 329 records
  ```
- Simplify implementation of the `reportInst` handling in the debug
  utils tracer - now each toplevel tracer template must submit the
  location by itself - this solution avoids unintuitive and fragile
  `instLoc(-5)` call which might break with more templates introduced.
  Also updated documentation on the `reportInst` and `reportFrom` in the
  reports file.

- compiler/front/options.nim:693 :: Unconditionally output debugging
  traces if they are requested, regardless of the surrounding hooks and
  filters. Introduce the `bypassWriteHookForTrace` flag in the debugging
  hack controller which makes it possible to bypass the `writeln` hook.

* Further work

- compiler/ast/parser.nim:744 :: introduce two tokens in order to handle
  custom literals. There is no real need to mash together everything in
  a single chunk of text that would have to be split apart down the
  line.
bors bot added a commit that referenced this issue Sep 4, 2022
425: decouple parser from PNode r=saem a=haxscramper

wip implementation of the #423

Co-authored-by: haxscramper <[email protected]>
@haxscramper
Copy link
Collaborator

haxscramper commented Sep 4, 2022

todo items from the PR

  • Add node kind for different literal bases - binary, decimal, octal, hexadecimal. This will be a x4 increase in the number of parsed literal nodes, but will normalize the data and allow ParsedNode to be a simple [tag, left, right] tuple of enum/integer
  • proc transitionSonsKind*(n: ParsedNode, kind: TNodeKind) = should check for transition validity, ideally parser should not have the need for midway transitions - adding a couple more nodes that describe different states and doing translation to PNode later is a preferred way.
  • https://github.com/nim-works/nimskull/blob/devel/compiler/ast/ast_query.nim#L114
  • isBlockArg*: bool new node kind for this distinction
  • Overhaul and document different enums that are defined in the ast_query.nim file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Implementation refactor
Development

No branches or pull requests

2 participants