DS ( Data Structure ) AST (Abstract Syntax Tree)
Tip
This knowledge is really useful and it helps whenever you need to process any kind text.
-
TypeScript implementation of: Building a Parser from scratch
-
TypeScript implementation of: XML Parser - basic support to the XML language
A lexer transforms a sequence of characters into a sequence of tokens.
A compiler lexer is a crucial component in the compilation process of a programming language. It is responsible for breaking down the source code into smaller, meaningful units called tokens or lexemes. These tokens are then fed into the parser, which constructs the abstract syntax tree (AST) of the program.
enum TokenType {
TOKEN_TYPE = 'TOKEN_TYPE',
...
}
type TTokenSpec = [RegExp, TokenType];
/**
* The line and column values are useful for debugging propose and also
* to have a better error messaging that points out where the error happens
*
* The start and end values are useful to UI implementation that informantion
* enables to select the given text by its start and end cursor position
*
* An example of the start and end cursor position to highlight text could be
* seen at: https://astexplorer.net/
*/
interface TokenLocation {
line: number;
column: number;
start: number;
end: number;
}
interface Token {
type: TokenType;
lexeme: string;
/**
* This information is useful for debugging and UI implementation of a code editor
*/
location?: TokenLocation;
}
-
Tokenization: The lexer converts the source code into tokens, which are the smallest syntactic units of the language. These tokens can be identifiers, keywords, literals, operators, or other special characters.
-
Regular Expressions: The lexer uses regular expressions to define the patterns for identifying these tokens. This approach allows for efficient and flexible token recognition.
-
State Transition Table: The lexer can be implemented using a state transition table, which is a table-driven approach that directly jumps to follow-up states via goto statements. This approach can produce faster lexers than hand-coded ones.
A parser is a software component that takes input data (typically text) and builds a data structure, often a parse tree or abstract syntax tree (AST), giving a structural representation of the input while checking for correct syntax. It is a crucial part of the compilation process, particularly in compiler design.
- Recursive Descent Parser | GeeksforGeeks (2023/06/09)
An Abstract Syntax Tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree-like representation of the source code, abstracting away the syntax and semantics of the programming language. The AST is designed to preserve essential information such as variable types, the location of each declaration, the order of executable statements, left and right components of binary operations, and identifiers and their assigned values.
-
[YouTube Playlist] Compiler Design - Quick Concepts | Neso Academy
-
[YouTube Playlist] Compiler Design - Chapter 1 - Introduction to Compiler Design | Neso Academy
-
[YouTube Playlist] Compiler Design - Chapter 2 - Syntax Analysis | Neso Academy
-
[YouTube Playlist] Compiler Design - Chapter 3 - Top-Down Parsers | Neso Academy
-
A Guide To Parsing: Algorithms And Terminology | Gabriele Tomassetti (2023/07/26)
-
Compilers Series' Articles | by Paul Lefebvre - DEV Community
-
Compilers 101 - Overview and Lexer (2018/01/19)
-
Compilers 102 - Parser (2018/01/22)
-
-
Extended Backus–Naur Form diagram | PlantUML.com - EBNF is a code that expresses the syntax of a formal language. An EBNF consists of terminal symbols and non-terminal production rules.
-
ebnf-convert - Grammar Converter
-
[GitHub] matthijsgroen/ebnf2railroad - 📔 Create beautiful documentation for EBNF
-
[GitHub] kaigouthro/ebnf_live_graphviz - [python] streamlit w3c ebnf visualzer, json output, markdown visualizer, and live graphviz hierarchy
-
EBNF (Syntax diagrams / Railroad diagrams / Grammar diagrams) #4252 | mermaidjs / mermeid - GitHub
-
Railroad, Syntax diagrams, EBNF | Wiki at mermaidjs / mermeid - GitHub
-
What is a Lexer ? known also as Tokenizer or Scanner - Lexical Analysis | DataCadamia
-
Lexical Analysis - (Token|Lexical unit|Lexeme|Symbol|Word) | DataCadamia
-
Parser / Compiler - (Abstract) Syntax Tree (AST) | DataCadamia
-
Abstract Syntax Tree (AST) - Explained in Plain English | DEV Community (2024/06/11) - As a developer, the source code that you write is all so concise and elegant.
-
[GitHub] cowchimp/awesome-ast - A curated list of awesome AST resources
-
BNF Notation: Dive Deeper Into Python's Grammar | Real Python
-
[YouTube] LLVM in 100 Seconds | Fireship (2022/05/23)
-
Writing Your Own Lexer With Simple Steps | Serhii Chornenkyi (2023/11/24)
-
A simple recursive descent parser | DEV Community (2023/10/09)
-
- [GitHub] tlaceby/guide-to-interpreters-series - Contains source-code for viewers following along with my Beginners Guide To Building Interpreters series on my Youtube Channel.
-
Let's Build A Simple Interpreter | Ruslan's Blog
-
Part 7: Abstract Syntax Trees | Ruslan's Blog (2015/12/15) - python and rust implementations
-
Part 13: Semantic Analysis | Ruslan's Blog (2017/04/27)
-
-
[YouTube Playlist] Building a Compiler in JS | benwatkins10xd
- [GitHub] benwatkins10xd/js-compile - Compiler in vanilla javascript from scratch
-
[YouTube] abstract syntax tree's are gonna be IMPORTANT in 2024 | Chris Hay (2023/12/28)
-
-
- [GitHub] antlr/antlr4-lab - A client/server for trying out and learning about ANTLR
-
ANTLR4 grammar syntax support | Visual Studio Marketplace
- [GitHub] mike-lischke/vscode-antlr4 - ANTLR4 language support for Visual Studio Code
-
-
[GitHub] antlr/antlr4 - is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
-
[GitHub] antlr/grammars-v4 - Grammars written for ANTLR v4; expectation that the grammars are free of actions.