Skip to content

joshuacrotts/littlec-compiler

Repository files navigation

LittleC Compiler

CodeFactor GitHub contributors GitHub commit activity GitHub repo size GitHub issues open GitHub issues closed

LittleC is a small, C-like language written for my compilers course. Some segments of the code, including the interpreter (yes, a freaking interpreter!), were written by Dr. Steve Tate.

The code is first split into lexical tokens and parsed down to an abstract syntax tree. This is then converted to intermediate code in the form of a modified three-address-code schema. Finally, we compile to MIPS, using the QTSpim "standard" (as the MARS simulator supports things that QTSpim does not, and vice versa such as a few pseudo-operations).

Warning: The interpreter and compiler generate two similar, but sometimes different results depending on the case. Most of the time, the results are identical. However, edge cases do exist, and not everything has been accounted for. From what I have tested, you can generally trust the results of the MIPS output, but the interpreter handles more cases (i.e. the quicksort implementation).

Running the Compiler

First, download the provided .JAR file. Run it as normal, and type littlec to view a list of possible commands. You may either type the code into a .lc file and provide it as an argument to one of the available subcommands, or you may feed in code via standard input.

Using the Compiler

There are a few differences between C and LittleC, which may be apparent as soon as you start to write a program and get stuck, asking yourself "who the hell made this?!" The following is a list of differences. Note that this does not include omissions; rather things that are present in both languages to a degree, but are modified in LittleC.

  1. The main function no longer accepts command-line/terminal arguments, and does not return a value. Thus, the main method's signature is void main().
  2. All variables must be declared above any expressions. So, if you want to declare variables later in the program, you need to either declare them outright at the start, or use a block, as you would in C with braces {}.
  3. for loops, and other such statements (if, while), do not allow for variable declarations, unless it's inside a new block as described above (this means that, including the statement body,there must be a new block declared). This is similar to C89.
  4. The extern and static keywords are present, but serve no meaningful purpose, for now at least.
  5. Array sizes must be computed at compile time, meaning that all arrays must be declared with an integer literal size.
  6. When passing an array to a function, use the signature char/int[] s instead of char/int s[], meaning the brackets come before the variable name.
  7. Initializing string literals in a local block is not directly supported, but is technically allowed by the compiler. Meaning that, if you want to initialize a local string, use the built-in function strinit(char[] dest, char[] src) (coming soon!).
  8. Variables must be declared with a literal value and not an expression (this includes function calls, and everything else that is not a literal).
  9. You cannot declare a variable with a negative value, as this is an expression. To use negatives, initialize the value, then assign the negative value to it.

Features

Most standard programming language concepts are present. These include

  • Functions (including prototypes)
  • Integers and Character Variables (with 0x and 0b prefixes)
  • Arrays (Strings are char arrays as in C)
  • Recursion
  • Conditionals (Short-Circuiting)
  • Loops
  • Array "size of" (#) operator
  • Power (**) operator
  • Bitwise XOR, OR, AND, Negation, and Shifting

Planned Features

The following is a list of planned features:

  • Bit rotation, absolute value operator, logical implication and biconditional
  • Single & Double Floating-Point Precision Variables
  • Standard API
  • Terminal argument support
  • Support for compiling down to "MARS MIPS"
  • Random numbers (only for MARS)
  • ...?

LittleC Example

char gStr[100] = "Hi there!";
int gInt = 2;

void strcpy(char[] dst, char[] src) {
    int i;
    for (i=0; i<#src && i<#dst; i++) {
        dst[i] = src[i];
        if (dst[i] == '\0')
            break;
    }
}

void main() {
	char lArray[100];
	int i;
	int j;
	int k; 
	
	j = 5;
	strcpy(lArray, gStr);
	prints(lArray);
	printc('\n');
	
	for (i = 0; i < j; i++) {
		k = ++k | (0b11 ** i) << (j - i + 1);
		k = i ^ j & 0xffff;
		printd(k);
		printc('\n');
	}
	
	j = 0xffffff >> 16 & 0xff;
	i = j | ((2 ** 4) ^ (i + j - (j * 24 % 2)));
	printd(j);
	prints("\n");
	printd(i);
}

The above code outputs

Hi there!
5
4
7
6
1
255
511

Dependencies

This project uses Maven, and was developed using Eclipse. Though, it works with any IDE, so long as the ANTLR plugin is available.

Reporting Bugs

See the Issues Tab.

Version History

The master branch encompasses all changes. The development branches have in-progress additions and updates that are not yet ready for the master branch. There will most likely be other branches present in the future, each with the "development" prefix, and a suffix denoting its purpose with a hyphen (-).

Packages

No packages published