-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vm: store data as raw contiguous bytes #242
Conversation
Thank you and welcome. Your pr description alone is great, I'm reading the code now, but I wanted to acknowledge how pleased I am with the approach you're advancing. Thanks. (Goes to read code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered a few items
After having read through the initial sketch and revisiting the PR: The part about having the VM be separate, that might not be immediately necessary, but promoting vmgen to a more first class backend would help a lot. Here is a possible trajectory, it can be modified of course, mostly a thought experiment to illustrate a bunch of things at once:
For the code, what's the reason for using PTR instead of a ref for the stack? Documenting the opcodes along with adding checks for pre and post conditions would be awesome. In debug mode checking for the conditions is also super useful. |
Very welcome! And thanks a ton for your time and effort |
Thank you for the comments and reviews everyone, it's much appreciated :).
The idea is to allow the usage of The In general, I try to use I just pushed the next batch of changes. Some general notes for reviewers regarding my code comments:
Details/NotesThere are multiple areas where I still haven't made a final decision on how to tackle them. Namely:
Object variants are also going to be tricky and, with the current approach, will probably require Regarding # Not possible right now
var f = newNimNode
discard f(nnkEmpty, nil) Regardless of how I go about representing |
My previous assessment about the usage of I'd probably go for storing the frames in a A different approach would be to just recurse into
I don't think the last downside would be an issue in practice though. |
(haven't caught up on changes, just responding to comments first) The approach outlined with respect to a stack and an index as a reference into it, by way of a context, is how I thought this would go ultimately. I have a sinking feeling the approach to clarifying things is through vmgen. Effectively, there needs to be a core nimskull, which is enough to either directly cover a language feature or do so via desugaring. Then each bit of core should have a clear translation into byte code, let's assume no optimizations for now. Without coverage of that we don't know the op code were missing, or are incorrectly specified, the semantics to support, etc. PNode is not one thing and that's the issue:
I might have missed something but clarifying the particular usage classes and associated semantics would be another way to allow incremental progress. Perhaps an easy bit of progress is having an explicit
|
Update on the current statusI'll try summarize the rough design I currently converged on, together with the reason(s) behind some decisions: Data representation
Theoretically, arrays of objects and objects could be merged into one thing (they are using the same data representation right now), but I decided to keep the distinction for now, as objects require an immutable tree layout, while an array of objects does not. VM resourcesThe VM uses the following kinds of resources:
This is mostly the same as was previously, but with a few notable differences:
More on registersThe difference between value and reference registers is opaquely handled when they're used as source operands. What this means, is that instructions sourcing from registers don't check for register kind, but just do a single pointer dereference. This is made possible by the following:
Registers, globals and constants are required to have a fixed location in memory for this to work. A downside of this approach is, that you always need to do a pointer dereference when reading from a register, which probably makes it harder for the compiler to reason about the code. An alternative would have been to a I chose this design as a way to have value semantics for VM data while having a) most of the VM code emitted by LdGlobal a, ... # Store reference to global 1 in 'a'. 'a' is a reference register
LdObj b, a, ... # Load reference to a field of the object in 'a'. 'b' is a reference register
Echo b
NewStr b # Assign a fresh new string to register b. 'b' is a value register now
Echo b
# Maybe a bit counter intuitive, but this still works. But maybe it shouldn't
LdNull a, .. # 'a' is a value register with object value
LdObj b, a, ... # Load field reference
WrObj b, ..., c # Copy value from register c into a field of a.x - this modifies the value in register 'a' Without reference registers, each load instruction ( Closing thoughtsI'm still not happy with a few things:
Flat storage of arbitrary data types is what I'd go for further down the line. Error reportingErrors in the VM can be categorized into:
I use "internal error" to refer to both static VM code issues and dynamic state error from here on. Right now I'm using
Further questions on direction
Regarding testingIdeally I'd like to do exhaustive testing of A simple yet extremely powerful performance and correctness test would be:
That is, to evaluate/run the compiler in the VM and build the compiler with it :). If that produces a working compiler in reasonable time, I'd say that the VM is definitely ready to run code in the wild :). A requirement for that would probably be full pointer, FFI and |
@saem Ah, sorry, I only just saw your message after I hit "Comment" :). I'm a bit exhausted right now (took a long time for the write up) so I'm going to respond to your message some time later |
compiler/vm/vmconv.nim
Outdated
vmobjects | ||
] | ||
|
||
# TODO: correctly and fully supporting every type with `fromLit`/`toLit` seems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The VM needs to support core types, which need to be a minimal and performant set to represent directly or through composition all concrete types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toLit
/fromLit
are not needed nor used by the VM but are a marshalling/unmarshalling convenience utility used by VmCallback
's:
let atom = toLit(10)
# Instead of:
# let value = VmObjectAtom(kind: akInt, intVal: 10)
let i = fromLit(atom)
# Instead of:
# let i = atom.i
While both examples are more or less trivial for ints, you'd have to write more manual code for complex objects.
Before this PR they also only supported a few select cases with an "add as needed" comment
compiler/vm/vm.nim
Outdated
of akString: | ||
str.add r.atomVal.strVal | ||
of akInt: | ||
# TODO: original code didn't check if int is in range 0..255. Should we do that now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend towards no.
There is a lot going on. First the byte code gen might have made a mistake. Second of this was just memory/CPU we'd have loaded the correct size into the register, as in we shouldn't have loaded 64 bits, but the 8 that are the char, then stored that. Locations are dynamic and all that.
regs[rc].intVal.int, regs[rd].intVal.int) | ||
of opcParseFloat: | ||
decodeBC(rkInt) | ||
# TODO: this op has really unusual semantics. Turn it into a callback? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure yet
compiler/vm/vm.nim
Outdated
let gp = sym.ast[genericParamsPos] | ||
for i in 0..<gp.len: | ||
let idx = sym.typ.len + i | ||
if idx < n.len: | ||
tos.slots[idx] = setupMacroParam(n[idx], gp[i].sym.typ) | ||
else: | ||
# TODO: are the surrounding check and this block meant to be in the for-loop? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure they are, of everything works we never go into the else, if we do we cleanup (eval failed, hence dec) and then report an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if gp.len > 1
and sym.typ.len == n.len
, the else branch is taken multiple times, always reporting the same error and also decrementing evalMacroCounter
each time.
From what I can tell, evalMacroCounter
is incremented only once at the start of evaluating a macro and then decremented at the end. If the else branch is taken even once, evalMacroCall
would return with an evalMacroCounter
less than the one it was called with which doesn't make sense, so at least the dec(g.config.evalMacroCounter)
is wrong
@zerbina you're reading my mind on a lot of things. 😅 I'll see if I can get a moment to respond to your latest comment. There is a lot there and I want to reinforce a number of things you point out to make sure they don't get lost. I want to avoid only focusing on improvement areas and not celebrating success. For now, especially your concluding note about being able to run the compiler in the VM, yes that's absolutely the long term direction. Besides the yo dawg and inception jokes, which don't get me wrong are a killer feature, it's a big step towards meta-circular interpretation/execution. This has big impact on language optimizations, language development, interactive environments, and more. |
@seam Thank you for your reviews/comments/ideas. I hope my approach to developing this isn't making it too hard to review things in the current unfinished state. When I started this PR, I did a cursory read of the code in Due to the somewhat inherent complexity of the task and me quickly trying/dropping new ideas in lots of different places, I introduced lots of code that quickly gets stale. I also wrote/write "TODO"'s and "XXX"s everywhere as reminders, so that I don't need to divert my attention but also don't forget about ideas and ostensibly strange/buggy code. My plan is, to resolve all "XXX"/"TODO" comments and also remove stale leftover code produced by me, before I mark this PR as non-draft. Also, I have two days worth of changes (which implement the reference register architecture along with some other things) that I should've pushed prior to my large post which probably render some of your review comments stale. Sorry for that :P |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some of my initial comments skimming through the structures. I will try to look at more in the coming days.
compiler/vm/vmdef.nim
Outdated
# XXX: is it okay to use this for unsigned integer too? | ||
intVal*: BiggestInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably better to just use unsigned here and do the int
checks when the instruction calls for it.
compiler/vm/vmdef.nim
Outdated
# TODO: what about type hooks (=copy, =sink, =destroy)? Original vm implementation seems to | ||
# have completely ignored these |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have vmgen emit the locations where the hooks would be run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=copy
and =sink
can be automatically emitted by the compiler (it already does that for native code backends), so I don't think we have to worry too much.
What I'm not sure of is whether the phase that gets these stuff ticking (injectdestructors
) is used to transform the code for vmgen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, the code is simply transformed with transf.transformBody
by vmgen
prior to code gen right now. injectdestructors
is not used.
Problems would arise with ref
and seq
types.
ref
handling is fully done by the execution engine right now, so vmgen
has no clue to where and when their lifetime ends. liftdestructors
seems to be responsible for the injection of ARC/ORC reference handling and maybe we can leverage this for vmgen
too. There are a whole lot of codegen procs that'd need to be handled then though (nimIncRef
, nimDecRefLast
, nimIncRefCyclic
, etc). I'm not sure if this is the right approach.
All seq
related functionality is handled via dedicated instructions, meaning that the workings of seq handling are completely opaque to vmgen
. A few ideas:
- Make
seq
s a non-magic type (very likely a complex endeavor and one I'd not attempt as part of this PR, but could be interesting in the future) - Leave the seq opcodes intact but remove dedicated atom support for them from the execution engine. Use
seqs_v2
for both the storage implementation and logic. Opaquely transform seq instructions to a VM procedure call - that is, runvmgen
oversetLen
,shrink
, etc. and execute them directly as guest code - Same as the above, but remove seq opcodes and let
vmgen
emit calls toseqs_v2
procs directly. - Nim type information needs to be carried around anyway, so VM
seqs
know their elements' type. Whenever elements are to be copied/moved/destroy, check for the existence of a corresponding lifetime-hook and run it as guest code for the required elements. Something similar to this could also be employed forref
type destructors
Removing dedicated seq
support from the VM would make it less complex, but usage of seq
s would also be slower due to each operation running as guest code.
The =trace
hook can probably only be realistically supported if we make use of injectdestructors
/liftdestructors
somehow.
compiler/vm/vmdef.nim
Outdated
of akOrdinalArray: | ||
rawArrayVal*: seq[byte] | ||
elemSize*: int8 # 1, 2, 4 or 8 | ||
oaKind*: OrdinalArrayKind # TODO: encode `oaKind` and `elemTyp` into a single int8 | ||
of akArray: | ||
arrayElements*: seq[VmObjectAtom] | ||
elemTyp*: PType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to merge arrays, objects and strings into one unified representation via a seq[byte]
. Nim objects and friends can't have their type mutated at runtime, so there shouldn't be a need to store them in the array itself.
An idea that I have is to have a constant array designed to store type info in TCtx
, then change your kind: AtomKind
into something like kind: TypeId
, which will be the index to that array describing the type of the atom, with the positions corresponding to AtomKind
reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds really similar to what I'm attempting right now with the flat object idea I mentioned.
Using @saem 's proposed terminology (location
, handle
):
- A
seq[byte]
stores a sequence oflocation
s - A
location
is either an atom (int, float, seq, etc.) or a "container" - A "container" is either homogeneous (
array
) or heterogeneous (object
) and contains either atoms or other "containers" - A
handle
is a reference to alocation
To interpret the location
referenced by a handle
you then need the corresponding type info (TypeInfo
) which is stored alongside the handle
as a handle itself (typeHandle
). The TypeInfo
stores whether a location
is an atom or a "container" and also the size in bytes of the location
.
Additionally:
- For homogeneous "containers",
TypeInfo
stores:- element count (
array
size is part of the type info) - element stride (number of bytes between two elements)
- element type (handle to a
TypeInfo
)
- element count (
- For heterogeneous "containers",
TypeInfo
stores a sequence of field information where each stores:- field offset (number of bytes from the start of the "container")
- field type (handle to a
TypeInfo
)
- For
seq
andset
atoms,TypeInfo
stores the respective element types. - For
ptr
andref
atoms,TypeInfo
stores the respective target types
I'm almost done with a first prototype in my local repository
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote this comment deleting tons of content over and over again, so I think I'm missing some things here. This is the basis for my thinking, a quote from the manual:
code specifies a computation that acts on a memory consisting of components called locations. A variable is basically a name for a location. Each variable and location is of a certain type. The variable's type is called static type, the location's type is called dynamic type. If the static type is not the same as the dynamic type, it is a super-type or subtype of the dynamic type.
So the above sounds pretty close. For the purposes of the VM, we need to decide the level of operation we support:
- once we're in pointer land, then the memory model needs to behave in a flat fashion, a la C, this is very baked in; can't change it, yet
- so if we're in pointer land then arrays, seq, and objects need to have their parts stored contiguously
- so
seq[byte]
works at least for that - we lose "free" gc; but the complexity isn't worth it
- so
From the quote you'll notice that there is the dynamic memory model (locations) and the static memory model (handles). For the purposes of the VM. If we decide that the static/handle memory model is the floor of abstraction, then we're saying that location is a higher level concepts in VM land. This is easier to implement, but I also think we won't be able to run a bunch of key code if we go this route.
I unfortunately don't have a particularly conclusive remark right now, so I'm going to go look at the VM again and remind myself of how it uses PNode
s. I suspect giving the VM its own explict "heap" (a seq[PNode]
) and could have it work exclusively on those; outside of copying data into and then copying results out of for compile-time evaluation is a good firs step. The problem remains of AST manipulations for meta-programming, but that's a completely different use of PNode
IMO.
compiler/vm/vmdef.nim
Outdated
of akPtr: | ||
ptrTyp*: PType | ||
# XXX should we keep a distinction between pointer to global and poiunter to inside register? | ||
ptrVal* : ptr VmObjectAtom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is possible to eliminate pointers in the VM altogether? Allowing user input to manipulate global memory that is shared between the VM and the compiler is not a good idea IMO, but not having these would also complicate certain parts of the VM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in inside the VM implementation or inside guest code? Writing the VM without pointers is possible, but - as you mentioned - would make some parts more complicated and maybe also a bit slower I'd say
Next design/architecture iterationNote: The current commit only reflects small parts of this. I adopted @saem 's proposed terminology ( A value is either made of a single
Memory is divided into typed and untyped. Values whose type is known at their creation time (globals, constants, stack variables and objects created via
Untyped
An The existence of Registers either contain:
If a registers owns a handle, it will free the memory it references once it transition to a different register state. For addresses and handles, registers also keep track of whether they reference typed or untyped memory (locations). Implementation detailsInt and float registers are dynamically typed, meaning that they're interpreted based on the instruction that works with them (e.g. Accessing (both reading/writing) a managed atom in untyped memory via a AddressesAn
Otherwise it's marked as potentially invalid. The following nim pseudo-code describes the handling of potentially invalid addresses: # opcLdDeref
var a = ... # The register containing the address
var b = ... # The register to hold the handle
if not a.address.isValidated:
if not isValid(a.address): error()
a.isValidated = true
b = asHandle(a)
# opcWrDeref
var a = ... # The register containing the address
var b = ... # The register containing the value
if not a.address.isValidated:
if not isValid(a.address): error()
a.isValidated = true
copyTo(a.address, b) When a typed memory region is freed, all registers containing weak When an untyped memory region is freed, all addresses referencing the freed region are invalidated. Registers containing weak This implicit resetting should probably be made explicit by requiring vmgen to conservatively emit register reset instructions on registers that contain weak handles ClosingThis overall design/architecture tries to be as permissive as possible while still being efficient and safe. Some parts might be a bit too restrictive, however. If I didn't forget/omit anything, this architecture should allow for pointer arithmetic, If we go with this design, I'd write tests to ensure that the safety measures all work. |
Some further thoughts and some rationale Typed/untyped memoryThe idea behind the typed/untyped distinction is/was, to allow reading/writing managed atoms to var a: array[sizeof(seq[int]), byte]
cast[ptr seq[int]](addr a)[].newSeq(10) The question remains of course, if things like this should even be allowed in the first place. At least with ARC/ORC and the C/C++ backends, this would work. In general, I'm trying to get the VM to - more or less - the same level as the C/C++ backends, feature wise. If the byte array example above should be supported, we could - instead of treating Supporting On aliasingAllowing the aliasing of If aliasing of |
Sorry for the slow follow-up, I got some busy weeks recently. The PR is getting rather big which makes it hard for us to follow through with all the changes that you are planning, not to mention the integration risks of such a large change. You are basically working too fast for us to keep up 😄. From what I can tell reading this PR, these are what being done:
I'm sure there are more that I haven't covered in this list. My recommendation is to plan out changes in subsets that we can review and integrate now, so that we can start getting your work merged. An ordering that I think could work well would be:
Other stuff can be weaved in these subsets as you see fit, but I'd recommend not adding any features or plan too much ahead until we are done with refactoring. The main goal right now is to get PNode away from the VM, we can build bigger changes on top once we got the foundation in. This ordering is only a suggestion though, please do it however you think is better, as I only have a limited grasp on how you are tackling this at the moment. Also, for discussions on the design I'd recommend we do it in the discussion forum as it's hard to follow up on discussions in PR due to Github hiding previous conversations and that discussing more than one topic become hard to track here. Again, thank you for the contribution, the work you are doing here is amazing :) |
Pushed my local changes. Rebase didn't happen yet, so the two VM related commits from devel are not part of this yet. I also updated the PR description to better reflect the current state of things and to include some pointers for reviewers. The changes include the move to flat object representation of data so a lot of code changed. Bootstrapping works (with both To make future progress on this easier and to remove some hacks that make things more complicated/error-prone, I'm going to do a simple overhaul (no IR yet) of Another thing that I'm going to split of and finish prior to this PR is improved error handling (this issue) As an aside, the second and third iteration of bootstrapping take a few seconds longer now (from around 17 to around 25 seconds on my machine) and maximum memory consumption went from around 800 MB to around 1470 MB. I looked into it a bit, and it seems like Time and memory consumption with ORC stayed the same ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added some minor comments, trading a much deeper look at vmtypes. I think I had a bit of an epiphany, just need to think through and write it down.
compiler/vm/vm.nim
Outdated
of opcWrObj: | ||
# TODO: we probably need something like opcWrObjMov for directly moving a value into an object. | ||
# This wasn't necessary for the old vm due to it's abuse of PNode, but values now have | ||
# copy semantics by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay!
compiler/vm/vm.nim
Outdated
|
||
if not (isLessEqual(l, a) and | ||
isLessEqual(a, h)): | ||
# TODO: are range errors `Exception`s or `Defect`s in normal Nim? If they |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defects are almost never the answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So instead of aborting execution here, a guest catchable exception should be raised? The current behaviour (aborting) is inherited from the old code and is also what happens for the other back-ends
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now we can preserve the behaviour, I think it depends on context.
It's so very rare that one has a true defect in that it's irrecoverable, so unless the intention is to never use this op code as a predicate then defect is fine for now.
if bNode.kind == nkAccQuoted: | ||
bNode = bNode[0] | ||
|
||
func asCString(a: TFullReg): cstring = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be done now, but this feels like callback ("syscall") territory.
compiler/vm/vm_enums.nim
Outdated
@@ -32,6 +41,8 @@ type | |||
opcLdObj, # a = b.c | |||
opcLdObjAddr, # a = addr(b.c) | |||
opcWrObj, # a.b = c | |||
# TODO: needs a better name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about opcWrObjFld
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have put a newline there; the to-do is targeted at opcLdRef
.
opcLdRef
loads a handle to the location which is referenced/targeted by the ref
value in register 'b' into register 'a'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's creating a new reference/handle to the same value/location.
I'm thinking out loud below:
From a very low level generic register machine perspective we're:
- copying a value from B to A
- updating/storing a value to track the new ref
From the computational model perspective we're creating a new handle to a location and tracking the new handle for a various reasons one of which is knowing if a location is orphaned or not.
At present many of the operations, at least in terms of vocabulary use generic VM terms. That terminology hasn't been updated and it might not need to. At least at this juncture I'm not sure which to use, because opcCloneHdl
feels like the most natural thing but it's at odds with the other stuff.
Helpful line of inquiry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not entirely sure if opcCloneHdl
would be correct/fit.
What opcLdRef
currently does (in Nim code) is:
# opcLdRef x, r
let r: ref int = ...
var x: lent int = r[]
So no copying/cloning takes place and no location is actually modified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud:
A handle is synonymous with naming a chunk of memory (location) and assuming its schema (type). Making a handle for a location and shoving it into a register is equivalent to putting an address in a register.
I reread the operation and it seems like that's what's happen, but I'm on the phone and I couldn't trace through all the relevant code.
I noticed a potential difference in our conceptual models, I consider a location a register... I'm less certain this is a good idea. Perhaps that's why I'm perceiving move/copy but you're not as I don't think you consider the register a location. Moreover, I think the perspective I was taking is not entirely correct. Although, it does have partial utility in that we can describe the fact that pointer mutations don't have to be visible to registers because in my model registers hold values.
I'm still not sure what to call this.
compiler/vm/vmconv.nim
Outdated
vmobjects | ||
] | ||
|
||
# XXX: correctly and fully supporting every type with `fromLit`/`toLit` seems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what would potentially become easier with the field generalization we chatted about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how? The complexity/work is due to having to write a whole lot of when T is xxx
/elif
blocks in order to support all types representable by the VM (e.g. array
, seq
, int
, int32
, int8
, etc.).
Generalized fields wouldn't help there, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that I imagine the number of types that needs to be supported by the VM would decrease. Potentially we could collapse array and seq. Set might disappear too.
Although I didn't get a chance to check the latest list of types and it might already be shorter. In which case you're absolutely correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long to review, I was busy in the last few weeks.
This is the first pass of the review, where I focus on VM static memory.
compiler/vm/vmdef.nim
Outdated
elementStride*: int ## The amount of bytes between the start of two consecutive | ||
## elements. Can be larger than the size of an element alone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the total size stored in sizeInBytes
and elementCount
, you should be able to compute this value without actually storing it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it's just there to not having to compute the stride with the element's sizeInBytes
and alignment
whenever accessing an array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer compute over storage if its trivial
compiler/vm/vmdef.nim
Outdated
BranchListEntryKind* = enum | ||
blekStart | ||
blekBranch | ||
blekEnd | ||
|
||
BranchListEntry* = object | ||
case kind*: BranchListEntryKind | ||
of blekStart: | ||
field*: FieldPosition | ||
numItems*: uint32 | ||
numBranches*: uint16 | ||
of blekBranch: | ||
fieldRange*: Slice[FieldPosition] | ||
of blekEnd: | ||
parent*: uint32 | ||
nextField*: FieldPosition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is your object variant implementation? Can you briefly explain how it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Admittedly, it's a bit complex.
Essentially, the branches
field in VmType
stores a linear list of BranchListEntry
s. The entries describe the tree a variant object might form (via nested case
s) in way that allows for traversal in a mostly forward fashion, requiring no recursion or look-aside buffers (implemented in the variantFields
inline iterator). Nodes (discriminators, denoted by blekStart
entires) are stored in depth-first manner.
elementType*: PVmType | ||
|
||
sizeInBytes*: uint | ||
alignment*: uint8 ## 1 shl `alignment` == real alignment value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why you are storing the alignment. If all offsets and sizes are pre-computed, then this value would be rather useless during execution, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the alignment
field is possible, but you'd have to recompute the alignment for fields of object type every time (which would likely be acceptable) and elementStride
/seqElemStride
couldn't be removed then.
Of course, sizeInBytes
could simply include the padding required to make the size a multiple of the alignment
, but I'm a bit reluctant to do that and the allocator probably also needs to be aware of types' alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say to keep unneeded data out until you really need it. Right now the only time the size without padding is ever needed is when the user puts {.packed.}
on the structure, thus, you should factor the alignment in.
compiler/vm/vmdef.nim
Outdated
of ckTemplate: | ||
templateSym*: PSym |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you have to handle templates, sem
expands all of them at codegen and you can't store them in a variable either, but I might be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also a bit unsure of why this needs to exist. I haven't looked into it further yet, but the bottom else case of the opcIndCall
instruction uses evalTemplate
(the branch is hit, so it's not dead-code) and the PSym
of procs used there are of kind skTemplate
(which I derived the naming from)
compiler/vm/vmdef.nim
Outdated
lut*: Table[ItemId, PVmType] | ||
|
||
types*: seq[PVmType] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document these two fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Documentation is for the most part still missing. I'm planning to do that once everything is in place, since a lot of smaller things might still change and I don't want to document things that are getting thrown away in the end (inside the bounds of this PR)
compiler/vm/vmdef.nim
Outdated
# XXX: constants need to have fixed location in memory. `ref` is used to accomplish that until I | ||
# decide on a more efficient memory management strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about holding them in TGlobals
?
compiler/vm/vmdef.nim
Outdated
# decide on a more efficient memory management strategy | ||
constants*: seq[ref TypedVmValue] # constant data | ||
typeInfoCache*: TypeInfoCache ## Manages the types used the VM | ||
types*: seq[tuple[nType: PType, layoutDesc: PVmType]] # some instructions reference types (e.g. 'except') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this field do, exactly? Isn't typeInfoCache
supposed to be your one-stop-shop for all types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typeInfoCache
manages mappings between PType
-> PVmType
and also stores all created PVmType
s.
The types
field is a lookup list for instructions that require type information (e.g. opcNew
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh... how is it any different from typeInfoCache if it's also a lookup list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
types
provides range[0..regBxMax] -> (PType, PVmType)
mappings while typesInfoCache
provides PType -> PVmType
mappings.
Maybe it's easier if you describe how you think it should be done and I then tell you if it would work or, if not, why it wouldn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so that's why. I was wondering why you have two storage for basically the same stuff.
I guess we can use only one storage once you figured the way to get rid of PVmType
in favor of a central index-based storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I do the ref
-based to index-based type management change as part of this PR? It's not too hard, but would require another larger scale refactoring.
compiler/vm/vmdef.nim
Outdated
static: | ||
doAssert sizeof(VmObjectAtom.atom) == sizeof(Atom) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put atom: Atom
in there if you really need it to be like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I did in the beginning, but it caused issues due to Atom
being a {.union.}
that contains GC'ed types.
VmObjectAtom
is queued for removal anyway and Atom
will likely follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the second pass.
More reviews coming soon™
compiler/vm/vm.nim
Outdated
func ensureAtomKind(n: var TFullReg, typ: PVmType, mm: var VmMemoryManager) {.inline.} = | ||
## If necessary, transition register to a value register of kind `k` | ||
|
||
assert typ.kind notin {akInt, akFloat} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preconditions should be enforced as an internal VM error. Assertions are turned off by default in release compiler builds, so they might not be protecting your code.
compiler/vm/vm.nim
Outdated
# XXX: I don't think exception unmarshaling/marhshaling is really needed. And even if it is, | ||
# use `unmarshalGeneric` instead | ||
func unmarshalException(e: TypedHandle, nimType: PType): PNode = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small tip, if you are planning to replace something, attach {.deprecated.}
to it so you get all the call sites reported, which should simplify your work.
compiler/vm/vm.nim
Outdated
of rkInt: x = TFullReg(kind: rkInt, intVal: y.intVal) | ||
of rkFloat: x = TFullReg(kind: rkFloat, floatVal: y.floatVal) | ||
of rkAddress: x = TFullReg(kind: rkAddress, addrVal: y.addrVal, addrTyp: y.addrTyp, isValid: y.isValid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can be condensed down to:
x = y
As objects are copied by default.
Specialize only when you need to adjust the resulting value, like what you do with rkHandle
/rkValue
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the code you're looking at, no. Copying is unavailable for TFullReg
due to the copy hook being disabled with {.error.}
. This is mostly meant as way to force code authors to do the copy manually in order to ensure that it's really what is wanted and to prevent accidental copies which could result in bugs.
Unfortunately, this triggered a code-gen issue in the csources compiler which resulted in a very-hard-to-track-down memory corruption issue (x
is not reset prior to moving into it), so I'm going to remove the restriction (otherwise, I'd have to disable the =sink
hook which would then require manual field assignment), which will make x = y
possible
assert h.typ.kind == akInt | ||
deref(h).intVal = x.intVal | ||
of rkFloat: | ||
assert h.typ.kind == akFloat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend avoiding assertions where possible.
Instead, put absolute trust in the codegen, and debug troublesome places where things acts weird.
Assertions in VM are unfortunately useless, as you have no trace of the instruction stream that leads to the erroneous state.
Also, accessing wrong discriminator field is checked (in debug builds), so there is much less value in manually checking too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rapidly changed large amounts of code multiple times and assertions helped me to make sure that preconditions we're not violated after changes (which could also hint at obsolete code) and to catch bugs early. I'm also using them as tool for documentation.
In the error handling overhaul that I'm planning to do in separate PR prior to merging this one, I'd have replaced assertions in the VM with something like vmAssert
, which, in debug builds, would throw an error which would be caught by a try/except
around rawExecute
which would then print a guest-code stack-trace and either do a globalReport
or turn the error into an nkError
.
In the code you commented on, the assertion is meant for the expression on the left, where no variant field access happens.
If okay, I'd keep the assertion till I finish this PR, but would remove/replace them prior to the merge, if wanted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you find them useful for debugging, please keep them. I'd prefer a design where we have as little assertions as humanly possible, but that can wait until things cleared up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial read, will look again in the weekend.
If tests in the todo are added and all tests are passing sooner than that I'm good to merge. Can fix in post.
Okay, ORC successfully bootstraps again. There were two compiler bugs that I had to work around: one code-gen issue (already working on fixing it) and one that's seemingly related to |
48b3c83
to
259fa34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bors d+
✌️ zerbina can now approve this pull request. To approve and merge a pull request, simply reply with |
@alaviss @haxscramper Do you still want to review this PR? |
The tests for the access validation checks are added. They're manually written and currently don't cover all possible interactions, but this can always be changed later on. |
I say it's merge time 🥳 |
I know not everyone might have had chance to do a "final" review. The fact of the matter is lots of improvements came out of this and were already merged. 🎉 After the fact we'll run into bugs and that's ok, but there were bugs before this change, and any change even after more review will have bugs. Thankfully it's software, we can change it and from my reading it seems easier to maintain now than before. |
Bors r+ |
242: vm: store data as raw contiguous bytes r=saem a=zerbina ## Summary This replaces the `PNode`-based data representation in the VM with storing the data as raw contiguous bytes, similar to how it works on the c/cpp targets. `VmType` encodes the information necessary to interpret the raw bytes. ## Details - since `PType`s store a lot of information irrelevant for the VM, they are first translated into a format (`VmType`) that's faster and easier to traverse/query. The mapped-to-types are de-duplicated, meaning that if two VM types have the same ID (currently the same instance), they're also the same type, making type equality checks very fast. This respects structural-vs-nominal types, but currently not `distinct` and named-vs-unnamed tuples. - location type information is stored separately from the actual data - `const` values are still stored using `PNode` representation, which requires a translation step whenever data crosses the compiler/VM border - it is enforced that locations are only accessed via handles of compatible type, reporting an error on violation - all memory access is now also ensured to not go outside of VM-owned memory, with an error being reported on violation ### Known changes in observable behaviour - `repr` output in compile-time/NimScript contexts changed a bit when compiling with `--mm:refc`. The output is more closer to that with `--mm:arc`. - when passing non-`NimNode` values to non-`static` parameters with `genAst`/`getAst`, the invoked macro always receives it as typed AST now (even if the parameter is of `untyped` type) Co-authored-by: zerbina <[email protected]>
Build failed: |
Okay, the failing test should work now. The issue was with assigning a temporary to a bors r+ |
242: vm: store data as raw contiguous bytes r=zerbina a=zerbina ## Summary This replaces the `PNode`-based data representation in the VM with storing the data as raw contiguous bytes, similar to how it works on the c/cpp targets. `VmType` encodes the information necessary to interpret the raw bytes. ## Details - since `PType`s store a lot of information irrelevant for the VM, they are first translated into a format (`VmType`) that's faster and easier to traverse/query. The mapped-to-types are de-duplicated, meaning that if two VM types have the same ID (currently the same instance), they're also the same type, making type equality checks very fast. This respects structural-vs-nominal types, but currently not `distinct` and named-vs-unnamed tuples. - location type information is stored separately from the actual data - `const` values are still stored using `PNode` representation, which requires a translation step whenever data crosses the compiler/VM border - it is enforced that locations are only accessed via handles of compatible type, reporting an error on violation - all memory access is now also ensured to not go outside of VM-owned memory, with an error being reported on violation ### Known changes in observable behaviour - `repr` output in compile-time/NimScript contexts changed a bit when compiling with `--mm:refc`. The output is more closer to that with `--mm:arc`. - when passing non-`NimNode` values to non-`static` parameters with `genAst`/`getAst`, the invoked macro always receives it as typed AST now (even if the parameter is of `untyped` type) Co-authored-by: zerbina <[email protected]>
Canceled. |
bors r+ |
*Summary* This replaces the PNode-based data representation in the VM with storing the data as raw contiguous bytes, similar to how it works on the c/cpp targets. `VmType` encodes the information necessary to interpret the raw bytes. *Details* - since `PType`s store a lot of information irrelevant for the VM, they are first translated into a format (`VmType`) that's faster and easier to traverse/query. The mapped-to-types are deduplicated, meaning that if two VM types have the same ID (currently the same instance), they're also the same type, making type equality checks very fast. This respects structural-vs-nominal types, but currently not `distinct` and named-vs-unnamed tuples. - location type information is stored separately from the actual data - `const` values are still stored using `PNode` representation, which requires a translation step whenever data crosses the compiler/VM border - it is enforced that locations are only accessed via handles of compatible type, reporting an error on violation - all memory access is now also ensured to not go outside of VM-owned memory, with an error being reported on violation * Known changes in observable behaviour * - `repr` output in compile-time/NimScript contexts changed a bit when compiling with `--mm:refc`. The output is more closer to that with `--mm:arc`. - when passing non-`NimNode` values to non-`static` parameters with `genAst`/`getAst`, the invoked macro always recieves it as typed AST now (even if the parameter is of `untyped` type)
Build succeeded: |
Summary ======= Remove the obsolete `fixupTypeAfterEval` procedure and all its usages. Details ======= `fixupTypeAfterEval` was originally required for ensuring the value AST returned by the VM is properly typed, but with the introduction of `vmcompilerserdes` (nim-works#242), this became obsolete. Remaining usages ---------------- Normal procedures procedures cannot return `untyped`, `typed`, or `typeDesc` values, meaning that the `fixupTypeAfterEval` in `evalAtCompileTime` was always a no-op. For `static` expression, `typeDesc` values are possible, but since the evaluated result coming out of `evalStaticExpr` is always a `nkType` AST, for which `semExpr` is a no-op, `fixupTypeAfterEval` was also a no-op there.
## Summary Remove the obsolete `fixupTypeAfterEval` procedure and all its usages. ## Details `fixupTypeAfterEval` was originally required for ensuring the value AST returned by the VM is properly typed, but with the introduction of `vmcompilerserdes` (#242), this became obsolete. ### Removed usages Normal procedures cannot return `untyped`, `typed`, or `typeDesc` values, meaning that the `fixupTypeAfterEval` in `evalAtCompileTime` was always a no-op. For `static` expression, `typeDesc` values are possible, but since the evaluated result coming out of `evalStaticExpr` for those is always a `nkType` AST, for which `semExpr` is a no-op, `fixupTypeAfterEval` was also a no-op there.
Summary
This replaces the
PNode
-based data representation in the VM withstoring the data as raw contiguous bytes, similar to how it works on
the c/cpp targets.
VmType
encodes the information necessary tointerpret the raw bytes.
Details
PType
s store a lot of information irrelevant for the VM, theyare first translated into a format (
VmType
) that's faster and easier totraverse/query. The mapped-to-types are de-duplicated, meaning that
if two VM types have the same ID (currently the same instance), they're
also the same type, making type equality checks very fast. This
respects structural-vs-nominal types, but currently not
distinct
andnamed-vs-unnamed tuples.
const
values are still stored usingPNode
representation, whichrequires a translation step whenever data crosses the compiler/VM
border
compatible type, reporting an error on violation
VM-owned memory, with an error being reported on violation
Known changes in observable behaviour
repr
output in compile-time/NimScript contexts changed a bitwhen compiling with
--mm:refc
. The output is more closer tothat with
--mm:arc
.NimNode
values to non-static
parameters withgenAst
/getAst
, the invoked macro always receives it as typed ASTnow (even if the parameter is of
untyped
type)The initial goal was for this PR to be the VM refactoring PR, but it shifted to being about unblocking both the DOD AST and a
vmgen
rewrite.I'm not happy with the overall quality, but I think it has to suffice for the first step. The biggest issues:
vm.nim
is too complexvmgen.nim
was in a bad state before, but it's even worse nowLots of parts/components are in a transitional state and not finished yet. I do have a clear plan on how to further proceed from here. In the bigger VM refactoring picture, this PR marks the first big step, with further steps being:
vmgen
; rewritevmgen
in generalI'm going to move to a more incremental development approach (smaller, more frequent PRs) for the rest of the VM refactoring from this PR on.
The biggest part of the test suite runs without changes, where some had to be adjusted due to slight behavioral changes in some areas (e.g.
repr
,getAst
). Some CPS tests also succeed, but they're running a bit slower now.Please don't merge this yet. I'd like to first compile a test suite covering all VM/const related issues reported in the mainline repository as a separate PR, in order to see if and what issues are fixed with this PR.(Edit: It makes more sense to do this once the rest of the VM related work is done)Notes for reviewers:
To-do
Get rid of register pointers (rkRegAddr
)Remove(postponed to after this PR)vmgen
related hacks where possiblePNode
to vm data conversionsref
supportref
typesGet FFI support back in shapeFinish/clean up remaining loose ends(postponed)set
implementationBetter VM allocator implementation(postponed)Improve how small values (int, float, ptr, etc.) are allocated for registers(postponed)