Udon VM (Short Primer)
Udon is VRChat’s sandboxed virtual machine for world content.
Udon operates by moving around values that for practical purposes are of the C# object type, and running ‘externs’ that manipulate these values or do things with them.
The ‘for practical purposes’ note here is important. They are kept in StrongBox (as evidenced from serialized Udon data), presumably to allow avoiding the excessive creation of boxes.
This system ultimately appears designed to emulate a memory that is a C# object array; however, all ‘heap’ indexes are static.
It also has a conditional jump, an indirect jump, and an integer stack solely consisting of hardcoded integers; these are indexes into the array of objects.
It is extremely minimalist at its core, with only 9 opcodes of which 2 are NOPs and 1 is generally unlikely to be used. The design essentially provides control flow primitives for Udon Graph, with a few extras thrown in.
Various other decisions support an Udon Graph-first theory of Udon development:
- Udon ‘variables’ are essentially object fields. There’s no such thing as functions or local variables.
- Very little is dynamic. The
JUMP_INDIRECTopcode is as dynamic as it gets. - Someone clearly wanted Udon Assembly to be half-decent, but then stopped at implementing a proper range of heap constants.
- There’s a mysterious missing opcode next to the stack operations, but also next to the conditional jump.
- If we assume this is suggesting something, the sensible options seem to be
DUP,PICK, orJUMP_IF_TRUE.
- If we assume this is suggesting something, the sensible options seem to be
An Udon program consists of the following key things:
- Heap: .NET values available to the program on start
- Bytecode: List of integers representing actual program code
- Symbols: Exposes fields and event handlers to the outside
- Sync metadata: Used for networking sync
Execution Flow
The Udon interpreter’s runtime state consists of the stack, ‘heap’, and the program counter.
The Udon interpreter is started with an ‘event’. The interpreter is reentrant (events may run during other events) due to save/restore code in UdonBehaviour – this is to say, if not for that code, it wouldn’t be reentrant. However, the design of Udon discourages reentrant and recursive code anyway.
The Udon interpreter loops until an exception happens or it runs out of instructions (because of a jump beyond program bounds).
JUMP, 0xFFFFFFFC is the usual idiom to deliberately stop the interpreter (because of the end of the program). Any label placed at the end of the program will theoretically do. If this difference has any important side-effects is uncertain.
Opcodes
0: NOP
This opcode does nothing. There is almost no reason to use this, except as padding to avoid symbol aliasing errors.
1: PUSH parameter
This opcode pushes an integer to the stack. Udon Assembly may give the impression that a value is being pushed; this is not the case. In these cases, it is the heap address that is being pushed.
Unless you are very dedicated to size-optimizing your Udon programs (even at the expense of runtime speed in some cases), or trying to obfuscate, there is never any reason to use this in a conditional fashion. Simply push everything immediately before EXTERN, COPY or JUMP_IF_FALSE.
2: POP
Pops an integer from the stack and discards it.
This instruction only really makes sense if you’re doing something very weird with the VM.
3: (unassigned)
Fun fact: Ancient private versions of my documentation said something very incorrect or at least misleading about this ‘instruction’. I blame my setup not being the most reliable thing ever when it comes to Worlds tests.
4: JUMP_IF_FALSE parameter
Pops a heap index from the stack and reads a System.Boolean from it.
If this value is false, jumps to the parameter as a bytecode position. Otherwise, continues to the next instruction.
5: JUMP parameter
Jumps to the bytecode position given by the parameter.
6: EXTERN parameter
Contrary to how this might look in assembly, the heap index given as a parameter is both read and written.
Firstly, it is read.
If it is a string, then this is the extern name, and needs to be resolved into a delegate, which is then written to the heap index. If it’s already a delegate, this is fine.
The delegate has some metadata about how many arguments it has, so that many heap indexes are popped. These are given to the extern are given in push order (first pushed = first argument). Note that these are still heap indexes, not values.
The delegate can read or write to any heap index given. In the case of out/ref (Ref suffix on the type in both cases), it does write, and it doesn’t bother reading for out.
The actual implementation of externs is that they are autogenerated wrappers around existing C# functions.
Two important wrinkles appear, in regards to the this argument and the return value (including getters). this is an additional parameter at the start, and the return value is an additional parameter at the end, the heap index of which is then written to.
7: ANNOTATION parameter
This opcode allows for a value to be inserted into the opcode stream with no side-effects whatsoever.
This may be useful for inline debug information or obfuscation; in particular, there is no rule that you can’t perform “misaligned” execution.
8: JUMP_INDIRECT parameter
Gets a heap index from the parameter and reads a System.UInt32 from it.
Interprets this as a bytecode position and jumps to it.
9: COPY
Pops two heap indexes. The value from the second heap index popped (aka the first heap index given) is copied to the first heap index popped (aka the second heap index given).