Compiled UIX v4

Compiled UIX (often shortened to "UIB" for "UI Binary") is one of two formats used by Iris 4 for defining user interfaces. UIB is generated by passing one or more UIX XML files to the UIX compiler, which pre-processes the UI definition into custom bytecode alongside a slew of data tables. Although UIB bytecode instructions are much higher level, the interpreter is conceptually similar to emulating a CPU in that they are relatively simple instructions which are interpreted in a fetch + decode + execute loop.

Note that the format used by Iris 4 is significantly different than Iris 3's UIB, which can be identified with the file magic "UIX2008".

Nearly all of the markup-related code is in the Microsoft.Iris.Markup namespace, including compiled UIX. Most of the functions responsible for loading compiled UIX files is in Microsoft.Iris.Markup.CompiledMarkupLoader, with some key functionality being delegated to other classes. Most notably, the bytecode interpreter is implemented in Microsoft.Iris.Markup.Interpreter.

Data format

Being a binary data format, UIB is capable of storing several primitive data types.

All integers are stored in little endian, as is standard for Windows.

Offsets

All offsets are stored as unsigned 32-bit integers (UInt32), relative to the start of the file unless otherwise specified.

Offset ranges are typically specified with an inclusive start offset and exclusive end offset: \([\mathrm{Start}, \mathrm{End})\).

Strings

Strings are stored length-prefixed with a UInt16 preamble, followed by encoded characters. If the preamble is 0xFFFF, the string is null. Otherwise, the most significant bit indicates whether the characters are UTF-8 encoded, and the remaining 15 bits are the number of characters (not bytes) in the string.

Length value (binary) Meaning
1111 1111 1111 1111 Null string
1xxx xxxx xxxx xxxx UTF-8 character encoding
0xxx xxxx xxxx xxxx UTF-16 character encoding

String references

Most strings are not stored inline, but instead as Int32 indexes into the Strings section of the Binary Data Table. This allows for common strings to be deduplicated, which can reduce file size.

String arrays

String arrays are lists of string references. This means that they are effectively Int32 arrays, where each item is an index into the Strings list in the Binary Data Table.

Integer arrays

Arrays of 32-bit integers are also stored prefixed with their length, where, similarly to strings, a 'negative length' is interpreted as a null array. Otherwise, each integer is stored one after the other. Both signed and unsigned integers (UInt32 and Int32) can be stored with in this format, but the Iris library always reads the values as unsigned, requiring callers to cast the value to Int32 for signed integers.

Booleans

Boolean values are stored as a single byte, where 0 represents false and 1 represents true.

Enums

Enums are usually stored as 32-bit integers. Naturally, the meaning of a particular integer value depends which enum it is intended to be.

MarkupType

Name Value
None 0x00000000
UI 0x01000000
Class 0x02000000
Effect 0x03000000
DataType 0x04000000
DataQuery 0x05000000

File structure

A custom binary format is used to store all compiler output. This format is divided into several sections, and can even be split across multiple files using shared Data Tables.

The first four bytes are always 0x5549421A, which spell out "UIB␚" in ASCII. The next four bytes contain some representation of the UIB version, although the exact format is unknown. All known Iris 4 releases, including 4.0 and 4.8 Beta, use 1012 (0x3F4).

Table of Contents

The Table of Contents begins at offset 0x0008, with two offsets specifying the start and end of the Object Section. Locations 0x0010 and 0x0014 contain the start and end of the Line Number Table, respectively.

The last item stored in the Table of Contents is a reference to the Binary Data Table. This is composed of two fields, of which only one can be set at at time. The value at offset 0x0018 is the start of a string. If the string is not null, then it is used as the resource URI to load a shared Data Table from. If it is null, then the UInt32 at location 0x001A is the offset to the Data Table embedded within the current file.

Dependencies

The Dependencies section is a list of UIX files to include, encoded as the unsigned 16-bit count followed by a series of entries. Each include is composed of a flag that stores whether the referenced file is UIX XML, and the reference's compiler name string. This name is almost always the URI the file was loaded from.

As an example, a Dependencies section with two includes might be stored as shown below. Note that all offsets are relative to the first byte of the dependency count.

Start offset Value Meaning
0x00 0x02000000 The list contains 2 includes
0x04 0x00000000 dependencies[0] is compiled UIB
0x05 0x05000000 The URI of the 1st dependency is the 6th string in the Data Table
0x09 0x01 dependencies[1] is UIX XML
0x10 0x02000000 The URI of the 2nd dependency is the 3rd string in the Data Table

Type Export Declarations

The Type Export Declarations are composed of two tables: the Export Table and Alias Table.

The Export Table is a length-prefixed (UInt16) list of exports, where each export is a type defined with a reference to the local name and the markup type.

The Alias Table allows a UIB file to export imported types under a different name. Each entry is exactly 10 (0x0A) bytes long and is composed of the desired alias, the dependency to load it from, and the name of the target type. Both the alias and target type name are stored a string references. The dependency is always referred to using an index, either into the Type Import Table of the Shared Binary Data Table if one is specified, or the file's dependencies.

Offset into entry Meaning
0x00 String reference to the desired Alias
0x04 UInt16 index into the Dependencies list
0x06 String reference to the target type name

Binary Data Table

The Binary Data Table consists of several subtables, with each one containing a different types of constant data. These subtables are stored in the following order:
1. Strings
1. ???

Strings table

The Strings table is effectively a list of strings, though unlike the primitive string[], it is actually stored as char[][].

The first four bytes of the Strings table contain the length of the list as a 32-bit integer. Although this value cannot be negative, UIX.dll ultimately uses this as an Int32 to allocate memory, so theoretically a maximum of 0x7FFFFFFF or 2,147,483,647 strings can be stored in a single UIB file.

Following the string count is a series of offsets relative to the first entry in the offset table (the byte immediately after the string count bytes). This is used similarly to a jump table, where the first chunk of the table is an array of fixed-size offsets into the second chunk. When UIB file is being read from fixed memory, this allows Iris to jump directly to the requested string using its index without having to read the entire table or every string before it.

As an example, a string table with three entries might be stored as shown below. Note that all offsets are relative to first entry in the jump table.

Start offset Value Meaning
-0x04 0x03000000 The table contains 3 strings
0x00 0x0C000000 strings[0] is located at offset 0x0C
0x04 0x1D000000 strings[1] is located at offset 0x1D
0x08 0x23000000 strings[2] is located at offset 0x23
0x0C 0x08000000 strings[0] is 8 UTF-16 characters
0x0D "Γεια σας" strings[0] character data
0x1D 0x05800000 strings[1] is 5 UTF-8 characters
0x1E "Howdy" strings[1] character data
0x23 0x08800000 strings[2] is 8 UTF-8 characters
0x24 "MOREtext" strings[2] character data

Constants Table

Type Import Table

Source Markup Import Tables

Line Number Table

Object Section

Export Table

Load passes

[Work in progress]

Compiled UIX is loaded in three main passes, listed in order of execution below. "Depersist" usually refers to reading and processing encoded information, such as type exports.

  1. Declare types
    1. Depersist Table of Contents
    2. Depersist Binary Data Table
    3. Depersist Dependencies
    4. Depersist Type Export Declarations
  2. Populate public model
    1. Depersist Type Import Table
    2. Depersist Type Export Definition
  3. Full
    1. Depersist Data Mappings Table
    2. Depersist Constants Table
    3. Depersist Line Number Table
    4. Depersist Object Section

TODO

Iris has two separate type systems-- the runtime types, which are your standard .NET types; and the markup type schemas, which of course are .NET types themselves, but are used to wrap runtime types for use by the UIX compiler and interpreter. Mapping from schema to runtime type and constructing instances from strings is easy enough, because those are both tasks the original UIX tooling has to do. Doing the reverse (finding the schema for a given runtime type, or encoding a runtime object into a string that can be parsed later) is much more difficult.