Freitag, Februar 23, 2018

TableGen #3: Bits

This is the third part of a series; see the first part for a table of contents.

One of the main backend uses of TableGen is describing target machine instructions, and that includes describing the binary encoding of instructions and their constituents parts. This requires a certain level of bit twiddling, and TableGen supports this with explicit bit (single bit) and bits (fixed-length sequence of bits) types:
class Enc<bits<7> op> {
  bits<10> Encoding;

  let Encoding{9-7} = 5;
  let Encoding{6-0} = op;
}

def InstA : Enc<0x35>;
def InstB : Enc<0x08>;
... will produce records:
def InstA {     // Enc
  bits<10> Encoding = { 1, 0, 1, 0, 1, 1, 0, 1, 0, 1 };
  string NAME = ?;
}
def InstB {     // Enc
  bits<10> Encoding = { 1, 0, 1, 0, 0, 0, 1, 0, 0, 0 };
  string NAME = ?;
}
So you can quite easily slice and dice bit sequences with curly braces, as long as the indices themselves are constants.

But the real killer feature is that so-called unset initializers, represented by a question mark, aren't fully resolved in bit sequences:
class Enc<bits<3> opcode> {
  bits<8> Encoding;
  bits<3> Operand;

  let Encoding{0} = opcode{2};
  let Encoding{3-1} = Operand;
  let Encoding{5-4} = opcode{1-0};
  let Encoding{7-6} = { 1, 0 };
}

def InstA : Enc<5>;
... produces a  record:
def InstA {     // Enc
  bits<8> Encoding = { 1, 0, 0, 1, Operand{2}, Operand{1}, Operand{0}, 1 };
  bits<3> Operand = { ?, ?, ? };
  string NAME = ?;
}
So instead of going ahead and saying, hey, Operand{2} is ?, let's resolve that and plug it into Encoding, TableGen instead keeps the fact that bit 3 of Encoding refers to Operand{2} as part of its data structures.

Together with some additional data, this allows a backend of TableGen to automatically generate code for instruction encoding and decoding (i.e., disassembling). For example, it will create the source for a giant C++ method with signature
uint64_t getBinaryCodeForInstr(const MCInst &MI, /* ... */) const;
which contains a giant constant array with all the fixed bits of each instruction followed by a giant switch statement with cases of the form:
case AMDGPU::S_CMP_EQ_I32:
case AMDGPU::S_CMP_EQ_U32:
case AMDGPU::S_CMP_EQ_U64:
// more cases...
case AMDGPU::S_SET_GPR_IDX_ON: {
  // op: src0
  op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
  Value |= op & UINT64_C(255);
  // op: src1
  op = getMachineOpValue(MI, MI.getOperand(1), Fixups, STI);
  Value |= (op & UINT64_C(255)) << 8;
  break;
}
The bitmasks and shift values are all derived from the structure of unset bits as in the example above, and some additional data (the operand DAGs) are used to identify the operand index corresponding to TableGen variables like Operand based on their name. For example, here are the relevant parts of the S_CMP_EQ_I32 record generated by the AMDGPU backend's TableGen files:
 def S_CMP_EQ_I32 {      // Instruction (+ other superclasses)
  field bits<32> Inst = { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, src1{7}, src1{6}, src1{5}, src1{4}, src1{3}, src1{2}, src1{1}, src1{0}, src0{7}, src0{6}, src0{5}, src0{4}, src0{3}, src0{2}, src0{1}, src0{0} };
  dag OutOperandList = (outs);
  dag InOperandList = (ins SSrc_b32:$src0, SSrc_b32:$src1);
  bits
<8> src0 = { ?, ?, ?, ?, ?, ?, ?, ? };
  bits
<8> src1 = { ?, ?, ?, ?, ?, ?, ?, ? };
  // many more variables...
}
Note how Inst, which describes the 32-bit encoding as a whole, refers to the TableGen variables src0 and src1. The operand indices used in the calls to MI.getOperand() above are derived from the InOperandList, which contains nodes with the corresponding names. (SSrc_b32 is the name of a record that subclasses RegisterOperand and describes the acceptable operands, such as registers and inline constants.)

Hopefully this helped you appreciate just how convenient TableGen can be. Not resolving the ? in bit sequences is an odd little exception to an otherwise fairly regular language, but the resulting expressive power is clearly worth it. It's something to keep in mind when we discuss how variable references are resolved.

Mittwoch, Februar 21, 2018

TableGen #2: Functional Programming

This is the second part of a series; see the first part for a table of contents.

When the basic pattern of having classes with variables that are filled in via template arguments or let-statements reaches the limits of its expressiveness, it can become useful to calculate values on the fly. TableGen provides string concatenation out of the box with the paste operator ('#'), and there are built-in functions which can be easily recognized since they start with an exclamation mark, such as !add, !srl, !eq, and !listconcat. There is even an !if-builtin and a somewhat broken and limited !foreach.

There is no way of defining new functions, but there is a pattern that can be used to make up for it: classes with ret-values:
class extractBit<int val, int bitnum> {
  bit ret = !and(!srl(val, bitnum), 1);
}

class Foo<int val> {
  bit bitFour = extractBit<val, 4>.ret;
}

def Foo1 : Foo<5>;
def Foo2 : Foo<17>;
This doesn't actually work in LLVM trunk right now because of the deficiencies around anonymous record instantiations that I mentioned in the first part of the series, but after a lot of refactoring and cleanups, I got it to work reliably. It turns out to be an extremely useful tool.

In case you're wondering, this does not support recursion and it's probably better that way. It's possible that TableGen is already accidentally Turing complete, but giving it that power on purpose seems unnecessary and might lead to abuse.

Without recursion, a number of builtin functions are required. There has been a !foreach for a long time, and it is a very odd duck:
def Defs {
  int num;
}

class Example<list<int> nums> {
  list<int> doubled = !foreach(Defs.num, nums, !add(Defs.num, Defs.num));
}

def MyNums : Example<[4, 1, 9, -3]>;
In many ways it does what you'd expect, except that having to define a dummy record with a dummy variable in this way is clearly odd and fragile. Until very recently it did not actually support everything you'd think even then, and even with the recent fixes there are plenty of bugs. Clearly, this is how !foreach should look instead:
class Example<list<int> nums> {
  list<int> doubled =
      !foreach(x, nums, !add(x, x));
}

def MyNums : Example<[4, 1, 9, -3]>;
... and that's what I've implemented.

This ends up being a breaking change (the only one in the whole series, hopefully), but !foreach isn't actually used in upstream LLVM proper anyway, and external projects can easily adapt.

A new feature that I have found very helpful is a fold-left operation:
class Enumeration<list<string> items> {
  list<string> ret = !foldl([], items, lhs, item,
      !listconcat(lhs, [!size(lhs) # ": " # item]));
}

def MyList : Enumeration<["foo", "bar", "baz"]>;
This produces the following record:
def MyList {    // Enumeration
  list<string> ret = ["0: foo", "1: bar", "2: baz"];
  string NAME = ?;
}
Needless to say, it was necessary to refactor the TableGen tool very deeply to enable this kind of feature, but I am quite happy with how it ended up.

The title of this entry is "Functional Programming", and in a sense I lied. Functions are not first-class values in TableGen even with my changes, so one of the core features of functional programming is missing. But that's okay: most of what you'd expect to have and actually need is now available in a consistent manner, even if it's still clunkier than in a "real" programming language. And again: making functions first-class would immediately make TableGen Turing complete. Do we really want that?

Montag, Februar 19, 2018

TableGen #1: What has TableGen ever done for us?

This is the first entry in an on-going series. Here's a list of all entries:
  1. What has TableGen ever done for us?
  2. Functional Programming
  3. Bits
  4. Resolving variables
  5. DAGs
  6. to be continued 
Also: here is a talk (slides + video) I gave in the FOSDEM 2019 LLVM devroom on TableGen.

Anybody who has ever done serious backend work in LLVM has probably developed a love-hate relationship with TableGen. At its best it can be an extremely useful tool that saves a lot of manual work. At its worst, it will drive you mad with bizarre crashes, indecipherable error messages, and generally inscrutable failures to understand what you want from it.

TableGen is an internal tool of the LLVM compiler framework. It implements a domain-specific language that is used to describe many different kinds of structures. These descriptions are translated to read-only data tables that are used by LLVM during compilation.

For example, all of LLVM's intrinsics are described in TableGen files. Additionally, each backend describes its target machine's instructions, register file(s), and more in TableGen files.

The unit of description is the record. At its core, a record is a dictionary of key-value pairs. Additionally, records are typed by their superclass(es), and each record can have a name. So for example, the target machine descriptions typically contain one record for each supported instruction. The name of this record is the name of the enum value which is used to refer to the instruction. A specialized backend in the TableGen tool collects all records that subclass the Instruction class and generates instruction information tables that is used by the C++ code in the backend and the shared codegen infrastructure.

The main point of the TableGen DSL is to provide an ostensibly convenient way to generate a large set of records in a structured fashion that exploits regularities in the target machine architecture. To get an idea of the scope, the X86 backend description contains ~47k records generated by ~62k lines of TableGen. The AMDGPU backend description contains ~39k records generated by ~24k lines of TableGen.

To get an idea of what TableGen looks like, consider this simple example:
def Plain {
  int x = 5;
}

class Room<string name> {
  string Name = name;
  string WallColor = "white";
}

def lobby : Room<"Lobby">;

multiclass Floor<int num, string color> {
  let WallColor = color in {
    def _left : Room<num # "_left">;
    def _right : Room<num # "_right">;
  }
}

defm first_floor : Floor<1, "yellow">;
defm second_floor : Floor
<2, "gray">;
This example defines 6 records in total. If you have an LLVM build around, just run the above through llvm-tblgen to see them for yourself. The first one has name Plain and contains a single value named x of value 5. The other 5 records have Room as a superclass and contain different values for Name and WallColor.

The first of those is the record of name lobby, whose Name value is "Lobby" (note the difference in capitalization) and whose WallColor is "white".

Then there are four records with the names first_floor_left, first_floor_right, second_floor_left, and second_floor_right. Each of those has Room as a superclass, but not Floor. Floor is a multiclass, and multiclasses are not classes (go figure!). Instead, they are simply collections of record prototypes. In this case, Floor has two record prototypes, _left and _right. They are instantiated by each of the defm directives. Note how even though def and defm look quite similar, they are conceptually different: one instantiates the prototypes in a multiclass (or several multiclasses), the other creates a record that may or may not have one or more superclasses.

The Name value of first_floor_left is "1_left" and its WallColor is "yellow", overriding the default. This demonstrates the late-binding nature of TableGen, which is quite useful for modeling exceptions to an otherwise regular structure:
class Foo {
  string salutation = "Hi";
  string message = salutation#", world!";
}

def : Foo {
  let
salutation = "Hello";
}
The message of the anonymous record defined by the def-statement is "Hello, world!".

There is much more to TableGen. For example, a particularly surprising but extremely useful feature are the bit sets that are used to describe instruction encodings. But that's for another time.

For now, let me leave you with just one of the many ridiculous inconsistencies in TableGen:
class Tag<int num> {
  int Number = num;
}

class Test<int num> {
  int Number1 = Tag<5>.Number;
  int Number2 = Tag<num>.Number;
  Tag Tag1 = Tag<5>;
  Tag Tag2 = Tag<num>;
}

def : Test<5>;
What are the values in the anonymous record? It turns out that Number1 and Number2 are both 5, but Tag1 and Tag2 refer to different records. Tag1 refers to an anonymous record with superclass Tag and Number equal to 5, while Tag2 also refers to an anonymous record, but with the Number equal to an unresolved variable reference.

This clearly doesn't make sense at all and is the kind of thing that sometimes makes you want to just throw it all out of the window and build your own DSL with blackjack and Python hooks. The problem with that kind of approach is that even if the new thing looks nicer initially, it'd probably end up in a similarly messy state after another five years.

So when I ran into several problems like the above recently, I decided to take a deep dive into the internals of TableGen with the hope of just fixing a lot of the mess without reinventing the wheel. Over the next weeks, I plan to write a couple of focused entries on what I've learned and changed, starting with how a simple form of functional programming should be possible in TableGen.