I published the URL to this blog on atari age. The posts are in reverse chronological order but the best place to start is the beginning.
tms9900-gcc
Thursday, 29 February 2024
Wednesday, 28 February 2024
Dev, Release, Install, Test, Debug
Previous: Missing Extends Up: Intro
There are a few scripts in the repo for installing gcc and binutils as well as scripts to create new patches and debian packages.
I work exclusively on Linux (debian bookworm currently) and have a script to create a .deb package which can be used for debian, ubuntu and others. The .deb package installs bins, libs, etc, in /opt/tms9900-gcc by default. Many users though are on Windows or MacOS and I am relying on others to build packaged releases for those OSs since I can't.
Development
The src files are contained in the "dev" directory. Only modifications to gcc and binutils are stored in the repo. This means only storing about 50 files as opposed to over 60,000. The dev directory contains two copies of any files that have been modified. The original files for gcc are in gcc-4.4.0-orig and the modified files are in gcc-4.4.0. Similarly for binutils. New files (including "gcc/config/tms9900/tms9900.md") have a corresponding empty file in the orig directory. This allows diff to create a patch that includes all changes and all additions to the stock gcc-4.4.0 distribution.
Files to be edited are mostly in "dev/gcc-4.4.0/gcc/config/tms9900". This dir contains the md file as well as a few C and H files. It also contains some library asm functions which get included in libgcc.a
To rebuild the compiler in the dev hierarchy, from the "dev/gcc-4.4.0/build" directory, you can use the following commands to build and install gcc and libgcc:
make all-gcc |
This assumes of course this directory has previously been configured for builds, from the build directory, using:
../configure --prefix <target-dir> --target=tms9900 --enable-languages=c,c++ |
Release
The script mkpatches.sh will generate new patch files. It takes a version number as a parameter. It will update the gcc DATESTAMP and REVISION files with the current date and the supplied version number. The patch revision can be queried in code by accessing the __TMS9900_PATCH_MAJOR__ and __TMS9900_PATCH_MINOR__ macros which return integer values.
Install
The primary install tool is a script called install.sh. It downloads gcc-4.4.0 and binutils-2.19.1, applies the patches to them, builds the compiler and toolchain, and installs to the specified directory. This script will create a directory called build in the pwd and do all its work in there. It marks progress by creating progress files such as ".gcc_patched" and ".gcc_built". To do a clean build from updated patches, the easiest thing to do is "rm -rf build" or at the very least remove the progress files.
Test
Debug
#!/bin/bash INCLUDE=-I../libti99i CFLAGS="$2 $3 $4 $5 $6 -Wall -da -std=c99 -fomit-frame-pointer $INCLUDE" ../dev/gcc-4.4.0/build/gcc/cc1 $CFLAGS < $1
|
; addhi3-20 : (insn 20 4 21 <stdin>:6 (set (reg/f:HI 10 r10) ; (plus:HI (reg/f:HI 10 r10) ; (const_int -2 [0xfffffffffffffffe]))) 63 {addhi3} (nil)) dect r10 ; movhi-21 : (insn 21 20 22 <stdin>:6 (set (mem:HI (reg/f:HI 10 r10) [0 S2 A16]) ; (reg:HI 11 r11)) 12 {tms9900_movhi} (expr_list:REG_DEAD (reg:HI 11 r11) ; (nil))) mov r11, *r10
|
Dummy md
Thursday, 1 February 2024
Missing extend and truncate insns
Various fixes have been tried including adding a separate subreg pass to the compiler to detect when subregs have been inserted. I've disabled this pass as I don't think modifying the compiler is the way to go and in any case it didn't seem to be working as intended anyway.
There didn't seem to me to be a pattern to the dropped extends until I noticed it often happened when doing binary "and" operation. So while this C code:
char c; int i; int z; void f() { z = (c + i) & 0x1f; }
|
def f f movb @c, r1 movb r1, r2 sra r2, 8 mov @i, r1 a r2, r1 andi r1, >1F mov r1, @z b *r11 |
def f f movb @c, r1 a @i, r1 andi r1, >1F mov r1, @z b *r11 |
(insn 6 3 7 2 <stdin>:7 (set (reg:HI 22 [ c+-1 ]) (sign_extend:HI (mem/c/i:QI (symbol_ref:HI ("c") <var_decl 0x7f7e81e82000 c>) [0 c+0 S1 A8]))) 17 {extendqihi2} (nil)) (insn 7 6 8 2 <stdin>:7 (set (reg:HI 23) (plus:HI (reg:HI 22 [ c+-1 ]) (mem/c/i:HI (symbol_ref:HI ("i") <var_decl 0x7f7e81e820a0 i>) [2 i+0 S2 A16]))) 63 {addhi3} (expr_list:REG_DEAD (reg:HI 22 [ c+-1 ]) (nil)))
|
(insn 7 6 8 2 <stdin>:7 (set (reg:HI 23) (plus:HI (mem/c/i:HI (symbol_ref:HI ("i") <var_decl 0x7f7e81e820a0 i>) [2 i+0 S2 A16]) (subreg:HI (mem/c/i:QI (symbol_ref:HI ("c") <var_decl 0x7f7e81e82000 c>) [0 c+0 S1 A8]) 0))) 63 {addhi3} (nil))
|
Failed to match this instruction: (set (reg:HI 23) (plus:HI (sign_extend:HI (mem/c/i:QI (symbol_ref:HI ("c") <var_decl 0x7f7e81e82000 c>) [0 c+0 S1 A8])) (mem/c/i:HI (symbol_ref:HI ("i") <var_decl 0x7f7e81e820a0 i>) [2 i+0 S2 A16])))
|
(insn 7 17 8 2 <stdin>:7 (set (reg:HI 1 r1 [23]) (plus:HI (reg:HI 1 r1 [+-1 ]) (mem/c/i:HI (symbol_ref:HI ("i") <var_decl 0x7f7e81e820a0 i>) [2 i+0 S2 A16]))) 63 {addhi3} (nil))
|
int f (char c, unsigned int len) { return c+len; }
|
def f f ; extendqihi2-8 : (insn 8 5 9 <stdin>:2 (set (reg:HI 1 r1 [orig:26 c+-1 ] [26]) ; (sign_extend:HI (reg:QI 1 r1 [ c ]))) 17 {extendqihi2} (nil)) sra r1, 8 ; addhi3-14 : (insn 14 9 20 <stdin>:4 (set (reg/i:HI 1 r1) ; (plus:HI (reg:HI 1 r1 [orig:26 c+-1 ] [26]) ; (reg:HI 2 r2 [ len ]))) 63 {addhi3} (expr_list:REG_DEAD (reg:HI 2 r2 [ len ]) ; (nil)))
|
(ORIGINAL_REGNO (operand) == REGNO (operand)) |
Monday, 29 January 2024
Forcing memory constants
I realised at some point that it was more efficient in many cases just to store a literal value in code memory and refer to it. So comparing a byte value went from:
li r1,>400 |
to this:
LC0 |
So 8 bytes are reduced to 5 bytes and 2 instructions to 1 instruction. And if LC0 gets referenced again, the savings are even more. It also fixed some issues with greater than or less than that had arisen when comparing a byte to a constant loaded into a reg.
This was done using the define_insn_and_split pattern as follows:
(define_insn_and_split "cmpqi" [(set (cc0)
(compare (match_operand:QI 0 "nonimmediate_operand" "=g")
(match_operand:QI 1 "general_operand" "g")))]
"" "cb %0, %1" "CONST_INT_P (operands[1])" [(set (cc0) (compare (match_dup 0) (match_dup 1)))] { tms9900_debug_operands ("split_cmpqi", NULL_RTX, operands, 2); int val = INTVAL (operands[1]) & 0xff; operands[1] = force_const_mem (QImode, GEN_INT (val)); } [(set_attr "length" "6")] )
|
Thursday, 21 December 2023
Instruction patterns and compiler passes
Previous: Machine Descriptions Next: Forcing Memory Constants Up: Intro
There are several ways to create an instruction pattern, define_insn, define_expand, define_split, etc. Which one you choose can depend on which compiler pass you expect them to be used in.
RTL pattern matching
An important realisation I made early on is that in all but the final stages of compilation, gcc is using the RTL to drive aspects like as register allocations and optimisations. The compiler knows nothing about the opcodes - it doesn’t even see them - they are emitted directly to the output file. So while it feels like the opcode output is the most important thing to focus on, I think getting the instruction and RTL patterns right is actually more likely to get the compiler to output efficient code. Knowing when to use each method of defining a pattern took me a while to understand.
In the TMS9900 back-end I have used these patterns:
- define_insn
- define_expand
- define_insn_and_split
- define_peephole2
define_insn
The standard instruction pattern is defined using define_insn. This is generally straightforward but does have limitations. When there are many alternatives and options the output code can become convoluted. It can also become very repetitive when slight different insns have to share similar methods (signed and unsigned mul/div for example).
The named insns are used initially to build the RTL block list. Later passes match patterns and will use named patterns and unnamed patterns.
define_expand
define_insn_and_split
define_peephole2
(define_peephole2 |
[(set (match_operand:QI 0 "register_operand" "") |
(match_operand:QI 1 "memory_operand" "")) |
(set (match_operand:QI 2 "memory_operand" "") |
(match_dup 0))] |
"peep2_reg_dead_p(2, operands[0])" |
[(set (match_operand:QI 2 "memory_operand" "") |
(match_operand:QI 1 "memory_operand" ""))] |
{ |
tms9900_debug_operands ("peep-movqi-mem-mem", NULL_RTX, operands, 3); |
} |
) |
Problems during compiler passes
- expand - The expand pass expands all the define_expand patterns.
- combine - The combine pass looks for insns that can be merged together.
- reload - The reload pass reduces the register usage and spills values to the stack when it can't allocate a register
- ira - The ira (integrated register allocator) actually does the register allocation.
- peephole2 - The peephole pass looks for sequences of instructions that match a pattern.
- final - emits the actual assembly
Expand and insn lists
As the compiler does its thing, it generates lists of insns for each function. Because it doesn't actually emit any code until a function has been completely parsed (as far as I can tell) we know a lot about the function even before we have emitted any code. For example, we know if it is a leaf (doesn't call any other functions) and don't need to save R11 in the function prologue.
Insns are numbered using a UID. In the dumps for each pass, each insn is listed with its UID and the UID of the previous and next insn. This is useful for finding where insns were combined or eliminated.
If we compiled this code fragment:
void f (void) { int x = 3; } |
gcc would create a list of blocks (with only one block in this case). In the file gccdump.128r.expand we would see:
(insn 6 5 11 3 <stdin>:3 (set (mem/c/i:HI (reg/f:HI 17 virtual-stack-vars) [0 x+0 S2 A16]) (mem/u/c/i:HI (symbol_ref/u:HI ("*LC0") [flags 0x2]) [0 S2 A16])) -1 (nil)) |
This does look confusing initially, but essentially it says it is an insn, numbered 6, previous is 5, next is 11, input is stdin line 3 and it is doing a "set" of a HI reg to a symbol. It has allocated a pseudo-reg 17 for now to hold the result. Any reg higher than the highest physical reg (r15 in our case) is a pseudo-reg. This will be allocated physical regs later in the ira and reload passes.
The compiler will look for a named insn called "movhi" (move a halfword) to do this operation. Our implementation of movhi is a define_expand which emits a CLR or a SETO if it can or otherwise emits a tms9900_movhi insn. The expand also replaces an immediate with a const if required.
In the next pass (gccdump.133r.vregs) we can see a reference to tms9900_movhi and an allocation on the stack frame (R9 = frame pointer):
(insn 6 5 11 3 <stdin>:6 (set (mem/c/i:HI (reg/f:HI 9 r9) [0 x+0 S2 A16]) (mem/u/c/i:HI (symbol_ref/u:HI ("*LC0") [flags 0x2]) [0 S2 A16])) 12 {tms9900_movhi} (nil)) |
(insn 6 3 14 2 <stdin>:6 (set (mem/c/i:HI (reg/f:HI 10 r10) [0 x+0 S2 A16]) (mem/u/c/i:HI (symbol_ref/u:HI ("*LC0") [flags 0x2]) [0 S2 A16])) 12 {tms9900_movhi} (nil))
|
Combiner
; This handles reverse-order not-and combinations (define_insn "*not_andhi" [(set (match_operand:HI 0 "nonimmediate_operand" "=rR>,rR>,Q,Q") (and:HI (not:HI (match_operand:HI 1 "nonimmediate_operand" "rR>,Q,rR>,Q")) (match_operand:HI 2 "nonimmediate_operand" "0,0,0,0")))] "" { tms9900_debug_operands ("*not_andhi", insn, operands, 3); return "szc %1, %0"; } [(set_attr "length" "2,4,4,6")])
|
int x;
int y; void f() { x = x&~y; }
|
def f
f szc @y, @x b *r11
|
Reload / IRA
Peephole
Final
Tuesday, 19 December 2023
TMS9900 Machine Description
Previous: Coding in C Next: Instruction Patterns Up: Intro
Insns
The machine description file gcc/config/tms9900.md is the primary definition of the TMS9900 backend. Machine descriptions for gcc in general are documented here. They use the .md extension (an unfortunate clash with markdown) and essentially list a series of instructions with patterns (called "insns") that emit assembly opcodes to achieve a result. The machine description is parsed by the compiler build system and a number of header and source code files are emitted which are then built in to the compiler. In many places in the md file it is possible to include C code. Names of named insns are used to generate a function called gen_<insn-name> so names must follow the same rules as C function naming (can't include symbols such as "-" or "$" for example). The insn patterns are used by the compiler to match the RTL generated by the compiler front-end.
Insn names
Insns can be named or unnamed. Named insns are either standard names (like "movqi") or else should have a unique name that won't clash with another backend, typically by prepending the back end name (e.g. "tms9900_movqi"). Unnamed insns can simply have an empty name ("") or a name starting with an asterisk ("*my_insn"), in which case the name is emitted in debug output but otherwise treated as empty. In early compilation, gcc will look for named insns to match patterns in the RTL. For example, it will look for "movqi" if it wants to move a byte or "addhi3" if it wants to add 16-bit values. In later stages, it will try to match patterns which is where unnamed insns come in.
NOTE: only named insns can be referred to by define_expand, define_split, etc.
Library calls
When the compiler can't find a named insn to perform an operation it will often instead emit a library call so floating point methods are defined in libgcc. This can't be done for fundamental insns like "movqi" but can be done for example to add two 32-bit values. If gcc can't find an insn called "addsi3" it will emit "bl __addsi3" instead.
Sometimes, the compiler will try to generate its own code for missing insns. For example this code:
long x; |
int y; |
long z; |
void f(void) |
{ |
z = x>>y; |
} |
Operands
Each insn takes several operands, typically 2 or 3. Operand values have a mode which indicates their size. The size of standard int (SI) in gcc is 32-bits. A half-word (HI) is 16 bits and a quarter-word (QI) is 8 bits. Other types like double-word (DI) are not used in the TMS9900 backend. There are also float and double (SF and DF) modes but there are no insns defined for these since the TMS9900 does not have any floating point hardware capability. One unusual mode is the condition code (CC) which is set by compare operations. This is implicitly "clobbered" by insns in gcc-4.4.0. The TMS9900 does an implicit compare to zero on most instructions.
The operand modes and count are usually appended to the end of the insn so "extendqihi2" means perform an extend operation where the src is QI (8 bits), the dest is HI (16 bits) and there are two operands.
Constraints and lengths
The type of the operand is defined using a single letter called a constraint. Constraints can be grouped together and comma-separated into alternatives. The values use in the TMS9900 backend are:
- r = a register
- R = a register indirect reference to a memory location (e.g. MOV *R4 ....)
- > = like R, but with a post increment (e.g. MOV *R4+....)
- Q = a label address of a memory location (e.g. MOV @LABEL ...)
- i = an immediate value (e.g. LI R1,>1234)
- n = a constant number, excludes labels and other constants
- 0 = not really a constraint. It just means this operand must be the same as operand[0] for this alternative. It could also be 1, 2... to copy constraints for other operands.
For r, R, > and Q, the opcode is usually the same, as the addressing mode doesn't change the opcode length, but the length of the instruction varies if using Q, so it makes sense to break these out in separate constraint "alternatives". Each alternative can have a slightly different implementation and/or can have a different length.
In addition, some values can use shorter methods for set, add, etc. Loading a zero into a register can be achieved with "CLR Rx" instead of "LI Rx,0"; loading -1 can be done with "SETO rx"; adding 1 to a reg can be done with "INC Rx" instead of "AI Rx,1" and so on. Each of these opcodes is 2 bytes shorter than an immediate. To make these easier to detect as alternatives, these constraints are also defined:
- L = +2 or -2; used to emit INCT or DECT
- M = -1; use to emit SETO or DEC
- N = +1; used to emit INC
- O = 0; used to emit CLR
NOTE: It is important with these constraints that they appear in the alternatives list before a broader constraint like 'i'. For example, the addhi3 operand2 constraints are:
(match_operand:HI 2 "general_operand" "rR>,Q,LMN,LMN,i,rR>,Q, rR>,Q") |
Because 'i' will match any immediate value so must be after the 'LMN' constraints which are more specific.
Constraints can also be preceded by modifiers. The equals ("=") modifier means this value is set. This is always operand[0] in the TMS9900 backend. The percent ("%") modifier means operands are commutative, i.e. can be swapped because A+B is the same as B+A, etc. Sometimes this makes the compiler's life easier.
Having commutative operands is actually particularly important for the TMS9900 since arithmetic operators only take two operands. So there is no opcode to do for example x=a+b. Instead, we need to do x=a;x+=b. For this reason, generally, operand[1] uses the "0" constraint to say it must match operand[0].
Predicates
When declaring an insn, each operand is given a predicate that gcc uses to match operand types. The predicates used in the TMS9900 backend are:
- general_operand - anything will match
- register_operand - only a register will match
- nonimmediate_operand - anything except an immediate (constant) value will match
- const_int_operand - only a constant integer will match
- memory_operand - only a memory reference (R or Q, see below) will match
(set (match_operand:HI 0 "register_operand" "=r") ... ) |
(set (match_operand:HI 0 "general_operand" "=rR,Q") ... ) |
Extending and truncating
Changing an operand to a larger width is called an extend. Reducing it a shorter width is called a truncate. There is only way to truncate an operand but extending can be zero extend or sign extend. We can extend from QI to HI, HI to SI or QI to SI and truncate from SI to HI, SI to QI or HI to QI. This is gives us 9 unique insn patterns we need to implement!
Truncating or extend from SI to HI and vice versa is quite straightforward. It's just a matter of discarding a word or adding a zero word or an inverted word (>FFFF) if sign extending a negative value.
Generally, we can't truncate or extend a memory reference, only a register. There is usually no use case for this anyway as a memory location never changes size. But a value in a register could be loaded as a QI and then accessed as a HI given that gcc thinks low-order byte values are accessible in a wider word.
Going from SI to QI and vice versa needs to be done in two steps, SI to HI and then SI to QI, so these are candidates for define_expand. More on define_expand later. To be safe, I use a scratch reg for the intermediate value. For example:
(define_expand "zero_extendqisi2" |
[(set (match_operand:SI 0 "register_operand" "") |
(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "")))] |
"" |
{ |
tms9900_debug_operands ("zero_extendqisi2", NULL_RTX, operands, 2); |
rtx tmp = gen_reg_rtx (HImode); |
emit_insn (gen_zero_extendqihi2 (tmp, operands[1])); |
emit_insn (gen_zero_extendhisi2 (operands[0], tmp)); |
DONE; |
} |
) |
NOTE: define_expand C code branches must end in either a DONE or a FAIL, or the insn pattern will remain in the list. These are macros that perform a goto in the generated code. Omitting these won't cause any error to be emitted but will cause really weird behaviour in the output that will take forever to track down. Been there, done that.
The trickier conversions are QI to HI and vice versa. For HI to QI truncation we don't care about the low-order byte and we can emit "SWPB" to just change the order of the bytes. To extend QI to HI we do care and must fill the high-order byte with zeros or a sign extension. Sign extension is done using an arithmetic shift "SRA Rx,8"and zero extension with a logical shift "SRL Rx,8".
Lengths, jumps and branches
Each "define_insn" and (not "define_expand" or "define_split") has an array of length values at the end, like this:
[(set_attr "length" "4,2,4,4,6")] |
These lengths are the number of bytes emitted by each of the alternatives identified by the constraints. The number of length attributes must exactly match the number of alternatives (e.g. 5 alternatives must have 5 lengths). These lengths are used to determine the distance to a label for a jump instruction. In TMS9900 we can jump backwards by 256 bytes or forward by 254 bytes. If a longer jump is needed we need to emit a branch ("B") instruction.
In some cases we can't determine the exact length of an insn because it may vary depending on the circumstances at the time. For example, we may have to emit an additional SWPB to correct a dropped truncate. In these cases I've made the length to be the worst case length. This does mean we will occasionally emit a branch where a jump would have been sufficient but this should be rare.
The function output_branch() in gcc/config/tms9900/tms9900.c determines what opcodes to output for all conditional jumps and will replace a jump with an inverted jump over a branch if distance requires it.
Tailcalls
If the optimising compiler sees that the end sequence of two functions are identical, it will emit insns for only one of the functions and issue a branch instruction in the other and use a common epilogue and return. This is called a tailcall. It is identified in the md by checking the macro SIBLING_CALL_P(insn).
Defining insns
The main way to define an insn is by using "define_insn". To define the insn "movhi", which moves one word to another, we include this in the md file:
(define_insn "movhi" |
[(set (match_operand:HI 0 "nonimmediate_operand" "=rR>,rR>,Q, Q,r") |
(match_operand:HI 1 "general_operand" "rR>, Q, rR>,Q,i"))] |
"" |
{ |
if (which_alternative == 4) |
{ |
return("li %0, %1"); |
} |
else |
{ |
return("mov %1, %0"); |
} |
} |
[(set_attr "length" "2,4,4,6,4")]) |
Here we have defined the name (movhi), the operation (set) and described the two operands. The first (destination) operand is a "non_immediate" operand, i.e. any type except an immediate. The second (source) is a "general_operand" which can be anything. There are 5 alternatives defined. The Q constraint adds 2 bytes to the length of the opcode but r, R and > are the same length, so each combination of rR> and Q are broken out as separate alternatives. "MOV r,r" is 2 bytes but "MOV @label1,@label2" would be 6 bytes. The final constraint is for immediates. Since TMS9900 does not allow immediate values to be used with MOV we instead emit load immediate (LI) for alternative number 4. This operation is 4 bytes long.
The empty string after the pattern is the condition. This insn will only be used if the condition is met. This is usually empty.
NOTE: The entire insn is surrounded by a pair of brackets (). The pattern and the set_attr are each surrounded by a pair of square brackets []. Each part of the pattern is surrounded by brackets as well (). The nested brackets gets confusing. My most common cause of build failures is a missing or misplaced closing bracket.
Emitting code
There are many ways to emit code in the md file. The example above uses C code as indicated by the opening brace ("{"). If instead of a brace, there is a double quote, the text is output directly. For an insn like abs which has only one operand, could have done this:
(define_insn "abshi2" |
Or we could also list instructions for each alternative like this (although there is no point as the same instruction is used for both alternatives):
(define_insn "abshi2" |
Or within the C code, we could just do { return "abs %0"; }
Or within C, we could call you can use { output_asm_insn ("abs %0", operands); return ""; }
Or if you only have a short piece of C code you could add a double quote followed by an asterisk and then a C statement with any double quotes escaped like this: "* return \"abs %0\""
I think there are just too many alternatives here tbh. I think mixing and matching methods of emitting output is going to be error prone. I personally like consistency so have tried to always use C code to emit opcodes. I also try and avoid mixing output_asm_insn("abs %0") and return "abs %0" in the same insn.
NOTE: since the MD file is compiled to C code, omitting a return will cause random output. Been there done that too.
Thursday, 14 December 2023
Coding machine descriptions in C
rtx objects
The compiled RTL is represented internally by an object called "rtx". This object is used for lots of things, insns, operands, constants, etc. There are several macros defined to get information about an rtx object.
Testing whether an rtx is something is done using macros that have _P appended. So to test if an rtx is an insn, you can call INSN_P(rtx); to test if it is a reg you can call REG_P(rtx); to test for numbers use CONST_INT_P(rtx), and so on. For the general case, GET_CODE(rtx) will return an enum with the type of an rtx. GET_RTX_NAME(code) will return a printable string containing the code name.
NOTE: I've found the code name to reliably match the enum so when debugging it is much easier to find an insn with a name of "plus" than it is to wade through the enum definition file trying to figure out which one is number 573. [1]
If it is a reg or memory location, it will have a mode. In C code, the modes are defined by an enum as QImode, HImode, SImode, etc. To get the mode of an rtx, the macro GET_MODE(rtx) can be used. To get a printable name, the macro GET_MODE_NAME(mode) can be used.
Registers
Registers have defines for convenience such as HARD_R0_REGNUM or FRAME_POINTER_REGNUM (aka HARD_R9_REGNUM). The stack pointer is also available through a global rtx called stack_pointer_rtx.
NOTE: be sure to wrap register values with gen_rtx_REG() before passing to any functions that expect an rtx parameter. I'm sure there will be a warning, but it will get lost in the big sea of warnings and then gcc will just crash.
Emitting opcodes
C code can emit opcodes. This is done in functions tms9900_expand_prologue() for example. The emit_insn() function will emit an insn rtx into the output list. Each named insn in the md file will have a corresponding function with a "gen_" prefix so you can call gen_movhi() and so on to emit these insns. The functions take an array of rtx objects as parameters for operands.
Operands
Operands are passed to insns as an array of rtx objects. So operand[0] is the first rtx and so on. I did once have a concern that we shouldn't modify rtx objects passed to insns but that seems to be a valid thing to and many other backends do it all the time. This is useful if we want to replace an operand with something else or swap them around, etc.
NOTE: be careful not to access operands with a higher array index than your insn defines. As with any C code there is no bounds checking on the array. If an insn takes 3 operands and you try to access operand[3] the compiler will probably (make that hopefully) seg fault.
[1] I just made that up, I've no idea what the code is for PLUS.
I published the URL to this blog on atari age. The posts are in reverse chronological order but the best place to start is the beginning .
-
Previous: Intro Next: TMS9900 Tools and libraries The tools in binutils seem very mature. I have not found any issues with ...
-
Next : Scope My first computer was a TI99/4A in 1982. I had a mini-memory module with a whopping 4KiB of RAM that came with a line-by-line ...
-
Previous: Scope Next: Coding in C Up: Intro Background The TMS9900 has some unique features among proces...