Previous: Instruction Patterns Next: Missing Extends Up: Intro
The TMS9900 can do some operations directly with constants, but not that many. There is LI to load a reg, CI to compare, AI to add and so on. But there are many things it can't do, such as load a constant directly to a memory location, compare a memory value to a constant, or do anything at all with byte constants.
I realised at some point that it was more efficient in many cases just to store a literal value in code memory and refer to it. So comparing a byte value went from:
li r1,>400 |
to this:
LC0 |
So 8 bytes are reduced to 5 bytes and 2 instructions to 1 instruction. And if LC0 gets referenced again, the savings are even more. It also fixed some issues with greater than or less than that had arisen when comparing a byte to a constant loaded into a reg.
This was done using the define_insn_and_split pattern as follows:
(define_insn_and_split "cmpqi" [(set (cc0)
(compare (match_operand:QI 0 "nonimmediate_operand" "=g")
(match_operand:QI 1 "general_operand" "g")))]
"" "cb %0, %1" "CONST_INT_P (operands[1])" [(set (cc0) (compare (match_dup 0) (match_dup 1)))] { tms9900_debug_operands ("split_cmpqi", NULL_RTX, operands, 2); int val = INTVAL (operands[1]) & 0xff; operands[1] = force_const_mem (QImode, GEN_INT (val)); } [(set_attr "length" "6")] )
|
In gcc13, this can become a define_insn_and_rewrite to avoid duplicating the pattern but this works well in gcc4. The insn part just emits CB in all cases. The split part has a condition of CONST_INT_P (operands[1]) in which case it replaces operands[1] with a force_mem_const which emits a constant into memory. So effectively converting the "i" constraint into a "Q" constraint.
NOTE: for some reason that I haven't yet figured out, adding proper constraints causes the compiler error "unable to generate reloads" to be emitted. This happens even if all constraint types are included. Since I know the immediate case has been rewritten, I just replaced the constraints with "g" and this worked. There must be something I'm missing between "g" and "rR>Qi". It does mean the length is too long for some alternatives but that's only a very minor inefficiency.
NOTE: I found that it is important to call GEN_INT(val) without any mask. Adding a mask with 0xff caused a compiler assert in combine.c:do_SUBST() when it compares the value against trunc_int_for_mode(). Just removing the mask solved this issue.
No comments:
Post a Comment