Monday, 29 January 2024

Forcing memory constants

Previous: Instruction Patterns                Next: Missing Extends               Up: Intro  

The TMS9900 can do some operations directly with constants, but not that many.  There is LI to load a reg, CI to compare, AI to add and so on.  But there are many things it can't do, such as load a constant directly to a memory location, compare a memory value to a constant, or do anything at all with byte constants.

I realised at some point that it was more efficient in many cases just to store a literal value in code memory and refer to it.  So comparing a byte value went from:

        li       r1,>400
        cb       @label,r1

to this:

LC0
        byte 4
        ...
cb       @label, @LC0

So 8 bytes are reduced to 5 bytes and 2 instructions to 1 instruction.  And if LC0 gets referenced again, the savings are even more.  It also fixed some issues with greater than or less than that had arisen when comparing a byte to a constant loaded into a reg.

This was done using the define_insn_and_split pattern as follows:

(define_insn_and_split "cmpqi"
  [(set (cc0)
        (compare (match_operand:QI 0 "nonimmediate_operand" "=g")
                 (match_operand:QI 1 "general_operand"      "g")))]
  ""
  "cb   %0, %1"
  "CONST_INT_P (operands[1])"
  [(set (cc0)
        (compare (match_dup 0)
                 (match_dup 1)))]
  {
    tms9900_debug_operands ("split_cmpqi", NULL_RTX, operands, 2);
    int val = INTVAL (operands[1]) & 0xff;
    operands[1] = force_const_mem (QImode, GEN_INT (val));
  }
  [(set_attr "length" "6")]
)

In gcc13, this can become a define_insn_and_rewrite to avoid duplicating the pattern but this works well in gcc4.  The insn part just emits CB in all cases.  The split part has a condition of CONST_INT_P (operands[1]) in which case it replaces operands[1] with a force_mem_const which emits a constant into memory.  So effectively converting the "i" constraint into a "Q" constraint.

NOTE: for some reason that I haven't yet figured out, adding proper constraints causes the compiler error "unable to generate reloads" to be emitted.  This happens even if all constraint types are included.  Since I know the immediate case has been rewritten, I just replaced the constraints with "g" and this worked.  There must be something I'm missing between "g" and "rR>Qi".  It does mean the length is too long for some alternatives but that's only a very minor inefficiency.

NOTE: I found that it is important to call GEN_INT(val) without any mask.  Adding a mask with 0xff caused a compiler assert in combine.c:do_SUBST() when it compares the value against trunc_int_for_mode().  Just removing the mask solved this issue.





No comments:

Post a Comment

I published the URL to this blog on atari age.  The posts are in reverse chronological order but the best place to start is the beginning .