32-Bit Multiplication with a 64-Bit Result
Coded in Thumb (ARM) | July 15, 2011
I worked on a 32-bit microcontroller for a summer intern project. Due to the limitations of
the ARM Thumb code, multiplying 32-bit numbers results in a loss of precision if the answer
can only be represented as a 64-bit number. The multiplying instruction, MULS, simply
truncates the top 32 bits of the answer. To solve this, I figured out a way to get a
64-bit number resulting from multiplying two 32-bit inputs, using 32-bit registers to represent the result.
Introduction
I first tried working this out on paper before getting it to work in Thumb. Say we want to multiply two numbers A and B. Both A and B will each take up one 32-bit register. Since A and B are represented in binary, they can be split into a top half and bottom half, like so.
A
* B
= Result
--------------- ---------------
-----------
| A_top | A_bot | *
| B_top | B_bot | =
| Top | Bot |
--------------- ---------------
-----------
This can then be taken as a product of two sums.
(A_top<<16 + A_bot) * (B_top<<16 + B_bot)
= (A_top<<16)(B_top<<16) + (A_top<<16)(B_bot)
+ (B_top<<16)(A_bot) + A_bot*B_bot
= (A_top*B_top)<<32 + (A_top*B_bot + B_top*A_bot)<<16 +
A_bot*B_bot
Top can be set as (A_top*B_top), since (A_top*B_top) << 32
moves it over by one register. Similarly, Bot can be set as (A_bot*B_bot).
LSRS R7, R6, #16 ; top half of Mid
ADD R1, R7 ; add to Top
The hard part is the middle 32 bits, (A_top*B_bot + B_top*A_bot) << 16, since
half of it falls in Top and half falls in Bot. This can be dealt
with by first adding the top half of (A_top*B_bot + B_top*A_bot) << 16 (which we
will call Mid from now on) to Top. To get this top half, shift
Mid right by 16 places.
Next, add the bottom half of Mid to Bot. To get this bottom half, shift
Mid left by 16 places. This will result with the bottom half of Mid followed by
16 zeros, which is what we want.
Checking for Overflow
The problem with only using 32-bit registers is that it is possible that overflow will occur. The overflow amount needs to be added to the right registers to get the correct result.
LSLS R7, R6, #16 ; bottom half of Mid
ADDS R0, R7 ; add to Bot
ADC R1, R7 ; add the carry to Top
The first part to check is the result of adding the bottom half of Mid to
Bot. Add 1 (the carry) to Top if there is an overflow. If
(A_top*B_bot + B_top*A_bot) also causes an overflow, it also has to be added to
Top.
ADDS R6, R7, R6 ; A_top*B_bot + B_top*A_bot stored in R6
ADC R1, R7 ; Overflow added to Top
This code does not deal with negative numbers.
Negation of 64-Bit Numbers with 32-Bit Registers has some information on this.