Disassemblies courtesy of: https://defuse.ca/online-x86-assembler.html
DAA
InstructionRecently, I was REing some old 80881 firmware, as one does, and noticed a mysterious sequence of instructions that I've seen a few times over the years:
and al,0fh
add al,90h
daa
adc al,40h
daa
I know what this instruction sequence does because a programmer from the 80s named Chuck Guzis told me. That was in 2014.2 However, it wasn't until a few days ago that I finally figured out how and why this instruction sequence works. And since I find string formatting fascinating, I figured it might be fun to write about this instruction sequence.
As spoiled by the title, the above sequence converts a 4-bit nibble, which can take on the values 0h
to Fh
as hex digits,
into the equivalent 8-bit byte in ASCII;
0
to 9
map to 30
to 39
for 0
). I can understand why this trick was
popular back when every byte mattered:
AL
and FLAGS
.Even if you don't know 8086 or x86 assembly, two of these lines can seamlessly
translate to a higher-level language like Rust, if we pretend that register AL
is a variable, say my_al
:
and al,0fh
is equivalent to the BitAndAssign
trait:
let mut my_al: u8 = ...; // Pretend we got a number from somewhere.
my_al &= 0xF;
add al,90h
is equivalent to the Add
trait on u8
:
adc
However, what does daa
do? According to Mr. Guzis, similar instruction
sequences have been around since the Intel 4004
days in the early 70s.
DAA
is an instruction for doing decimal arithmetic, and it has indeed
existed since the Intel 4004.
I don't think decimal arithmetic is necessarily niche these days, but dedicated
instructions for manipulating decimal numbers certainly are. I
Many people who read my blog probably already know a lot or all of the information in this section. I am including it for completeness or as a refresher. If you're not interested, skip to the next section.
What Intel refers to as "packed decimal" is also called Packed Binary Coded Decimal. The "Packed" part is important historically, but I'll be calling it BCD for the rest of thi post; the "Packed" is implied.
Binary Coded Decimal represents decimal numbers using 4 bits for each decimal
digit. 4 bits are required because there are 10 unique digits that need to
be represented, and 4 bits is the smallest number of bits that can
represent at least 10 unique values. Since 4 bits can represent 2^4 = 16
values, the remaining 6 bit patterns don't have an interpretation/are undefined.2
Just like in hexadecimal/binary, we can create larger BCD values than 9 by concatenating
The below table lists all 16 4-bit binary values and how they're interpreted in hex, decimal, and BCD:
BCD Addition | Hex | BCD | BCD Carry Out |
---|---|---|---|
9 + 1 | 0xa | 10 | Yes |
9 + 9 | 0x12 | 18 | Yes |
10 + 8 | 0x18 | 18 | Yes |
Just like with 4-bit hexadecimal and BCD values, the meanining of a 8-bit value
differs depending on whether you interpret the bits as ASCII or hexadecimal.
For instance, while the bit pattern 0b00100000
represents the hexadcimal
number 0x20
or decimal number 32
, in ASCII it represents a literal space
(
).
Any software or hardware such as a font ROM that expects to operate on ASCII values will map
Making things more complicated is that a single hex digit requires 4 bits, all ASCII values require 8 bits.
In hexadecimal representation, we have 16 unique digits: 0
,1
,2
,3
Binary | Hex | Decimal | BCD | ASCII Binary | ASCII Hex |
---|---|---|---|---|---|
0b0000 | 0x0 | 0 | 0 | 0b00110000 | 0x30 |
0b0001 | 0x1 | 1 | 1 | 0b00110001 | 0x31 |
0b0010 | 0x2 | 2 | 2 | 0b00110010 | 0x32 |
0b0011 | 0x3 | 3 | 3 | 0b00110011 | 0x33 |
0b0100 | 0x4 | 4 | 4 | 0b00110100 | 0x34 |
0b0101 | 0x5 | 5 | 5 | 0b00110101 | 0x35 |
0b0110 | 0x6 | 6 | 6 | 0b00110110 | 0x36 |
0b0111 | 0x7 | 7 | 7 | 0b00110111 | 0x37 |
0b1000 | 0x8 | 8 | 8 | 0b00111000 | 0x38 |
0b1001 | 0x9 | 9 | 9 | 0b00111001 | 0x39 |
0b1010 | 0xA | 10 | Undefined | 0b01000001 | 0x41 |
0b1011 | 0xB | 11 | Undefined | 0b01000010 | 0x42 |
0b1100 | 0xC | 12 | Undefined | 0b01000011 | 0x43 |
0b1101 | 0xD | 13 | Undefined | 0b01000100 | 0x44 |
0b1110 | 0xE | 14 | Undefined | 0b01000101 | 0x45 |
0b1111 | 0xF | 15 | Undefined | 0b01000110 | 0x46 |
0b00010000 | 0x10 | 16 | 10 | Undefined | Undefined |
By looking at the table in the previous section, the gist of hexadecimal to
ASCII conversion is to map the 4-bit hexadcimal values 0x0
to 0xF
to the
8-bit hexadecimal values 0x30
through 0x39
, and 0x41
through 0x46
, respectively.
If I were writing my own hex to ASCII routine, even in assembly language, I
would write something that looks like the following in Rust (playground link):
fn hex_to_ascii(hex: u8) -> u8 {
let in_range = hex & 0xf;
if in_range < 10 {
in_range + 0x30
} else {
in_range + (0x41 - 10)
}
}
As of this writing, the generated assembly from rustc
has some similarities
to the mystery snippet:
rustc
's code nor the mystery snippet have branches (thanks to cmov
in the former).rustc
's version is much faster :).40 80 e7 0f and dil,0xf
8d 47 30 lea eax,[rdi+0x30]
8d 4f 37 lea ecx,[rdi+0x37]
40 80 ff 0a cmp dil,0xa
0f b6 d0 movzx edx,al
0f b6 c1 movzx eax,cl
0f 42 c2 cmovb eax,edx
c3 ret
Contrast with our mystery sequence, reproduced here:
24 0f and al,0xf
04 90 add al,0x90
27 daa
14 40 adc al,0x40
27 daa
DAA
As A Conditional AddDAA
InstructionEven though this is one of my shorter posts, it's still a large number of words
spent discussing an obsolete trick for formatting numbers on an obsolete CPU.
Today, you would never write code using DAA
in 32-bit code and I'm not even sure compilers
would emit code using it even when optimizing for space; the instruction is
niche and not optimized.
However, I do find the 8086-optimized version interesting in how it works, and a creative solution to program within strict memory limits. Constraints give way to creativity.
I don't see myself writing a BCD library anytime soon, but as part of my RE
efforts and just poking around on vintage computers, I hope I can use this
blog post as a refresher on how DAA
works when I need it :).
1 The 8086 and 8088 CPUs have the same instruction set, but different pinouts. They are equivalent for the scope of this blog post, and I will refer to either of them depending on context.
2 Somewhere on the [Vintage Computer Federation Forums](https://forum.vcfed.org/index.php) you can find his post to me, but I've long since lost the link.
3 There are other more efficient methods to encode BCD, such as using 10 bits to encode 1000 decimal digits, with 24 left over.
Last Updated: 2022-05-15