Disassemblies courtesy of: https://defuse.ca/online-x86-assembler.html

Hex To ASCII Conversion Using The 8086's DAA Instruction

Recently, I was REing some old 80881 firmware, as one does, and noticed a mysterious sequence of instructions that I've seen a few times over the years:

and al,0fh
add al,90h
daa
adc al,40h
daa

I know what this instruction sequence does because a programmer from the 80s named Chuck Guzis told me. That was in 2014.2 However, it wasn't until a few days ago that I finally figured out how and why this instruction sequence works. And since I find string formatting fascinating, I figured it might be fun to write about this instruction sequence.

What Does That Instruction Sequence Do?

As spoiled by the title, the above sequence converts a 4-bit nibble, which can take on the values 0h to Fh as hex digits, into the equivalent 8-bit byte in ASCII; 0 to 9 map to 30 to 39 for 0). I can understand why this trick was popular back when every byte mattered:

Even if you don't know 8086 or x86 assembly, two of these lines can seamlessly translate to a higher-level language like Rust, if we pretend that register AL is a variable, say my_al:

However, what does daa do? According to Mr. Guzis, similar instruction sequences have been around since the Intel 4004 days in the early 70s.

DAA is an instruction for doing decimal arithmetic, and it has indeed existed since the Intel 4004. I don't think decimal arithmetic is necessarily niche these days, but dedicated instructions for manipulating decimal numbers certainly are. I

Quick (Packed) BCD and ASCII Primers

Many people who read my blog probably already know a lot or all of the information in this section. I am including it for completeness or as a refresher. If you're not interested, skip to the next section.

BCD

What Intel refers to as "packed decimal" is also called Packed Binary Coded Decimal. The "Packed" part is important historically, but I'll be calling it BCD for the rest of thi post; the "Packed" is implied.

Binary Coded Decimal represents decimal numbers using 4 bits for each decimal digit. 4 bits are required because there are 10 unique digits that need to be represented, and 4 bits is the smallest number of bits that can represent at least 10 unique values. Since 4 bits can represent 2^4 = 16 values, the remaining 6 bit patterns don't have an interpretation/are undefined.2

Just like in hexadecimal/binary, we can create larger BCD values than 9 by concatenating

The below table lists all 16 4-bit binary values and how they're interpreted in hex, decimal, and BCD:

BCD AdditionHexBCDBCD Carry Out
9 + 10xa10Yes
9 + 90x1218Yes
10 + 80x1818Yes

Hex and ASCII Representations

Just like with 4-bit hexadecimal and BCD values, the meanining of a 8-bit value differs depending on whether you interpret the bits as ASCII or hexadecimal. For instance, while the bit pattern 0b00100000 represents the hexadcimal number 0x20 or decimal number 32, in ASCII it represents a literal space ( ).

Any software or hardware such as a font ROM that expects to operate on ASCII values will map

Making things more complicated is that a single hex digit requires 4 bits, all ASCII values require 8 bits.

In hexadecimal representation, we have 16 unique digits: 0,1,2,3

BinaryHexDecimalBCDASCII BinaryASCII Hex
0b00000x0000b001100000x30
0b00010x1110b001100010x31
0b00100x2220b001100100x32
0b00110x3330b001100110x33
0b01000x4440b001101000x34
0b01010x5550b001101010x35
0b01100x6660b001101100x36
0b01110x7770b001101110x37
0b10000x8880b001110000x38
0b10010x9990b001110010x39
0b10100xA10Undefined0b010000010x41
0b10110xB11Undefined0b010000100x42
0b11000xC12Undefined0b010000110x43
0b11010xD13Undefined0b010001000x44
0b11100xE14Undefined0b010001010x45
0b11110xF15Undefined0b010001100x46
0b000100000x101610UndefinedUndefined

Basic Hex To ASCII Conversion

By looking at the table in the previous section, the gist of hexadecimal to ASCII conversion is to map the 4-bit hexadcimal values 0x0 to 0xF to the 8-bit hexadecimal values 0x30 through 0x39, and 0x41 through 0x46, respectively. If I were writing my own hex to ASCII routine, even in assembly language, I would write something that looks like the following in Rust (playground link):

fn hex_to_ascii(hex: u8) -> u8 {
    let in_range = hex & 0xf;

    if in_range < 10 {
        in_range + 0x30
    } else {
        in_range + (0x41 - 10)
    }
}

As of this writing, the generated assembly from rustc has some similarities to the mystery snippet:

40 80 e7 0f             and    dil,0xf
8d 47 30                lea    eax,[rdi+0x30]
8d 4f 37                lea    ecx,[rdi+0x37]
40 80 ff 0a             cmp    dil,0xa
0f b6 d0                movzx  edx,al
0f b6 c1                movzx  eax,cl
0f 42 c2                cmovb  eax,edx
c3                      ret

Contrast with our mystery sequence, reproduced here:

24 0f                   and    al,0xf
04 90                   add    al,0x90
27                      daa
14 40                   adc    al,0x40
27                      daa

Hex To ASCII Conversion Using DAA As A Conditional Add

Deriving The DAA Instruction

Explaining Our Mystery Sequence

Conclusions

Even though this is one of my shorter posts, it's still a large number of words spent discussing an obsolete trick for formatting numbers on an obsolete CPU. Today, you would never write code using DAA in 32-bit code and I'm not even sure compilers would emit code using it even when optimizing for space; the instruction is niche and not optimized.

However, I do find the 8086-optimized version interesting in how it works, and a creative solution to program within strict memory limits. Constraints give way to creativity.

I don't see myself writing a BCD library anytime soon, but as part of my RE efforts and just poking around on vintage computers, I hope I can use this blog post as a refresher on how DAA works when I need it :).

Footnotes

1 The 8086 and 8088 CPUs have the same instruction set, but different pinouts. They are equivalent for the scope of this blog post, and I will refer to either of them depending on context.

2 Somewhere on the [Vintage Computer Federation Forums](https://forum.vcfed.org/index.php) you can find his post to me, but I've long since lost the link.

3 There are other more efficient methods to encode BCD, such as using 10 bits to encode 1000 decimal digits, with 24 left over.

Last Updated: 2022-05-15