Assembly Macro for Position Independent Strings
As part of an assembly project written for NASM, I needed assembly code to be position independent but still be able to access string constants. Position independent instructions can build strings by storing immediate values to memory, but this is tedious to do by hand and I need a lot of string constants stored. Instead, of doing this by hand, I automated the string constant building with a macro.
I started by writing an example of my desired macro syntax and macro-output.
;; MACRO INPUT EXAMPLE: ;; I want to put the left operand, a string constant, into the right operand, a memory destination. inject ds:eax, `abc\x00`
;; MACRO OUTPUT EXAMPLE: ;; The string should be unrolled into a series of mov immediate instructions. mov byte [ds:eax + 0], 'a' mov byte [ds:eax + 1], 'b' mov byte [ds:eax + 2], 'c' mov byte [ds:eax + 3], `\0`
If my macro works, it'll translate the syntax example to the output example.
My macro takes the two arguments, the memory destination and string, and interpolates each byte of the string into a mov
instruction. I build multiple instructions with a %rep
preprocessor loop and tear apart the string using %substr
.
%macro inject 2 %strlen times %2 %assign index 1 %rep times %substr in_byte %2 index index+1 mov byte [%1 + (index - 1)], in_byte %assign index index + 1 %endrep %endmacro
I can test my macro by only using the preprocess only flag: -E
. When I test my macro, the preprocessor expansion looks about right.
$ nasm -E string-inject.asm %line 6+1 string-inject.asm mov byte [ds:eax + (1 - 1)], 'a' %line 6+0 string-inject.asm mov byte [ds:eax + (2 - 1)], 'b' mov byte [ds:eax + (3 - 1)], 'c' mov byte [ds:eax + (4 - 1)], `\0`
The arithmetic in the destination address is a little weird. Because array addressing starts at 0, but %substr
indexing starts at 1, there's a little arithmetic in the array addressing. To be sure the macro is working, I can look at the assembled output.
Below, I assemble the example code and then disassemble it with radare2.
$ nasm -f bin string-inject.asm $ r2 string-inject -- Thanks for using radare2! [0x00000000]> x/6i 0x00000000 3e67c60061 mov byte ds:[eax], 0x61 0x00000005 3e67c6400162 mov byte ds:[eax + 1], 0x62 0x0000000b 3e67c6400263 mov byte ds:[eax + 2], 0x63 0x00000011 3e67c6400300 mov byte ds:[eax + 3], 0 0x00000017 ff invalid 0x00000018 ff invalid [0x00000000]>
The output is as expected. The bytes 0x61
, 0x62
, and 0x63
are the ASCII characters of my null terminated string. The macro works and I can now use strings in position independent code.