From Assembly Language To Executable

This page is a brief look at the production of an executable image from an assembly language program. It is Linux-specific; technically it is ELF-specific. We're also assuming the NASM assembler, but the ideas are really universal.

The Assembly Language File

Here's an example program for our study. What do you think it will do?

        global  _start
        section .text
_start:
        mov     rax, 1
        mov     rdi, 1
        mov     rsi, message
        mov     rdx, 13
        syscall
        mov     eax, 60
        xor     rdi, rdi
        syscall
message:
        db      "Hello, World", 10

The Listing File

You can produce a listing file with

nasm -f elf64 -l hello.lst hello.asm
     1                                          global  _start
     2                                          section .text
     3                                  _start:
     4 00000000 B801000000                      mov     rax, 1
     5 00000005 BF01000000                      mov     rdi, 1
     6 0000000A 48BE-                           mov     rsi, message
     7 0000000C [2500000000000000] 
     8 00000014 BA0D000000                      mov     rdx, 13
     9 00000019 0F05                            syscall
    10 0000001B B83C000000                      mov     eax, 60
    11 00000020 4831FF                          xor     rdi, rdi
    12 00000023 0F05                            syscall
    13                                  message:
    14 00000025 48656C6C6F2C20576F-             db      "Hello, World", 10
    15 0000002E 726C640A           
    16                                  

The Object File

Running nasm outputs the file hello.o

hexdump -C hello.o

produces the object file hello.o which is 752 bytes in size. Here it is:

00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|
00000020  00 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000030  00 00 00 00 40 00 00 00  00 00 40 00 06 00 02 00  |....@.....@.....|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  01 00 00 00 01 00 00 00  06 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  c0 01 00 00 00 00 00 00  |................|
000000a0  32 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |2...............|
000000b0  10 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000c0  07 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  00 00 00 00 00 00 00 00  00 02 00 00 00 00 00 00  |................|
000000e0  2c 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |,...............|
000000f0  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  11 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000110  00 00 00 00 00 00 00 00  30 02 00 00 00 00 00 00  |........0.......|
00000120  78 00 00 00 00 00 00 00  04 00 00 00 04 00 00 00  |x...............|
00000130  04 00 00 00 00 00 00 00  18 00 00 00 00 00 00 00  |................|
00000140  19 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
00000150  00 00 00 00 00 00 00 00  b0 02 00 00 00 00 00 00  |................|
00000160  1a 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000170  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000180  21 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |!...............|
00000190  00 00 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  |................|
000001a0  18 00 00 00 00 00 00 00  03 00 00 00 01 00 00 00  |................|
000001b0  04 00 00 00 00 00 00 00  18 00 00 00 00 00 00 00  |................|
000001c0  b8 01 00 00 00 bf 01 00  00 00 48 be 00 00 00 00  |..........H.....|
000001d0  00 00 00 00 ba 0d 00 00  00 0f 05 b8 3c 00 00 00  |............<...|
000001e0  48 31 ff 0f 05 48 65 6c  6c 6f 2c 20 57 6f 72 6c  |H1...Hello, Worl|
000001f0  64 0a 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |d...............|
00000200  00 2e 74 65 78 74 00 2e  73 68 73 74 72 74 61 62  |..text..shstrtab|
00000210  00 2e 73 79 6d 74 61 62  00 2e 73 74 72 74 61 62  |..symtab..strtab|
00000220  00 2e 72 65 6c 61 2e 74  65 78 74 00 00 00 00 00  |..rela.text.....|
00000230  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000240  00 00 00 00 00 00 00 00  01 00 00 00 04 00 f1 ff  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 03 00 01 00  00 00 00 00 00 00 00 00  |................|
00000270  00 00 00 00 00 00 00 00  12 00 00 00 00 00 01 00  |................|
00000280  25 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |%...............|
00000290  0b 00 00 00 10 00 01 00  00 00 00 00 00 00 00 00  |................|
000002a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002b0  00 68 65 6c 6c 6f 2e 61  73 6d 00 5f 73 74 61 72  |.hello.asm._star|
000002c0  74 00 6d 65 73 73 61 67  65 00 00 00 00 00 00 00  |t.message.......|
000002d0  0c 00 00 00 00 00 00 00  01 00 00 00 02 00 00 00  |................|
000002e0  25 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |%...............|
000002f0

It is a great idea to pick up a copy of the ELF specification and use it to figure out what each byte in this file means. Once you pay your dues and study the file format with a hand analysis, you can use the objdump utility to get information about the file.

Note a couple things here:

The Executable File

Object files are not run directly since in general they will need to be linked to other object files to form complete programs. (If this were not the case, you could never have pre-compiled libraries sitting on your system and would thus have to build everything from source constantly.) We can "link" the file hello.o with

ld hello.o

and get the file a.out, which is 778 bytes in size:

00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 3e 00 01 00 00 00  80 00 40 00 00 00 00 00  |..>.......@.....|
00000020  40 00 00 00 00 00 00 00  d8 00 00 00 00 00 00 00  |@...............|
00000030  00 00 00 00 40 00 38 00  01 00 40 00 05 00 02 00  |....@.8...@.....|
00000040  01 00 00 00 05 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 00 40 00 00 00 00 00  00 00 40 00 00 00 00 00  |..@.......@.....|
00000060  b2 00 00 00 00 00 00 00  b2 00 00 00 00 00 00 00  |................|
00000070  00 00 20 00 00 00 00 00  00 00 00 00 00 00 00 00  |.. .............|
00000080  b8 01 00 00 00 bf 01 00  00 00 48 be a5 00 40 00  |..........H...@.|
00000090  00 00 00 00 ba 0d 00 00  00 0f 05 b8 3c 00 00 00  |............<...|
000000a0  48 31 ff 0f 05 48 65 6c  6c 6f 2c 20 57 6f 72 6c  |H1...Hello, Worl|
000000b0  64 0a 00 2e 73 79 6d 74  61 62 00 2e 73 74 72 74  |d...symtab..strt|
000000c0  61 62 00 2e 73 68 73 74  72 74 61 62 00 2e 74 65  |ab..shstrtab..te|
000000d0  78 74 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |xt..............|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000110  00 00 00 00 00 00 00 00  1b 00 00 00 01 00 00 00  |................|
00000120  06 00 00 00 00 00 00 00  80 00 40 00 00 00 00 00  |..........@.....|
00000130  80 00 00 00 00 00 00 00  32 00 00 00 00 00 00 00  |........2.......|
00000140  00 00 00 00 00 00 00 00  10 00 00 00 00 00 00 00  |................|
00000150  00 00 00 00 00 00 00 00  11 00 00 00 03 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000170  b2 00 00 00 00 00 00 00  21 00 00 00 00 00 00 00  |........!.......|
00000180  00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  01 00 00 00 02 00 00 00  |................|
000001a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001b0  18 02 00 00 00 00 00 00  c0 00 00 00 00 00 00 00  |................|
000001c0  04 00 00 00 04 00 00 00  08 00 00 00 00 00 00 00  |................|
000001d0  18 00 00 00 00 00 00 00  09 00 00 00 03 00 00 00  |................|
000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001f0  d8 02 00 00 00 00 00 00  32 00 00 00 00 00 00 00  |........2.......|
00000200  00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000210  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000220  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000230  00 00 00 00 03 00 01 00  80 00 40 00 00 00 00 00  |..........@.....|
00000240  00 00 00 00 00 00 00 00  01 00 00 00 04 00 f1 ff  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  0b 00 00 00 00 00 01 00  a5 00 40 00 00 00 00 00  |..........@.....|
00000270  00 00 00 00 00 00 00 00  13 00 00 00 10 00 01 00  |................|
00000280  80 00 40 00 00 00 00 00  00 00 00 00 00 00 00 00  |..@.............|
00000290  1a 00 00 00 10 00 f1 ff  b2 00 60 00 00 00 00 00  |..........`.....|
000002a0  00 00 00 00 00 00 00 00  26 00 00 00 10 00 f1 ff  |........&.......|
000002b0  b2 00 60 00 00 00 00 00  00 00 00 00 00 00 00 00  |..`.............|
000002c0  2d 00 00 00 10 00 f1 ff  b8 00 60 00 00 00 00 00  |-.........`.....|
000002d0  00 00 00 00 00 00 00 00  00 68 65 6c 6c 6f 2e 61  |.........hello.a|
000002e0  73 6d 00 6d 65 73 73 61  67 65 00 5f 73 74 61 72  |sm.message._star|
000002f0  74 00 5f 5f 62 73 73 5f  73 74 61 72 74 00 5f 65  |t.__bss_start._e|
00000300  64 61 74 61 00 5f 65 6e  64 00                    |data._end.|
0000030a

The executable is also in ELF format, so study it if you get the chance. Locate the start of the code at offset 0x0080, and notice now that the address of message is worked out a little better!

The Program in Memory

Now there's one final step. The executable file has to be loaded into memory before it can execute. The loading is done by the operating system. When I ran the program, the code for _start got loaded at address 0x0000000000400078, and this is what I saw in memory when asking gdb to disassemble:

$ gdb a.out
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.

(gdb) set disassembly-flavor intel
(gdb) disassemble _start
Dump of assembler code for function _start:
   0x0000000000400080 <+0>:   mov    eax,0x1
   0x0000000000400085 <+5>:   mov    edi,0x1
   0x000000000040008a <+10>:  movabs rsi,0x4000a5
   0x0000000000400094 <+20>:  mov    edx,0xd
   0x0000000000400099 <+25>:  syscall 
   0x000000000040009b <+27>:  mov    eax,0x3c
   0x00000000004000a0 <+32>:  xor    rdi,rdi
   0x00000000004000a3 <+35>:  syscall 
End of assembler dump.

To see the data:

(gdb) x /13xb message
0x4000a2 <message>:	0x48	0x65	0x6c	0x6c	0x6f	0x2c	0x20	0x57
0x4000aa <message+8>:	0x6f	0x72	0x6c	0x64	0x0a

There's more in memory, but that should give you an idea. The next thing to try should be a complicated program with many sections.