x86 Architecture Overview

The IA-32 is the instruction set architecture (ISA) of Intel’s most successful line of 32-bit processors, and the Intel 64 ISA is its extension into 64-bit processors. (Actually Intel 64 was invented by AMD, who called it x86-64). These notes summarize a few items of interest about these two ISAs. They in no way serve as a substitute for reading Intel’s manuals.

IA-32 and x86-64

The two massively popular architectures IA-32 and x86-84 are so common, they are described in a single set of manuals.

The following notes briefly summarize the latter architecture only.

x86-64 Architecture Diagram

The basic architecture of the x86-64 is described in Volume 1 of the System Developer’s Manual. The following diagram is taken directly from Chapter 3 in this volume:

x86-64-arch.png

Registers

Application Programmers generally use only the general purpose registers, floating point registers, XMM, and YMM registers.

General Purpose Registers

These are 64 bits wide and used for integer arithmetic and logic, and to hold both data and pointers to memory. The registers are called R0...R15. Also:

RIP and RFLAGS

RIP is the instruction pointer and RFAGS is the flags register.

Segment Registers

These are CS, DS, SS, ES, FS, and GS. I haven’t used them in 64-bit programming.

XMM Registers

These are 128-bits wide. They are named XMM0...XMM15. Use them for floating-point and integer arithmetic. You can do operations on 128-bit integers, but you can also take advantage of their ability to do operations in parallel:

YMM Registers

These are 256-bits wide. They are named YMM0...YMM15. Use them for floating-point arithmetic. You can do:

and some other crazy things.

FPU Registers

There are eight registers used for computing with 80-bit floating point values. The registers don’t have names because they are used in a stack-like fashion.

Other Registers

Application programmers can remain oblivious of the rest of the registers:

Instruction Set

See the SDM Volume 1, Chapter 5 for a nice overview of all of the processor instructions and Volume 2 for complete information.

The following table shows most of the available instructions, using the instruction names as specified in the Intel syntax. Not every processor supports every instruction, of course.

The vertical bar means OR, the square brackets mean OPTIONAL, and parentheses are used for grouping. For example:

INTEGER FPU SSE SSE2
MOV
CMOV[N]((L|G|A|B)[E]|E|Z|S|C|O|P)
XCHG
BSWAP
XADD
CMPXCHG[8B]
PUSH[A[D]] | POP[A[D]]
IN | OUT
CBW | CWDE | CWD | CDQ
MOVSX | MOVZX

ADD | ADC
SUB | SBB
[I]MUL
[I]DIV
INC | DEC
NEG
CMP

DAA | DAS
AAA | AAS | AAM | AAD

AND | OR | XOR | NOT

SH(L|R)[D]
SA(L|R)
RO(L|R)
RC(L|R)

BT[S|R|C]
BS(F|R)
SET[N]((L|G|A|B)[E]|E|Z|S|C|O|P)
TEST

JMP
J[N]((L|G|A|B)[E]|E|Z|S|C|O|P)
J[E]CXZ
LOOP[N][Z|E]
CALL | RET
INT[O] | IRET
ENTER | LEAVE
BOUND

MOVS[B|W|D]
CMPS[B|W|D]
SCAS[B|W|D]
LODS[B|W|D]
STOS[B|W|D]
INS[B|W|D]
OUTS[B|W|D]
REP[N][Z|E]

STC | CLC | CMC
STD | CLD
STI | CLI
LAHF | SAHF
PUSHF[D] | POPF[D]

LDS | LES | LFS | LGS | LSS

LEA
NOP
UD2
XLAT[B]
CPUID
F[I]LD
F[I]ST[P]
FBLD
FBSTP
FXCH
FCMOV[N](E|B|BE|U)

FADD[P]
FIADD
FSUB[R][P]
FISUB[R]
FMUL[P]
FIMUL
FDIV[R][P]
FIDIV[R]
FPREM[1]
FABS
FCHS
FRNDINT
FSCALE
FSQRT
FXTRACT

F[U]COM[P][P]
FICOM[P]
F[U]COMI[P]
FTST
FXAM

FSIN
FCOS
FSINCOS
FPTAN
FPATAN
F2XM1
FYL2X
FYL2XP1

FLD1
FLDZ
FLDPI
FLDL2E
FLDLN2
FLDL2T
FLDLG2

FINCSTP
FDECSTP
FFREE
F[N]INIT
F[N]CLEX
F[N]STCW
FLDCW
F[N]STENV
FLDENV
F[N]SAVE
FRSTOR
F[N]STSW
FWAIT | WAIT
FNOP

FXSAVE
FXRSTOR
MOV(A|U)PS
MOV(H|HL|L|LH)PS
MOVSS
MOVMSKPS

ADD(P|S)S
SUB(P|S)S
MUL(P|S)S
DIV(P|S)S
RCP(P|S)S
SQRT(P|S)S
RSQRT(P|S)S
MAX(P|S)S
MIN(P|S)S

CMP(P|S)S
[U]COMISS

ANDPS
ANDNPS
ORPS
XORPS

SHUFPS
UNPCK(H|L)PS

CVTPI2PS
CVT[T]PS2PI
CVTSI2SS
CVT[T]SS2SI

PAVG(B|W)
PEXTRW
PINSRW
P(MIN|MAX)(UB|SW)
PMOVMSKB
PMULHUW
PSADBW
PSHUFW

LDMXCSR
STMXCSR

MASKMOVQ
MOVNT(Q|PS)
PREFETCHT(0|1|2)
PREFETCHNTA
SFENCE
MOV(A|U)PD
MOV(H|L)PD
MOVSD
MOVMSKPD

ADD(P|S)D
SUB(P|S)D
MUL(P|S)D
DIV(P|S)D
SQRT(P|S)D
MAX(P|S)D
MIN(P|S)D

CMP(P|S)D
[U]COMISD

ANDPD
ANDNPD
ORPD
XORPD

SHUFPD
UNPCK(H|L)PD

CVT(PI|DQ)2PD
CVT[T]PD2(PI|DQ)
CVTSI2SD
CVT[T]SD2SI
CVTPS2PD
CVTPD2PS
CVTDQ2PS
CVT[T]PS2DQ
CVTSS2SD
CVTSD2SS

MOVDQ(A|U)
MOVQ2DQ
MOVDQ2Q
PUNPCK(H|L)QDQ
PADDQ
PSUBQ
PMULUDQ
PSHUF(LW|HW|D)
PS(L|R)LDQ

MASKMOVDQU
MOVNT(PD|DQ|I)
CLFLUSH
LFENCE
MFENCE
PAUSE
SYSTEM MMX SSE3 SSE4
LGDT | SGDT
LLDT | SLDT
LTR | STR
LIDT | SIDT
LMSW | SMSW
CLTS
ARPL
LAR
LSL
VERR | VERW
INVD | WBINVD
INVLPG
LOCK
HLT
RSM
RDMSR | WRMSR
RDPMC
RDTSC
SYSENTER
SYSEXIT
MOVD
MOVQ

PACKSS(WB|DW)
PACKUSWB
PUNPCK(H|L)(BW|WD|DQ)

PADD(B|W|D)
PADD(S|US)(B|W)
PSUB(B|W|D)
PSUB(S|US)(B|W)
PMUL(H|L)W
PMADDWD
PCMP(EQ|GT)(B|W|D)

PAND
PANDN
POR
PXOR

PS(L|R)L(W|D|Q)
PSRA(W|D)

EMMS
FISTTP

LDDQU

ADDSUBP(S|D)

HADDP(S|D)
HSUBP(S|D)

MOVS(H|L)DUP
MOVDDUP

MONITOR
MWAIT
PMUL(LD|DQ)
DPP(D|S)

MOVNTDQA

BLEND[V](PD|PS)
PBLEND(VB|W)

PMIN(UW|UD|SB|SD)
PMAX(UW|UD|SB|SD)

ROUND(P|S)(S|D)

EXTRACTPS
INSERTPS
PINSR(B|D|Q)
PEXTR(B|W|D|Q)
PMOV(S|Z)X(BW|BD|WD|BQ|WQ|DQ)

MPSADBW

PHMINPOSUW

PTEST

PCMPEQQ
PACKUSDW

PCMP(E|I)STR(I|M)
PCMPGTQ

CRC32
POPCNT
64-BIT MODE VIRTUAL MACHINE SSSE3 AESNI
CDQE
CMPSQ
CMPXCHG16B
LODSQ
MOVSQ
MOVZX
STOSQ
SWAPGS
SYSCALL
SYSRET
VMPTRLD
VPTRST
VMCLEAR
VMREAD
VMWRITE
VMCALL
VMLAUNCH
VMRESUME
VMXOFF
VMXON
INVEPT
INVVPID
PHADD(W|SW|D)
PHSUB(W|SW|D)
PABS(B|W|D)
PMADDUBSW
PMULHRSW
PSHUFB
PSIGN(B|W|D)
PALIGNR
AESDEC[LAST]
AESENC[LAST]
AESIMC
AESKEYGENASSIST
PCLMULQDQ

Addressing Memory

In protected mode, applications can choose a flat or segmented memory model (see the SDM Volume 1, Chapter 3 for details); in real mode only a 16-bit segmented model is available. Most programmers will only use protected mode and a flat-memory model, so that’s all we’ll discuss here.

A memory reference has four parts and is often written as

[SELECTOR : BASE + INDEX * SCALE + OFFSET]

The selector is one of the six segment registers; the base is one of the eight general purpose registers; the index is any of the general purpose registers except ESP; the scale is 1, 2, 4, or 8; and the offset is any 32-bit number. (Example: [fs:ecx+esi*8+93221].) The minimal reference consists of only a base register or only an offset; a scale can only appear if there is an index present.

Sometimes the memory reference is written like this:

selector
offset(base,index,scale)

Data Types

The data types are

Type nameNumber of bitsBit indices
Byte87..0
Word1615..0
Doubleword3232..0
Quadword6463..0
Doublequadword128127..0

Little Endianness

The IA-32 is little endian, meaning the least significant bytes come first in memory. For example:

    0    12  
    1    31       byte @ 9 = 1F
    2    CB       word @ B = FE06
    3    74       word @ 6 = 230B
    4    67       word @ 1 = CB31
    5    45       dword @ A = 7AFE0636
    6    0B       qword @ 6 = 7AFE06361FA4230B
    7    23       word @ 2 = 74CB
    8    A4       qword @ 3 = 361FA4230B456774
    9    1F       dword @ 9 = FE06361F
    A    36  
    B    06  
    C    FE  
    D    7A  
    E    12  

Note that if you draw memory with the lowest bytes at the bottom, then it is easier to read these values!

Flags Register

Many instructions cause the flags register to be updated. For example if you execute an add instruction and the sum is too big to fit into the destination register, the Overflow flag is set.

    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
   +---------------------------------------------------------------+
   | | | | | | | | | | |I|V|V|A|V|R| |N| I |O|D|I|T|S|Z| |A| |P| |C|
   | | | | | | | | | | |D|I|I|C|M|F| |T| P |F|F|F|F|F|F| |F| |F| |F|
   | | | | | | | | | | | |P|F| | | | | | L | | | | | | | | | | | | |
   +---------------------------------------------------------------+

The flags are described in Section 3.4.3 of Volume 1 of the SDM. To determine how each instruction affects the flags, see Appendix A of Volume 1 of the SDM.

The System Developer’s Manual

The System Developer’s Manual contains vast amounts of important information and is required reading for all assembly language programmers and backend compiler writers. The manual is split into several volumes; links to all volumes are here. Highlights from Volumes 1 and 2: