x86 Architecture Overview

The IA-32 is the instruction set architecture (ISA) of Intel’s most successful line of 32-bit processors, and the Intel 64 ISA is its extension into 64-bit processors. (Actually Intel 64 was invented by AMD, who called it x86-64). These notes summarize a few items of interest about these two ISAs. They in no way serve as a substitute for reading Intel’s manuals.

Processor Modes

A processor implementing the IA-32 architecture can execute in:

A processor implementing the Intel 64 architecture has the three above modes plus IA-32e mode, which has two submodes:

Register Set

Application programmers only care about the following registers (those in purple only exist in 64-bit processors):

x86-64.png

Application programmers can remain oblivious of the rest of the registers:

Instruction Set

See the Intel 64/IA-32 SDM Volume 1, Chapter 5 for a nice overview of all of the processor instructions and Volume 2 for complete information.

The following table shows most of the available instructions, using the instruction names as specified in the Intel syntax. Not every processor supports every instruction, of course.

The vertical bar means OR, the square brackets mean OPTIONAL, and parentheses are used for grouping. For example:

INTEGER FPU SSE SSE2
MOV
CMOV[N]((L|G|A|B)[E]|E|Z|S|C|O|P)
XCHG
BSWAP
XADD
CMPXCHG[8B]
PUSH[A[D]] | POP[A[D]]
IN | OUT
CBW | CWDE | CWD | CDQ
MOVSX | MOVZX

ADD | ADC
SUB | SBB
[I]MUL
[I]DIV
INC | DEC
NEG
CMP

DAA | DAS
AAA | AAS | AAM | AAD

AND | OR | XOR | NOT

SH(L|R)[D]
SA(L|R)
RO(L|R)
RC(L|R)

BT[S|R|C]
BS(F|R)
SET[N]((L|G|A|B)[E]|E|Z|S|C|O|P)
TEST

JMP
J[N]((L|G|A|B)[E]|E|Z|S|C|O|P)
J[E]CXZ
LOOP[N][Z|E]
CALL | RET
INT[O] | IRET
ENTER | LEAVE
BOUND

MOVS[B|W|D]
CMPS[B|W|D]
SCAS[B|W|D]
LODS[B|W|D]
STOS[B|W|D]
INS[B|W|D]
OUTS[B|W|D]
REP[N][Z|E]

STC | CLC | CMC
STD | CLD
STI | CLI
LAHF | SAHF
PUSHF[D] | POPF[D]

LDS | LES | LFS | LGS | LSS

LEA
NOP
UD2
XLAT[B]
CPUID
F[I]LD
F[I]ST[P]
FBLD
FBSTP
FXCH
FCMOV[N](E|B|BE|U)

FADD[P]
FIADD
FSUB[R][P]
FISUB[R]
FMUL[P]
FIMUL
FDIV[R][P]
FIDIV[R]
FPREM[1]
FABS
FCHS
FRNDINT
FSCALE
FSQRT
FXTRACT

F[U]COM[P][P]
FICOM[P]
F[U]COMI[P]
FTST
FXAM

FSIN
FCOS
FSINCOS
FPTAN
FPATAN
F2XM1
FYL2X
FYL2XP1

FLD1
FLDZ
FLDPI
FLDL2E
FLDLN2
FLDL2T
FLDLG2

FINCSTP
FDECSTP
FFREE
F[N]INIT
F[N]CLEX
F[N]STCW
FLDCW
F[N]STENV
FLDENV
F[N]SAVE
FRSTOR
F[N]STSW
FWAIT | WAIT
FNOP

FXSAVE
FXRSTOR
MOV(A|U)PS
MOV(H|HL|L|LH)PS
MOVSS
MOVMSKPS

ADD(P|S)S
SUB(P|S)S
MUL(P|S)S
DIV(P|S)S
RCP(P|S)S
SQRT(P|S)S
RSQRT(P|S)S
MAX(P|S)S
MIN(P|S)S

CMP(P|S)S
[U]COMISS

ANDPS
ANDNPS
ORPS
XORPS

SHUFPS
UNPCK(H|L)PS

CVTPI2PS
CVT[T]PS2PI
CVTSI2SS
CVT[T]SS2SI

PAVG(B|W)
PEXTRW
PINSRW
P(MIN|MAX)(UB|SW)
PMOVMSKB
PMULHUW
PSADBW
PSHUFW

LDMXCSR
STMXCSR

MASKMOVQ
MOVNT(Q|PS)
PREFETCHT(0|1|2)
PREFETCHNTA
SFENCE
MOV(A|U)PD
MOV(H|L)PD
MOVSD
MOVMSKPD

ADD(P|S)D
SUB(P|S)D
MUL(P|S)D
DIV(P|S)D
SQRT(P|S)D
MAX(P|S)D
MIN(P|S)D

CMP(P|S)D
[U]COMISD

ANDPD
ANDNPD
ORPD
XORPD

SHUFPD
UNPCK(H|L)PD

CVT(PI|DQ)2PD
CVT[T]PD2(PI|DQ)
CVTSI2SD
CVT[T]SD2SI
CVTPS2PD
CVTPD2PS
CVTDQ2PS
CVT[T]PS2DQ
CVTSS2SD
CVTSD2SS

MOVDQ(A|U)
MOVQ2DQ
MOVDQ2Q
PUNPCK(H|L)QDQ
PADDQ
PSUBQ
PMULUDQ
PSHUF(LW|HW|D)
PS(L|R)LDQ

MASKMOVDQU
MOVNT(PD|DQ|I)
CLFLUSH
LFENCE
MFENCE
PAUSE
SYSTEM MMX SSE3 SSE4
LGDT | SGDT
LLDT | SLDT
LTR | STR
LIDT | SIDT
LMSW | SMSW
CLTS
ARPL
LAR
LSL
VERR | VERW
INVD | WBINVD
INVLPG
LOCK
HLT
RSM
RDMSR | WRMSR
RDPMC
RDTSC
SYSENTER
SYSEXIT
MOVD
MOVQ

PACKSS(WB|DW)
PACKUSWB
PUNPCK(H|L)(BW|WD|DQ)

PADD(B|W|D)
PADD(S|US)(B|W)
PSUB(B|W|D)
PSUB(S|US)(B|W)
PMUL(H|L)W
PMADDWD
PCMP(EQ|GT)(B|W|D)

PAND
PANDN
POR
PXOR

PS(L|R)L(W|D|Q)
PSRA(W|D)

EMMS
FISTTP

LDDQU

ADDSUBP(S|D)

HADDP(S|D)
HSUBP(S|D)

MOVS(H|L)DUP
MOVDDUP

MONITOR
MWAIT
PMUL(LD|DQ)
DPP(D|S)

MOVNTDQA

BLEND[V](PD|PS)
PBLEND(VB|W)

PMIN(UW|UD|SB|SD)
PMAX(UW|UD|SB|SD)

ROUND(P|S)(S|D)

EXTRACTPS
INSERTPS
PINSR(B|D|Q)
PEXTR(B|W|D|Q)
PMOV(S|Z)X(BW|BD|WD|BQ|WQ|DQ)

MPSADBW

PHMINPOSUW

PTEST

PCMPEQQ
PACKUSDW

PCMP(E|I)STR(I|M)
PCMPGTQ

CRC32
POPCNT
64-BIT MODE VIRTUAL MACHINE SSSE3 AESNI
CDQE
CMPSQ
CMPXCHG16B
LODSQ
MOVSQ
MOVZX
STOSQ
SWAPGS
SYSCALL
SYSRET
VMPTRLD
VPTRST
VMCLEAR
VMREAD
VMWRITE
VMCALL
VMLAUNCH
VMRESUME
VMXOFF
VMXON
INVEPT
INVVPID
PHADD(W|SW|D)
PHSUB(W|SW|D)
PABS(B|W|D)
PMADDUBSW
PMULHRSW
PSHUFB
PSIGN(B|W|D)
PALIGNR
AESDEC[LAST]
AESENC[LAST]
AESIMC
AESKEYGENASSIST
PCLMULQDQ

Addressing Memory

In protected mode, applications can choose a flat or segmented memory model (see the SDM Volume 1, Chapter 3 for details); in real mode only a 16-bit segmented model is available. Most programmers will only use protected mode and a flat-memory model, so that’s all we’ll discuss here.

A memory reference has four parts and is often written as

[SELECTOR : BASE + INDEX * SCALE + OFFSET]

The selector is one of the six segment registers; the base is one of the eight general purpose registers; the index is any of the general purpose registers except ESP; the scale is 1, 2, 4, or 8; and the offset is any 32-bit number. (Example: [fs:ecx+esi*8+93221].) The minimal reference consists of only a base register or only an offset; a scale can only appear if there is an index present.

Sometimes the memory reference is written like this:

selector
offset(base,index,scale)

Data Types

The data types are

Type nameNumber of bitsBit indices
Byte87..0
Word1615..0
Doubleword3232..0
Quadword6463..0
Doublequadword128127..0

Little Endianness

The IA-32 is little endian, meaning the least significant bytes come first in memory. For example:

    0    12  
    1    31       byte @ 9 = 1F
    2    CB       word @ B = FE06
    3    74       word @ 6 = 230B
    4    67       word @ 1 = CB31
    5    45       dword @ A = 7AFE0636
    6    0B       qword @ 6 = 7AFE06361FA4230B
    7    23       word @ 2 = 74CB
    8    A4       qword @ 3 = 361FA4230B456774
    9    1F       dword @ 9 = FE06361F
    A    36  
    B    06  
    C    FE  
    D    7A  
    E    12  

Note that if you draw memory with the lowest bytes at the bottom, then it is easier to read these values!

Flags Register

Many instructions cause the flags register to be updated. For example if you execute an add instruction and the sum is too big to fit into the destination register, the Overflow flag is set.

    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
    1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
   +---------------------------------------------------------------+
   | | | | | | | | | | |I|V|V|A|V|R| |N| I |O|D|I|T|S|Z| |A| |P| |C|
   | | | | | | | | | | |D|I|I|C|M|F| |T| P |F|F|F|F|F|F| |F| |F| |F|
   | | | | | | | | | | | |P|F| | | | | | L | | | | | | | | | | | | |
   +---------------------------------------------------------------+

The flags are described in Section 3.4.3 of Volume 1 of the SDM. To determine how each instruction affects the flags, see Appendix A of Volume 1 of the SDM.

Exceptions

Sometimes while executing an instruction an exception occurs. There are three types of exceptions.

When exceptions occur, the processor will start executing code in an exception handler associated with that exception. The predefined exceptions are:

GENERAL EXCEPTIONS
0#DEDivide Errorfault DIV or IDIV instruction
1#DBDebugfault
trap
...
3#BPBreakpointtrap INT 3 instruction
4#OFOverflowtrap INTO instruction executed when overflow flag in EFLAGS is set
5#BRBound Range Exceededfault BOUND instruction
6#UDUndefined Opcodefault UD2 instruction, or attempt to execute an opcode that doesn’t correspond to any instruction
7#NMDevice Not Availablefault FPU instruction or WAIT instruction on a processor without an FPU that is not linked to a FPU coprocessor
8#DFDouble Faultabort Exception occurred during an exception handler
10#TSInvalid TSSfault Task switch or implicit TSS access
11#NPNot Presentfault Loading segment registers or accessing system segments
12#SSStack Segment Faultfault Stack operations and SS register loads
13#GPGeneral Protection Faultfault Any memory reference and other protection checks
14#PFPage Faultfault Any memory reference
16#MFFPU Math Faultfault Any FPU instruction
  #IS - FPU stack overflow
  #IA - Invalid arithmetic operation
  #Z - Divide by zero
  #D - Source operand is a denormal number
  #O - Overflow in result
  #U - Underflow in result
  #P - Inexact result
17#ACAlignment Checkfault Any data reference in memory
18#MCMachine Faultabort Internal Error or bus error
19#XFSIMD Math Faultfault Any SIMD instruction
  #I - Invalid arithmetic operation or source operand
  #Z - Divide by zero
  #D - Source operand is a denormal number
  #O - Overflow in result
  #U - Underflow in result
  #P - Inexact result

The System Developer’s Manual

The System Developer’s Manual contains vast amounts of important information and is required reading for all assembly language programmers. The manual is split into several volumes; links to all volumes are here. Highlights from Volumes 1 and 2: