sandpile.org -- x86 architecture -- opcode encodings

x86 architecture
opcode encodings

upper case letters for operand types

DREX	MVEX EVEX	VEX		mod R/M
dst	kkk	imm.ssss_s'	v'_vvvv	reg	r/m	r \| m

						Mem	mod != 11
			B	G	E	R	GPR
				P	Q	(PR) N (PR)	MMX
VD		L	H	V	W	(VR) U (VR)	vector
			vT	rT		mT	tile
	{K}		vK	rK		mK	mask
				rB		mB	bounds

				Sreg		ST(i)	x87
Address	Offset			Creg
X = [DS:rSI]	Y = [ES:rDI]			Dreg		Flags
Imm	Jmp			Treg		only Z is left

lower case letters for operand sizes

1	2	4	8	16	32	64

byte	word	dword	qword	oword	yword	zword

				x = oword or yword			[o,y]
					upper = yword or zword		[y,z]
				normal = oword or yword or zword			[o,y,z]
			half = qword or oword or yword				[o.q,o,y]
		fourth = dword or qword or oword					o.[d,q,o]
	eighth = word or dword or qword						o.[w,d,q]

only c, g, m, i, j, k, l, and r, s are left	var-size = word or dword or qword
	zero-ext = word or dword or dword
		y = dword or qword			p = w : [z\|v\|y]	a = z : z	t = {eh}

The overlap between the integer z and y and the vector y and z could be eliminated. This can be achieved by renaming {v,z,y} to {i,j,k} or to {int,short,long} or to {i,r,s}. However, each of these combinations is -- in its own way -- challenging to memorize. And introducing lower and upper on the integer side would conflict with the vector side. That said, this cleanup does not seem to be worthwhile.				On the vector side, oword could be renamed to xword, to better match yword and zword. This would then require that xword be renamed, e.g. to lower, which would match upper. However, the lower-case letter l is hard to visually distinguish from the upper-case letter I. Perhaps c could be used instead of l, providing a mental reference to XY Chromosomes? That said, this cleanup does not seem to be worthwhile.

colors and annotations

#004080 MAIN	#808080 reserved	#B0B0B0 internal	#FFFFFF regular	default #808080	<== primary vs secondary ==>	#F0F0F0 blanks	basic table layout
	#008080 reserved	#80A0A0 internal	#B0D0D0 regular	x86-64 #008080		#E0E0E0 spacers
	#0080C0 reserved	#80A0C0 internal	#B0D0F0 regular	APX #0080C0		#E0C0C0 invalid in PM64	opcode table cells
	#804040 reserved	#AA8888 internal	#D0B0B0 regular	special #804040		#E0E0C0 defaults to O64

special	AVX512 predates APX (Therefore X and v' are used instead of B' and X' to encode the 4th bits of the R/M and INDEX fields for vector registers.)
inverted	bits are stored inverted

...^#mod	mod can be 0...3
...^#reg	reg can be 0...7
...^#r/m	r/m can be 0...7

...^v	only VEX-encoded form exists

...^I64	invalid in PM64
...^D64	defaults to O64 in PM64; 66h results in O16 (implicit RSP references)
...^Df64	defaults to O64 in PM64; 66h results in O16 in AMD64 but is ignored in EM64T (near branches)
...^Dv64	defaults to O64 in PM64; 66h results in #UD due to VEX encoding (near branches with masks)
...^F64	defaults to O64 in PM64; 66h is ignored in AMD64 and EM64T (GDTR/IDTR and CRx/DRx/TRx)
...^F64	defaults to O64 in PM64; 66h is ignored due to REX2.W1 (PUSHP/POPP)
...^F64	defaults to O64 in PM64; 66h is impossible due to EVEX.pp=00b (PUSH2(P)/POP2(P))

...+n	one register explicitly specified, and another n consecutive registers implied

...	reg op must be distinct from other reg op(s) (VSIB, AMX DP/CPLX/MUL, A512FP16 ADDC/MULC dst)

...scc	EVEX.v'aaa = source condition code (CCMPscc and CTESTscc) (cc -- with True and False instead of Parity and No Parity)
...,dfv	EVEX.vvvv = default flag value OSZC (CCMPscc and CTESTscc) (set {OF,SF,ZF,CF} to OSZC, as well as PF=C and AF=0)

{NF} ...	EVEX.NF controls status flag update suppression { ANDN, BLSR, BLSMSK, BLSI, BZHI, BEXTR }, { CFCMOVcc (Bv,)Ev,Gv }, and { INC/DEC, NEG, ADD/SUB, MUL/DIV/IMUL/IDIV, AND/OR/XOR, SAL/SAR/SHL/SHR, ROL/ROR, SHLD/SHRD, LZCNT/TZCNT/POPCNT }
{ND} ...	EVEX.ND controls NDD new data destination register Bq (with new zero-upper behavior instead of old preserve-upper behavior) { INC/DEC, NOT/NEG, ADD/SUB/ADC/SBB, AND/OR/XOR, SAL/SAR/SHL/SHR, RCL/RCR, ROL/ROR, SHLD/SHRD, ADCX/ADOX, (CF)CMOVcc, IMUL @ AFh }
{ZU} ...	EVEX.ND controls existing destination register to widen it (with new zero-upper behavior instead of old preserve-upper behavior) { SETcc Eb with mod=11b (i.e. SET Rb becomes SET Rq, whereas SET Mb remains SET Mb), IMUL @ 69h/6Bh (i.e. IMUL Gv,Ev,Ib/Iz becomes IMUL Gq,Ev,Ib/Iz }

instruction encodings

Fundamentally x86 operands are little-endian and of {transform} ( op {hint} ) {mask} form, arranged in dst,src[,...] order.

L1OM = {sss,ccccc} ( op {nt} ) {kkk}

K1OM = {sss} ( op {eh} ) {kkk}

1 byte opcodes and 2 byte opcodes

instruction prefix(es) ^#1

opcode
byte(s)

mod
R/M
byte
16-bit
32-bit

SIB
byte

displacement

immediate ^{#2, #3}

SEG, REP, LOCK, 66h, 67h

0Fh

xxh

byte/word/dword

3 byte opcodes, including DREX

instruction prefix(es)

opcode
bytes

mod
R/M
byte
16-bit
32-bit

SIB
byte

displacement

imm

SEG, REP, LOCK, 66h, 67h

0Fh

38h

xxh

byte/word/dword

3Ah

byte

7Ah

xxh

7Bh

byte

VEX2 (C5h), VEX3 (C4h), and XOP (8Fh)

instruction prefix(es) ^#4

opcode
byte

mod
R/M
byte
16-bit
32-bit

SIB
byte

displacement

imm

SEG, 67h

C5h

VEX

xxh

byte/word/dword

byte

SEG, 67h

C4h

VEX1

VEX2

xxh

SEG, 67h

8Fh

XOP1

XOP2

xxh

byte/word/dword

byte

byte/word/dword ^#1

L1OM misc (62h) and L1OM vector (D6h)

instruction prefix(es)

opcode
byte

mod
R/M
byte
16-bit
32-bit

displacement and immediate

SEG, REP, LOCK, 66h, 67h

62h

XBRW
and
oooo

word

byte

SIB
byte

byte/word/dword

SEG, REP, LOCK, 66h, 67h

D6h

NT, sss,
and
R'R B'B

v or c
and
kkk

xxh

byte/word/dword

word

MVEX (62h) and EVEX (62h)

instruction prefix(es) ^#4

opcode
byte

mod
R/M
byte
16-bit
32-bit

SIB
byte

displacement

imm

SEG, 67h

62h

MVEX1

MVEX2

MVEX3

xxh

byte/word/dword

byte

EVEX1

EVEX2

EVEX3

DREX

instruction prefix(es) ^#5

opcode
bytes

mod
R/M
byte
16-bit
32-bit

SIB
byte

DREX
byte

displacement

imm

SEG, REP, LOCK, 66h, 67h

0Fh

24h

xxh

byte/word/dword

25h

byte

notes

descriptions

In some cases it is possible to encode valid instructions that exceed the traditional 15-byte length limit. For example:

  ; 16-bit mode
  F2 F0 36 66 67 81 84 24 disp32 imm32 = xacquire lock add [ss:esp*1+disp32],imm32
  F3 F0 36 66 67 81 84 24 disp32 imm32 = xrelease lock add [ss:esp*1+disp32],imm32

  ; 16-bit mode
  36 67 8F EA 78 12 84 24 disp32 imm32 = lwpins eax,[ss:esp*1+disp32],imm32
  36 67 8F EA 78 12 8C 24 disp32 imm32 = lwpval eax,[ss:esp*1+disp32],imm32
  36 67 8F EA 78 10 84 24 disp32 imm32 = bextr eax,[ss:esp*1+disp32],imm32

  ; 64-bit mode
  64 67 8F EA F8 12 84 18 disp32 imm32 = lwpins rax,[fs:eax+ebx+disp32],imm32
  64 67 8F EA F8 12 8C 18 disp32 imm32 = lwpval rax,[fs:eax+ebx+disp32],imm32
  64 67 8F EA F8 10 84 18 disp32 imm32 = bextr rax,[fs:eax+ebx+disp32],imm32

It is up to the user to avoid these cases (and the resulting #GP exception).

Most 3DNow! instructions use the immediate byte as a third opcode byte.

Some SSE/SSE2 instructions use the immediate byte as a condition code.

The use of a REPE, REPNE, 66h, or REX prefix will result in a #UD exception.

The use of a REX prefix will result in a #UD exception. The DREX byte is used instead.

byte encodings

type

mod R/M
and SIB

mod

reg

r/m

scale

index

base

REX

0100b = 4xh

the REX prefix must immediately precede the opcode byte(s)

DREX

dst (VD)

O
C
0

OC0=0 for reg,r/m
OC0=1 for r/m,reg

outside PM64, DREX.[DRXB]
are silently ignored if set to 1

L1OM
(misc)

0110_0010b = 62h

oooo

L1OM
(vector)

1101_0110b = D6h

N
T

sss

vvvvv

ccccc

kkk

VEX2

1100_0101b = C5h

vvvv

mmmmm = 00001b
is implied for VEX2

VEX3

1100_0100b = C4h

mmmmm

vvvv

XOP

1000_1111b = 8Fh

mmmmm

vvvv

MVEX

0110_0010b = 62h

mmmm

vvvv

sss

kkk

EVEX

0110_0010b = 62h

mmm

vvvv

aaa

REX2

1101_1010b = D5h

the REX2 prefix must immediately precede the opcode byte(s)

EVEX

0110_0010b = 62h

mmm

vvvv

aaa

EVEX
MAP4

0110_0010b = 62h

1 0 0

vvvv

N
D

N
F

EVEX
scc / dfv

0110_0010b = 62h

1 0 0

dfv
O
F

dfv
S
F

dfv
Z
F

dfv
C
F

scc
C
3

scc
C
2

scc
C
1

scc
C
0

fields

descriptions

inverted

bits are stored inverted

0 = default operand size
1 = 64-bit operand size

field	4	3	2 1 0	register type	register use

REG	R'	R	reg	GPR or vector	src or dst
VVV	v'	v	vvv	GPR or vector	src or dst
R/M	B'	B	r/m	GPR	src or dst
R/M	X	B	r/m	vector	src or dst

BASE	B'	B	r/m	GPR	memory reference
INDEX	X'	X	index	GPR	memory reference
INDEX	v'	X	index	vector	VSIB memory ref.

note	AVX512 predates APX.			Therefore X and v' are used instead of B' and X'.

R' and R

mod R/M byte reg field extension

mod R/M byte r/m field extension
SIB byte base field extension
opcode byte reg field extension

SIB byte index field extension
also:
mod R/M byte r/m field extension (SIB-less MVEX/EVEX)

VEX prefix vvvv field extension
also:
SIB byte index field extension (VSIB with MVEX/EVEX)

mod R/M byte r/m field extension
SIB byte base field extension
opcode byte reg field extension

SIB byte index field extension

00000b (1-byte map MAP0), 00001b (2-byte map MAP1), 00010b (0Fh,38h MAP2), 00011b (0Fh,3Ah MAP3)
00100b (MAP4 incl. groups), 00101b (2-byte map MAP5), 00110b (0Fh,38h MAP6), 00111b (0Fh,3Ah MAP7)

01000b (XOP 8 with imm), 01001b (XOP 9 without imm), 01010b (XOP A with imm)

00b (none), 01b (66h), 10b (F3h), 11b (F2h)

LL or RC

00b (128-bit), 01b (256-bit), 10b (512-bit), 11b (reserved) or 00b (RN), 01b (RD), 10b (RU), 11b (RZ)

U and X'

first, the bit indicated 0=MVEX (aka K1OM) versus 1=EVEX (aka AVX512) (swizzling replaced by vector length) (eviction hint replaced by zeroing)
then, the bit indicated 1=512-bit (AVX512) versus 0=256-bit (AVX10.2) for {er} and {sae} variants (mod=11b && b=1) (leading bold !!)
now, the bit also serves (with inverted polarity) as X' together with X, to support the full 32 GPRs (mod<>11b) (and so MVEX is gone)