PART IV INSTRUCTION SET

Chapter 17 80386 Instruction Set




Chapter 17  80386 Instruction Set

----------------------------------------------------------------------------

This chapter presents instructions for the 80386 in alphabetical order. For
each instruction, the forms are given for each operand combination,
including object code produced, operands required, execution time, and a
description. For each instruction, there is an operational description and a
summary of exceptions generated.


17.1 Operand-Size and Address-Size Attributes



17.1  Operand-Size and Address-Size Attributes

When executing an instruction, the 80386 can address memory using either 16
or 32-bit addresses. Consequently, each instruction that uses memory
addresses has associated with it an address-size attribute of either 16 or
32 bits. 16-bit addresses imply both the use of a 16-bit displacement in
the instruction and the generation of a 16-bit address offset (segment
relative address) as the result of the effective address calculation.
32-bit addresses imply the use of a 32-bit displacement and the generation
of a 32-bit address offset. Similarly, an instruction that accesses words
(16 bits) or doublewords (32 bits) has an operand-size attribute of either
16 or 32 bits.

The attributes are determined by a combination of defaults, instruction
prefixes, and (for programs executing in protected mode) size-specification
bits in segment descriptors.


17.1.1 Default Segment Attribute



17.1.1  Default Segment Attribute

For programs executed in protected mode, the D-bit in executable-segment
descriptors determines the default attribute for both address size and
operand size. These default attributes apply to the execution of all
instructions in the segment. A value of zero in the D-bit sets the default
address size and operand size to 16 bits; a value of one, to 32 bits.

Programs that execute in real mode or virtual-8086 mode have 16-bit
addresses and operands by default.


17.1.2 Operand-Size and Address-Size Instruction Prefixes



17.1.2  Operand-Size and Address-Size Instruction Prefixes

The internal encoding of an instruction can include two byte-long prefixes:
the address-size prefix, 67H, and the operand-size prefix, 66H. (A later
section, "Instruction Format," shows the position of the prefixes in an
instruction's encoding.) These prefixes override the default segment
attributes for the instruction that follows. Table 17-1 shows the effect of
each possible combination of defaults and overrides.


See Also: Tab.17-1 

17.1.3 Address-Size Attribute for Stack



17.1.3  Address-Size Attribute for Stack

Instructions that use the stack implicitly (for example: POP EAX also have
a stack address-size attribute of either 16 or 32 bits. Instructions with a
stack address-size attribute of 16 use the 16-bit SP stack pointer register;
instructions with a stack address-size attribute of 32 bits use the 32-bit
ESP register to form the address of the top of the stack.

The stack address-size attribute is controlled by the B-bit of the
data-segment descriptor in the SS register. A value of zero in the B-bit
selects a stack address-size attribute of 16; a value of one selects a stack
address-size attribute of 32.

See Also: Tab.17-1 

17.2 Instruction Format



17.2  Instruction Format

All instruction encodings are subsets of the general instruction format
shown in Figure 17-1. Instructions consist of optional instruction
prefixes, one or two primary opcode bytes, possibly an address specifier
consisting of the ModR/M byte and the SIB (Scale Index Base) byte, a
displacement, if required, and an immediate data field, if required.

Smaller encoding fields can be defined within the primary opcode or
opcodes. These fields define the direction of the operation, the size of the
displacements, the register encoding, or sign extension; encoding fields
vary depending on the class of operation.

Most instructions that can refer to an operand in memory have an addressing
form byte following the primary opcode byte(s). This byte, called the ModR/M
byte, specifies the address form to be used. Certain encodings of the ModR/M
byte indicate a second addressing byte, the SIB (Scale Index Base) byte,
which follows the ModR/M byte and is required to fully specify the
addressing form.

Addressing forms can include a displacement immediately following either
the ModR/M or SIB byte. If a displacement is present, it can be 8-, 16- or
32-bits.

If the instruction specifies an immediate operand, the immediate operand
always follows any displacement bytes. The immediate operand, if specified,
is always the last field of the instruction.

The following are the allowable instruction prefix codes:

   F3H    REP prefix (used only with string instructions)
   F3H    REPE/REPZ prefix (used only with string instructions
   F2H    REPNE/REPNZ prefix (used only with string instructions)
   F0H    LOCK prefix

The following are the segment override prefixes:

   2EH    CS segment override prefix
   36H    SS segment override prefix
   3EH    DS segment override prefix
   26H    ES segment override prefix
   64H    FS segment override prefix
   65H    GS segment override prefix
   66H    Operand-size override
   67H    Address-size override

See Also: Fig.17-1 


17.2.1 ModR/M and SIB Bytes



17.2.1  ModR/M and SIB Bytes

The ModR/M and SIB bytes follow the opcode byte(s) in many of the 80386
instructions. They contain the following information:

  *  The indexing type or register number to be used in the instruction
  *  The register to be used, or more information to select the instruction
  *  The base, index, and scale information

The ModR/M byte contains three fields of information:

  *  The mod field, which occupies the two most significant bits of the 
     byte, combines with the r/m field to form 32 possible values: eight
     registers and 24 indexing modes

  *  The reg field, which occupies the next three bits following the mod
     field, specifies either a register number or three more bits of opcode
     information. The meaning of the reg field is determined by the first
     (opcode) byte of the instruction.

  *  The r/m field, which occupies the three least significant bits of the
     byte, can specify a register as the location of an operand, or can form
     part of the addressing-mode encoding in combination with the field as
     described above

The based indexed and scaled indexed forms of 32-bit addressing require the
SIB byte. The presence of the SIB byte is indicated by certain encodings of
the ModR/M byte. The SIB byte then includes the following fields:

  *  The ss field, which occupies the two most significant bits of the
     byte, specifies the scale factor

  *  The index field, which occupies the next three bits following the ss
     field and specifies the register number of the index register

  *  The base field, which occupies the three least significant bits of the
     byte, specifies the register number of the base register

Figure 17-2 shows the formats of the ModR/M and SIB bytes.

The values and the corresponding addressing forms of the ModR/M and SIB
bytes are shown in Tables 17-2, 17-3, and 17-4. The 16-bit addressing
forms specified by the ModR/M byte are in Table 17-2. The 32-bit addressing
forms specified by ModR/M are in Table 17-3. Table 17-4 shows the 32-bit
addressing forms specified by the SIB byte

See Also: Tab.17-4 


17.2.2 How to Read the Instruction Set Pages



17.2.2  How to Read the Instruction Set Pages

The following is an example of the format used for each 80386 instruction
description in this chapter:

CMC -- Complement Carry Flag

Opcode   Instruction         Clocks      Description

F5        CMC                  2            Complement carry flag

The above table is followed by paragraphs labelled "Operation,"
"Description," "Flags Affected," "Protected Mode Exceptions," "Real
Address Mode Exceptions," and, optionally, "Notes." The following sections
explain the notational conventions and abbreviations used in these
paragraphs of the instruction descriptions.


17.2.2.1 Opcode



17.2.2.1  Opcode

The "Opcode" column gives the complete object code produced for each form
of the instruction. When possible, the codes are given as hexadecimal bytes,
in the same order in which they appear in memory. Definitions of entries
other than hexadecimal bytes are as follows:

/digit: (digit is between 0 and 7) indicates that the ModR/M byte of the
instruction uses only the r/m (register or memory) operand. The reg field
contains the digit that provides an extension to the instruction's opcode.

/r: indicates that the ModR/M byte of the instruction contains both a
register operand and an r/m operand.

cb, cw, cd, cp: a 1-byte (cb), 2-byte (cw), 4-byte (cd) or 6-byte (cp)
value following the opcode that is used to specify a code offset and
possibly a new value for the code segment register.

ib, iw, id: a 1-byte (ib), 2-byte (iw), or 4-byte (id) immediate operand to
the instruction that follows the opcode, ModR/M bytes or scale-indexing
bytes. The opcode determines if the operand is a signed value. All words and
doublewords are given with the low-order byte first.

+rb, +rw, +rd: a register code, from 0 through 7, added to the hexadecimal
byte given at the left of the plus sign to form a single opcode byte. The
codes are--

      rb         rw         rd
    AL = 0     AX = 0     EAX = 0
    CL = 1     CX = 1     ECX = 1
    DL = 2     DX = 2     EDX = 2
    BL = 3     BX = 3     EBX = 3
    AH = 4     SP = 4     ESP = 4
    CH = 5     BP = 5     EBP = 5
    DH = 6     SI = 6     ESI = 6
    BH = 7     DI = 7     EDI = 7


17.2.2.2 Instruction



17.2.2.2  Instruction

The "Instruction" column gives the syntax of the instruction statement as
it would appear in an ASM386 program. The following is a list of the symbols
used to represent operands in the instruction statements:

rel8: a relative address in the range from 128 bytes before the end of the
instruction to 127 bytes after the end of the instruction.

rel16, rel32: a relative address within the same code segment as the
instruction assembled. rel16 applies to instructions with an operand-size
attribute of 16 bits; rel32 applies to instructions with an operand-size
attribute of 32 bits.

ptr16:16, ptr16:32: a FAR pointer, typically in a code segment different
from that of the instruction. The notation 16:16 indicates that the value of
the pointer has two parts. The value to the right of the colon is a 16-bit
selector or value destined for the code segment register. The value to the
left corresponds to the offset within the destination segment. ptr16:16 is
used when the instruction's operand-size attribute is 16 bits; ptr16:32 is
used with the 32-bit attribute.

r8: one of the byte registers AL, CL, DL, BL, AH, CH, DH, or BH.

r16: one of the word registers AX, CX, DX, BX, SP, BP, SI, or DI.

r32: one of the doubleword registers EAX, ECX, EDX, EBX, ESP, EBP, ESI, or
EDI.

imm8: an immediate byte value. imm8 is a signed number between -128 and
+127 inclusive. For instructions in which imm8 is combined with a word or
doubleword operand, the immediate value is sign-extended to form a word or
doubleword. The upper byte of the word is filled with the topmost bit of the
immediate value.

imm16: an immediate word value used for instructions whose operand-size
attribute is 16 bits. This is a number between -32768 and +32767 inclusive.

imm32: an immediate doubleword value used for instructions whose
operand-size attribute is 32-bits. It allows the use of a number between
+2147483647 and -2147483648.

r/m8: a one-byte operand that is either the contents of a byte register
(AL, BL, CL, DL, AH, BH, CH, DH), or a byte from memory.

r/m16: a word register or memory operand used for instructions whose
operand-size attribute is 16 bits. The word registers are: AX, BX, CX, DX,
SP, BP, SI, DI. The contents of memory are found at the address provided by
the effective address computation.

r/m32: a doubleword register or memory operand used for instructions whose
operand-size attribute is 32-bits. The doubleword registers are: EAX, EBX,
ECX, EDX, ESP, EBP, ESI, EDI. The contents of memory are found at the
address provided by the effective address computation.

m8: a memory byte addressed by DS:SI or ES:DI (used only by string
instructions).

m16: a memory word addressed by DS:SI or ES:DI (used only by string
instructions).

m32: a memory doubleword addressed by DS:SI or ES:DI (used only by string
instructions).

m16:16, M16:32: a memory operand containing a far pointer composed of two
numbers. The number to the left of the colon corresponds to the pointer's
segment selector. The number to the right corresponds to its offset.

m16 & 32, m16 & 16, m32 & 32: a memory operand consisting of data item pairs
whose sizes are indicated on the left and the right side of the ampersand.
All memory addressing modes are allowed. m16 & 16 and m32 & 32 operands are
used by the BOUND instruction to provide an operand containing an upper and
lower bounds for array indices. m16 & 32 is used by LIDT and LGDT to
provide a word with which to load the limit field, and a doubleword with
which to load the base field of the corresponding Global and Interrupt
Descriptor Table Registers.

moffs8, moffs16, moffs32: (memory offset) a simple memory variable of type
BYTE, WORD, or DWORD used by some variants of the MOV instruction. The
actual address is given by a simple offset relative to the segment base. No
ModR/M byte is used in the instruction. The number shown with moffs
indicates its size, which is determined by the address-size attribute of the
instruction.

Sreg: a segment register. The segment register bit assignments are ES=0,
CS=1, SS=2, DS=3, FS=4, and GS=5.


17.2.2.3 Clocks



17.2.2.3  Clocks

The "Clocks" column gives the number of clock cycles the instruction takes
to execute. The clock count calculations makes the following assumptions:

  *  The instruction has been prefetched and decoded and is ready for
     execution.

  *  Bus cycles do not require wait states.

  *  There are no local bus HOLD requests delaying processor access to the
     bus.

  *  No exceptions are detected during instruction execution.

  *  Memory operands are aligned.

Clock counts for instructions that have an r/m (register or memory) operand
are separated by a slash. The count to the left is used for a register
operand; the count to the right is used for a memory operand.

The following symbols are used in the clock count specifications:

  *  n, which represents a number of repetitions.

  *  m, which represents the number of components in the next instruction
     executed, where the entire displacement (if any) counts as one
     component, the entire immediate data (if any) counts as one component,
     and every other byte of the instruction and prefix(es) each counts as
     one component.

  *  pm=, a clock count that applies when the instruction executes in
     Protected Mode. pm= is not given when the clock counts are the same for
     Protected and Real Address Modes.

When an exception occurs during the execution of an instruction and the
exception handler is in another task, the instruction execution time is
increased by the number of clocks to effect a task switch. This parameter
depends on several factors:

  *  The type of TSS used to represent the current task (386 TSS or 286
     TSS).

  *  The type of TSS used to represent the new task.

  *  Whether the current task is in V86 mode.

  *  Whether the new task is in V86 mode.

Table 17-5 summarizes the task switch times for exceptions.

See Also: Tab.17-5 

17.2.2.4 Description



17.2.2.4  Description

The "Description" column following the "Clocks" column briefly explains the
various forms of the instruction. The "Operation" and "Description" sections
contain more details of the instruction's operation.


17.2.2.5 Operation




17.2.2.5  Operation

The "Operation" section contains an algorithmic description of the
instruction which uses a notation similar to the Algol or Pascal language.
The algorithms are composed of the following elements:

Comments are enclosed within the symbol pairs "(*" and "*)".

Compound statements are enclosed between the keywords of the "if" statement
(IF, THEN, ELSE, FI) or of the "do" statement (DO, OD), or of the "case"
statement (CASE ... OF, ESAC).

A register name implies the contents of the register. A register name
enclosed in brackets implies the contents of the location whose address is
contained in that register. For example, ES:[DI] indicates the contents of
the location whose ES segment relative address is in register DI. [SI]
indicates the contents of the address contained in register SI relative to
SI's default segment (DS) or overridden segment.

Brackets also used for memory operands, where they mean that the contents
of the memory location is a segment-relative offset. For example, [SRC]
indicates that the contents of the source operand is a segment-relative
offset.

A = B; indicates that the value of B is assigned to A.

The symbols =, <>, >=, and <= are relational operators used to compare two
values, meaning equal, not equal, greater or equal, less or equal,
respectively. A relational expression such as A = B is TRUE if the value of
A is equal to B; otherwise it is FALSE.

The following identifiers are used in the algorithmic descriptions:

  *  OperandSize represents the operand-size attribute of the instruction,
     which is either 16 or 32 bits. AddressSize represents the address-size
     attribute, which is either 16 or 32 bits. For example,

   IF instruction = CMPSW
   THEN OperandSize = 16;
   ELSE
      IF instruction = CMPSD
      THEN OperandSize = 32;
      FI;
   FI;

indicates that the operand-size attribute depends on the form of the CMPS
instruction used. Refer to the explanation of address-size and operand-size
attributes at the beginning of this chapter for general guidelines on how
these attributes are determined.

  *  StackAddrSize represents the stack address-size attribute associated
     with the instruction, which has a value of 16 or 32 bits, as explained
     earlier in the chapter.

  *  SRC represents the source operand. When there are two operands, SRC is
     the one on the right.

  *  DEST represents the destination operand. When there are two operands,
     DEST is the one on the left.

  *  LeftSRC, RightSRC distinguishes between two operands when both are
     source operands.

  *  eSP represents either the SP register or the ESP register depending on
     the setting of the B-bit for the current stack segment.

The following functions are used in the algorithmic descriptions:

  *  Truncate to 16 bits(value) reduces the size of the value to fit in 16
     bits by discarding the uppermost bits as needed.

  *  Addr(operand) returns the effective address of the operand (the result
     of the effective address calculation prior to adding the segment base).

  *  ZeroExtend(value) returns a value zero-extended to the operand-size
     attribute of the instruction. For example, if OperandSize = 32,
     ZeroExtend of a byte value of -10 converts the byte from F6H to
     doubleword with hexadecimal value 000000F6H. If the value passed to
     ZeroExtend and the operand-size attribute are the same size,
     ZeroExtend returns the value unaltered.

  *  SignExtend(value) returns a value sign-extended to the operand-size
     attribute of the instruction. For example, if OperandSize = 32,
     SignExtend of a byte containing the value -10 converts the byte from
     F6H to a doubleword with hexadecimal value FFFFFFF6H. If the value
     passed to SignExtend and the operand-size attribute are the same size,
     SignExtend returns the value unaltered.

  *  Push(value) pushes a value onto the stack. The number of bytes pushed
     is determined by the operand-size attribute of the instruction. The
     action of Push is as follows:

   IF StackAddrSize = 16
   THEN
      IF OperandSize = 16
      THEN
         SP = SP - 2;
         SS:[SP] = value; (* 2 bytes assigned starting at
                             byte address in SP *)
      ELSE (* OperandSize = 32 *)
         SP = SP - 4;
         SS:[SP] = value; (* 4 bytes assigned starting at
                             byte address in SP *)
      FI;
   ELSE (* StackAddrSize = 32 *)
      IF OperandSize = 16
      THEN
         ESP = ESP - 2;
         SS:[ESP] = value; (* 2 bytes assigned starting at
                              byte address in ESP*)
      ELSE (* OperandSize = 32 *)
         ESP = ESP - 4;
         SS:[ESP] = value; (* 4 bytes assigned starting at
                              byte address in ESP*)
      FI;
   FI;

  *  Pop(value) removes the value from the top of the stack and returns it.
     The statement EAX = Pop( ); assigns to EAX the 32-bit value that Pop
     took from the top of the stack. Pop will return either a word or a
     doubleword depending on the operand-size attribute. The action of Pop
     is as follows:

   IF StackAddrSize = 16
   THEN
      IF OperandSize = 16
      THEN
         ret val = SS:[SP]; (* 2-byte value *)
         SP = SP + 2;
      ELSE (* OperandSize = 32 *)
         ret val = SS:[SP]; (* 4-byte value *)
         SP = SP + 4;
      FI;
   ELSE (* StackAddrSize = 32 *)
      IF OperandSize = 16
      THEN
         ret val = SS:[ESP]; (* 2 bytes value *)
         ESP = ESP + 2;
      ELSE (* OperandSize = 32 *)
         ret val = SS:[ESP]; (* 4 bytes value *)
         ESP = ESP + 4;
      FI;
   FI;
   RETURN(ret val); (*returns a word or doubleword*)

  *  Bit[BitBase, BitOffset] returns the address of a bit within a bit
     string, which is a sequence of bits in memory or a register. Bits are
     numbered from low-order to high-order within registers and within
     memory bytes. In memory, the two bytes of a word are stored with the
     low-order byte at the lower address.

   If the base operand is a register, the offset can be in the range 0..31.
   This offset addresses a bit within the indicated register. An example,
   "BIT[EAX, 21]," is illustrated in Figure 17-3.

   If BitBase is a memory address, BitOffset can range from -2 gigabits to 2
   gigabits. The addressed bit is numbered (Offset MOD 8) within the byte at
   address (BitBase + (BitOffset DIV 8)), where DIV is signed division with
   rounding towards negative infinity, and MOD returns a positive number.
   This is illustrated in Figure 17-4.

  *  I-O-Permission(I-O-Address, width) returns TRUE or FALSE depending on
   the I/O permission bitmap and other factors. This function is defined as
   follows:

   IF TSS type is 286 THEN RETURN FALSE; FI;
   Ptr = [TSS + 66]; (* fetch bitmap pointer *)
   BitStringAddr = SHR (I-O-Address, 3) + Ptr;
   MaskShift = I-O-Address AND 7;
   CASE width OF:
         BYTE: nBitMask = 1;
         WORD: nBitMask = 3;
         DWORD: nBitMask = 15;
   ESAC;
   mask = SHL (nBitMask, MaskShift);
   CheckString = [BitStringAddr] AND mask;
   IF CheckString = 0
   THEN RETURN (TRUE);
   ELSE RETURN (FALSE);
   FI;

  *  Switch-Tasks is the task switching function described in Chapter 7.

See Also: 7.5 


17.2.2.6 Description



17.2.2.6  Description

The "Description" section contains further explanation of the instruction's
operation.




17.2.2.7 Flags Affected



17.2.2.7  Flags Affected

The "Flags Affected" section lists the flags that are affected by the
instruction, as follows:

  *  If a flag is always cleared or always set by the instruction, the
     value is given (0 or 1) after the flag name. Arithmetic and logical
     instructions usually assign values to the status flags in the uniform
     manner described in Appendix C. Nonconventional assignments are
     described in the "Operation" section.

  *  The values of flags listed as "undefined" may be changed by the
     instruction in an indeterminate manner.

All flags not listed are unchanged by the instruction.


17.2.2.8 Protected Mode Exceptions



17.2.2.8  Protected Mode Exceptions

This section lists the exceptions that can occur when the instruction is
executed in 80386 Protected Mode. The exception names are a pound sign (#)
followed by two letters and an optional error code in parentheses. For
example, #GP(0) denotes a general protection exception with an error code of
0. Table 17-6 associates each two-letter name with the corresponding
interrupt number.

Chapter 9 describes the exceptions and the 80386 state upon entry to the
exception.

Application programmers should consult the documentation provided with
their operating systems to determine the actions taken when exceptions
occur.

See Also: 9.6.1 


17.2.2.9 Real Address Mode Exceptions



17.2.2.9  Real Address Mode Exceptions

Because less error checking is performed by the 80386 in Real Address Mode,
this mode has fewer exception conditions. Refer to Chapter 14 for further
information on these exceptions.


See Also: 14.6 

17.2.2.10 Virtual-8086 Mode Exceptions




17.2.2.10  Virtual-8086 Mode Exceptions

Virtual 8086 tasks provide the ability to simulate Virtual 8086 machines.
Virtual 8086 Mode exceptions are similar to those for the 8086 processor,
but there are some differences. Refer to Chapter 15 for details.



See Also: 15.3.2