Skip to content

Latest commit

 

History

History
544 lines (386 loc) · 17.3 KB

ExecutingEnvironment.md

File metadata and controls

544 lines (386 loc) · 17.3 KB

Executing Environment and Mechanism

Process Synchronization and Interprocess Communication Practice

Application <----> OS <----> Hardware
              |          |
     * System calls      |
                    * CPU state
                    * Interrupt / Exception Mechanism

Terminology

Exceptional Control Flow (ECF) 異常控制流

Interrupt Descriptor Table (IDT) 中斷描述符表

OS vs. Hardware

  1. CPU
    • Register
  2. Interrupt

1. CPU

Top-Level View of Computer Organization

Register

  • User visible register: used by high-level compiler, to reduce the memory access times
    • data register (general purpose register)
    • address register
      • index register
      • segment pointer
      • stack pointer
    • condition code register
      • overflow
      • sign
  • Control and State register (Only accessible with authority)
    • PC Program Counter
    • IR Instruction Register
    • PSW Program Status Word
      • wiki
        • Interrupt masks
        • Privilege states
        • Condition code
        • Instruction address
      • PSW
        • a collection of data 8 bytes (or 64 bits) long
        • maintained by the operating system
        • keeps track of the current state of the system

Distinguish from user mode to kernel mode

x86 - Protection ring

  • Ring 0: kernel mode
  • Ring 3: user mode

Privilege rings for the x86 available in protected mode

Protection

Hardware support:

  • Executing different instruction set on different authority level.
  • Seperate OS and user program.

Status of CPU / Mode

Use PSW

Eflags register in x86

Privileged Instructions

Privileged Instructions: Can only be used by OS. (can't be used by user)

  • Kernel Mode: running system program
  • User Mode: running user program

privilege instruction: the instruction can only used by system

Trap instruction is non-privileged instruction (訪管指令)

Example of X86: 4 different privilege

  • R0: kernel state
  • R1
  • R2
  • R3: user state

In most of the x86 processor only use R0 and R3 privilege

CPU Mode Transform

  • User Mode -> Kernel Mode
    • Only way: Interrupt/Exception/Trap Mechanism 中斷/異常(例外)/陷入機制
  • Kernel Mode -> User Mode
    • Setting PSW to user mode

e.g. int, trap, syscall, sysenter/sysexit => 訪管指令 (不同系統implement的名稱可能不同)

2. Interrupt/Exception Mechanism

OS is interrupt triggered or event triggered

Origin of Interrupt and Exception:

  • Interrupt: Support parallel operation between CPU and device
  • Exception: Problem appear while CPU executing instruction

Concept

  • CPU "react" to an "event"
  1. CPU stop the running process
  2. Preserve the scene (PC, PSW)
  3. Execute the handler for the "event"
  4. After finish, back to the break point
    1. If it's system call then advance PC
    2. If other exception then don't advance PC
  5. Continue executing

Event

  • (External) Interrupt
    • I/O interrupt
    • Time interrupt
    • Hardware failure
  • Exception (Internal Interrupt)
    • System call
    • Page fault 頁錯誤/故障
      • 缺頁異常
    • Protectional exception
    • Break point instruction
    • Other programming exception
      • e.g. overflow
- Unexpected Deliberate
Exceptions (sync) fault syscall trap
Interrupt (async) interrupt software interrupt
  • Interrupts: asynchronous interrupts generated by hardware.
  • Exceptions: synchronous interrupts generated by the processor.

interrupt/exception/trap/syscall

Class Reason Async/Sync Return behavior
Interrupt I/O device, peripheral Async Always return to next instruction
Trap Arrange intentionally Sync Return to next instruction
Fault Recoverable error Sync Return to current instruction
Abort Unrecoverable error Sync Don't return

Interrupt Response

Discover interrupt -> Receiving interrupt

Instruction Cycle with Interrupts

Transfer of Control with Multiple Interrupts

In the last step of execution cycle, it will scan the interrupt register check if there is interrupt signal

If there is an interrupt, then interrupt hardware will send the "interrupt code" in the corresponding position in PSW. Through switching interrupt vector to call the interrupt handler.

Interrupt Descriptor Table (IDT)

Location of IDT (address and size) is kept in the IDTR register of the CPU, which can be loaded/stored using LIDT, SIDT instructions

This is similar to the GDT

IDTR Interrupt Descriptor Table Register

The processor has a special register (IDTR) to store both the physical base address and the length in bytes of the IDT

stackoverflow IDTR w/ IDT

Interrupt Vector Table (IVT)

On the x86 architecture, the Interrupt Vector Table (IVT) is a table that specifies the addresses of all the 256 interrupt handlers used in real mode.

It's a unit in the memory. Store the entry address of interrupt handler and PSW.

Linux interrupt vector

  • 128 (0x80): for system call (programmable exception)

Interrupt Handler (Interrupt Service Routine)

The x86 architecture is an interrupt driven system. External events trigger an interrupt — the normal control flow is interrupted and an Interrupt Service Routine (ISR) is called

Procedure

  1. Preserve relative registers
    • PC
    • PSW
  2. Analysis the reason of Interrupt / Exception
  3. Execute the corresponding funciton
  4. Resume and return to the original program

Example of I/O Interrupt:

  • I/O operation end normally
    • Wake up the process which is waiting for the result
  • I/O operation fail
    • Retry the fail operation
    • Reach the tolerance maximum, determine as hardware failure

Implementation of Timer Interrupt:

  • System necessary
  • Software clock
  • CPU scheduling
    • Round Robin
  • Timing task
  • Real-time execution

Hardware Fault Interrupt

Program Interrupt

Interrupt in IA32

IA32 = Intel's 32-Bit computer architecture = x86 (comes from the Intel Processor model number "Intel 8086") (explained)

  • Interrupt
  • Exception 異常
  • System Call

IA32 system structure

IA32 system structure

  • Advanced Programmable Interrupt Controller (APIC / PIC)
    • Transfer the hardware interrupt signal to interrupt vector, trigger CPU interrupt
  • Interrupt Vector Table (Real Mode)
    • Store the address of interrupt handler
      • handler entry address = segment base address + offset
  • Interrupt Descriptor Table (Protection Mode)
    • Use data structure gate descriptor to describe interrupt vector

Gates in Interupt Descriptor Table

  • Task Gate
  • Interrupt Gate
  • Trap Gate
  • Call Gate

stackoverflow IDT Gate Descriptors

Procedure of Interrupt in x86

  1. Get the interrupt vector (i)
  2. Use IDTR find IDT then get interrupt descriptor (ith item in the table)
  3. From GDTR get the address of GDT (GDT contains entries telling the CPU about memory segments)
  4. Combine the section selector and get the corresponding section selector from GDT
  5. From that section selector get the base address of interrupt handler
    • handler entry address = section base address + offset
  6. Check the privilege, make sure it's allow to access the segment
    • Make sure CPL (in CS Register) ≤ Gate descriptor DPL
      • Prevent user application access special trap gate or interrupt gate
    • Make sure RPL (in CS Register) ≤ Section descriptor DPL
      • Make sure current privilege greater than the privilege of interrupt handler
  • CPL Current privilege level
  • RPL Requested privilege level (privilege level associated with a segment selector)
  • DPL Descriptor privilege level (privilege level of a segment)
    • It defines the minimum1 privilege level required to access the segment.

Privilege levels range from 0-3; lower numbers are more privileged.

3. Storage System

Storage Hierarchy

          Register <--> Cache <--> Memory <--> Disk
Speed     Fast ------------------------------> Slow
Capacity  Small ------------------------------> Big
Cost      High -------------------------------> Low

Locality of Storage Access

aka. principle of locality

Memory Block

  • Byte, Bit
  • Page Frame (物理頁、頁框、頁幀)
    • Block / Page size
      • 512B, 1KB, 4KB, ..., 256KB, 1MB, 4MB, 16MB

Cache

Cache (SRAM) 快取記憶體、高速緩存

┌-----┐                                 ┌-------┐                          ┌--------┐
| CPU | <--- Byte or Word transfer ---> | Cache | <--- Block transfer ---> | Memory |
└-----┘                                 └-------┘                          └--------┘

4. I/O Access

  • Program Control
  • Interrupt Trigger
  • Direct Memory Access (DMA)

Blocking I/O and Non-blocking I/O

Two I/O methods: (a) synchronous and (b) asynchronous

Program-based I/O

Polling aka. Programmed I/O

CPU has to check the I/O status => waste lost of time on polling the status

Interrupt-based I/O

To solve the problem of "polling". It will free the CPU from polling.

Parallelize the I/O and the other instructions

Send the interrupt when I/O unit is ready to interact with the device

DMA-based I/O

The interrupt-based I/O is not efficient enough.

Use a individual unit DMA controller

5. Timer

TCON (Timer Control register)

Timer is necessary in the following scenario

  • Found infinity loop
  • Round Robin algorithm in interactive system
  • Time delay and Time exceed control in real-time system
  • Execute some external event for a time duration
  • ...

Clock

  • absolute clock: record current time (will advance even when shutdown)
  • relative clock: implement by clock register
    • clock-- for a time unit, when the value become negative, then do something
  • hardware clock
  • software clock

System Call

A system call is a way for programs to interact with the operating system.

System call provides the services of the operating system to the user programs via Application Program Interface (API) (Library)

Trap the CPU state from user state to kernel mode

syscall

Kernel Function

Example: printf

printf() --> write() (syscall)

System Call Design

  • Interrupt/Exception Mechanism
    • implement the services
  • Trap Instruction/Privilege Instruction
    • switch between user state and kernel state
  • System call number and parameter
    • number each syscall
  • System call table
    • store the function pointer address for each syscall's service handler

In Linux, each system call is assigned a unique syscall number that is used to reference a specific system call.

Passing Parameter

Passing parameter from user program to kernel

  • Trap instruction with argument
    • limited paramter number
  • General purpose register
    • can be accessed by both user and system
    • limited register number
      • it's completely fine in 64bit system
    • e.g. Nachos (MIPS) (r2 register)
  • Special purpose stack heap area in memory

return is also a syscall (No. 1)

Ctrl + C

Soft interrupt

send a signal --> .... -> ....

System Call Procedure

When CPU execute special trap instruction

  1. Interrup/Exception mechanism: Protect state by hardware
    • Lookup the IVT
    • Pass the authority for syscall entry function
  2. Invoke entry function: Preserve state
    • Preserve the parameters into kernel stack
    • Pass the authority for the syscall handler
    • e.g. sysenter
  3. Execute system call handler
  4. Resume state and back to user program

System Call in Linux (based on x86)

include/linux/syscalls.h

System call number

arch/sh/include/uapi/asm/unistd_64.h

#define __NR_restart_syscall	  0
#define __NR_exit		  1
#define __NR_fork		  2
#define __NR_read		  3
#define __NR_write		  4
#define __NR_open		  5
#define __NR_close		  6
#define __NR_waitpid		  7
#define __NR_creat		  8
#define __NR_link		  9
#define __NR_unlink		 10
#define __NR_execve		 11
#define __NR_chdir		 12
#define __NR_time		 13
...

In Linux, all the system call use the same single entrance int: 0x80

#define __NR_init_module	128
  1. Change privilege => change stack
    • From User stack to Kernel stack
    • CPU assign a new stack pointer (SS: ESP) which point to kernel stack in TSS (Task State Segment)
  2. Push EFLAGS into stack, resume TF (Trap Frame), IF stay remain
  3. Find the gate descriptor in IDT by 0x80. Find the segment selector assign to CS (Code Segment) register
  4. Calculate the "base address of segment descriptor" + "offset in the trap gate descriptor" to locate the entry address of the system call handler
  • Privilege check
    • code can only access same or lower privilege data
  • System call number and arguments/parameters
    • EAX: for system call number, and the return value (after handle the system call)
      • e.g. Nachos (MIPS) r2 register
    • EBX, ECX, EDX, ESI, EDI

OS low-level procedure after interrupt

  1. (Hardware) Push stack
    • PC, etc.
  2. (Hardware) Insert new PC from interrupt vector
  3. (Assembly) Preserve value of registers
  4. (Assembly) Set new stack and heap
  5. (C language) Execute interrupt handler
  6. (CPU Scheduler) Decide next process
  7. (C language) Return to Assembly
  8. (Assembly) Start running new process

Mechanism and Seperation Principle

kernel structure

kernel structure

kernel structure

Resources

Article

Book

Operating System Concepts 9ed.

  • Ch13 I/O Systems
    • Ch13.2 I/O Hardware
      • Ch13.2.1 Polling
      • Ch13.2.2 Interrupts
      • Ch13.2.3 Direct Memory Access
    • Ch13.3 Application I/O Interface
      • Ch13.3.4 Blocking and Non-blocking I/O
  • Notes