UNIX® Systems for Modern Architectures

Symmetric Multiprocessing and Caching for Kernel Programmers

Curt Schimmel

Publisher: Addison-Wesley, 1994, 396 pages

ISBN: 0-201-63338-8

Keywords: Programming, Operating Systems

Last modified: May 14, 2021, 5:50 p.m.

This book represents a significant new milestone in UNIX kernel internals books. Symmetric multiprocessing and cache memory systems are important cost-effective technologies for improving performance in today's state-of-the-art systems.

Written for the UNIX kernel developer, this book provides a complete yet comprehensible explanation of the operation of caches and symmetric multiprocessors, how they work together, and the issues operating systems must address in order to run on the machines that incorporate them.

After a review of UNIX kernel internals, Curt Schimmel launches into a detailed description of cache memory systems, including several kinds of virtual and physical caches, as well as a chapter on efficient cache management. For each type of cache, the book covers the impact on the software and the operating system changes necessary for these systems. The next section details the operation of the tightly-coupled, shared memory, symmetric multiprocessor. It examines the problems these multiprocessors present to the operating system, such as race conditions, deadlocks, and the ordering of memory operations, and looks at how the UNIX kernel can be adapted to run on such systems. Finally, the book looks at the interaction between cache memory systems and multiprocessors and the new problems that this interaction presents to the kernel. Techniques for solving these problems are then explained.

Numerous examples representing CISC and RISC processors, such as the Intel 80486 and Pentium, the Motorola 68040 and 88000, as well as the MIPS and SPARC processors, illustrate the concepts presented. To reinforce the concepts, each chapter contains a set of exercises with answers to selected exercises included in the back.

    • Notational Conventions
      • Constants
      • Word Size
      • Byte Order
      • Bit Order
      • Bit Ranges
      • Memory Size Units
  • Introduction
    1. Review of UNIX Kernel Internals
      1. Introduction
      2. Processes, Programs, and Threads
      3. The Process Address Space
        1. Address Space Mapping
      4. Context Switch
      5. Memory and Process Management System Calls
        1. The Fork System Call
        2. The Exec System Call
        3. The Exit System Call
        4. The Sbrk and Brk System Calls
        5. Shared Memory
        6. I/O Operations
        7. Mapped Files
      6. Summary
      7. Exercises
      8. Further Reading
    2. Part I: CACHE MEMORY SYSTEMS
      1. Introduction to Cache Memory Systems
        1. Memory Hierarchies
        2. Cache Fundamentals
          1. How a Cache Is Accessed
          2. Virtual or Physical Addresses?
          3. Searching the Cache
          4. Replacement Policies
          5. Write Policies
        3. Direct Mapped Caches
          1. Hashing Algorithms for Direct Mapped Caches
          2. Direct Mapped Cache Example
          3. Miss Processing and replacement Policy for Direct Mapped Caches
          4. Summary of Direct Mapped Caching
        4. Two-Way Set Associative Caches
          1. Summary of Two-Way Set Associative Caches
        5. n-Way Set Associative Caches
        6. Fully Associative Caches
        7. Summary of n-Way Set Associative Caches
        8. Cache Flushing
        9. Uncached Operation
        10. Separate Instruction and Data Caches
        11. Cache Performance
        12. How Cache Architectures Differ
        13. Exercises
        14. Further Reading
      2. Virtual Caches
        1. Virtual Cache Operation
        2. Problems with Virtual Caches
          1. Ambiguities
          2. Aliases
        3. Managing a Virtual Cache
          1. Context Switch
          2. Fork
          3. Exec
          4. Exit
          5. Brk and Sbrk
          6. Shared Memory and Mapped Files
          7. I/O
          8. User-Kernel Data Ambiguities
        4. Summary
        5. Exercises
        6. Further Reading
      3. Virtual Caches with Keys
        1. The Operation of a Virtual Cache with Keys
        2. Managing a Virtual Cache with Keys
          1. Context Switch
          2. Fork
          3. Exec
          4. Exit
          5. Brk and Sbrk
          6. Shared Mamory and Mapped Files
          7. I/O
          8. User-Kernel Data Ambiguities
        3. Virtual Cache Usage in MMUs
        4. Summary
        5. Exercises
        6. Further Reading
      4. Virtual Caches with Physical Address Tags
        1. The Organization of a Virtual Cache with Physical Tags
        2. Managing a Virtual Cache with Physical Tags
          1. Context Switch
          2. Fork
          3. Exec
          4. Exit
          5. Brk and Sbrk
          6. Shared Memory and Mapped Files
          7. I/O
          8. User-Kernel Data Ambiguities
        3. Summary
        4. Exercises
        5. Further Reading
      5. Physical Caches
        1. The Organization of a Physical Cache
        2. Managing a Physical Cache
          1. Context Switch
          2. Fork
          3. Exec, Exit, Brk, and Sbrk
          4. Shared Memory and Mapped Files
          5. User-Kernel Data Ambiguities
          6. I/O and Bus Watching
        3. Multilevel Caches
          1. Primary Virtual Cache with Secondary Physical Cache
          2. Primary Virtual Cache with Physical Tags and a Secondary Physical Cache
        4. Primary Virtual Cache with Secondary Physical Cache
        5. Summary
        6. Exercises
        7. Further Reading
      6. Efficient Cache Management Techniques
        1. Introduction
        2. Address Space Layout
          1. Virtually Indexed Caches
          2. Dynamic Address Binding
          3. Physically Indexed Caches
        3. Cache Size Bounded Flushing
        4. Delayed Cache Invalidations
          1. Virtual Caches with Keys
          2. Physically Tagged Caches without Bus Watching
        5. Cache-Aligning Data Structures
        6. Summary
        7. Exercises
        8. Further Reading
    3. Part II: MULTIPROCESSOR SYSTEMS
      1. Introduction to Multiprocessor Systems
        1. Introduction
          1. MP Operating Systems
        2. The Tightly Coupled, Shared Memory, Symmetric Multiprocessor
        3. The MP Memory Model
          1. The Sequential Memory Model
          2. Atomic Reads and Writes
          3. Atomic Read-Modify-Write Operations
        4. Mutual Exclusion
        5. Review of Mutual Exclusion on Uniprocessor UNIX Systems
          1. Short-Term Mutual Exclusion
          2. Mutual Exclusion with Interrupt Handlers
          3. Long-Term Mutual Exclusion
        6. Problems Using UP Mutual Exclusion Policies on MPs
        7. Summary
        8. Exercises
        9. Further Reading
      2. Master-Slave Kernels
        1. Introduction
        2. Spin Locks
        3. Deadlocks
        4. Master-Slave Kernel Implementation
          1. Run Queue Implementation
          2. Process Selection for Slaves
          3. Process Selection for the Master
          4. Clock Interrupt Handling
        5. Performance Considerations
          1. Master-Slave Improvements
        6. Summary
        7. Exercises
        8. Further Reading
      3. Spin-Locked Kernels
        1. Introduction
        2. Giant Locking
        3. Multithreading Cases Requiring No Locks
        4. Coarse-Grained Locking
        5. Fine-Grained Locking
          1. Short-term Mutual Exclusion
          2. Long-Term Mutual Exclusion
          3. Mutual Exclusion with Interrupt Handlers
          4. Lock Granularity
          5. Performance
          6. Kernel Preemption
        6. Effects of Sleep and Wakeup on Multiprocessors
        7. Summary
        8. Exercises
        9. Further Reading
      4. Semaphored Kernels
        1. Introduction
          1. Mutual Exclusion with Semaphores
          2. Synchronization with Semaphores
          3. Resource Allocation with Semaphores
        2. Deadlocks
        3. Implementing Semaphores
        4. Coarse-Grained Semaphore Implementations
        5. Multithreading with Semaphores
          1. Long-Term Mutual Exclusion
          2. Short-term Mutual Exclsion
          3. Synchronization
        6. Performance Considerations
          1. Measuring Lock Contention
          2. Convoys
          3. Multireader Locks
        7. Summary
        8. Exercises
        9. Further Reading
      5. Other MP Primitives
        1. Introduction
        2. Monitor
        3. Eventcounts and Sequencers
        4. The MP Primitives of SVR4.2 MP
          1. Spin Locks
          2. Sleep Locks
          3. Synchronization Variables
          4. Multireader Locks
        5. Comparison of MP Synchronization Primitives
        6. Summary
        7. Exercises
        8. Further Reading
      6. Other Memory Models
        1. Introduction
        2. Dekker's Algorithm
        3. Other Memory Models
        4. Total Store Ordering
        5. Partial Store Ordering
        6. The Store Buffer as Part of the Memory Hierarchy
        7. Summary
        8. Exercises
        9. Further Reading
    4. Part III: MULTIPROCESSOR SYSTEMS WITH CACHES
      1. Introduction to MP Cache Consistency
        1. Introduction
        2. The Cache Consistency Problem
        3. Software Cache Consistency
          1. Uncached Shared Data
          2. Selective Cache Flushing
          3. Handling Other Memory Models
        4. Summary
        5. Exercise
        6. Further Reading
      2. Hardware Cache Consistency
        1. Introduction
        2. Write-Invalidate Protocols
          1. The Write-Through Invalidate Protocol
          2. The Write-Once Protocol
          3. MESI Protocols
        3. Write-Update Protocols
          1. The Firefly Protocol
          2. The MIPS R4000 Update Protocol
        4. Consistency of Read-Modify-Write Operations
        5. Hardware Consistency for Multilevel Caches
        6. Other Main Memory Architectures
          1. The Cross-bar Interconnect
          2. Directory-based Hardware Cache Consistency
        7. Effects on the Software
        8. Hardware Consistency for Nonsequential Memory Models
        9. Performance Considerations for Software
          1. Cache-Aligning Data Structures
          2. Reducing Cache Line Contention When Acquiring a Spin Lock
          3. Matching Consistency Protocols to Data Usage
        10. Summary
        11. Exercises
        12. Further Reading
  1. Architecture Summary
  2. Answers to Selected Exercises

    Reviews

    UNIX® Systems for Modern Architectures

    Reviewed by Roland Buresund

    Outstanding ********* (9 out of 10)

    Last modified: May 21, 2007, 2:51 a.m.

    Memory management and locking issues galore. Only for the real geeks out there, but for them, it is an excellent book.

    Comments

    There are currently no comments

    New Comment

    required

    required (not published)

    optional

    required

    captcha

    required