Fall 2005. Distributed/Multiprocessor Operating Systems.
ONLINE VERSION
Class
synopsis
- just a brief outline of topics covered in class
No guarantees of accuracy or completeness.
Class 1:
- Introduction to the class and course contents
- Distributed Operating Systems
- Multiprocessor architecture (UMA)
- Concurrency in multiprocessor operating systems
Class 2:
- Operating Systems Kernels, system calls and interrupts
- Reentrancy in kernels - uniprocessor and multiprocessor
- Making kernels reentrant - preemptible and non-preemptible kernels
- Need atomicity to get atomicity - disable interrupts and test and set
- Making semaphores from atomic constructs
Class 3:
- Atomicity revisited
- Mutex Semaphores and Spin Semaphores
- Benefits of each approach, user level usage of mutex/spin semaphores
- "Need atomicity to get Atomicity?"
- Peterson/Dekker solutions
- Processor Classifications: SISD, SIMD, MISD, MIMD
- Multiprocessor classifications: UMA NUMA, NORMA
Class 4:
Lecture by Austin Godber on NFS/AFS/SMB and Kerberos
Class 5:
- Processors and Processes -- and Programs and Processes
- Processes and Threads
- Sharing of variables (global/local/stack/heap)
- Parallel Processing
- "fork/join" programming
- structured parallel programming (the "parbegin-parend" strategy)
- Operating Systems for the UMA machine (scheduling/reentrancy etc.)
Class 6:
- What does the Butterfly/Hypercube look like?
- Programming a Multiprocessor - programs, processes and processors
- Fork - Join programming and Parbegin - Parend programming
- Barrier Synchronization
- Multiprocessor Scheduling - co-scheduling, affinity scheduling, smart
scheduling
- NORMA machines and NORMA Programming
- Splitting up programs for parallel execution
Class 7: Sept 17
- Programming the NORMA - data movement, granularity
- Software packages PVM/MPI/Linda
- Programming using PVM
- Master-worker method
- MPI and scatter gather
- Open MP
- Linda
Class 8:
- Introduction to Distributed Systems
- Autonomy and collaboration
- Load balancing, efficiency, reliability
- USAGE: Distributed Application - Info sharing - resource sharing - reliability
- flexibility
- Issues - TRANSPARENCY (access/location/replication/failure)
- Distributable System Models (message, RPC, DSM)
Class 9:
- History of Distributed Operating Systems
- OS facilities -- timesharing systems to PC's (integration started disappearing)
- Bringing tighter coupling back to networked machines - network file
systems (NFS)
- Sun-NFS and how it works
- Mounting file systems, remote mounts, configuring dataless workstations
(all data file visible at all workstations)
- Translating local and remote file names to "i-numbers"
- Read client and nfsd interactions
Class 10:
- Read and write in NFS
- Stateless design of file service
- Differences with stateless and stateful designs - performance vs. failure
tolerance
- Caching (bufferring)
- Scalability issues
- Fully distributed file systems
- The Andrew file system and its design
- Coherence issues
- Locking - advisory vs. mandatory
Class 11:
- Distributed Systems and message passing
- Simple message passing system - design (messages, ports)
- Send and receive - blocking and non-blocking
- timeouts are bad
- Writing simple message passing programs
Class 12:
Class 13:
- Client Server programming
- Server port naming
- Name service
- Server scalability - use multiple thread or processes
- Server state -- why cannot server code be moved to clients
Class 14:
- Server scalability
- Pool of processes or threads
- Dynamic processes or thread
- Multithreaded servers (issues with server state)
- Threads and Processes
Class 15:
- Pre-emptible and non-pre-emptible threads
- Programming with non-preemptible threads
- User level threads
- Writing a user level threads package (Startthread, P, V and Schedule)
Class 16:
- Discussion of project - semaphores, shared memory
- Implementing Ports on a single machine using bounded buffer
Class 17:
Class 18:
- Implementing Distributed Ports (multi machine)
- TCP IP ports
- Global Ports - naming, location and send/receive handling
- Network Message Service
Class 19:
- Use of cookies in single threaded stateless servers
- Lock management in distributed systems (exclusive locks)
Class 20:
- Lock management continued -- upgrading and deadlocks
Class 21:
- Unlocking the locks, starvation prevention
- Introduction to RPC
Class 22:
- RPC, IDL and stubs
- How RPC works
Class 23:
- Distributed Shared Memory (coherence, semantics, page fault handling,
basic algorithm)
Class 24:
- Distributed Shared memory -- implementation
- Drawbacks -- page shuttling, false sharing
- Positives -- locking improves performance, automatic data transfer and
caching
- Better solution -- Release Consistent DSM.
Class 25:
- The muddy children Problem
- Hierarchies of knowledge
- Common knowledge, attainment
- Consensus
- Coordinated Attack
- Two phase commit
- A paper by Moses et. al.
Class 26:
- Time in Distributed Systems and Event ordering
- Lamport Clocks
- Distributed Mutual Exclusion
- Lamports Algorithm
- Paper by Lamport
- Maekawa Algorithm
Class 27:
- Distributed Snapshots, the algorithm
- Examples of a simple 2-process message passing system and its snapshots
- Paper by Chandy and Lamport
Class 28:
- Distributed Snapshots - properties and informal proof
- Replication of data for availability and reliability.
Class 29: : Last class