Fall 2001. Distributed/Multiprocessor Operating Systems.
Class
synopsis
- just a brief outline of topics covered in class
NEW:
NOTE: A more elaborate set of notes may be found at http://calypso30.eas.asu.edu/~godber/classes/cse531/notes.txt,.
No guarantees of accuracy or completeness.
Class 1: Aug 20th (combined
class)
Class 2: Aug 22: (combined class)
- Operating System structure
- Interrupts and System Calls
- Reentrancy
- Locking of data items - atomocity
- Approaches to get atomicity
- Blocking and spinning (the test-and-set-instruction)
Class 3: Aug 27:
- Recap of Operating System structures
- Classificants on SISD, SIMD, MISD and MIMD
- Using SIMD machines (how they work)
- Programming UMA systems
- The parallel construct and fork/join parallelism
- Barrier Synchronization
- Implementing barriers using Semaphores
- Spin lock vs. Mutex lock
Class 4: Aug 29:
- Recap of OS Structure and locking
- Scheduling in UMA operating systems
- Issues: Preemption in Critical Sections, Cache Clobbering, Thread
Interdependance
- Co-Schedling, Affinity Scheduling, Smart Scheduling
- The NUMA machine
- Switching architecture in NUMA machines - hypercube
- Master-Slave operating systems
- Floating Master and Symmetric Operating Systems
- "Which Processor runs the OS"
Class 5: Sept 3:
Class 6: Sept 5:
- MPI, Open MP
- LINDA - the language and its uses
- Basic concepts of Distributed Computing
Class 7: Sept 10:
- What are are Dist Systems used for?
-. Distributed Applications
-. Information Sharing
-. Resource Sharing
-. Better Price Performance
-. Higher Reliability
-. faster throughput
-. growth/flexibility/
- Transparency of:
*- access,
*- location,
*- replication,
*- failure
- Reliability
*- Fault avoidance,
*- fault tolerance,
*- failure detection and
*- recovery
- Scalability
*- avoid centralized
*- do things on clients
Class 8: Sept 12:
- Discussion of Distributed Systems issues
- What is the meaning of - transparency, reliability and scalability
- Towards Distributed systems - mainframe -> mini computer -> PC
- Aggregation - make a collection of machines looks like one machine
Class 9: Sept 17:
- Logical and Physical centralization
- Client server systems
- Distributable System models
- Message based, Object Based, Shared memory based
- Do processes run inside the kernel?
- File Sharing via NFS
Class 10: Sept 19:
- NFS - make a set of distributed files look centralized
- Mapping files to machines and Unix file trees
- Mount tables and remote mounting
- Name resolution using "namei"
- Scalability problems
- Scalable file system - AFS (Andrew)
- AFS clients, servers and callbacks
Class 11: Sept 24:
- Introduction to message passing
- Types of sends and receives (blocking, non-blocking, synchronous,
asynchronous)
- simple example programs
- discussion of what is a port
- threads and timeouts
- using reply ports to send messages back
- message formats and conventions (or protocols)
Class 12: Sept 26:
- Client server programs, importance of being synchronous
- Implementing ports in a single machine (in kernel)
- Ports are bounded buffers and can be implemented by a queue and some
semaphores
- Distributed implementation needs network message servers
HOMEWORK ASSIGNED, DUE Oct 17.
Class 13: Oct 1:
- Message passing continued
- Workstation model and processor pool model
- create/delete and bind to ports (nameserver)
- capability based naming of ports
- Single threaded server limitations
Class 14: Oct 3:
- Multithreaded servers
- Multiple processes instead of threads
- Dynamic threads and thread pool
- Limitations and overheads of each approach
- Single threaded servers with hand crafted threading?
- Need for cookies
Class 15: Oct 8:
- Coroutines and transfer
- Implementing a transfer using stacks
- queue of stacks for implementing yield
Class 16: Oct 10:
- processes, threads and fibers
- Microkernels
- services in microkernels, overhead
- how to make micorkernels efficient
Class 17: Oct 15:
- Review class for midterm exam
Class XX: Oct 17: MID TERM EXAM
MID TERM EXAM:
Open book/Notes.
Homework Due
Class 18: Oct 22: (combined class)
- Class combined with Section A in SCOB-350
Please go to SCOB 350 for class
- TOPIC:
Class 19: Oct 24: (combined class)
- Class combined with Section A in SCOB-350
Please go to SCOB 350 for class
- TOPIC:
Class 20: Oct 29:
- Locking in distributed systems (advisory locking)
- Lock service, client blocking
- Exclusive locks
- Shared and exclusive locks
- Starvation
- Downgrading of locks
Class 21: Oct 31:
- Exclusive locking and upgrades
- starvation prevention
- unlocking and semantics of upgrade/downgrade
- starvations due to upgrade/downgrade
- Problems with message passing - introduction to RPC
Class 22: Nov 5:
- RPC and IDL
- How RPC Compilers generate clients and servers
- How RPC Works
Class 23: Nov 7:
- DSM
- Sequential consistencey
- Page based access detection
- Invalication
No Class: Nov 12 - Veteran's Day:
Class 24: Nov 14:
- DSM page shuttling problems
- Locking and DSM
- Consistency issues
- Release consistency
- Implementation of RC DSM using diff mechanisms
Class 25: Nov 19:
- Muddy Children problem
- Hiearchies of knowledge
- Common Knowledge
- Coordinated attack (decisions on the telephone vs. email)
- Impossibility of consensus
- Handling consensus in Distributed Systems
Class 26: Nov 21:
- Event ordering in Distributed Systems
- Time in Distributed Systems
- Lamport clocks (implementation and properties)
- Distributed Mutual Exclusion (Lamport algorithm)
Class 27: Nov 26:
- Distributed Mutual Exclusion (Maekawa algorithm)
- Distributed Snapshot algorithm
- Dist Snaphot examples
Class 28: Nov 28:
- Distributed Snapshot, proof
- Replication and Fault tolerance
Class 29: Dec 3:
Class 30: Dec 5:
- Project Demo-s, as per schedule