Lavoisier S.A.S.
14 rue de Provigny
94236 Cachan cedex
FRANCE

Heures d'ouverture 08h30-12h30/13h30-17h30
Tél.: +33 (0)1 47 40 67 00
Fax: +33 (0)1 47 40 67 02


Url canonique : www.lavoisier.fr/livre/informatique/distributed-system-design/wu/descriptif_1690148
Url courte ou permalien : www.lavoisier.fr/livre/notice.asp?ouvrage=1690148

Distributed System Design

Langue : Anglais

Auteur :

Couverture de l’ouvrage Distributed System Design

Future requirements for computing speed, system reliability, and cost-effectiveness entail the development of alternative computers to replace the traditional von Neumann organization. As computing networks come into being, one of the latest dreams is now possible - distributed computing.
Distributed computing brings transparent access to as much computer power and data as the user needs for accomplishing any given task - simultaneously achieving high performance and reliability.
The subject of distributed computing is diverse, and many researchers are investigating various issues concerning the structure of hardware and the design of distributed software. Distributed System Design defines a distributed system as one that looks to its users like an ordinary system, but runs on a set of autonomous processing elements (PEs) where each PE has a separate physical memory space and the message transmission delay is not negligible. With close cooperation among these PEs, the system supports an arbitrary number of processes and dynamic extensions.
Distributed System Design outlines the main motivations for building a distributed system, including:

  • inherently distributed applications
  • performance/cost
  • resource sharing
  • flexibility and extendibility
  • availability and fault tolerance
  • scalability
    Presenting basic concepts, problems, and possible solutions, this reference serves graduate students in distributed system design as well as computer professionals analyzing and designing distributed/open/parallel systems.
    Chapters discuss:
  • the scope of distributed computing systems
  • general distributed programming languages and a CSP-like distributed control description language (DCDL)
  • expressing parallelism, interprocess communication and synchronization, and fault-tolerant design
  • two approaches describing a distributed system: the time-space view and the interleaving view
  • mutual exclusion and related issues, including election, bidding, and self-stabilization
  • prevention and detection of deadlock
  • reliability, safety, and security as well as various methods of handling node, communication, Byzantine, and software faults
  • efficient interprocessor communication mechanisms as well as these mechanisms without specific constraints, such as adaptiveness, deadlock-freedom, and fault-tolerance
  • virtual channels and virtual networks
  • load distribution problems
  • synchronization of access to shared data while supporting a high degree of concurrency
  • Chapter 1 introduces some basic concepts, discusses the motivation of distributed computing systems, and presents the scope of distributed computing systems. A brief overview of the book is also provided. -- Chapter 2 surveys general distributed programming languages and introduces a CSP-like distributed control description language (DCDL). This language is used to describe several control issues such as expressing parallelism, interprocess communication and synchronization, and fault-tolerant design. A list of commonly used symbols in DCDL appears in the Appendix. -- Chapter 3 treats distributed systems in a formal way. Several concepts such as clock, event, and state are introduced as well as two approaches that describe a distributed system: the time-space view and the interleaving view. -- Chapter 4 addresses the problem of mutual exclusion which is the key issue for distributed systems design. Mutual exclusion ensures that mutually conflicting concurrent processes can share resources. We also discuss three issues that are related to the mutual exclusion problem: election, bidding, and self-stabilization. -- Chapter 5 studies prevention and detection of deadlock in a distributed system. Distributed systems, in general, exhibit a high degree of resource and data sharing; a situation in which deadlocks may happen. This chapter discusses several solutions to deadlock problems that are unique in distributed systems. -- Chapter 6 studies efficient interprocessor communication mechanisms that are essential to the performance of distributed systems. Three types of communications: one-to-one {unicast), one-to-many (multicast), and one to- all (broadcast), as well as their performance, are studied in this chapter. -- Chapter 7 covers interprocessor communication mechanisms without specific constraints such as adaptiveness, deadlock-freedom, and fault tolerance. Concepts of virtual channels and virtual networks are introduced to achieve various objectives. -- Chapter 8 deals with the reliability issue in distributed systems. An important objective of using distributed systems to achieve high dependability includes reliability, safety, and security. A fundamental issue is to detect and handle faults that might appear in the system. In this chapter we study various methods of handling node, communication, Byzantine, and software faults in a distributed system. -- Chapters 9 and 10 include load distribution problems in a distributed system. Load distribution is a resource management component of a distributed system that focuses on judiciously and transparently redistributing the load of the system among the processors such that overall performance of the system is maximized. Chapter 9 studies static load distribution where decisions of load distribution are made by using a priori knowledge of the system and loads cannot be redistributed during the run time. -- Chapter 10 deals with dynamic load distribution algorithms that use system state information {the loads at nodes), at least in part, to make load distribution decisions. -- Chapter 11 describes distributed data management issues. Two specific issues are covered: {l) synchronization of access to shared data while supporting a high degree of concurrency and {2) reliability. -- Chapter 12 contains applications of distributed design in operating sys terns, file systems, shared memory systems, database systems and heterogeneous processing. Possible future research directions are also listed. The Appendix includes a list of common symbols in DCDL.

    Professional and Professional Practice & Development