Principles of Parallel and Distributed Computing

Principles of Parallel and Distributed Computing

Three major milestones have led to cloud computing evolution

  • Mainframes: Large computational facilities leveraging multiple processing units. Even though mainframes cannot be considered as distributed systems, they offered large computational power by using multiple processors, which were presented as a single entity to users.
  • Clusters: An alternative technological advancement to the use of mainframes and supercomputers.
  • Grids
  • Clouds

Eras of Computing

  • Two fundamental and dominant models of computing are sequential and parallel.
    • The sequential era began in the 1940s, and the Parallel( and distributed) computing era followed it within a decade.
  • Four key elements of computing developed during three eras are
    • Architecture
    • Compilers
    • Applications
    • Problem-solving environments
  • The computing era started with development in hardware architectures, which actually enabled the creation of system software – particularly in the area of compilers and operating systems – which supports the management of such systems and the development of applications
  • The terms parallel computing and distributed computing are often used interchangeably, even though they mean slightly different things.
  • The term parallel implies a tightly coupled system, whereas distributed systems refer to a wider class of systems, including those that are tightly coupled.
  • More precisely, the term parallel computing refers to a model in which the computation is divided among several processors sharing the same memory.
  • The architecture of a parallel computing system is often characterized by the homogeneity of components: each processor is of the same type and it has the same capability as the others.
  • The shared memory has a single address space, which is accessible to all the processors.
  • Parallel programs are then broken down into several units of execution that can be allocated to different processors and can communicate with each other by means of shared memory.
  • Originally parallel systems are considered as those architectures that featured multiple processors sharing the same physical memory and that were considered a single computer.
    • Over time, these restrictions have been relaxed, and parallel systems now include all architectures that are based on the concept of shared memory, whether this is physically present or created with the support of libraries, specific hardware, and highly efficient networking infrastructure.
    • For example a cluster in which the nodes are connected through an InfiniBand network and configured with distributed shared memory system can be considered a parallel system.
  • The term distributed computing encompasses any architecture or system that allows the computation to be broken down into units and executed concurrently on different computing elements, whether these are processors on different nodes, processors on the same computer, or cores within the same processor.
  • Distributed computing includes a wider range of systems and applications than parallel computing and is often considered a more general term.
  • Even though it is not a rule, the term distributed often implies that the locations of the computing elements are not the same and such elements might be heterogeneous in terms of hardware and software features.
  • Classic examples of distributed computing systems are
    • Computing Grids
    • Internet Computing Systems

Elements of Parallel computing

  • Silicon-based processor chips are reaching their physical limits. Processing speed is constrained by the speed of light, and the density of transistors packaged in a processor is constrained by thermodynamics limitations.
  • A viable solution to overcome this limitation is to connect multiple processors working in coordination with each other to solve “Grand Challenge” problems.
  • The first step in this direction led
    • To the development of parallel computing, which encompasses techniques, architectures, and systems for performing multiple activities in parallel.

Parallel Processing

  • Processing of multiple tasks simultaneously on multiple processors is called parallel processing.
  • The parallel program consists of multiple active processes ( tasks) simultaneously solving a given problem.
  • A given task is divided into multiple subtasks using a divide-and-conquer technique, and each subtask is processed on a different central processing unit (CPU).
  • Programming on a multiprocessor system using the divide-and-conquer technique is called
  • parallel programming.
  • Many applications today require more computing power than a traditional sequential computer can offer.
  • Parallel Processing provides a cost-effective solution to this problem by increasing the number of CPUs in a computer and by adding an efficient communication system between them.
  • The workload can then be shared between different processors. This setup results in higher computing power and performance than a single processor a system offers.

Parallel Processing influencing factors

  • The development of parallel processing is being influenced by many factors. The prominent among them include the following:
    • Computational requirements are ever-increasing in the areas of both scientific and business computing. The technical computing problems, which require high-speed computational power, are related to
      • life sciences, aerospace, geographical information systems, mechanical design, analysis, etc.
    • Sequential architectures are reaching mechanical physical limitations as they are constrained by the speed of light and thermodynamics laws.
      • The speed at which sequential CPUs can operate is reaching saturation point ( no more vertical growth), and hence an alternative way to get high computation speed is to connect multiple CPUs ( opportunity for horizontal growth).
    • Hardware improvements in pipelining, superscalar, and the like are nonscalable and require sophisticated compiler technology.
      • Developing such compiler technology is a difficult task.
    • Vector processing works well for certain kinds of problems. It is suitable mostly for scientific problems ( involving lots of matrix operations) and graphical processing.
      • It is not useful for other areas, such as databases.
    • The technology of parallel processing is mature and can be exploited commercially
      • there is already significant R&D work on development tools and environments.
    • A significant development in networking technology is paving the way for 
      •  heterogeneous computing.