OS-Multiprocessors and Fault Tolerance

MULTIPROCESSORS: A multiprocessor system is an interconnection of two or more CPUs with memory and input-output equipment. A multiprocesso...

A multiprocessor system is an interconnection of two or more CPUs with memory and input-output equipment. A multiprocessor system is controlled by one operating system that provides interaction between processors and all the components of the system cooperate in the solution of a problem.

One appeal of multiprocessing systems is that if a processor fails, the remaining processors can normally continue operating. A failing processor must somehow inform the other processors to take over ; functioning processors must be able to detect a processor that has failed. The operating system must note that a particular processor has failed and is no longer available for allocation.

Multiprocessing can improve performance by decomposing a program into parallel executable tasks or multiple independent jobs can be made to operate in parallel. With decreasing hardware costs, it has become common to connect a large number of microprocessors to form a multiprocessor in this way, large-scale computer power can be achieved without the use of costly ultra-high speed processors. 

One of the most important capabilities of multiprocessor operating systems is their ability to withstand equipment failures in individual processors and to continue operation; this ability is referred to as fault tolerance.

Fault tolerance systems can achieve operating even when portions of the system fail. This kind of operation is especially important in so-called mission critical systems. Fault tolerance is appropriate for systems in which it may not be possible for humans to intervene and repair the problem, such as in deep-space probes, aircrafts, and the like. It is also appropriate for systems in which these consequences could happen so quickly that humans could not intervene quickly enough.

Many techniques are commonly used to facilitate fault tolerance. These include
  •  critical data for the system and the various processes should be maintained in multiple-copies. These should reside in separate storage banks so that failures in individual components will not completely destroy the data.
  •  The operating system must be designed so that it can run the maximal configuration of hardware effectively, but it must also be able to run subsets of the hardware effectively in case of failures.
  •  Hardware error detection and correction capability should be implemented so that extensive validation is performed without interfering with the efficient operation of the system.
  •  Idle processors capacity should be utilized to attempt to detect potential failures before they occur

Best Online Tutorials | Source codes | Programming Languages: OS-Multiprocessors and Fault Tolerance
OS-Multiprocessors and Fault Tolerance
Best Online Tutorials | Source codes | Programming Languages
Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS PREMIUM CONTENT IS LOCKED STEP 1: Share to a social network STEP 2: Click the link on your social network Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy Table of Content