What is Election Algorithms ?
Many distributed algorithms employ a coordinator process that performs functions needed by the other processes in the system. These functions include enforcing mutual exclusion, maintaining a global wait-for graph for deadlock detection, replacing a lost token, and controlling an input or output device in the system. If the coordinator process fails due to the failure of the site at which it resides, the system can continue only by restarting a new copy of the coordinator on some other site.
The algorithms that determine where a new copy of the coordinator should be restarted are called election algorithms. Election algorithms assume that a unique priority number is associated with each active process in the system. For ease of notation, we assume that the priority number of process P, is /. To simplify our discussion, we assume a one-to-one correspondence between processes and sites and thus refer to both as processes.
The coordinator is always the process with the largest priority number. Hence, when a coordinator fails, the algorithm must elect that active process with the largest priority number. This number must be sent to each active process in the system. In addition, the algorithm must provide a mechanism for a recovered process to identify the current coordinator. In this section, we present examples of election algorithms for two different configurations of distributed systems.
The first algorithm applies to systems where every process can send a message to every other process in the system. The second algorithm applies to systems organized as a ring (logically or physically). Both algorithms require n 2 messages for an election, where n is the number of processes in the system. We assume that a process that has failed knows on recovery that it has indeed failed and thus takes appropriate actions to rejoin the set of active processes
The Bully Algorithm
Suppose that process P; sends a request that is not answered by the coordinator within a time interval T. In this situation, it is assumed that the coordinator has failed, and P; tries to elect itself as the new coordinator. This task is completed through the following algorithm, Process P; sends an election message to every process with a higher priority number. Process P, then waits for a time interval T for an answer from any one of these processes. If no response is received within time T, P,- assumes that all processes with numbers greater than / have failed and elects itself the new coordinator.
Process P; restarts a new copy of the coordinator and sends a message to inform all active processes with priority numbers less than; that P,- is the new coordinator. However, if an answer is received, P, begins a time interval T, waiting to receive a message informing it that a process with a higher priority number has been elected. (That is, some other process is electing itself coordinator and should report the results within time T.)
If no message is sent within T, then the process with a higher number is assumed to have failed, and process P, should restart the algorithm. If Pi is not the coordinator, then, at any time during execution, P,- may receive one of the following two messages from process P,:
1. Pj is the new coordinator (j > /). Process P,, in turn, records this information.
2. Pj has started an election (j < i). Process P,- sends a response to Pj and begins its own election algorithm, provided that P, has not already initiated such an election. The process that completes its algorithm has the highest number and is elected as the coordinator. It has sent its number to all active processes with smaller.
After a failed process recovers, it immediately begins execution of the same algorithm. If there are no active processes with higher numbers, the recovered process forces all processes with lower numbers to let it become the coordinator process, even if there is a currently active coordinator with a lower number.
For this reason, the algorithm is termed the bully algorithm. We can demonstrate the operation of the algorithm with a simple example of a system consisting of processes Pi through Pj. The operations are as follows:
1. All processes are active; P4 is the coordinator process.
2. PT and P4 fail. P2 determines that P4 has failed by sending a request that is not answered within time T. P2 then begins its election algorithm by sending a request to P3.
3. P3 receives the request, responds to P2, and begins its own algorithm by sending an election request to P4.
4. Pi receives Pa's response and begins waiting for an interval T'.
5. Pi does not respond within an interval T, so P3 elects itself the new coordinator and sends the number 3 to P2 and Pi. (Pi does not receive the number, since it has failed.) 6. Later, when P] recovers, it sends an election request to P?, P3, and P4. 7. P2 and P3 respond to Pi and begin their own election algorithms. P3 will again be elected, through the same events as before. 8. Finally, P4 recovers and notifies Pi, Pj, and P3 that it is the current coordinator. (P4 sends no election requests, since it is the process with the highest number in the system.)
The Ring Algorithm
The ring algorithm assumes that the links are unidirectional and that each process sends its messages to the neighbor on the right. The main data structure used by the algorithm is the active list, a list that contains the priority numbers of all active processes in the system when the algorithm ends; each process maintains its own active list. The algorithm works as follows:
1. If process P; detects a coordinator failure, it creates a new active list that is initially empty. It then sends a message elect(i) to its right neighbor and adds the number / to its active list.
2. If Pj receives a message electij) from the process on the left, it must respond in one of three ways: a. If this is the first elect message it has seen or sent, P, creates a new active list with the numbers i and;. It then sends the message ekct(i), followed by the message elect(j). b. If i # /—that is, the message received does not contain P.'s number —then Pj adds / to its active list and forwards the message to its right neighbor. c. If / = /—that is, Pi receives the message eled(i)—then the active list for P now contains the numbers of all the active processes in the system.
Process P; can now determine the largest number in the active list to identify the new coordinator process. This algorithm does not specify how a recovering process determines the number of the current coordinator process. One solution requires a recovering process to send an inquiry message. This message is forwarded around the ring to the current coordinator, which in turn sends a reply containing its number.