What does fail-over in distributed grid or cluster environment really mean? In a standard notion of it, users usually expect their data or logic to automatically fail-over to a new available grid node in case of a node crash. But is this really enough? What if, for example, a grid node is still alive, but it did not have the available resources to process your job. What if I/O on that node is to slow or database connection is not available? Also, a result of a computation could be application specific. If a computation throws an exception, depending on application logic it may or may not be worth while to retry the same computation on another node.
The correct approach is to allow users to control their fail-over logic whenever a custom behavior is needed. In GridGain, in addition to standard fairly rich fail-over policy provided out-of-the-box, we have 2 pluggability points where user can plug a custom fail-over behavior - one is for overall application fail-over policy, and another is for every individual computation.
The application-specific behavior is provided via GridFailoverSpi (GridGain uses SPI's, Service Provider Interfaces, as plugins into any kernel level functionality). A user simply has to implement 'fail-over' method on Failover SPI interface. Here is a very simple example of fail-over logic that picks another available node using underlying load balancer:
public class MyFailoverSpi extends GridSpiAdapter
implements GridFailoverSpi {
...
/**
* This logic handles fail-over of a computation
* job from one node to another.
*/
public GridNode failover(
GridFailoverContext ctx,
List topology) {
GridJobResult failedResult = ctx.getJobResult();
List newTopology = new ArrayList(topology);
// Remove failed node from topology to
// avoid retries on the same node.
newTopology.remove(failedResult.getNode());
// Delegate to load balancing.
return ctx.getBalancedNode(newTopology);
}
...
}
The computation-specific behavior is overridden at the computation logic level. In GridGain a computation unit, GridTask, is responsible for splitting your logic into smaller sub-computations, GridJobs, assigning them to remote nodes and then aggregating job results into one task result (GridTask is our main MapReduce abstraction). Here is an example of how a GridTask can decide that a job should be failed over to another node:
public class MyGridTask
extends GridTaskSplitAdapter {
public List split(int gridSize, Object arg) {
...
}
/**
* Callback for every job result that came
* from remote grid nodes.
*/
public public GridJobResultPolicy result(
GridJobResult result,
List receivedResults) {
if (result.getData().equals(someBadResult)) {
// Delegate to failover SPI to pick
// another node.
return GridJobResultPolicy.FAILOVER;
}
// Wait for other results to come in, or
// reduce if all results have arrived.
return GridJobResultPolicy.WAIT;
}
public Object reduce(List allResults) {
...
}
}
Enjoy!


0 comments:
Post a Comment