
Let's take Cloudy Akka for example. While it is a wonderful product with great ideas borrowed from Erlang, it's not really a compute grid. What it provides is a convenient way to execute functions remotely, but it does not provide all the underlying plumbing that we have come to expect from compute grids.
Having worked on GridGain compute grid myself for over 5 years already (and then on data grid), and having studied quite a few others, here are some features that are minimally required in any product in my opinion, before it can claim itself as a compute grid.
- Auto Discovery - all nodes in the grid should auto-discover each other, i. e. user should never have to manually add nodes to a topology.
- MapReduce - support for splitting execution into multiple sub-jobs and then aggregating the results is just a MUST. Otherwise you are offloading most of the dirty work onto your users, which is not fair.
- Auto Failure Detection - compute grid must be smart enough to automatically detect node crashes and proportionally distribute all the load among remaining nodes.
- Fault Tolerance - all failed grid jobs must be automatically failed-over to other nodes, which are better suited for executing these jobs.
- Load Balancing - compute grid should automatically distribute load equally among nodes, usually utilizing many different policies for load balancing. GridGain even has support for work-stealing, where less-loaded nodes can steal jobs from overloaded nodes.
- Job Collision Resolution - this gives users control over how many jobs can run in parallel, while other jobs should wait in waiting queues, ordered by multiple available collision resolution strategies.
- Auto Deployment - compute grid users should never be forced to manually deploy their libraries on all available grid nodes, this is just way too inconvenient and error-prone. The approach I like the best (available in GridGain) is auto-deployment, where code just automatically penetrates throughout the grid without any explicit action from users.
- Nested Jobs And Continuations - compute grid jobs should be able to invoke other compute grid jobs when executing remotely. This is a very powerful feature, especially when grid jobs are recursive. Continuations should allow to suspend a job and release its resources while it's waiting for a result of another job within the grid.
scalar {
// 1. Execute (unicast) a simple job on some remote node.
grid ucastRun (() => println("> GridGain ROCKS <"))
// 2. Execute (broadcast) a simple job on all remote nodes.
grid bcastRun (() => println("> GridGain ROCKS <"))
// 3. Use MapReduce to split a phrase into multiple words and
// print each word on remote nodes.
grid splitRun
(for (w <- "GridGain ROCKS".split(" ")) yield () => println(w))
// 4. Use MapReduce to count number characters by spreading
// workload to the grid and reducing on local node.
val cnt = grid splitReduce
(for (w <- "GridGain REALLY ROCKS!".split(" "))
yield () => w.length, // Map step.
(s: Seq[Int]) => s.sum) // Reduce step.
}
Here is the similar example written in Java. It's a bit more verbose, but nevertheless, still pretty simple.
// 1. Execute (unicast) a simple job on some remote node.
grid.run(UNICAST, new GridRunnable() {
@Override public void run() {
System.out.println("> GridGain ROCKS <");
}
});
// 2. Execute (broadcast) a simple job on all remote nodes.
grid.run(BROADCAST, new GridAbsClosure() {
@Override public void apply() {
System.out.println("> GridGain ROCKS <");
}
});
// 3. Use MapReduce to split a phrase into multiple words and
// print each word on remote nodes.
grid.run(SPREAD, F.yield("> GridGain ROCKS <".split(" "),
new GridInClosure<String>() {
@Override public void apply(String word) {
System.out.println(word);
}
}));
// 4. Use MapReduce to count number characters by spreading
// workload to the grid and reducing on local node.
int cnt = grid.reduce(
BALANCE,
new GridClosure<String, Integer>() { // Create executable logic.
@Override public Integer apply(String word) {
return word.length();
}
},
Arrays.asList("GridGain REALLY ROCKS!".split(" ")), // List of words.
F.sumIntReducer() // Reducer which adds up all the values given to it.
);
GridGain Scalar is available within GridGain release and can be downloaded here.
View comments