1. I have assembled a list of GridGain how-to tips based on questions we usually get from our users. I will shortly put it on our Wiki as well, but here it is for now:

    1. How to execute a task on a grid?
    GridGain supports annotation-based and API-based GridTask Execution. For annotation-based approach you will need to attach @Gridify annotation to your Java method. For API-based approach you will need to call Grid.execute(..) method directly. There are examples of both approaches on our Wiki.

    2. How to split a task into multiple sub-units of work (jobs) for parallel execution?
    The main abstraction in GridGain is GridTask, which has 3 methods: map(), result(), and reduce(). In GridTask.map(..) method you are able to create multiple GridJobs and execute them in parallel. There are also multiple adapters, such as GridTaskAdapter or GridTaskSplitAdapter that make implementation of GridTask much simpler.

    3. How to execute the same task on all nodes?
    In GridTask.map(..) method, or in any of the adapters, simply return as many jobs as there are nodes (you can return the same job instance multiple times). This way every job will be executed on a different grid node. When using GridTaskSplitAdapter, make sure that GridRoundRobinLoadBalancingSpi is configured (this is default configuration).

    4. How to pick a random node for execution?
    Change load balancer in GridGain configuration to GridWeightedRandomLoadBalancerSpi. This way every job will be assigned to a random node.

    5. How to pick a specific node for task execution?
    In GridTask.map(..) method you get the list of all available GridNode instances. You can inspect every node for metrics and attributes. GridGain uses node metrics to expose all lifetime vitals for every node, such as CPU utilization, Heap memory, threads, averages for job counts and execution times, etc... Also, all system and environment properties of a node are automatically attached to every node as attributes. Users can also attach any custom attributes they like. You can use all this node information to intelligently select a specific node for job execution (for example, you can pick all Linux nodes with more than 50% of Heap memory available).

    Another approach would be to properly configure GridTopologySpi and/or GridLoadBalancingSpi to properly select the nodes for your jobs.

    6. How to limit task execution to a subset of nodes?
    Use node attributes to Segment Your Grid. Then configure GridAttributesTopologySpi to only include nodes that have specific attributes. This way only the nodes that have configured attributes will be provided to GridTask.map(..) method.

    7. How to deploy a grid task?
    GridGain supports implicit and explicit task deployment. In implicit mode you don't have to do anything. Simply execute a task and all classes and resources used by it will be automatically loaded to remote nodes via GridGain Peer Class Loading mechanism. You also have an option to deploy a GAR file explicitly on all nodes.

    8. How to limit a maximum number of jobs that can execute in parallel?
    GridGain has a notion of GridCollisionSpi. This SPI gets invoked every time a job arrives to a remote node. You can configure number of parallel jobs for any Collision SPI you choose. All jobs that exceed this number will be queued for execution. You also have an option to reject any job and fail it over to another node.

    9. How to change load balancing policy?
    GridGain comes with multiple GridLoadBalancingSpi implementations out of the box. These implementations include a wide range of load balancing algorithms, such as round-robin, random, adaptive, affinity, etc... Some of the interesting ones are adaptive policy, which basically listens to the grid load and automatically self-adjusts to pick the least loaded node, or affinity policy which always assigns a job to the same node based on the affinity key provided - perfect for collocation of computations and data and is often used for integration with data grids.

    There is also a concept of Job Stealing which allows less loaded nodes to steal jobs from more loaded nodes.

    10. How to control node fail-over behavior?
    In GridGain all node failures and job rejections are failed-over automatically. However, users have an option to fully control fail-over behavior by overriding GridTask.result(..) method and decide which cases should be failed over and which should not. Users can also control maximum number of fail-over hops a job can make before it will be considered failed by properly configuring any of GridFailoverSpi implementations shipped with GridGain.

    For more information visit:
    GridGain Website
    Wiki Documentation
    Javadoc Documentation
    Online Forums
    4

    View comments

  2. Deployment in grid environments is always a big issue. Many grid companies provide super-tooling support to make sure that a simple JAR file is distributed across all nodes. Often users have to create Maven or Ant scripts that deploy their applications onto servers that run remotely. To me this whole thing just sounds like a big headache. Developer still has to spend time to writing an application locally, then launch a build script that would deploy it remotely, and then spend enormous amount of time debugging their applications with remote debuggers to find out why the heck their app does not work the way it should when deployed within a grid infrastructure.

    The latest shift in the industry is to remove the deployment step altogether. Why spend time on explicit deployment when Java provides all the necessary support to do it automatically?

    For example, JavaRebel from ZeroTurnAround introduced deploy-less way to work with J2EE. They have a pretty cool implementation of Java class loading (which supports class unloading and reloading by the way) which allows you to simply change your class in IDE and immediately observe the new behavior on the application server without any deployment at all. Although not supporting some edge cases, the product does indeed provide a great productivity boost for developers.

    The same "Change->Compile->Run" approach you can find in GridGain. I know that we got our grid deployment right when I hear it from our users. Our users really appreciate not having to spend any time on deployment, especially in development. When working with GridGain, you simply write your code as you would locally. Then you can start several stand-alone GridGain nodes (often on the same box), hit the Run button in your IDE and watch your code execute on the grid. There are no build scripts to run or grid nodes to restart. Whenever you change your code, again, hit the Run button and watch your new code execute while the old version is automatically undeployed.

    Imagine how useful this becomes in production. You can transparently deploy new code in production without bringing down your environment at all. Lets say that you have 10 worker grid nodes constantly running in production and several master nodes constantly emitting grid jobs to the worker nodes. Now, if you want to deploy a new version of your application onto the grid, you don't need to bring down your production environment at all. You can simply stop master nodes one by one and restart them with new code in the classpath. The restarted nodes will execute the new version of code while the old ones will still work with the old code version. The remote worker nodes will happily execute old and new versions of code side by side without any problems while providing all the required class-loading isolation. Once the last master node is restarted, the old code will be automatically undeployed from all the worker nodes. All this is achieved with absolutely zero down time. We used this approach when deploying under load onto a large Amazon EC2 cloud and it worked nicely.

    When using our GAR deployment, you would not even have to bring down the master nodes in the above example. Simply deploy new GAR files onto master nodes and the new code will automatically be deployed on the worker nodes. All the tasks currently running with old version of code will be allowed to complete and then the old version will be automatically undeployed.

    Zero deployment is just one of the many productivity enhancements you can find in GridGain. For example, when debugging you can start multiple grid nodes within the same VM and step through the code in debugger while your job executes on the grid. When writing unit tests, you can start several grid nodes right from within your test case, simulate any scenario, and validate results.

    We provide all these productivity enhancements because we are developers ourselves and have the first hand experience with all the daily pain factors developers usually run into.
    5

    View comments

About me
About me
- Antoine de Saint-Exupery -
- Antoine de Saint-Exupery -
"A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away."
Blog Archive
Blogs I frequent
Loading
Dynamic Views theme. Powered by Blogger.