1. In GridGain 3.0 we are coming up with a set of useful annotations to automate grid-enabling of common functionality that works on ranges and collections. The whole idea is that given a certain collection, GridGain can automatically split that collection into sub-collections, send them to remote nodes for execution, get results back, reduce them and return them back to user. Sort of automatic map-reduce for collecions.

    Let's take search for example. You can model almost any search as taking a collection of values, picking the right value in that collection, and returning that value. For example, let's assume that we need to find the max value in a collection (forgive the simplicity of the example, in real life you would probably be grid-enabling searches that are a lot more complex than this one).

    In Java this would look like the following:

    public Integer findMax(Collection<Integer> vals) {
    Integer max = Collections.max(vals);

    return max;
    }
    If we were to grid-enable above functionality, we would have to do the following:
    1. Map Step: Split the initial collection into a number of sub-collections
    2. Send each sub-collection to a remote node
    3. Have every remote node find a max value in the sub-collection assigned to it and return it.
    4. Reduce Step: Find the maximum out of all values returned from remote nodes and return it to user.
    As simple as it is, you can apply the same kind of steps to many other searches you do in real life.

    To do this search in GridGain 3.0, all you would have to do is attach @GridifySetToValue(recursive=true) annotation to your method and you are done:

    @GridifySetToValue(recursive=true)
    public Integer findMax(Collection<Integer> vals) {
    Integer max = Collections.max(vals);

    return max;
    }
    By a virtue of attaching a single annotation, you are basically telling GridGain to perform steps 1 to 4 described above automatically. On top of that, with GridGain peer-class-loading functionality no code needs to be explicitly deployed to remote nodes at all. Simply bring up several GridGain images on a cloud and they are ready to start computing whatever you throw at them.

    In the coming weeks I will show how some other annotations can be used to automate grid-enabling of other common tasks we encounter on daily basis.

    I should also mention that you can achieve the above with relative ease on the current version of GridGain (you would have to do some of the steps manually though).

    Stay tuned for GridGain 3.0 scheduled for release this summer.

     

    4

    View comments

  2. GridGain will conduct 3-day advance public training in Amsterdam, May 18-20, 2009. This training provides full in-depth learning of all features in GridGain. Beyond standard features and use cases, this 3-day course goes deep in complex scenarios of homogeneous and heterogeneous environments, SPI development, global grid and advanced security, custom topology resolution, and many other advanced topics.

    This 3-day training is intended for developers, architects and system analysts who are going to use GridGain in a complex environment and want to have unparalleled understanding and knowledge about every aspect of how GridGain is designed and how it operates. Taught by the people that actually develop the GridGain product and with labs to bring hands on experience.

    For more information visit http://gridgain.eventbrite.com
    0

    Add a comment

  3. With emergence of cloud computing, the term "Hybrid Topology" or "Hybrid Deployment" is becoming more and more common. Let me first start with definition of what "Hybrid Topology" is. A "Hybrid Topology" is when you join different cloud deployments into one connected cluster. For example, you can have your local data center forming a join cluster with several images deployed on a cloud.

    Here is a simple use case. Let's say that you have an application deployed in your local data center that stays idle for most of the time and only peaks for 3 hours in a day, say from 4pm to 7pm. To make it cost efficient, you want to keep as few nodes as possible for the most of the time and, once load peaks, you want to automatically detect that and bring up a few more nodes. These new nodes that you bring up to help your existing cluster may be in a totally different data center, or different cloud (e.g. Amazon EC2 or GoGrid), yet you want them to join your cluster and participate in load balancing, job collision resolution, job execution, fail-overs, etc...

    Here are some challenges to consider when setting up hybrid clouds:

    1. On Demand Startup and Shutdown
    Your infrastructure must be able to start up and shutdown cloud nodes on demand. Usually you should have some policy implemented which listens to some of your application characteristics and reacts to them by starting or stopping cloud nodes. In simplest case, you can react to CPU utilization and start up new nodes if main cloud gets overloaded and stop nodes if it gets underloaded.

    2. Cloud-based Node Discovery
    The main challenge in setting up regular discovery protocols on clouds is that IP Multicast is not enabled on most of the cloud vendors (including Amazon and GoGrid). Your node discovery protocol would have to work over TCP. However, you do not know the IP addresses of the new nodes started on the cloud either. To mitigate that, you should utilize some of the cloud storage infrastructure, like S3 or SimpleDB on Amazon, to store IP addresses of new nodes for automatic node detection.

    3. One-Directional Communication
    One of the challenges in big enterprises is opening up new ports in Firewalls for connectivity with clouds. Quite often you will only be allowed to make only outgoing connections to a cloud. Your middleware should support such cases. On top of that, sometimes you may run into scenario of *disconnected clouds*, where cloud A can talk to cloud B, and cloud B can talk to cloud C, however cloud A cannot talk to cloud C directly. Ideally in such case cloud A should be allowed to talk to cloud C through cloud B.

    4. Latency
    Communication between clouds may take longer than communication between nodes within the same cloud. Often, communication within the same cloud is significantly slower than communication within local data center. Your middleware layer should properly react to and handle such delays without breaking up the cluster into pieces.

    5. Reliability and Atomicity
    Many operations on the cloud are unreliable and non-transactional. For example, if you store something on Amazon S3 storage, there is no guarantee that another application can read the stored data right away. There is also no way to ensure that data is not overwritten or implement some sort of file locking. The only way to provide such functionality is at application or middleware layers.

    There are certainly other things that could go wrong, but these turned out to be the main challenges we had to resolve while working on the GridGain 3.0 version. Some of the cool features we plan to support are On Demand Startup and Shutdown Policies (including Cost-based policies) and Disconnected Clouds. GridGain 3.0 is planned to be released this summer, so stay tuned!
    5

    View comments


  4. GridGain project will be presented at OGF25/EGEE International Conference on March 5th, 2009.

    EGEE, OGF and OGF-Europe are spearheading efforts to connect developers, users and newcomers to distributed computing for the benefit of business and research, now and in the future. The EGEE 4th User Forum / OGF25 & OGF-Europe’s 2nd International Event will catalyse people from diverse sectors to drive forward the evolution of distributed computing and open standards for the knowledge-based economy.

    This premier event in Europe will help to strengthen existing business and research communities and foster new relationships and collaborative developments on a European and global level. Special emphasis will be placed on showcasing high-level technological developments, identifying best practices, evaluating user requirements and deliberating the top priorities for the future.

    EGEE 4th User Forum/OGF25 & OGF-Europe’s 2nd International Event is a multi-faceted event featuring keynote talks delivered by high-profile experts from business, government and research, and a series of parallel and joint sessions focusing on specific sectors and technologies.

    For more information visit OGF25/EGEE website.
    0

    Add a comment

About me
About me
- Antoine de Saint-Exupery -
- Antoine de Saint-Exupery -
"A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away."
Blog Archive
Blogs I frequent
Loading
Dynamic Views theme. Powered by Blogger.