Monday, February 9, 2009

Processing Events in Grid Environments

When working with distributed systems, it is often problematic to exchange events between grid nodes. Simple broadcasting does not work in majority of the cases. Just imagine having a grid of 100 nodes, and every node pushing its events to all other nodes. If you just push every event at the time it happens, then you will end up pushing events every millisecond, which will flood the network. If you let them accumulate and throttle them, then you end up pushing very large messages which is not good for your network either. On top of that, none of the nodes care about most of the events that happen on other nodes.

The right approach is to provide a flexible event storage and query events on demand as needed.

For example, in GridGain with its pluggable SPI architecture, implementation of event storage is just an implementation of event interface. By default, GridGain will use in-memory storage for all events which will keep in memory only last N number of events and automatically discard the old ones. But user is free to provide a different implementation. For example, it is fairly trivial to provide a DB/SQL based event storage implementation which could selectively store certain events in database and ignore all other ones. It could event throttle events and store them in one batch database operation.

Here is the method that allows you query events in GridGain:


/**
* @param filter Implementation of GridEventFilter which is
* a simple filter that either accepts or rejects an event.
* @param nodes Collection of grid nodes to send the query to.
* @param timeout Timeout in milliseconds to wait for results.
*/
Collection<GridEvent> Grid.queryEvents(
GridEventFilter filter,
Collection<GridNode> nodes,
long timeout
);

So the filter will be sent to the specified remote nodes and as a result you will get a collection of accepted grid events from all the queried nodes.

However, the coolest thing here is that, with GridGain Peer Class Loading, remote nodes can be bare-bone GridGain nodes that have absolutely no knowledge of your event filter class. As a matter of fact, you can simply startup several stand alone grid nodes by executing a startup script, then execute any grid task directly from your IDE and then query for events. All task execution code and event querying code (including the filter you just created) will be automatically deployed to the remote grid nodes.

The above approach allows GridGain to be very productive in development and yet remain very effective in production.

 

5 comments:

Sergio Bossa said...

Hi Dmitriy,

thanks for your interesting insight.
Just a question: why hot to use a publish/subscribe approach, with grid nodes subscribing to events published by other nodes?

Thanks,
Cheers,

Sergio B.

Artur said...

Just a short question - what if my query from IDE contains

System.exit(0);

?

dsetrakyan said...

As far as publish/subscribe we again must be careful to avoid sending event messages as they occur as this may flood the network, so there must be throttling of some sort.

What we may do in future is allow users to provide a filter and a frequency when subscribing for event events. This way we will avoid sending all the events and will support throttling.

Best.

dsetrakyan said...

Good question about System.exit() :)

For the most part, grid is executed in a trusted environment and this is not an issue.

However, you can configure a standard Java security policy to prohibit certain functionality. You can also validate the message to make sure that it comes from a trusted sender.

Best.

Jonas said...

Very cool stuff.