1. A pretty cool locking mechanism which we use every now and then at GridGain is concurrent segmented "striped" locks. Sometimes, your objects are constantly recreated, so you can't really attach a mutex to them. For cases like those you can partition the object space into multiple segments (often by object hashcode value) and only acquire the lock on the segment an object belongs to. This way the more segments you have, the more concurrency you have.

    Here is a code for our StripedLock that I thought would be good to share:
    public class StripedLock {
        // Array of underlying locks.
        private final Lock[] locks;
    
        public StripedLock(int concurrencyLevel) {
            locks = new Lock[concurrencyLevel];
    
            for (int i = 0; i < concurrencyLevel; i++)
                locks[i] = new ReentrantLock();
        }
    
        public int concurrencyLevel() {
            return locks.length;
        }
    
        public Lock getLock(int key) {
            return locks[abs(key) % locks.length];
        }
    
        public Lock getLock(long key) {
            return locks[abs((int)(key % locks.length))];
        }
    
        // Account for Math.abs() returning negative value.
        private int abs(int key) {
            return key == Integer.MIN_VALUE ? 0 : Math.abs(key);
        }
    
        public Lock getLock(@Nullable Object o) {
            return o == null ? locks[0] : getLock(o.hashCode());
        }
    
        public void lock(int key) {
            getLock(key).lock();
        }
    
        public void lock(long key) {
            getLock(key).lock();
        }
    
        public void unlock(int key) {
            getLock(key).unlock();
        }
    
        public void unlock(long key) {
            getLock(key).unlock();
        }
    
        public void lock(@Nullable Object o) {
            getLock(o).lock();
        }
    
        public void unlock(@Nullable Object o) {
            getLock(o).unlock();
        }
    }
    

    Simple - yet powerful! Hope you find it useful.
    7

    View comments

  2. We have added many cool features in GridGain 4.1. One of them is tight integration with Hadoop ecosystem. There are two ways you can integrate with Hadoop. One way is upstream integration in which you efficiently load data from HDFS into In-Memory Cache (aka Data Grid) where it gets indexed for low-latency query access. Another way is downstream integration where HDFS is used as a persistent store and data gets flushed into it periodically from In-Memory Cache using write-behind cache feature.

    With downstream integration users are able to hold the latest data in memory and use HDFS as historical data warehouse without any extra ETL process and without lags in data. Business applications can run queries and get instant analytical feedback on the whole data set which includes most recent data held in-memory and HDFS-based warehoused data, hence not loosing any data in query results at all.

    Here is an example of how HDFS-based cache store would look like in this case. I show how HBase would be used, but you can write directly to HDFS if you like. Note that all we do here is override methods load(...), put(...), and remove(...) to tell GridGain how to load and update entries in HBase.

    Full source of this example can also be found on GitHub GridGain Project:
    public class GridCacheHBasePersonStore 
        extends GridCacheStoreAdapter<Long, Person> {
        // Default config path.
        private static final String CONFIG_PATH "/my/hbase/hbase-site.xml";
    
        // Table name.
        private static final String TABLE_NAME = "persons";
    
        // Maximum allowed pool size.
        private static final int MAX_POOL_SIZE = 4;
    
        // HBase table pool.
        private HTablePool tblPool;
    
        // HBase column descriptor for first name.
        private HColumnDescriptor first = new HColumnDescriptor("firstName");
    
        // HBase column descriptor for last name.
        private HColumnDescriptor last = new HColumnDescriptor("lastName");
    
        public GridCacheHBasePersonStore() throws Exception {
            prepareDb();
        }
    
        // Load entry from HBase.
        @Override 
        public Person load(String cacheName, GridCacheTx tx, Long key) 
            throws GridException {
            HTableInterface t = tblPool.getTable(TABLE_NAME);
    
            try {
                Result r = t.get(new Get(Bytes.toBytes(key)));
    
                if (r == null)
                    throw new GridException("Failed to load key: " + key);
    
                if (r.isEmpty())
                    return null;
    
                Person p = new Person();
    
                p.setId(Bytes.toLong(r.getRow()));
                p.setFirst(Bytes.toString(r.getValue(first.getName(), null)));
                p.setLast(Bytes.toString(r.getValue(last.getName(), null)));
                
                return p;
            }
            catch (IOException e) {
                throw new GridException(e);
            }
            finally {
                close(t);
            }
        }
    
        // Store entry in HBase.
        @Override 
        public void put(String cacheName, GridCacheTx tx, Long key, Person val)
            throws GridException {
            HTableInterface t = tblPool.getTable(TABLE_NAME);
    
            try {
                t.put(new Put(Bytes.toBytes(key))
                    .add(first.getName(), null, Bytes.toBytes(val.getFirst()))
                    .add(last.getName(), null, Bytes.toBytes(val.getLast()));
            }
            catch (IOException e) {
                throw new GridException(e);
            }
            finally {
                close(t);
            }
        }
    
        // Remove entry from HBase.
        @Override 
        public void remove(String cacheName, GridCacheTx tx, Long key) 
            throws GridException {
            HTableInterface t = tblPool.getTable(TABLE_NAME);
    
            try {
                t.delete(new Delete(Bytes.toBytes(key)));
            }
            catch (IOException e) {
                throw new GridException(e);
            }
            finally {
                close(t);
            }
        }
    
        // Initialize HBase database.
        private void prepareDb() throws IOException {
            Configuration cfg = new Configuration();
    
            cfg.addResource(CONFIG_PATH);
    
            HBaseAdmin admin = new HBaseAdmin(cfg);
    
            if (!admin.tableExists(TABLE_NAME)) {
                HTableDescriptor desc = new HTableDescriptor(TABLE_NAME);
    
                desc.addFamily(first);
                desc.addFamily(last);
    
                admin.createTable(desc);
            }
    
            tblPool = new HTablePool(cfg, MAX_POOL_SIZE);
        }
    
        // Close HBase Table.
        private void close(@Nullable HTableInterface t) {
            ...
        }
    }
    
    To configure this store you should simply specify it in cache configuration and enable write-behind if you need data to be periodically flushed to HBase like so:
    <bean class="org.gridgain.grid.cache.GridCacheConfigurationAdapter">
        ...
        <!-- Setup HBase Cache Store. -->
        <property name="store">
            <bean class="GridCacheHBasePersonStore" scope="singleton"/>
        </property>
    
        <!-- Enable write-behind. -->
        <property name="writeBehindEnabled" value="true"/>
        ...
    </bean>
    
    1

    View comments

  3. Recently, at one of the customer meetings, I was asked whether GridGain comes with its own database. Naturally my reaction was - why?!? GridGain easily integrates pretty much with any persistent store you wish, including any RDBMS, NoSql, or HDFS stores. However, then I thought, why not? We already have cache swap space (disk overflow) storage based on Google LevelDB key-value database implementation, so why not have the same for data store.

    Here is how easy it was to add LevelDB based data store implementation for GridGain cache - literally took me 20 minutes to do, including unit tests. The store is based on GridGain swap space, but since swap space is based on LevelDB, you essentially get LevelDB local store for your cached data.
    public class GridCacheSwapSpaceStore<K, V> 
        extends GridCacheStoreAdapter<K, V> {
        // Default class loader.
        private ClassLoader dfltLdr = getClass().getClassLoader();
    
        @GridInstanceResource
        private Grid g; // Auto-injected grid instance
    
        @Override 
        public V load(String cacheName, GridCacheTx tx, K key) 
            throws GridException {
            return g.readFromSwap(spaceName(cacheName), key, classLoader(key));
        }
    
        @Override 
        public void put(String cacheName, GridCacheTx tx, K key, V val) 
            throws GridException {
            g.writeToSwap(spaceName(cacheName), key, val, classLoader(val, key));
        }
    
        @Override 
        public void remove(String cacheName, GridCacheTx tx, K key) 
            throws GridException {
            g.removeFromSwap(spaceName(cacheName), key, null, classLoader(key));
        }
    
        private String spaceName(String cacheName) {
            return cacheName == null ? 
                "gg-spacestore-default" : "gg-spacestore-" + cacheName;
        }
    
        private ClassLoader classLoader(Object... objs) {
            ClassLoader ldr = null;
    
            for (Object o : objs) {
                if (o != null) {
                    // Detect class loader for given object.
                    ldr = U.detectClassLoader(o.getClass());
    
                    if (ldr != dfltLdr)
                        break;
                }
            }
    
            return ldr;
        }
    }
    
    Quite easily done in my view. It will become part of next release of GridGain, so you will have local persistent store out-of-the-box if needed.

    Plenty of more examples of different GridGain cache store implementations can be found on GitHub here.
    0

    Add a comment

  4. Most of the grid products on the market tout their GUI management consoles for managing and monitoring of their grids. Trying to use a grid without management is the same as trying to drive without GPS navigation - you can get from point A to point B, but it requires a lot more effort. For example, if you have more than a couple of nodes in your grid, how do you know which nodes are alive, what data they are caching, or what tasks they are executing? Management consoles would usually provide such functionality. At GridGain we have built a very feature-rich management and monitoring GUI tool called Visor (you can see screenshots here).

    However, while being able to manage grid from GUI is important, it is equally as important to be able to do the same from code. What if you need to start or stop nodes programmatically when reacting to certain parameters of your business logic? Or what if you need to free up some more memory before executing a memory-intensive task? GridGain Remote Closure Execution in combination with Zero Deployment makes it very easy.

    For example, here is how you would start or restart  multiple remote nodes at once:
    // By specifying startup parameters directly
    G.grid().startNodes(
        hostSpec, // Mappings of host related parameters
        null, // Common defaults for this start routine. 
        false, // Restart existing nodes, if any.
        2000, // Timeout to wait for nodes startup. 
        5 // Max connections to each host for concurrent startup.
    );
    
    // Or by specifying all startup parameters in a file.
    G.grid().startNodes(
        new File("/my/grid/startup/specification.ini", 
        false // Restart flag.
    );
    
    Or here is how you can compact your in-memory distributed cache on all grid nodes by freeing up internal byte buffers:
    // Get all nodes on which "mycache" is enabled.
    GridProjection gridPrj = G.grid().projectionForCache("mycache");
    
    gridPrj.callAsync(new Callable<Object>() {
        public Object call() throws Exception {
            GridCache cache = G.grid().cache("mycache");
    
            // Compact all entries in cache.
            cache.compactAll(cache.keySet());
    
            return null;
        }
    });
    
    Or here is how you can free all memory used for cache entry backups on all grid nodes and move it onto disk storage:
    // Get all nodes on which "mycache" is enabled.
    GridProjection gridPrj = G.grid().projectionForCache("mycache");
    
    gridPrj.callAsync(new Callable<Object>() {
        public Object call() throws Exception {
            GridCache cache = G.grid().cache("mycache");
    
            // Evict will move in-memory state onto swap storage on disk.
            // Note that we pass a predicate into evict method to make sure 
            // that only backup entries will be evicted.
            cache.evictAll(new GridPredicate<GridCacheEntry<Object, Object>>() {
                public boolean apply(GridCacheEntry<Object, Object> entry) {
                    // Return true if entry is a backup of some other entry.
                    return entry.backup();
                }
            });
    
            return null;
        }
    });
    

    This is just a small subset of a rich set of management hooks available in GridGain. In the above examples we have been utilizing the same remote closure execution as used for basic MapReduce tasks (we have more advanced API for more complex MapReduce tasks). A cool thing to note is that our closures created above will be deployed on the grid automatically using GridGain Zero Deployment functionality - tasks and closures never have to be predeployed on remote grid nodes.
    3

    View comments

About me
About me
- Antoine de Saint-Exupery -
- Antoine de Saint-Exupery -
"A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away."
Blog Archive
Blogs I frequent
Loading
Dynamic Views theme. Powered by Blogger.