If you ever looked at Apache Ignite, you have probably noticed that it is a fairly rich platform with lots of components. However, despite the extensive feature set, Ignite community aims to make the platform easy to use and understand. Here is how the Ignite community defines their project:

Apache Ignite is
the in-memory computing platform
that is durable, strongly consistent, and highly available
with powerful SQL, key-value and processing APIs

So, in summary, Ignite looks like a distributed data storage that can work both, in-memory and on-disk, and provides SQL, key-value and processing APIs to the data. Sounds simple enough. However, to get a complete picture, perhaps it is better to define Ignite by answering several "Is Ignite a ...?" questions:

Is Ignite a persistent or pure in-memory storage?

Both. Native persistence in Ignite can be turned on and off. This allows Ignite to store data sets bigger than can fit in the available memory. Essentially, the smaller operational data sets can be stored in-memory only, and larger data sets that do not fit in memory can be stored on disk, using memory as a caching layer for better performance.


Is Ignite an in-memory database (IMDB)?

Yes. Even though Ignite durable memory works well in-memory and on-disk, the disk persistence can be disabled and Ignite can act as a pure distributed in-memory database, with support for SQL and distributed joins.


Is Ignite an in-memory data grid (IMDG)?

Yes. Ignite is a full-featured data grid, which can be used either in pure in-memory mode or with Ignite native persistence. It can also automatically integrate with any 3rd party databases, including any RDBMS or NoSQL stores.


Is Ignite a distributed database?

Yes. Data in Ignite is either partitioned or replicated across a cluster of multiple nodes. This provides scalability and adds resiliency to the system. Ignite automatically controls how data is partitioned, however, users can plugin their own distribution (affinity) functions and collocate various pieces of data together for efficiency.


Is Ignite an SQL database?

Not fully. Although Ignite aims to behave like any other relational SQL database, there are differences in how Ignite handles constraints and indexes. Ignite supports primary and secondary indexes, however, the uniqueness can only be enforced for the primary indexes. Ignite also does not support foreign key constraints.

Essentially, Ignite purposely does not support any constraints that would entail a cluster broadcast message for each update and significantly hurt performance and scalability of the system.


Is Ignite a transactional database? 

Not fully. ACID Transactions are supported, but only at key-value API level. Ignite also supports cross-partition transactions, which means that transactions can span keys residing in different partitions on different servers. At SQL level Ignite supports atomic, but not yet transactional consistency. Ignite community plans to implement SQL transactions in version 2.4.


Is Ignite a key-value store?

Yes. Ignite provides a feature rich key-value API, that is JCache (JSR-107) compliant and supports Java, C++, and .NET.

You can find out more about Ignite by visiting the freshly redesigned Ignite website.


1

View comments

If you ever looked at Apache Ignite, you have probably noticed that it is a fairly rich platform with lots of components. However, despite the extensive feature set, Ignite community aims to make the platform easy to use and understand. Here is how the Ignite community defines their project:

Apache Ignite is

the in-memory computing platform

that is durable, strongly consistent, and highly available

with powerful SQL, key-value and processing APIs

So, in summary, Ignite looks like a distributed data storage that can work both, in-memory and on-disk, and provides SQL, key-value and processing APIs to the data. Sounds simple enough.
1

Ignite is the in-memory computing platform

that is durable, strongly consistent, and highly available

with powerful SQL, key-value and processing APIs

Starting with 2.1 release, Apache Ignite has become one of a very few in-memory computing systems that provides its own distributed persistence layer.
5

Today the GridGain team has announced the release of enterprise-grade GridGain In-Memory Data Fabric v. 7.5, based on Apache Ignitetm v. 1.5. For those not familiar with GridGain or Apache Ignite, it provides the ability to distribute, cache, and compute on data in memory, including such features as in-memory data grid, compute grid, ANSI-99 in-memory SQL, real-time streaming, in-memory file system, and many more.

In my previous post I have demonstrated benchmarks for atomic JCache (JSR 107) operations and optimistic transactions between Apache Ignitetm data grid and Hazelcast. In this blog I will focus on benchmarking the pessimistic transactions.

The difference between optimistic and pessimistic modes is in the lock acquisition. In pessimistic mode locks are acquired on first access, while in optimistic mode locking happens during the commit phase.
3

Recently I have been doing many benchmarks comparing the incubating Apache Ignitetm project to other products. In this blog I will describe my experience in comparing Apache Ignite Data Grid vs Hazelcast Data Grid.

Yardstick Framework

I will be using Yardstick Framework for the benchmarks, specifically Yardstick-Docker extension. Yardstick is an open source framework for performing distributed benchmarks.
7

In its 1.0 release Apache Ignitetm added much better streaming support with ability to perform various data transformations, as well as query the streamed data using standard SQL queries. Streaming in Ignite is generally used to ingest continuous large volumes of data into Ignite distributed caches (possibly configured with sliding windows). Streamers can also be used to simply preload large amounts of data into caches on startup.

Here is an example of processing a stream of random numbers.
1

In this example we will stream text into Apache Ignite and count each individual word. We will also issue periodic SQL queries into the stream to query top 10 most popular words.

The example will work as follows:

We will setup up a cache to hold the words as they come from a stream. We will setup a 1 second sliding window to keep the words only for the last 1 second. StreamWords program will stream text data into Ignite. QueryWords program will query top 10 words out of the stream.

Ever seen a product which has duplicated mirrored APIs for synchronous and asynchronous processing? I never liked such APIs as they introduce extra noise to what otherwise could be considered a clean design. There is really no point to have myMethod() and myMethodAsync() methods while all you are trying to do is to change the mode of method execution from synchronous to asynchronous.
1

An easy-to-manage network cluster is a cluster in which all nodes are equal and can be brought up with identical configuration. However, even though all nodes are equal, it often still makes sense to assign application-specific roles to them, like "workers', "clients", or "data-nodes". In Apache Ignite, this concept is called cluster groups.
5

Some of us may have already heard the terms Data Grid and Data Fabric, however, neither of these terms has been well defined in the industry. In this blog, I will try to add some clarity to both terms by outlining some main features for data grids and data fabrics.

What is a Data Grid

Often when doing meetup presentations about Apache Ignite, I ask the crowd if anyone has ever heard of what a Data Grid is. I usually get only a few hands.
4
About me
About me
Blog Archive
Loading