In-Memory Data Grid (IMDG)

 

In my current project I am using SOSS (Scale out state server) which internally uses IMDG concept. And I found this as one of the good technique to save data in out of proc memory like controllers/actions permissions of various Role Groups in MVC applications, users Sessions’, not so frequent changing data of database, other application level data. So, the data base hit gets reduced and we read data from RAM only which is very fast. The application performance increases tremendously. Also, we can store as much data as we want; we need to have that much RAM only which is easily available and cheap. Below I am adding some more details. Also, I am aware of a demo in which lakhs (~8) of filtered records were displayed in console within few (~5) seconds through SOSS.

IMDGs has an small disadvantage, when the application starts and if at that time data is not loaded in IMDGs then the data gets loaded in there and hence small time is spent in loading the data in IMDGs.

Table of Contents:

  1. Scale out shortcomings
  2. What is an In-Memory Data Grid
  3. Top Benefits of IMDGs
  4. Why IMDG?
  5. What is IMDG?
  6. Factors for Accelerating IMDGs
  7. IMDGs Products (some of them)
  8. Architectural points

Scale out shortcomings:

In Scale out we have many more than two servers and client request are transferred to one server using load balancing. And these servers are not connected to each-other.

Scaling out:

  • Offers excellent scalability.
  • But is challenging to implement:
    • How to share data across servers –as servers are not co  nnected to each other
    • How to maintain high Availability – If one serer fails how to read data present on that server.

Scale out (1)

Scale out (2)

In-Memory Data Grids overcome these issues.
What is an In-Memory Data Grid (Aka “Distributed cache”, “Distributed Data Grid”)?

  • A new “vertical” storage tier :
    • Runs as middleware software.
    • Ads missing storage layer to boost performance.
    • Uses out-of-process memory.
    • Avoids repeated trips to a backing store.

Vertical storage

A new “horizontal” storage tier:

  • Allows data sharing among servers.
  • Scales performance and capacity.
  • Ads high availability.
  • Can be used independently of backing storage.

Horizontal storage

 

Top benefits of IMDGs

  1. Faster access time for business logic state or database data – Data storing in IMDGs from Database or through some business logic is quite fast.
  2. Scalable throughput to match growing workload and keep response times low – You can load as much data as the RAM of your server farm can store and also data reading speed from these IMDGs are nearly constant because the data is stored in Key/value pair (Hashing) in the RAM and hence data reading speed is independent of Data stored in the RAM.
  3. High availability to prevent data loss if a grid server (or network link) fails – IMDGs have Data Replication techniques so due to this same Data copy resides on many IMDGs nodes. So, when one node fails still we can read the data from other nodes.
  4. Shared access to data across the server farm – Data is shared between the servers in the farm. So, all the servers are in connected mode always, we can hit any sever and can reach to any other server in no-time.
  5. Fast data analysis for quickly and easily mining data using “map/reduce” – It uses “map/reduce” technique so data analysis times is reduced significantly (like in Hadoop).
  6. Global data access across multiple sites and cloud.

IMDB Vs DBMS (note: Throughput doesn’t have any impact on IMDG while DBMS suffers as throughput increases.)

Why IMDG?

  • For large volume transaction, there is need for faster access to data to meet business demands.
  • Traditional database have IO operation which are heavy and this limitation initiated need for in memory technologies.

What is IMDG?

  • It is not
    • In memory relational database
    • NOSQL database
    • Normal relational database
    • But a different breed of data store
    • All data is stored in RAM, in distributed mode
    • Servers can be added/removed without disruption
    • Data model is object based (Key-value)
    • Failover and high availability
    • Low response time

Factors for Accelerating IMDGs:

  • Industry demand and competitive Advantage.
  • Hardware revolution.
  • Data complexity.
  • Cloud computing.

IMDGs Products (some of them):

  • Oracle Coherence (Java)
  • VMWare Gemfire (Java)
  • ScaleOut Stateserver (SOSS) (.Net)
  • Alachisoft Ncache (.Net)

Architectural points:

  • Data distribution IMDG has techniques like Partitioned, Distributed, and Replicated etc for data distribution.
  • Custom Serialization: Serialization options are offered other than Java serialization that give you higher performance and greater flexibility for data storage, transfers, and language types e.g. Gemfire offers PdxSerializer, Coherence offered PoF etc.
  • Data Querying/ OQL:  These solutions generally provide SQL-like querying language that allows you to access data stored. OQL syntax is very similar to SQL but they have other differences as well.
  • Continuous Queries: IMDB provides a feature that combines a query result with a continuous stream of related events in order to maintain an up-to-date query result in a real-time fashion. This capability is called Continuous Query, because it has the same effect as if the desired query had zero latency and the query were being executed several times every millisecond.
  • Failover and High Availability: When one server crashes, the client connections automatically connect to the other servers.
  • User Authentication: When applications aren’t using sensitive information, we can go by simple UserName/Password based basic authentication. But anything beyond basic authentication like LDAP, PKCS or Kerberos will depend upon the product in use. So, the product (like Coherence, VMWare Gemfire etc.) should be pluggable for respective authentication technique.