Share your requirements and we'll get back to you with how we can help.

Thank you for submitting your request.
We will get back to you shortly.

Columnar Database—For Faster Analytics

Do your analytical queries involve computing aggregate values from thousands of records?

Compared to an RDBMS, a columnar database can run such analytical queries an order of magnitude faster.

Most analytical queries involve reading data from select columns of a row rather than entire rows. By storing data in columns, columnar databases allow you selectively access columns that are relevant to your query and achieve superior performance and speed to insight.

Enterprise Use Cases

As columnar databases can query a large amount of data faster and with greater predictability, they are ideal for read-intensive data warehousing and BI applications. Here’s a sample list of enterprise use cases:

  • Personalize sales pitch by accessing only the relevant data attributes
  • Analyze customer interaction with products
  • Obtain data on the impressions left by advertisements
  • Increase clicks and revenue through targeted advertisements
  • Respond efficiently to customer calls by referencing relevant data

Benefits of Going Columnar

  • Reduce persistent storage through compression
  • Optimize searches on trends and aggregates
  • Reduce I/O by selectively searching columns
  • Execute ad hoc queries easily
  • Speed up BI reporting and analytics
  • Write new columns efficiently
  • Lower data administration overhead
  • Scale with growing data volumes
  • Model data using flexible schema
  • Store structured, semistructured, and unstructured data

Popular Columnar Databases

  • Column-oriented key value store
  • Stores data on a cluster of nodes
  • Fault-tolerant
  • High availability via multiple seed nodes
  • Tuneable consistency
  • Linear scalability
  • Faster writes than reads
  • CQL query language similar to SQL
  • Ideal for web and online applications
  • Column family-oriented database
  • Uses HDFS and MapReduce
  • Automatic sharding of tables
  • High availability via standby master nodes
  • Strong record-level consistency
  • Linear and modular scalability
  • Enables random reads and writes
  • Supports ACID-level semantics per row
  • Ideal for complex data warehouse use cases

Featured App

Real-time reports on prospective employers

The client wanted an application that would help users refine their job search and make informed decisions based on an insider’s view of the companies listed. HBase offered the best bet for handling millions of company profiles and processing thousands of online articles featuring the companies every day.

The application uses Flume to collect and dump the articles on HBase. A Natural Language Processing module identifies the companies mentioned in the articles and matches them with those in the list. Over 5000 articles are matched in 40 minutes, enabling close to real-time reporting on the companies. The references within the articles are further subjected to sentiment analysis and a MapReduce job is run to generate reports on each company, giving users better insight into prospective employers.

    Want to speed up analytical queries?