Introduction to NoSQL

Databases served the need for software applications to process organized data. A database that follows the relational model and stores data in a tabular format is known as a relational database. A relational database has rows and columns and a unique key for each data point. Relational Database Management Systems (RDBMS), software was first introduced by IBM and then carried forward by the likes of Oracle, Sybase, Microsoft SQL Server, and MySQL. The “relational” part of the name comes into play because of mathematical relations between the tables that comprise the database. This is the crux of a relational database – it works on some relation between data. Web applications, e-commerce applications and most other applications work properly using RDBMS like MySQL. However, complications began to arise when enterprises were faced with humongous amount of data. This data was not exactly organized – it comprised of values, images, and even audio and video. Proper processing of this data was not possible with the existing RDBS software that required data in an organized form. The phenomenal growth of the web, and especially cloud computing, required businesses to manage increasingly large volumes of such unorganized data. Paradigms like Internet of Things (IoT) and Industrial Internet of Things (IIoT) involved processing an unprecedented amount of data (called Big Data) that was made available across a distributed (geographically or otherwise) system and does not fit neatly into a relational data model. To boot, it was imperative to process this data real time, and existing RDBMS could not handle this requirement efficiently.

Data scientists and engineers pondered up over this problem, and went into huddles and conferences to come up with a new and advanced database software designed to meet 21st century data management demands. The term “NoSQL” was introduced to describe the progressive data management engines that contained some RDBMS-like qualities, but went beyond the limits that currently shackle traditional SQL-based databases.

The NoSQL Database
So what exactly is NoSQL? It is a database that is used to refer a non-SQL or non relational database. NoSQL provides a mechanism for storage and retrieval of data other than tabular relations model used in relational databases. NoSQL is an approach to databases that represents a shift away from traditional relational database management systems (RDBMS). It does not use standard tables to store data; instead it stores data in various different formats and ways. Broadly speaking, NoSQL databases reject the constraints of the relational model, including strict consistency and schemas. They retain many features of the relational model but amend the underlying technology in significant ways. As most modern businesses have information processing demands that long ago outgrew legacy relational systems, their IT professionals are exploring how NoSQL solutions can better manage big data and real-time web applications data needs. NoSQL databases have demonstrated that they can handle real-time / line of business applications as well as analytic and enterprise search systems. For this reason, many enterprises have already elevated NoSQL as a primary data provider alongside traditional RDBMSs.

Many people think that NoSQL is a misnomer – it defines what a database is not rather than what it is, and it focuses attention on the presence or absence of the SQL language. Although it’s true that most non-relational systems do not support SQL, actually it is variance from the strict transactional and relational data model that motivated most NoSQL database designs. Anyway, the name has since stuck, and a growing number of enterprises are migrating to it.

Here are a few characteristics of NoSQL:

  • They serve as an online processing database, so that it becomes the primary data source / operational data store for online applications
  • store and retrieve data from many formats like key-value stores, graph databases, column-family stores, document stores, and even rows in tables
  • Use data stored in primary source systems for real-time, batch analytics, and enterprise search operations
  • Offer a flexible schema design that can be changed without downtime or service disruption
  • Accommodate structured, semi-structured, and non-structured data.
  • Easily operate in the cloud and exploit the benefits of cloud computing

A point to note is that a NoSQL database doesn’t completely discard all features / functions that define a relational database. In fact, a few NoSQL databases provide a SQL-like query language that helps ease the transition from the RDBMS world. This is a big plus for organizations that need to migrate from traditional RDBMs to NoSQL.

There are four primary types of NoSQL databases – Document databases, Key Value databases, Column Family Data stores or Wide column data stores, and Graph databases. Since this is only an introductory article on NoSQL, we will not delve more into it.

There are many open source and commercially available NoSQL databases today. A few of the more popular names include MongoDB, Amazon DynamoDB, Azure Cosmos DB, Google Cloud Datastore, however there are many others. While they may offer some features that are unique to each, all NoSQL databases offer the following common features:

  • Schema agnostic: A database schema is the description of all possible data and data structures in a relational database. With a NoSQL database, a schema isn’t required, giving you the freedom to store information without doing up?front schema design.
  • Non-relational: Doesn’t use relational database mathematical theory
  • Commodity hardware: NoSQL databases are distributable on commodity hardware (off-the-shelf hardware). In other words, you do not need any specialized hardware, saving on cost
  • Highly distributable: Distributed databases can store and process a set of information on more than one device. With a NoSQL database, a cluster of servers can be used to hold a single large database.

NoSQL as a Career Choice

As IoT and IIoT usage grows, there is a growing demand for data engineers who know how to analyze Big Data. It is also required in Artificial Intelligence (AI) and Machine Learning (ML). If you are adept in managing NoSQL databases, you should be having a bright career in progressive and forward thinking enterprises. There are many institutes in Pune (a city where C-DAC was born) and other cities of India that offer training in NoSQL. It will be worth paying a visit to one such training centre if you are serious about a career in NoSQL.