Abstract
Relational database management system (RDBMS) have used for many decades. However, these databases are facing several challenges with the requirements of many organizations like high scalability and availability. They cannot deal with huge amount of data and requests efficiently. As a result, famous organizations such as Google and Amazon shift from RDBMS to NoSQL databases. NoSQL databases have several features that overcome issues. This paper explains features, principles, and data models of NoSQL databases. However, the main focus of this paper is to compare and evaluate two of the most popular NoSQL databases which are MongoDB and Cassandra.
RDBMS, NoSQL, MongoDB, Cassandra
1. Introduction
In recent years with the new breed of
…show more content…
If you change your application, you have to change your database schema as well. This fixed schema used by RDBMS make them impossible to quickly incorporate new types of data. Also, it is a poor fit for unstructured and semi-structured data [14, 17].
• Since queries require attributes from more than one table result in a join operation, join decreases the performance of RDBMS. Joins and locks have a negative impact on performance of RDBMS [4].
• RDBMS provide limited replication techniques. In fact, these databases are based on consistency instead of availability [14].
In order to understand NoSQL databases, chapter two will describe the most significant features of NoSQL databases for solving the above mentioned requirements. Since the relational data model is not suitable for some use cases, chapter three will explain structure and flexibility of different data models offered by NoSQL databases. Chapter four will compare two of the widely used NoSQL databases which are MongoDB and Cassandra.
2. BASE vs. ACID
Both RDBMS and NoSQL databases use principles that are derived from CAP theorem. According to this theorem, following guarantees can be defined [8,12, 20]:
• Consistency. All the users see the same data at any time.
• Availability. When certain nodes fail, the other nodes in the system are able to continue and operate.
•
Most database ensure that they follow the ACID (Atomicity, Consistency, Isolation, Durability) properties to ensure the transactions are reliable. The studies show that stores that guarantee ACID property generally have poor availability. As a result, Dynamo focusses on application that have weaker consistency.
Relational database contains data records that do not have a preset of relationships, permitting the user to define his or her relationship when accessing the data. Since users have much control over the data being accessed, relational databases can perform a variety of tasks. Such as defining the database; querying the database; adding, editing, and deleting data from the database; modifying the structure of the database; securing data from public access; communicating within the network; and exporting and importing data (Murthy, 2008).
In order to overcome these limitations, a new database model known as Not Only SQL (NoSQL) database emerged with a set of new features. The main objective of NoSQL is not to discard SQL, but to be used as an alternative database data model for new features [1] [2] [3]. NoSQL database increases the performance of relational databases by a set of new characteristics and advantages. In contrast to relational databases, NoSQL databases introduced an additional feature that provides flexible and horizontal scalability and taking advantage of new clusters. The rise of NoSQL provides cost-effective management of data in modern web applications. With its new features, NoSQL can be used with applications that have a large transaction, and require low-latency access to huge datasets, service availability while
The wider insight about relational and non-relational database performance, particularly MySQL and Hadoop was gathered through the literature survey. By read textbooks, reviewing academic journals and research papers, I founded a gap in the performance of relational database compare to the non-relational.
Relational databases play a major role in making many apps and programs work. They provide an easy way to store large amounts of data in a consistent, non duplicating, and maintainable way to be used by developers for analytical or software use ("Advantages of a relational database", n.d.). However, more and more applications and companies with a tremendous amount of data such as search engines, social networks, and e-commerce sites have been requiring a level of speed and scalability that relational databases can not provide ("Why NoSQL?", n.d.). NoSQL is a name given to a quickly growing type of database known as non-relational databases, which are being used to store and manage huge amounts of structured, semi-structured, and non-structured data known as "Big Data" ("Why NoSQL?" n.d.). With the advent of social networks and apps with millions of users, the rate of growth of non-structured and semi-structured data is exponential, and the value in being able to quickly traverse it, analyze it, and use it for development is also growing quickly (McGuire, Manyika, & Chui, 2012).
The relational model organizes data into multiple tables and assigns a value to attributes in each row and column, with a unique key for each row. Other tables can use these keys to access the data without reorganizing the table.
For example, Facebook which is the most popular social networking website recently announced their adoption of a NoSQL based graph data store for efficient storage of user data. In other words, NoSQL has already made its way into the enterprise. However, just like every other widely accepted technology, NoSQL has its own set of advantages and disadvantages. It is important for an enterprise to quantify the pros and cons of a particularly new database technology against the already existing solutions based on their custom requirements. For example, legacy enterprise applications may require extensive community support from their database vendors. Moreover, traditional relational database vendors such as Oracle have already established themselves for providing excellent support. On the other hand, NoSQL has been rapidly growing since the past few years and is consistently evolving in terms of big data handling, data warehousing and lesser complexity. Hence, there is a need to study the current market of data stores based on the most popular NoSQL data stores and how well they fair against the widely accepted traditional database systems. This requires a study of the commonly used NoSQL data stores.
STRUCTURE OF DATA: The data structure of a relational database comprises of table structure. Every table is identified by a unique name or label. The data tables are described as the collection of rows and columns. Each row of the table is known as the record and each column is known as the field of the specific data table. All the data sets are well organized and logical linked to each other through definite and unique relationships. A table, therefore can also be defined as the “structured collection of relationships”. The fundamental aim of developing No SQL database systems is to easily and effectively handle vast quantity of data or information in advanced web-scale applications. In order to achieve this purpose, the No SQL systems are designed as the schema-free database systems. There are different modes to define the No SQL databases that typically depend on the requirements of the data that has to be managed. The main No SQL data structures include column database, key-value store database, document store database, graph database and
NoSQL databases had made for unraveling the Big Data issue by utilizing a distributed system to bring out excellent performance in data storage and retrieval at very large-scale. At this scale, pieces of the system often fail and NoSQL is created to handle these failures (Chow, 2013) (Ron, Shulman-Peleg, & Bronshtein, 2015). Various companies have espouse different sorts of non-relational databases, ordinarily alluded to as
The paper provides background and related literature on the Big Data, studies the concept from Relational Database to current NoSQL database which have been fueled by the growth Big Data and importance of managing it. And surveys the Big Data challenges from the perspective of its characteristics Volume, Variety and Velocity and attempts to study how those challenges can be addressed.
A No-SQL (often interpreted as Not Only SQL) database provides a mechanism for storage and retrieval of data that is modelled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability.
The demands on database technology have been ever expanding since its introduction in the 1960’s. Today traffic on the internet requires that millions upon millions of records be stored and queried each second. Data must be highly available and quickly retrievable. These requirements put together have given rise to new forms of database technologies collectively called “NoSQL” or “Not Only SQL”. NoSQL eschews the strict guidelines that govern the creation and function of traditional relational databases. These guidelines are put aside in order to rise to the new demands of an increasingly interconnected world. The rigorous standards and data definitions of relational databases give way in order to provide the ability to rapidly
The modern RDBMS advancements are not capable of supporting unstructured information with ideal space necessity. The plan winds up plainly mind-boggling and is henceforth troublesome for designers. The requirement for unstructured information administration is so annoying with conventional RDBMS arrangements (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). Moreover, RDBMS turns out to be an exorbitant answer for creating light-footed web applications with direct information investigation necessities. NoSQL is developing as a proficient possibility in this situation, which connects the issues related with RDBMS innovation. The market development can credit to creative dispatches of NoSQL arrangements, and collective endeavors by NoSQL sellers and clients. The endeavors of organizations, to enhance their market offerings, are creating the request of NoSQL, as a back-end bolster (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). The emergence of agile software development is creating the demand for NoSQL (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). They offer users much more avenues to accept data in many different forms. NoSQL is adaptable as SQL but offers many more uses that can apply to many organizations.
In the initial stages of evolution of databases, relational databases systems was designed as a solution to the problems of flat file databases. A relational database stores data in multiple table. This technique helped to overcome the issues like data duplication, data noise and inconsistency which ensured that the data is entered and stored only once. Later as the data grew in size, it became a challenging task to handle such a significantly large amount of data. Key features like high data velocity, data variety, data volume and data complexity are few important reasons which the traditional database systems failed to handle successfully. As a result NoSQL came into
In this paper, we will review one of the graph database (Neo4j), which the graph database is part of the emerging technology that is called NoSQL and compared it with one of the traditional relational databases (MySQL). MySQL, it is being another name for Relational Databases and it has been used for a long time period until now. However, with the emergence of Big Data there was clearly a need for more flexible databases. Facebook 's Graph Search use Neo4j, a graph database, is an application which clearly displays how relationships need to be modeled in a more efficient and sophisticated manner than using conventional relational models. In this paper, we will make a comparison between MySQL and Neo4j based on the features like ACID, replication, availability and the language that is used in both of them.