Analysis on Key-Value Stores
Introduction
We have come into the era of Big Data. As (Atikoglu, Xu, Frachtenberg, Jiang, & Paleczny, 2012) stated, the need for efficiently storing large-scale data in scale-out companies at lower cost is dramatically increasing. Therefore Key-Value Store has occurred in popularity. (Fitzpatrick, 2004) has clarified that KV stores plays an essential role in lots of huge websites such as Facebook, Twitter, GitHub and Amazon. This paper reviewed 6 popular key-value stores and distinguished primary features, performance and availability of each. The six systems are Hyperdex, Dynamo, SILT, Project-Voldemort used by LinkedIn, Berkeley DB and LevelDB used by Google.
A key-value store database has a set of keys and values, and each value is associated with a key. The implementation of key-value store database is actually a distributed hash table (Stonebraker, 04/2010). Key-Value Stores(KV), which are normally known as a model of NoSQL databases, are widely deployed for data operation and management in purpose of enhancing Internet services due to better scalability, higher efficiency and more availability than existing relational databases systems (Wang, et al., 2014). Because KV stores sacrifice relation model in exchange for fast writing, and they are often featured with simple methods like “put()”,”delete()” and “get()”.
LevelDB
LevelDB is a very lightweight, simple key-value store with limited operational abilities. LevelDB will store data in
This paper gives insight into the design and implementation of dynamo, a key value store that throws light on availability and reliability. The availability is achieved by sacrificing consistency of data under certain failure scenarios.
In order to overcome these limitations, a new database model known as Not Only SQL (NoSQL) database emerged with a set of new features. The main objective of NoSQL is not to discard SQL, but to be used as an alternative database data model for new features [1] [2] [3]. NoSQL database increases the performance of relational databases by a set of new characteristics and advantages. In contrast to relational databases, NoSQL databases introduced an additional feature that provides flexible and horizontal scalability and taking advantage of new clusters. The rise of NoSQL provides cost-effective management of data in modern web applications. With its new features, NoSQL can be used with applications that have a large transaction, and require low-latency access to huge datasets, service availability while
Key-values stores: Strength: Simplest and easiest to implement. However, one of the weaknesses is that it doesn't perform well when querying or updating a particular value.
Key-value stores provide users simple yet powerful interface to data storage, which are often used in complicated systems. [2] LMDB is a framework that provides high-performance key-value storage
NoSQL is generally interpreted as “Not only SQL” [1]. It is a class of database management systems that are used for non-relational database. Typically NoSQL database does not use two-dimensional table to store data. The four generally categories of NoSQL database are key-values database, column databases, document databases, and graph databases [2]. NoSQL database is an indispensable part of big data. Most company choose NoSQL database because it yields better performance when compared to relation database. Many relational databases have been existing for more than 20 years, while most NoSQL databases have a history of less than 5 years. Because NoSQL databases are so young, they exposes lots of security issues. Many NoSQL databases are still focusing on adding features and improving performance, while strength security mechanism is still a low priority task. There were already two data breaches happened in companies that are using NoSQL databases (MongoHQ in 2013 and LinkedIn in 2012 [3]).
Column-based or wide column NOSQL systems: These systems segment a table by column into column families where every column family is put away in its own records. They additionally permit forming of data qualities. Chart based NOSQL systems: Data is spoken to as graphs, and related hubs can be found by navigating the edges utilizing way expressions Data with the accompanying attributes is appropriate for a NoSQL system firstly, Data volume becoming quickly secondly, Columnar development of data then, Document and tuple data Lastly, Hierarchical and graph data. Data with the accompanying qualities may be more qualified for a conventional relational database management system is On-Line Transaction Processing required atomicity, consistency, disengagement, toughness prerequisites (ACID) then Complex data relationship and Complex question prerequisites [2] Apache Cassandra are example of BigTable-style Databases Oracle Coherence, Kyoto Cabinet is case of of Key-Value Stores. mongo DB and Couch DB is example of document database and neo4j and flock dB is case of graph database. [4]. I have selected document base data modeling to compare and contras with relational data modeling.
The performance of Cassandra is very high, and contains a data model which is divided into row store with consistency.
Author says that conventional data storage systems (databases) work well with structured data, but crash under heavy workloads. He describes various distributed file systems like GFS (Google file system), HDFS (Hadoop distributed file system), and amazon S3(Simple Storage service). All these file systems handle unstructured data and support fault tolerance by data replication. Specially S3 provides good integration with other amazon services and provides big data processing capabilities to consumers at an affordable cost in a pas-as-you-go fashion. For storing non-structured and semi-structured data, the author provides solutions used in various corporates. He gives examples of BigTable used by Google and PNUTS used by Yahoo. One that caught my eye is the one proposed by Facebook, which is a hybrid data management system. It is hybrid in a sense that it combines features of a row-based and column-based database systems. Upon research I found that this new system actually enhances the performance of both query processing and load balancing [2]. The author then moves on to describe various available cloud vendors. All these Infrastructure as a service (IaaS) providers employ virtualization technologies to maximize
The paper provides background and related literature on the Big Data, studies the concept from Relational Database to current NoSQL database which have been fueled by the growth Big Data and importance of managing it. And surveys the Big Data challenges from the perspective of its characteristics Volume, Variety and Velocity and attempts to study how those challenges can be addressed.
A No-SQL (often interpreted as Not Only SQL) database provides a mechanism for storage and retrieval of data that is modelled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability.
NoSQL is able to address the massive traffic loads experienced by database servers at corporations that specialize in data processing like Google, Facebook and Amazon. NoSQL technologies can provide near constant availability, massive user concurrency and lightning fast responses. There are four primary NoSQL database implementation types being used today: document based, wide column (or columnar), key-value and graph. The different properties of SQL and NoSQL databases will be examined and an overview of each NoSQL implementation type along with an example will be given.
The modern RDBMS advancements are not capable of supporting unstructured information with ideal space necessity. The plan winds up plainly mind-boggling and is henceforth troublesome for designers. The requirement for unstructured information administration is so annoying with conventional RDBMS arrangements (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). Moreover, RDBMS turns out to be an exorbitant answer for creating light-footed web applications with direct information investigation necessities. NoSQL is developing as a proficient possibility in this situation, which connects the issues related with RDBMS innovation. The market development can credit to creative dispatches of NoSQL arrangements, and collective endeavors by NoSQL sellers and clients. The endeavors of organizations, to enhance their market offerings, are creating the request of NoSQL, as a back-end bolster (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). The emergence of agile software development is creating the demand for NoSQL (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). They offer users much more avenues to accept data in many different forms. NoSQL is adaptable as SQL but offers many more uses that can apply to many organizations.
In comparison to relational databases, NoSQL databases are better at providing superb performance while handling data of large scale and variable structures
In Nowadays, there are two major of database management systems which are used to deal with data, the first one called Relational Database Management System (RDBMS) which is the traditional relational databases, it deals with structured data and have been popular since decades since 1970, while the second one called Not only Structure Query Language databases (NoSQL), they are dealing with semi-structured and unstructured data; the NoSQL types are gaining their popularity with the development of the internet and the social media since April 2009. NoSQL are intending to override the cons of RDBMs, such as fixed
Amazon DynamoDB is NoSQL database, it is famous for its cloud base and speed. It is agility to many data models.