Abstract— Big data is a significant subject in modern times with the rapid advancement of new technologies for example, smartphones, pc/laptops, game consoles, that all in some way store information. Big companies require a place to not only store all the data that is coming in but to also analyze it for specific purposes and at the fastest speed manageable. There are many different providers out there who provide this service, this paper will talk about one way the company Google handles data using their own special made platform.
1. INTRODUCTION
In modern times, the amount of data being stored is terrifically large. Companies must deal with such abundance of data on a daily basis in both storing and analyzing as fast as they can. One such company that not only store data is Google, they also analyze data from each user using their product. The platform used by google for this database management called BigQuery, which runs in the cloud and provides real time information. In this survey, the inner working of BigQuery is glossed over to show how this platform manages to do the job it is supposed to accomplish.
2. WHAT IS BIG DATA
Big data is a vast amount of information, both structured and unstructured, unstructured is information that is not simple to interpret by traditional databases and structured information is the opposite, which is not too difficult to interpret data made form text or image. Structured or unstructured, the main business of big data is categorized
Big Data has many challenges and opportunities associated with it, which necessitates us to rethink on aspects such as data management in order to attain desirable outputs. The next generation of BDA lies in its data management and its associated systems, principles and platforms. This will indeed make Big Data in creating a new wave of technological advancements.
There is currently a conundrum facing experts in the field of Big Data. The struggle is the ability to perform large-scale data analysis and the impracticality of using relational database processing languages to handle the information that is collected/processed. Specifically, the growth of data, the sheer volume that must be stored in databases, processed by cloud analytic and queried by applications has led to a growth in the data capacity the needs to be handled. Unfortunately, this exponential growth has exceeded the hardware and
Today ‘Big data’ is more popular than before, the performance of a database is becoming extremely important, including family life, school studying, office work and all business. Providing reliable and faster database services is the goal of organizations including any business, schools, and
Managing and analyzing big data is a huge task for all organizations of all sizes and across all industries. If a business’s plan to implement a data management tools there is a need for a more realistic way of capturing information about their customers, products, and services. Mining data is often in the terabytes and organizations need to be able to quickly analyze that data and then pull appropriate information needed to make managerial decisions. Further, with the insurgence of social media, smart devices and click-stream, data is generated daily on global networks through interactions. The use of data management technologies allow a company to interface unstructured data and structured data to gleam information that is usable for business managers to make sound business decisions, improve sales and to decrease operating costs. Big data integration and analysis has evolved for organizations to store, manage, and manipulate vast amounts of data then provide the appropriate information when it’s needed to meet business objectives.
Big data is an emerging term which has been noticed since it influences our daily life gradually. The big data is a large concept that is vague because different people look at big data phenomenon from a different perspectives, it is not easy to make a precise definition (Moorthy, et. al., 2015). The definition of big data is a matter of debate, however, a typical reference is to the collection, management, and analysis of massive amounts of data (McNeely & Hahm, 2014).According to George et. al. (2014), big data include Internet clicks, mobile transactions, user-generated content, and social media and content from sensor networks or business transactions, such as sales queries and purchase transactions. These procedures are significant to
Understanding what big data means is really simple.” It is being generated by everything around us at all times. Every digital process and social media exchange produces it. Systems, sensors and mobile devices transmit it” (Big Data Analytics).Big data is being produced by everyone and every day that finding ways way to manage this data is becoming a challenge. It arrives from multiple sources or touch points such as websites, social media or apps on smart phones at a high velocity, volume and variety. “All kinds of technologies or approaches including mobile devices, remote sensing technologies, software logs, wireless sensor networks, social media etc. are used by organizations to collect big data. (issue, 2013)” Now that the meaning of ‘big data’ is clear, it’s important to know that this information is useless unless it’s processed properly with the right tools. To extract meaningful value from big data companies spends fortune; it requires optimal processing power, analytics capabilities and skills.
Web search engines (Google, Amazon, and Yahoo) are the first to face the problem of big volume of data to handle in real time. Therefore, they are the first to develop big data management tools and make them available to open sources communities [8]. Gartner [8] uses the “3 Vs” to describe big data:
Big data is an extensive collection of structured and unstructured data. It is a modern day technology which is applied to store, manage and analyze data that are not possible to manage, store and analyze by using the commonly used software or tools. Since all of our daily tasks are overtaken by the modern technologies and all the businesses and organizations are using internet system to operate, the production of data has increased significantly in past
As Big Data problems evolve, each application have its own characteristics with respect to their data and analysis process. Firstly, besides the huge amount of historical data, streaming data plays an important role. For instance, GPS ground stations do monitor and predict geological events on earthquakes generates lots of real time data which needs streaming data processing. Automatic trading systems in stock market needs dynamic
Big data is not a hype, but it is the future. The big data industry continues to advance, and big data service providers are making it easier for companies to work with big data in driving their businesses. Progressively, greater volumes and varieties of data will be incorporated with more business processes to support better decision making and greater insight. Moreover,
Big Data is the data that surpasses the processing limit of the customary database frameworks. The data is huge, moves with more speed and does not fit the structure of the conventional database frameworks.
TITLE A Big Data is fast becoming a ubiquitous term in the world of computers – but what does it actually mean? Explain the fundamental principles of Big Data and discuss the impact it is having, and may continue to have, on modern computing. What challenges does the model bring and in what ways can these be resolved?
Data is a powerful weapon as well as a resource. Having data does not make you powerful but what you do with it makes all the difference. Companies like Amazon, eBay and Netflix are already using data to predict user behavior and utilizing that to increase their revenue. But processing data in real time is not an easy task. The data today has great volume, is veracious in nature and is increasing at an enormous rate and hence has been given the term Big Data. There is a constant research going on to find a solution to process such huge amount of data in real time.
This study is focused on HBase database which is a column-oriented NoSql database. HBase is Apache’s open source database that is modeled after Google’s BigTable technology. It uses Java as the API and is developed on top of the Hadoop distributed file system (HDFS) to store and process large quantities of data, maintaining reliability and fault tolerance. This database is being used by many big enterprises including Facebook, Twitter and Yahoo to store and process large quantities of data in efficient and cost effective manner.
There are lots of opportunity associated to big data that help any organization to handle their large amount of data, like in financial sector it store data related to finance, healthcare sector it store health related patient records, doctors detail and medicine ,medical equipment related details . In retail sector it is also used [5]. Web/social media/mobile companies also use it for storing their user detail and data like their likes, search pattern, calling and messaging records. Manufacturing and government sectors also use it.