By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email
No need to pay just yet!
About this sample
About this sample
Words: 1047 |
Pages: 2|
6 min read
Published: Mar 19, 2020
Words: 1047|Pages: 2|6 min read
Published: Mar 19, 2020
The data that an origination used to have in their severs and hard disks was just data but suddenly a new term popped up which was “Big data”, big data is the term used to describe massive volume of structured or unstructured data that is so large that it is nearly impossible for the traditional database management software to process. In more simple words data which is very large in size and yet it’s growing exponentially everyday. The reason behind the sudden rise of big data is that nowadays there are so many digital platforms from where the data can be collected than few decades ago. Also the price to store and manage data has become cheaper and it will continue to decrease further. Cloud computing platform has enabled data access from any location with great downloading speed. All these factors have served as fuel for the expansion in the use and analysis of big data.
To explain the meaning of Big data Laney presented 3V’s: Volume, velocity and Variety. The V’s imply that data size is enormous, data is created rapidly and data will be available in various formats and will be collected from many resources. Nowadays, the definition of 3 V’s is considered insufficient to explain the big data therefore validity, veracity, value, variability, venue and vagueness were added to make some standard definition of Big Data.
Whenever a user utilizes any network or digital service, they leave digital footprints that can be collected as data. These digital services could be anything that can store and manage excessive data like social media, E-commerce websites, online payment portals, search engines, digital maps, etc. For example, a person searches ‘Good mobile phones under 7000 rupees’ on Google, now Google will create a new thread for this users digital footprints (if the user has logged in with their Google account on the browser then this thread will be linked to their Google ID if not then the thread will be uniquely identified with the users IP address or from the device ID of the device being used to access Google). If the user now goes through the results of the search and opens up an online shopping website Google adds new data about which website is preferred by the user to the thread. If the user ever searches again for any product, the results of search will have the preferred website on the top. So, basically Google uses the data gathered from users to show targeted search results and advertisements. In addition to that, if the user buys the phone from the shopping website, the website will recommend the user to buy a screen guard, phone case and other accessories for the mobile phone.
Facebook goes one more step further, facebook monitors every single click of the user, they even collect what they deleted after typing something in comment box or status update section. They also collect data from your private chats and what kind of topics you follow. If anyone wants to see the data collection and analysis algorithm of facebook in action, they can simply have a fake chat with their friends about buying a particular product (like a calculator or shoes) or consuming a particular service (like fastfood or cloud storage) on Facebook website, facebook messanger, Whatsapp or Instagram (All these social media platforms are owned by facebook!) and the participants of the fake chat will notice very soon that their news feed is full of ads related to that product or service.
Now, one can only wonder how much data is being generated every day, the massive size of data creates the problem of extracting useful data from the pile of data clusters. Different methods have been introduced to overcome this problem as much as possible, some of the methods are sampling, incremental learning, density based approaches, data condensation, grid based approaches, distributed computing and divide and conquer. Furthermore the data stored by the organisations needs to be protected from getting into the hands of unauthorized personals like hackers or crackers.
If we are managing records of an origination that has limited amount of data and the manipulation in data is in a manageable quantity then we can use any DBMS software to accomplish data management and query processing. But if we are talking about data giants like Facebook, Google or Youtube where data is so huge and complex that when we try to implement traditional RDBMS concepts we face failure of data flow structure and loss in efficiency, we needed something to handle the exponential growth of data. Big data came as a solution to this problem and a necessity for the analysis of clusters of data. Data analysis is a process in which certain tools are used to transform, filter and remodel data to reach a conclusion for the given situation; the accurate analysis of data is proportional to decision making quality which in turn leads to increase in efficiency and reduction in failure.
Big data has been proven to be a powerful asset for data giants, it allows access to enormous volume of data. By the analysis of this data businesses conquer the market by making more efficient strategies in less time as it strengthens the decision making capabilities. Also, it can be used to detect errors and frauds quickly.
Unstructured data: Data without a proper form or structure is termed as unstructured data, the core problem with it is that they are difficult to understand by nontechnical users and it is also difficult to process it since there is no particular direction of data flow. Example: images, text, videos, meta data of files, social media data, etc.
Semi-structured data: It is the data which is neither raw nor structured in tables and records. They might be arranged in tree pattern which will it easier to analyse them. Examples are JSON and XML type files contain semi-structured data.
Structured data: The data which is stored in form of tables in a record is known as structured data, it is easy to input, process and analyse such data. Example: Student record of a school managed using RDBMSBig data enables the integration of both structured and unstructured data.
Browse our vast selection of original essay samples, each expertly formatted and styled