Big data is a combination of structured, semistructured and unstructured data collected by organizations that can be mined for information and used in machine learning projects, predictive modelling and other advanced analytics applications.
In earlier days, to store our data we have used hard disk devices like floppy disk, cd player etc., that offers less storage but now the entire situation is different, we have started using Internet for storing our data in platforms like drive, cloud, etc., which enhances more storing capacity. Moreover smart devices such as smart watches that detects our health , smart TV etc., these are playing major roles, we are individually knowingly or unknowingly giving our data and sharing our thoughts with the internet.
Big data analytics :
Big data analytics refers to the use of advanced analytics techniques to analyse very large, diverse data sets that include structured , semi structured, and unstructured data from different sources and sizes ranges from terabytes to zetabytes.
Strucutred data :
Structured data refers to the data which is highly organized and can be easily searchable and it can be understood by machine language.. The programming language which is used to manage structural data is known as SQL(Structural Query Language). It is resides in RDBMS (Relational Database Management System) and usually exist in text forms only. It can be human or machine generated.
Examples : names, addresses, credit card numbers , transaction information, etc.,
Unstructured data :
It is very difficult to search unlike structured data. The amount of unstructured data is much larger than that of structural data. According to experts 80-90 percent of data in any organization is unstructured. This possess no SQL databases and it is not resides in RDBMS (Relational Database Management System). These data can be textual or non-textual and it can be human or machine generated .
Examples : video files, audios, images, email messages, etc.,
Semi- structured data :
Semi-structured data is the pattern of structured data.This posssess no SQL databasesIt does not resides in relational databases which is associated with the tabular structure of data modules or other forms of tables but even so contains tags or other indicator to separate connotation elements and impose hierarchies of records and field within data.
Example : HTML, EDI (Electronic Data Interchange), Email, NoSQL databases, RDF(Resource Description Framework).
Let’s see the parameters of Big Data Analytics, there are five key parameters :
The first of the five V’s of big data is volume, which refers to the amount of data available. As it relates to the initial size and amount of data acquired, volume is like the foundation of big data. Big data is defined as data with a sufficiently enormous volume. What constitutes big data, on the other hand, is subjective and will vary based on the processing power available on the market.
The velocity of big data is the next of the five V’s. It refers to the rate at which data is generated and transferred. This is a critical consideration for businesses that require their data to flow rapidly so that they may make the best business decisions possible.
Variety is the fifth of the five V’s of big data. The term “variety” refers to the wide range of data kinds available. An organisation may collect data from a variety of sources, each of which has a different value. Data can come from both inside and outside an organisation. The standardisation and sharing of all data collected is a difficulty in variety. The information gathered can be unstructured, semi-structured, or structured.
The fourth of the five V’s of big data is veracity. It relates to the data’s quality and accuracy. Data collected could be incomplete, erroneous, or incapable of providing actual, actionable information. Overall, veracity refers to the level of confidence in the data collected. Data can become cluttered and difficult to use at times. If the data is incomplete, a big volume of data can produce more confusion than insights.
The fifth and final V in the five V’s of big data is value. This refers to the value that big data may give, and it has to do with what businesses can do with the data they collect. The ability to extract value from big data is a must, as the value of big data rises in direct proportion to the insights that can be gleaned from it.