
What is Big Data?
Large amount of data which is impossible for traditional data processing and management systems to handle effectively. Examples of traditional systems could be RDBMS, On-Premises Data Centers, Traditional batch processing systems, etc.
Big Data is mainly characterized by four V’s as follows:
(A) Volume – 2.5 Quintillion (2,500,000,000,000,000,000) Bytes of data are created every day. It is typically too large to be processed by conventional databases and software tools.
(B) Variety – Various forms of data. Data can come from a variety of sources, like social media, sensors, weblogs, and more.
1. Structured data – traditional database information
2. Unstructured data – text, images, log files, audio, video
3. Semi-structured data – CSV, XML, JSON
(C) Velocity – The speed at which data is being created every day.
(D) Veracity – Veracity refers to the trustworthiness of the data. Since big data can be derived from various sources, ensuring the reliability and accuracy of the data can be a concern.
Leave a comment