Before jumping on to bigdata lets try to understand about what is data? Yeah, its all about the data before we start on how big it is. Data are units of information, it may be a number, text, image, audio, video and so on. For example the data of an employee are his name, age, designation and years of experience, etc.
In modern technology every data is an assert and piece of valuable information, which means the data gives sale / profit to an individual or a business. Since the data becomes more and more valuable then the data collection begins to race.
So what is bigdata ?
My friend asked meThis is where people gets confused having a big amount of data with you is not considered to be a bigdata. There are some law which needs to be satisfied to considered.
What exactly is big data?
The definition of big data is data that contains greater variety, arriving in increasing volumes and with more velocity. This is also known as the 3 Vs.
|Volume||The amount of data matters. With big data, you’ll have to process high volumes of low-density, unstructured data. This can be data of unknown value, such as Twitter data feeds, clickstreams on a web page or a mobile app, or sensor-enabled equipment. For some organizations, this might be tens of terabytes of data. For others, it may be hundreds of petabytes.|
|Velocity||Velocity is the fast rate at which data is received and (perhaps) acted on. Normally, the highest velocity of data streams directly into memory versus being written to disk. Some internet-enabled smart products operate in real time or near real time and will require real-time evaluation and action.|
|Variety||Variety refers to the many types of data that are available. Traditional data types were structured and fit neatly in a relational database. With the rise of big data, data comes in new unstructured data types. Unstructured and semistructured data types, such as text, audio, and video, require additional preprocessing to derive meaning and support metadata.|
3Vs kick-off with social media apps
Most of the social media platforms like twitter, facebook and others are handing the bigdata. So we know the law now that a data will be considers as bigdata if it satisfies the 3Vs.
Lets consider facebook !
- When you logged into your account there will be lot of feeds from your list, the volume of the data will be huge and you will be scrolling the page the whole day which satisfies the Volume. Facebook data volume is huge.
- On a special occasion, you are posting you memories to your friends and followers. Immediately you will get the likes and comments which happens in fraction of micro mile seconds which satisfies the Velocity.
- Variety is obvious, in fb we are having enormous variety of data to play with like posting a text, image, videos and sharing feeds and much more which satisfied the 3rd V.
Use of bigdata
Main use of bigdata is for data analytics and processing, bigdata helps business for marketing and also user centric performance. Google ads, Amazon products suggestions are gather from the bigdata processing and strategies. If you search something on google, its obvious that you will be getting relevant suggestion / advertisments on fb, amazon and others too. Bigdata facilitates the data science for processing machine learning with the available dataset.
Learnig and exposure
There are variety of bigdata tools available in market to learn like hadoop, spark, druid, hive, etc. Learning these will give a good exposure on the IT sector as a data analyst / data scientist which involves mostly related to ETL (Extract, Transform, Load) activities and also analytics for decision making.