Data Lakes 101: The What, Why and How of Data Lakes
A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data, and run different types of analytics for making informed decisions.
An Aberdeen survey reported that organizations who implemented a Data Lake outperform similar companies by 9% in organic revenue growth.
Data lakes enable new types of analytics like machine learning over new sources like log files, data from click-streams, social media, and internet connected devices. This helps to identify, and act upon opportunities for business growth faster by attracting and retaining customers, boosting productivity and proactively maintaining devices.
While data lake and date warehouse are both data storage repositories, the commonalities end right there. Depending on the requirements, an organization may require both - a data warehouse and a data lake, as they serve different needs, and use cases.
Examples where Data Lakes have added value include:
Date lakes have the ability to harness more data, from more sources, in less time, and empowering users to collaborate and analyze data in different ways leads to better, faster decision making.
Customers like NETFLIX, Zillow, NASDAQ, Yelp, iRobot, and FINRA are using AWS to run their business critical analytics workloads.
The AWS Lake Formation service allows developers to create a secure data lake within a few days which was earlier highly time consuming. Lake Formation is meant to handle all of the complications with just a few clicks.