Databricks and Snowflake overlap in many areas. Firms deploying both need to clearly demarcate the epics and use case journeys…
A straightfoward method to automate data ingestion from S3 buckets (data lake) to a Redshift (data warehouse) cluster; by using…
Data Ingestion Challenges Data ingestion can be complicated. There are usually a variety of data sources, including both SQL and…
AWS Glue is a meta data catalogue service with Extract-Transform-Load logic. The Glue catalogue is based on Hive and is…
Data flowing into the Data Lake obviously changes. Data table changes are captured by CDC or change data capture. Changes…
Amazon Redshift is a petabyte scalable columnar data warehouse that is very efficient in storing raw data and collecting data…
Data products are the end result of file or data movements to the cloud; ETL; processing; de-duplication; curation and storage…
In simple terms we can identify the differences between Data Lakes and Data Warehouses. Data Lake: A data lake is…
Digital Transformation Digital transformation not a magic solution nor a buffet of word salads. DT is roughly defined as the…
A typical Technology Stack for a Data Lake. S3 as the Golden Source. Snowflake as a corporate Data Share with…