In essence Data Operations is based on DevSecOps or DevOps and applies these same ideas to the life cycle of…
DataLake The entire concept of a Data Operations Platform rests on top of a Data Lake. There is no simple…
Data Operations ‘DataOps’ has been inspired by the Agile-premised ‘Development Operations’ model. The ‘DevOps’ model which usually includes security (DevSecOps),…
The icebergth is hereth. Apache Iceberg is an open-source table format for large-scale data systems, designed to provide efficient and…
Data files or tables are parsed into smaller units. This is also called ‘partitioning’. A partition is usually performed against…
Parquet is a file format standard used in many enterprises. It allows the standardisation of files and provides a common…
Databricks and Snowflake overlap in many areas. Firms deploying both need to clearly demarcate the epics and use case journeys…
A straightfoward method to automate data ingestion from S3 buckets (data lake) to a Redshift (data warehouse) cluster; by using…
Data Ingestion Challenges Data ingestion can be complicated. There are usually a variety of data sources, including both SQL and…
AWS Glue is a meta data catalogue service with Extract-Transform-Load logic. The Glue catalogue is based on Hive and is…