The icebergth is hereth. Apache Iceberg is an open-source table format for large-scale data systems, designed to provide efficient and…
Data files or tables are parsed into smaller units. This is also called ‘partitioning’. A partition is usually performed against…
Data products are the end result of file or data movements to the cloud; ETL; processing; de-duplication; curation and storage…
In simple terms we can identify the differences between Data Lakes and Data Warehouses. Data Lake: A data lake is…
Digital Transformation Digital transformation not a magic solution nor a buffet of word salads. DT is roughly defined as the…
A typical Technology Stack for a Data Lake. S3 as the Golden Source. Snowflake as a corporate Data Share with…
(ETL engine in the above could be AWS Glue) There are various ways to define performance and what that means. …
Iceberg Cometh Open table formats, such as Apache Iceberg, enable scale-out data warehousing directly on a data lake. This architecture…
A data lake is a centralized repository that allows a firm to store structured and unstructured data at any scale.…
Traditional Data Product Management Federated data management and data product builds and sharing has little to do with traditional data…