Load data from CSV file into Trino Table

To create a table in Trino and load data from a CSV file stored in Azure Data Lake Storage (ADLS), you’ll use Trino’s Hive connector to register the CSV file as a table. The Hive connector, backed by a Hive metastore, allows Trino to query files in ADLS. Below is a step-by-step guide to achieve... Continue Reading →

Exam DP-203: Data Engineering on Microsoft Azure Certification Study Blueprint

Theoretical Knowledge Azure documentation Data Lake Storage Gen 2 docs Storage account docs Azure Synapse docs Azure Data Factory docs Azure SQL Database docs Cosmos DB docs Azure Databricks docs Slowly changing dimensions Azure Synapse: Copy and Transform Data Azure Databricks: ETL with Scala Microsoft Learn SCD tutorial Raspberry Pi IoT Online Simulator Transact-SQL Language... Continue Reading →

Low Level System design articles

These articles will save you 50+ hours of hopping to resources and wasting time. 1) Scalability: https://lnkd.in/gq4hW9qx 2) Horizontal vs Vertical Scaling: https://lnkd.in/g8qcwRCy 3) Latency vs Throughput: https://lnkd.in/gDAx6QQd 4) Load Balancing: https://lnkd.in/gefSiXEJ 5) Caching: https://lnkd.in/gAp-9udf 6) ACID Transactions: https://lnkd.in/g-sjsMwX 7) SQL vs NoSQL: https://lnkd.in/gwCe58TU 8) Database Indexes: https://lnkd.in/gE_q5m_g 9) Database Sharding: https://lnkd.in/gFdNxDrU 10) Content Delivery... Continue Reading →

Insert, Update and Delete in PySpark

Here's the scenario: We had two data tables, Table_A and Table_B, each containing a "Name" and "Age" column. 📋💡Table_A:Name | Age------------S1 | 20S2 | 23-------------------------Table_B:Name | Age------------S1 | 22S4 | 27Our mission was to determine the differences between these tables and generate a Action between Update, Delete, Insert🚀 and here's the solution we came up... Continue Reading →

Create a website or blog at WordPress.com

Up ↑

Design a site like this with WordPress.com
Get started