#Databricks #SQL for Data Engineering ,Data Science and Machine Learning.✅ The whole SQL lesson for DataBricks is provided here.1️⃣ spark sql sessions as series.https://lnkd.in/g77DE36a2️⃣ How to register databricks community editionhttps://lnkd.in/ggAqRgKJ3️⃣ What is DataWarehouse? OLTP and OLAP?https://lnkd.in/gzSuJCBC4️⃣ how to create database in databricks?https://lnkd.in/gzHNFZrv5️⃣ databricks file system dbfs.https://lnkd.in/dHAHkqd36️⃣ Spark SQL Table , Difference between Managed table and... Continue Reading →
What is Surrogate keys and how can we handle during data warehouse migration?
What is surrogate key? Surrogate key is nothing but unique identifier assigned to each row in a dimension table. Isn’t simple? Yes. For one, this might raise few questions, because what about primary key, its also unique in nature and assigned to each row. Then, how it differs from primary key of a table, what... Continue Reading →
Big Data Learning Plan
Step by Step Plan to learn Big Data (All Free resources Included)1. Learn SQL Basics - https://lnkd.in/g9NEJMVE SQL will be used at a lot of places - Hive/Spark SQL/RDBMS queriesJoins & windowing functions are very important2. Learn Programming/Python for Data Engineering - https://lnkd.in/gr6fFPdU Learn Python to an extent required for Data Engineers.3. Learn the Fundamentals... Continue Reading →
COMPLEX SQL QUERIES
Questions on SQL are based on following two tables, Employee Table and Employee Incentive Table. Table Name : Employee EMPLOYEE_ID FIRST_NAME LAST_NAME SALARY JOINING_DATE DEPARTMENT 1 John Abraham 1000000 01-JAN-13 12.00.00 AM Banking 2 Michael Clarke 800000 01-JAN-13 12.00.00 AM Insurance 3 Roy Thomas 700000 01-FEB-13 12.00.00 AM Banking 4 Tom Jose 600000 01-FEB-13 12.00.00... Continue Reading →
Important Services for Data Engineers provided by AWS, Microsoft Azure & GCP
AWS Lambda :AWS Lambda is a serverless compute service allowing running code without provisioning or managing servers, paying only for actual usage.Amazon Redshift :Amazon Redshift is a fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze vast amounts of data using SQL and existing BI tools.AWS Glue :AWS Glue is... Continue Reading →
Database Indexes
Spend 2 minutes on this post, and you'll gain a good understanding of Database Indexing, which might take much longer to learn otherwise!Imagine managing a large-scale database:Database Size: 𝟱𝟬𝟬 𝗚𝗕Average Query Search Time Without Index: 𝟱 𝘀𝗲𝗰𝗼𝗻𝗱𝘀Number of Records: 𝟱𝟬 𝗺𝗶𝗹𝗹𝗶𝗼𝗻𝗟𝗲𝘁'𝘀 𝗱𝗶𝘃𝗲 𝗶𝗻𝘁𝗼 𝘁𝗵𝗲 𝘄𝗼𝗿𝗹𝗱 𝗼𝗳 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲 𝗜𝗻𝗱𝗲𝘅𝗶𝗻𝗴:1️⃣ 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲 𝗜𝗻𝗱𝗲𝘅𝗶𝗻𝗴?A database index is... Continue Reading →
Azure Learning Resources
Top 9 Azure #dataengineering tools to learn for FREE.....start your Azure journey here.....1. Azure Data Factory.https://lnkd.in/gEmpbyrMProject: https://lnkd.in/gFG2aCgy2. Azure Data bricks.https://lnkd.in/gvFwKxaNproject: https://lnkd.in/gFG2aCgy3. Azure Stream Analytics.https://lnkd.in/g35VbSTv4. Azure Synapse Analytics.https://lnkd.in/gCufskNC5. Azure Data Lake Storage.https://lnkd.in/gcEKjWsc6. Azure SQL database.https://lnkd.in/gmHxqxQX7. Azure Postgres SQL database.https://lnkd.in/grHWJvWZ8. Azure MariaDB.https://lnkd.in/gYSp7MZi9. Azure Cosmos DB.https://lnkd.in/g6jPZA36This is an excellent guide to become azure data engineer. No need to... Continue Reading →