SCD 2 with Pyspark

Implementing slowly changing dimension (SCD type2) in Pyspark earlier we saw in SQL https://lnkd.in/dH6j3MWE# Define the schema for the DataFrameschema = StructType([ StructField("id", IntegerType(), True), StructField("name", StringType(), True), StructField("salary", IntegerType(), True), StructField("department", StringType(), True), StructField("active", BooleanType(), True), StructField("start", StringType(), True), StructField("end", StringType(), True)])Employee_data = [ (1,"John", 100, "HR",True,'2023-10-20',None), (2,"Alice", 200, "Finance",True,'2023-10-20',None), (3,"Bob", 300, "Engineering",True,'2023-10-20',None), (4,"Jane",... Continue Reading →

Mastering SCD Type 2: Handling Historical Changes in SQL

๐Ÿ“Š Mastering SCD Type 2: Handling Historical Changes in SQLSlowly Changing Dimensions (SCD) are a crucial part of data warehousing and analytics. Among the different types of SCD, Type 2 is particularly interesting as it allows us to track historical changes in dimensions such as customer data, product information, and more.In a recent project, I... Continue Reading →

Data Engineering Questions – 1

if your #dataengineering experience grows more than 5 years you expect these questions in your interviews.....1. Explain me the architecture of spark?2. How does internals job execution happens?3. what will happen when you fire the Spark Job?4. How did you tune your jobs?5. Explain optimizations you have used in your project?6. How did you connected... Continue Reading →

Chatgpt for Interviews

ChatGPT can help you land your dream job twice as fast.Here are 10 powerful ChatGPT prompts will 10X your interview chances.1. Customizing Your ResumeChatGPT prompt: "Can you make changes to my resume to fit the [Job Title] role at [Company]?Here's the job description: [Paste Job Description], and resume: [Paste Resume]."2. Creating a Professional SummaryChatGPT prompt:... Continue Reading →

Database Indexes

Spend 2 minutes on this post, and you'll gain a good understanding of Database Indexing, which might take much longer to learn otherwise!Imagine managing a large-scale database:Database Size: ๐Ÿฑ๐Ÿฌ๐Ÿฌ ๐—š๐—•Average Query Search Time Without Index: ๐Ÿฑ ๐˜€๐—ฒ๐—ฐ๐—ผ๐—ป๐—ฑ๐˜€Number of Records: ๐Ÿฑ๐Ÿฌ ๐—บ๐—ถ๐—น๐—น๐—ถ๐—ผ๐—ป๐—Ÿ๐—ฒ๐˜'๐˜€ ๐—ฑ๐—ถ๐˜ƒ๐—ฒ ๐—ถ๐—ป๐˜๐—ผ ๐˜๐—ต๐—ฒ ๐˜„๐—ผ๐—ฟ๐—น๐—ฑ ๐—ผ๐—ณ ๐——๐—ฎ๐˜๐—ฎ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐—œ๐—ป๐—ฑ๐—ฒ๐˜…๐—ถ๐—ป๐—ด:1๏ธโƒฃ ๐—ช๐—ต๐—ฎ๐˜ ๐—ถ๐˜€ ๐——๐—ฎ๐˜๐—ฎ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐—œ๐—ป๐—ฑ๐—ฒ๐˜…๐—ถ๐—ป๐—ด?A database index is... Continue Reading →

AWS DE Questions

This post details AWS data engineering interview and highlights the most common concepts you can expect to be asked in interview processes.1. Start by providing a concise introduction to your professional projects, emphasizing your role as a data engineer.2. Share your knowledge of cloud platforms (AWS, GCP, Azure) as it pertains to data engineering.3. Discuss... Continue Reading →

TOP 50 SQl queries for interview

-- Q-1. Write an SQL query to fetch โ€œFIRST_NAMEโ€ from Worker table using the alias name as <WORKER_NAME>. select first_name AS WORKER_NAME from worker; -- Q-2. Write an SQL query to fetch โ€œFIRST_NAMEโ€ from Worker table in upper case. select UPPER(first_name) from worker; -- Q-3. Write an SQL query to fetch unique values of DEPARTMENT... Continue Reading →

Cloud Data Engineering Road Map

๐Ÿšถ๐Ÿป Cloud Data Engineering Road Map ๐Ÿƒ๐Ÿปโœ… Basic Version Control toolhttps://lnkd.in/gEqyhzZRhttps://lnkd.in/g_t2xKnGhttps://lnkd.in/gZT7QNjS โœ… Data Warehousing Conceptshttps://lnkd.in/gq99PDcp โœ… Core Pythonhttps://lnkd.in/gQpmSnM โœ… Spark SQLhttps://lnkd.in/gDcR5bwM โœ… Databrickshttps://lnkd.in/gSpKBWbJhttps://lnkd.in/gpbMg9nU โœ… Sparkhttps://lnkd.in/gtqRtTPvhttps://lnkd.in/gs2gkqRq โœ… Pysparkhttps://lnkd.in/gmkPpmAXhttps://lnkd.in/gh-_KzjE โœ… Delta Lakehttps://lnkd.in/gt6ggER6 โœ… Cloud ETL Tool + Storagehttps://lnkd.in/gTs8y4Ai โœ… Cloud MPP Warehousehttps://lnkd.in/gMTHCrNZ โœ… Databricks Unity Cataloghttps://lnkd.in/gH6Q2a5K๐Ÿ“• Learn , Lead and Make Leaders ๐Ÿš€. Happy Learning ๐Ÿ“–Follow ๐Ÿ‘‰... Continue Reading →

System Design Challenges

Get a good grasp on these 45 key problems, and you'll be ready for a whopping 95% of your System Design Interview challenges-๐„๐š๐ฌ๐ฒ 1. Design URL Shortener like TinyURL 2. Design Text Storage Service like Pastebin 3. Design Content Delivery Network (CDN) 4. Design Parking Garage 5. Design Vending Machine 6. Design Distributed Key-Value Store... Continue Reading →

Github Repos for Developer

Github Repos for Developer that will reveal thousands of free resources. 1 The Algorithms: https://lnkd.in/dpzAd_vE2 freeCodeCamp : https://lnkd.in/diBh4dVy3 Freely available programming books : https://lnkd.in/d2bwBmU94 100 Days of ML Coding : https://lnkd.in/dz8dDr9U5 project-based tutorials: https://lnkd.in/dSiiKHXK6 Public APIs : https://lnkd.in/dvGamaUM7 Coding Interview University : https://lnkd.in/dhY5pCxH8 Developer Roadmap: https://lnkd.in/dJ4wAG2B9 Computer Science: https://lnkd.in/d2uFXzPz10 30 Seconds of Code : https://lnkd.in/dwDNk_VX11... Continue Reading →

Create a website or blog at WordPress.com

Up ↑

Design a site like this with WordPress.com
Get started