Github Repos for Developer

Github Repos for Developer that will reveal thousands of free resources. 1 The Algorithms: https://lnkd.in/dpzAd_vE2 freeCodeCamp : https://lnkd.in/diBh4dVy3 Freely available programming books : https://lnkd.in/d2bwBmU94 100 Days of ML Coding : https://lnkd.in/dz8dDr9U5 project-based tutorials: https://lnkd.in/dSiiKHXK6 Public APIs : https://lnkd.in/dvGamaUM7 Coding Interview University : https://lnkd.in/dhY5pCxH8 Developer Roadmap: https://lnkd.in/dJ4wAG2B9 Computer Science: https://lnkd.in/d2uFXzPz10 30 Seconds of Code : https://lnkd.in/dwDNk_VX11... Continue Reading →

Learn Apache Spark Step by Step

Learn Apache Spark Step by Step (Follow the Sequence)1. Getting started with Apache Sparkhttps://lnkd.in/gFRpe3-D2. A quick introduction to the Spark APIhttps://lnkd.in/g8Y3tdhX3. Overview of Spark - RDD, accumulators, broadcast variablehttps://lnkd.in/g7fepuFF4. Spark SQL, Datasets, and DataFrames:https://lnkd.in/g3iZp7zk5. PySpark - Processing data with Spark in Pythonhttps://lnkd.in/gBnh6PAi6. Processing data with SQL on the command linehttps://lnkd.in/ggnxDaUu7. Cluster Overviewhttps://lnkd.in/guCQnJnv8. Packaging and deploying... Continue Reading →

Data Engineering Blogs

75 Engineering blogs worth reading to improve your system design:High Scalability https://lnkd.in/eQ4eDw4EEngineering at Meta https://lnkd.in/e8tiSkEv AWS Architecture Blog https://lnkd.in/eEchKJif All Things Distributed https://lnkd.in/emXaQDaS The Nextflix Tech Blog https://lnkd.in/efPuR39b LinkedIn Engineering Blog https://lnkd.in/ehaePQth Uber Engineering Blog https://eng.uber.com/ Engineering at Quora https://lnkd.in/em-WkhJd Pinterest Engineering https://lnkd.in/esBTntjq Lyft Engineering Blog https://eng.lyft.com/ Twitter Engineering Blog https://lnkd.in/evMFNhEs Dropbox Engineering Blog https://dropbox.tech/... Continue Reading →

Insert, Update and Delete in PySpark

Here's the scenario: We had two data tables, Table_A and Table_B, each containing a "Name" and "Age" column. ๐Ÿ“‹๐Ÿ’กTable_A:Name | Age------------S1 | 20S2 | 23-------------------------Table_B:Name | Age------------S1 | 22S4 | 27Our mission was to determine the differences between these tables and generate a Action between Update, Delete, Insert๐Ÿš€ and here's the solution we came up... Continue Reading →

๐Ÿš€๐ŸŒ ๐—›๐—ผ๐˜„ ๐˜๐—ผ ๐—•๐˜‚๐—ถ๐—น๐—ฑ ๐—ฎ๐—ป ๐—˜๐˜ƒ๐—ฒ๐—ป๐˜-๐——๐—ฟ๐—ถ๐˜ƒ๐—ฒ๐—ป ๐—ฆ๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ๐—น๐—ฒ๐˜€๐˜€ ๐—˜๐—ง๐—Ÿ ๐—ฃ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ ๐—ผ๐—ป ๐—”๐—ช๐—ฆ

๐—˜๐—ง๐—Ÿ => ๐—˜๐˜…๐˜๐—ฟ๐—ฎ๐—ฐ๐˜ | ๐—ง๐—ฟ๐—ฎ๐—ป๐˜€๐—ณ๐—ผ๐—ฟ๐—บ | ๐—Ÿ๐—ผ๐—ฎ๐—ฑEvent-Driven Serverless ETL Pipelines is a data processing architecture that is used to process large amounts of data in real-time.Here data is processed as soon as it is generated, rather than being stored and processed later.This allows for faster processing times and more efficient use of resources.Here are the... Continue Reading →

FREE DATA ENGINEERING COURSES ON CLOUD

Data engineering is the backbone of the modern data-driven world. Itโ€™s the meticulous process of designing and building systems for collecting, storing, and analyzing data at scale. However, finding comprehensive projects and courses that are also free can be a challenge. To bridge this gap, Iโ€™ve created a list of five end-to-end data engineering courses... Continue Reading →

Pyspark UDF

#PySpark_UDF_with_the_help_of_an_example๐Ÿ‘‰ ๐Ÿ‘‰ ๐Ÿ‘‰ The most important aspect of Spark SQL & DataFrame is PySpark UDF (i.e., User Defined Function), which is used to expand PySpark's built-in capabilities. UDFs in PySpark work similarly to UDFs in conventional databases.โœ We write a Python function and wrap it in PySpark SQL udf() or register it as udf and... Continue Reading →

Delete Duplicates in Pyspark Dataframe

#ScenarioThere are two ways to handle row duplication in PySpark dataframes. The distinct() function in PySpark is used to drop/remove duplicate rows (all columns) from a DataFrame, while dropDuplicates() is used to drop rows based on one or more columns. Hereโ€™s an example showing how to utilize the distinct() and dropDuplicates() methods- First, we need... Continue Reading →

Create a website or blog at WordPress.com

Up ↑

Design a site like this with WordPress.com
Get started