Pyspark Syntax Cheat Sheet

Quickstart Install on macOS: brew install apache-spark && pip install pyspark Create your first DataFrame: from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() # I/O options: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/io.html df = spark.read.csv('/path/to/your/input/file') Basics # Show a preview df.show() # Show preview of first / last n rows df.head(5) df.tail(5) # Show preview as JSON (WARNING: in-memory) df =... Continue Reading →

Azure Data Engineer Journey Learning links

Start your Azure journey here.....1. Azure Data Factory.https://lnkd.in/gEmpbyrMProject: https://lnkd.in/gFG2aCgy2. Azure Data bricks.https://lnkd.in/gvFwKxaNproject: https://lnkd.in/gFG2aCgy3. Azure Stream Analytics.https://lnkd.in/g35VbSTv4. Azure Synapse Analytics.https://lnkd.in/gCufskNC5. Azure Data Lake Storage.https://lnkd.in/gcEKjWsc6. Azure SQL database.https://lnkd.in/gmHxqxQX7. Azure Postgres SQL database.https://lnkd.in/grHWJvWZ8. Azure MariaDB.https://lnkd.in/gYSp7MZi9. Azure Cosmos DB.https://lnkd.in/g6jPZA36This is an excellent guide to become azure data engineer. No need to become expert. but learn how to work with... Continue Reading →

Create a website or blog at WordPress.com

Up ↑

Design a site like this with WordPress.com
Get started