1. What is Hadoop MapReduce? A.) For processing large datasets in parallel across hadoop cluster, hadoop mapReduce framework is used. 2. What are the difference between relational database and HDFS? There are 6 major categories we can define RDMBS and HDFS. They areData TypesprocessingSchema on read Vs WriteRead/write speed cost Best fit use case RDBMSHDFS1. ... Continue Reading →
Data Scientist Roadmap
How I would relearn Data Science In 2024 to get a job: Getting Started: ⬇️ - Data Science Intro: DataCamp- Anaconda Setup: Anaconda Documentation Programming: - Python Basics: Real Python- R Basics: R-bloggers- SQL Fundamentals: SQLZoo- 六 Java for Data Science: Udemy - Java Programming and Software Engineering Fundamentals Mathematics:... Continue Reading →
Google Cloud Compute Engine vs App Engine
Google Cloud Platform provides a wide range of computing services that target broad categories of user needs. The Google Cloud Platform provides mainly 6 types of compute options: – App Engine Compute Engine Kubernetes Engine Cloud Functions Cloud Run VMware Engine Now let’s talk about some of these services in brief. Compute Engine The Compute... Continue Reading →
Google Cloud GCloud Commands Cheat Sheet
Google Cloud Config PURPOSECOMMANDList projectsgcloud config list, gcloud config list projectList projectsgcloud config list, gcloud config list projectShow project infogcloud compute project-info describeSwitch projectgcloud config set project <project-id>Set the active accountgcloud config set account <ACCOUNT>Set default regiongcloud config set compute/region us-westSet default zonegcloud config set compute/zone us-west1-bList configurationsgcloud config configurations listActivate configurationgcloud config configurations activate Google Cloud... Continue Reading →
AWS Certification
FREE AWS Certificate by Amazon that you can't miss in 20231. Getting Started with Data Analytics on AWS🔗https://lnkd.in/dwRhRAzM2. Practical Data Science on the AWS Cloud Specialization🔗https://lnkd.in/d3-3GZbG3. Getting Started with AWS Machine Learning🔗https://lnkd.in/dhAp-Vjh4. Introduction to Machine Learning on AWS🔗https://lnkd.in/detfDCWA5. Hands-on Machine Learning with AWS and NVIDIA🔗https://lnkd.in/dgGvATq26. AWS Fundamentals Specialization🔗https://lnkd.in/dSV9jhRz7. Building Modern Python Applications on AWS🔗https://lnkd.in/dQAinFGy8. AWS... Continue Reading →
System Design Blogs
30 Blogs to learn 30 System Design Concepts:1) Content Delivery Network (CDN): https://lnkd.in/gjJrEJeH2) Caching: https://lnkd.in/gC9piQbJ3) Distributed Caching: https://lnkd.in/g7WKydNg4) Latency vs Throughput: https://lnkd.in/g_amhAtN5) CAP Theorem: https://lnkd.in/g3hmVamx6) Load Balancing: https://lnkd.in/gQaa8sXK7) ACID Transactions: https://lnkd.in/gMe2JqaF8) SQL vs NoSQL: https://lnkd.in/g3WC_yxn9) Consistent Hashing: https://lnkd.in/gd3eAQKA10) Database Index: https://lnkd.in/gCeshYVt11) Rate Limiting: https://lnkd.in/gWsTDR3m12) Microservices Architecture: https://lnkd.in/gFXUrz_T13) Strong vs Eventual Consistency: https://lnkd.in/gJ-uXQXZ14) REST vs RPC:... Continue Reading →
Databricks lakehouse fundamentals
You Can Try Free Databricks lakehouse fundamentals recorded videos and certification. Link is below. https://lnkd.in/gXx2GUH8#lakehouse #databricks
Spark – BTS
Internal working of Apache Spark (don't forget to save it)𝐀𝐩𝐚𝐜𝐡𝐞 𝐒𝐩𝐚𝐫𝐤 works on the principle of in-memory computation making it 100x faster and a highly performant distributed framework.Here is a detailed explanation on what happens internally when a spark job is executed using the spark-submit command - 📋𝐒𝐭𝐞𝐩 1 : Client application initiates the execution... Continue Reading →