Data Masking in Pyspark

Hide Credit card number:
Accept 16 digit credit card number from user and display only last 4 characters of card number
input :
1234567891234567

output :
************4567

We can use Py spark or python


Code In Pyspark:-
——————–
from pyspark.sql import SparkSession
from pyspark.sql.functions import substring

# Create a SparkSession
spark = SparkSession.builder.appName(“HideCreditCard”).getOrCreate()

# Sample input credit card number
input_cc_number = “1234567891234567”

# Hide all characters except the last four digits
hidden_cc_number = “************” + input_cc_number[-4:]

# Create a DataFrame with the hidden credit card number
data = [(input_cc_number, hidden_cc_number)]
df = spark.createDataFrame(data, [“Original_CC_Number”, “Hidden_CC_Number”])

# Display the result
df.show(truncate=False)


This code creates a PySpark DataFrame where the Original_CC_Number column contains the input credit card number, and the Hidden_CC_Number column contains the hidden credit card number with asterisks for all digits except the last four. Adjust the input_cc_number variable with the user input or integrate this code with user input functionality as needed.

Leave a comment

Create a website or blog at WordPress.com

Up ↑

Design a site like this with WordPress.com
Get started