📝 Research :https://ojitha.blogspot.com.au
for my lengthy articles.
Scala - AWS EMR Serverless
AWS EMR Serverless is a cost effective AWS Service to which you can submit Spark Scala jobs.
AWS CI/CD pipeline to Copy files to S3 bucket
Sometime it is necessary to copy files to AWS S3 via CI/CD build pipelines.
Notes on Introduction to Advanced Bash Usage
While I am going through the following, the youtube talk and it’s associated presentation, my hand-ons were recorded here. It is recommended to go through the basics first. You can also refer to the Bash Ref Manual for more information.
Pandas type conversion
Sometimes we need to remove unnecessary data and save the column in the right format in the Pandas data frames.
AWS Glue run locally
This blog explains how to create an AWS Glue container1 to develop PySpark scripts locally. I’ve already explained how to run the Glue locally using Glue Development using Jupyter.