This is my tech workbook. Please visit https://ojitha.blogspot.com.au for my official Tech blog.

Semantic search with ELSER in Elasticsearch

May 11, 2024

Elastic Learned Sparse EncodeR(ELSER) is a retrieval model trained by Elastic that enables you to perform semantic search to retrieve more relevant search results.

Summary of ELSER process

  1. install ELSER v2: Only once (DevOPs will do for your)
  2. Create source index where you can insert all your documents
  3. Create target index
  4. Create ingestion pipeline
  5. Reindex process to create embeddings
  6. Ready to do semantic search using text expansion queries

I created this blog post on docker to demonstrate Linux-optimised ELSER v2. The Elasticsearch version is 8.11.1.

More…

Kafka PySpark streaming example

July 18, 2023

The diagram shows that the Kafka producer reads from Wikimedia and writes to the Kafka topic. Then Kafka Spark consumer pulls the data from the Kafka topic and writes the steam batches to disk.

arcitecture of the streaming application

More…

Terraform For each iteration

July 8, 2023

This is to explain Terraform for each looping technique. In this example, 3 buckets are created to demonstrate the looping idea.

create 3 S3 buckets

In the first step, we will create the above 3 buckets starting from 0.

More…

Spark to create a table in AWS Redshift

June 13, 2023

In this post, Spark reads the data from a CSV file to a DateFrame and saves that DataFrame as a Redshift table.

Spark to Redshift

In addition to that, I’ve explained how to create a table in Postgres, use Jupyter magics and plot a diagram.

More…