Semantic Search Tutorial

Copernicus Services Semantic Search App Interface in Dark Mode

A full semantic search tutorial about:

  • data mining with requests and beautifulsoup
  • preprocessing in pandas
  • chunking the document text in smaller paragraphs of the right size for the ML model
  • creating embeddings for each chunk
  • calculating the mean embedding for each document
  • saving data as gzipped json (small file size & easy and fast to read in js with pako.js)
  • creating a static web app based on transformers.js on GitHub Pages

App here