IT - Data and Artificial Intelligence Intern
September 2025 — March 2026- Built a geospatial data mining pipeline using Computer Vision and Google Earth Engine to extract infrastructure data from satellite maps; utilized Levenshtein distance to ensure high-accuracy data loading into BigQuery.
- Developed a high-precision OCR platform powered by Tesseract.js and Docling to automate and validate data extraction from power billing statements. Implemented a bounding box annotation system to target specific document regions, integrating PyMuPDF for seamless document processing and LLM-ready data preparation.
- Conducted R&D on multi-agent workflows using LangGraph, Flowise, and Pydantic AI to develop an in-house Agentic Data Analyst platform. This tool automates data extraction from Excel, CSV, and BigQuery to generate visualizations and key insights. Successfully deployed the system on GCP using Docker and Google Artifact Registry, integrating Okta authentication to ensure secure, employee-only access.





