Projects
Over time, I have got my hands dirty in projects from multiple realms. Here is a snapshot.
SRT : Shiprocket || WHJ : Whitehat Jr || EXL: EXL Services
Professional
Open Hackathons Mentorship
- Kiefer AI Open Hackathon 2026 | Athens, Greece - Mentored team Big O No on GPU-accelerated NLP for Greek digital media.
- Big O No
- Built Hellenic Insight, a real-time Greek news NLP platform for political bias and ideological framing detection.
- Developed an asynchronous ingestion and processing pipeline using
Crawl4AI,Scrapy,Kafka,PySpark,Apache Spark,MongoDB, andReact. - Fine-tuned
Qwen 3.6 35BwithLoRA/Unslothand served it throughvLLM, achieving an approximately 10% F1 improvement with strong gains in right and center-right classification quality.
- Big O No
- AI for Science Australian Hackathon 2026 | Melbourne, Australia - Mentored teams WA-Ag Futures and Team_MCCG on GPU-accelerated AI for science workloads.
- WA-Ag Futures
- Built WA Ag-LandInsight, a paddock-level agricultural intelligence platform for Western Australia’s wheatbelt.
- Processed approximately 2TB of spatial data in about 14 minutes, achieving 20x faster preprocessing and reducing model training from hours to about 2 minutes.
- Applied
RAPIDS,cuDF,cuML,CuPy,XGBoost,PyTorch,NIM/Nemotron, and dashboarding patterns to deliver decision-ready insights for farmers and policymakers.
- Team_MCCG
- Optimised physics-based ML workflows for chemical property prediction across LDA+, QPep, and related computational chemistry models.
- Achieved a 3.0x speedup for the LDA+ model and a 2.7x reduction in QPep training time through PyTorch refactoring, data-loading improvements, and multi-GPU scaling.
- Used
PyTorch,Nsight Systems, and profiling-guided software architecture improvements to connect multiple chemical ML models through a prototype GUI platform.
- WA-Ag Futures
- India Open Hackathon 2025 | Mumbai, India - Mentored team IMD_AIML on GPU-accelerated AI for weather and climate applications.
- IMD_AIML
- Built an AI-for-weather workflow to estimate surface and maximum air temperature from satellite and in-situ meteorological datasets for heat-risk applications.
- Combined IMD observations, AWS/synoptic station data, LST, OLR, NDVI, DEM, and LULC features to improve spatial estimates beyond station-only coverage.
- Explored
PyTorch,TensorFlow,XGBoost,Transformers,ESRGAN/SRGAN,xarray,netCDF4,GeoPandas, and A100 GPU resources for satellite downscaling and trustworthy temperature modeling.
- IMD_AIML
- India GSI Open Hackathon 2025 | Bengaluru, India - Mentored team GenSureAI on agentic AI for insurance claims processing.
- GenSureAI
- Won the hackathon with a multi-agent RAG workflow for medical insurance claim summarization, discrepancy detection, and adjuster decision support.
- Built pipelines to flag non-pertinent medical bills, summarize claim timelines, and support claims Q&A, with the team estimating more than 50% reduction in analysis time.
- Used
NVIDIA NIM,NeMo Retriever,NeMo Agent Toolkit/AIQ,LangChain,LangGraph,Milvus,Streamlit,FastAPI,PyMuPDF,Tesseract,FAISS, and models includingLlama 3.3 Nemotron,Palmyra Med, andKimi K2.
- GenSureAI
- FPT AI Open Hackathon 2025 | Hanoi, Vietnam - Mentored team Jarvis AI on multimodal agentic AI for customer support workflows.
- Jarvis AI
- Built a multichannel AI assistant that centralizes customer requests from channels such as email, social platforms, hotlines, Zalo, Messenger, and WhatsApp into a unified support workflow.
- Integrated FPT AI Marketplace models into the application and built a
GraphRAGpipeline for document-based Q&A, business-wide search, summarization, and synthesis. - Used
LlamaIndex,ReactJS, browser-extension context, and models includingQwen2.5-Coder-32B-Instruct,DeepSeek-R1-Distill-Llama-70B,Qwen3-32B, and SaoLa Llama variants to improve answer accuracy and support smarter human handoff.
- Jarvis AI
Generative AI
Algo/Tools/Libraries Used: langchain,GPTs, OpenAI, Vector DataStores,whisper,gpt-4,LLaMA
[SRT] Engineered “SR-Copilot”, a RAG-based application, to streamline e-commerce and SR product interactions by efficiently addressing user queries and suggesting pertinent products and support ticket updates. Link
[SRT] Prototyped an automated pipeline for processing customer support calls, rating interaction quality, transcribing content, identifying key pointers, and assessing buyer sentiment. (ASR→NLP→TTS)
Impact: Reduced the number of agent call service requests by 7.6%.[SRT] Created a LLM-based chat app for non-technical stakeholders, to answer & execute database queries over Snowflake and Redshift in natural language.
Tabular Machine Learning
Algo/Tools/Libraries Used: pandas,polars,xgboost,catboost,scikit-learn,rapids,dask,AWS Sagemaker,S3,ELK
- [SRT] An ML pipeline that predicts a customer’s propensity to reject an e-commerce order at delivery. Link
Impact: Saved over 35MM in shipping costs for over 250+ D2C sellers. - [SRT] An ML solution combining NLP and tabular data analysis to identify fraudulent seller behaviors.
Impact: Achieved 18% reduction in fraud cases of multiple categories (KYC, Weight Fraud etc). - [WHJ] A classifier model to identify promising retention leads for sales pitch prioritization.
Impact: Enhanced retention performance by 8%. - [EXL] A suite of 12 ML models for an insurance client that aims to mark a potential customer’s:
- propensity to respond and convert to a marketing campaign.
- estimated chargeable premium and loss ratio, if converted.
Implemented an automated 1-window customer selection and offer allocation framework on top of models.
Impact: The process boosted performance by +27%.
Natural Language Processing
Algo/Tools/Libraries Used: PyTorch, BERT, Word2Vec, NER, nltk, flashtext, gensim, rapidfuzz,transformers,sentiment analysis,semantic search, clustering, text similarity
- [SRT] An unsupervised learning framework to enhance, standardise and validate over 6MM delivery addresses.
Impact: A SoTA address intelligence agent that can successfully parse over 100K localities from 500+ Indian districts. - [SRT] Developed an address deduplication and syntax correction pipeline, whole suite led to +20% increased deliveries.
Impact: Successful identification of customers belonging to same household, leading to enhanced customer segmentations.
- [SRT] Engineered an intelligent system for instant product categorization, categorising over 1.3 million uniquely named products.
Impact: Categorisation helped in better targeting for our other products.
Clustering and Segmentation
Algo/Tools/Libraries Used: DBSCAN,k-means,sklearn,ANOVA,noSQL,mongo
- [SRT] Created a custom seller segmentation for 100K+ D2C sellers registered on the platform.
- [SRT] Designed a novel clustering method to segregate over 100 million unique buyers for tagging and behavioral analysis.
- [WHJ] An SQL-based segmentation for improved student-teacher mappings, for better learning outcomes.
Forecasting and Geo-Spatial
Algo/Tools/Libraries Used: fb-prophet,time-series analysis, ARIMA, kepler, geopandas,scipy,statmodels,spatial-clustering
- [SRT] Probed hyper-local e-commerce geo-coordinate data to identify ideal locations for dark stores.
- [SRT] Undertook demand forecasting for fast moving goods for enhanced control over inventory and order management.
Miscellaneous
Algo/Tools/Libraries Used: SQL,Redshift,Tableau,Snowflake,Excel,Google Workspace APIs
- [ALL] Data Pipelines to establish data for training and to update the features for inference.
- [ALL] Custom dashboards and reports in GSheets and Tableau to track performance of models and raise flags for changes.
- [EXL] Converted unoptimised SAS-based code to R and Python modules for faster processing and cheaper executions by saving license costs.
- [EXL] Optimised SQL data processing pipelines from Oracle to Redshift.
Personal
Generative AI
- An audio-to-text ML app that converts expenses to JSON and builds a custom report. Link