Back to Portfolio
Nitanshu Joshi

Nitanshu Joshi

Data Scientist | Machine Learning Engineer | AI Engineer

Transforming business problems into AI-based applications

Professional Summary

Data Scientist with 2 years of experience deploying LLM/NLP apps, building vector-search recommendations, and partnering with stakeholders to deliver RAG, AI agents, and AWS ML solutions reducing churn by 15%.

Key Achievements

500+ New User Acquisitions
20% Engagement Boost
30% Turnaround Time Reduction
95% Fraud Detection Recall

Technical Skills

Languages
SQL Python PySpark Pandas NumPy Scikit-learn TensorFlow PyTorch
Tools & Technologies
Hugging Face Git Tableau Power BI MS Excel
Cloud & Deployment
AWS EC2 AWS ECS AWS Lambda AWS SageMaker AWS Bedrock MS Azure Streamlit Docker CI/CD MLOps
Machine Learning
Regression Classification Clustering Deep Neural Networks Random Forest XGBoost NLP
Statistical & Analytical
Predictive Modelling Statistical Analysis Time Series Forecasting A/B Testing Data Mining
Data Engineering
Spark Airflow Kafka MySQL PostgreSQL MongoDB ETL
GenAI
MCP LLM LLM Fine Tuning LORA QLORA PEFT AI Agents Quantization LangChain LangGraph CrewAI
Other Skills
Business Analysis Root Cause Analysis Issue Tree Framework

Work Experience

Data Scientist

GradMeet LLC | Oct 2024 – May 2025
  • Launched a Vector Search driven recommendation engine on AWS EC2, driving 500+ new user acquisitions and boosting engagement by 20% through data-driven personalization
  • Boosted satisfaction for 100+ users by integrating NLP-driven sentiment analysis and regex into feedback processing, enhancing personalized experiences
  • Built an internal RAG system to organize documents and enable fast, accurate retrieval
  • Partnered with Marketing and Finance teams to validate growth strategies using A/B testing and root cause analysis, directly supporting business objectives
  • Productionized analytics with AWS/Docker/CI/CD, improving reliability and time-to-insight for iterative experiments

Data Scientist

Indiana University Bloomington | Sep 2023 – Aug 2024
  • Enhanced spatial genomic data, increasing cluster quality by 18% (Silhouette) and 68% (Davies-Bouldin) using advanced statistical methods
  • Engineered a cutting-edge Graph Neural Network that integrated 6,000 genes and 23,000 drug–cell line pairs via Feed Forward Network, slashing prediction error (RMSE) by 7% and boosting model explainability (R-squared) by 2%
  • Streamlined validation workflows for 15,000+ RNA-seq samples, reducing manual review time by 15% and accelerating three concurrent lab research projects
  • Translated complex 2D gene patterns into actionable insights and presented key findings to stakeholders, supporting data-driven decision-making in biomedical research

Associate Instructor

Indiana University Bloomington | Jan 2023 – May 2023
  • Mentored 30+ students in machine learning workshops, resulting in a 2% average performance improvement across participants

Research Data Analyst

Biostatistics Consulting Center, Indiana University Bloomington | Aug 2022 – Dec 2022
  • Spearheaded analysis of 35,000+ COVID-19 PCR test records, pinpointing and remedying process bottlenecks to slash lab turnaround time by 30%
  • Built Power BI dashboards for real-time test tracking, cutting testing backlogs by 40% and saving over 10 staff hours weekly with Python automation
  • Ensured 99% diagnostic accuracy by rigorously validating laboratory processes with ANOVA and t-tests, ensuring reliable results for critical healthcare decisions

Education

MS in Data Science

Indiana University, Bloomington, IN, USA | May 2023

Post Graduate Diploma in Data Science

IIIT Bangalore, India | Aug 2021

Bachelor's in Computer Science

Symbiosis International University, Pune, India | May 2019

Featured Projects

ArXiv Multi-Agent Research Assistant

AI-powered system for researching top papers using CrewAI and LangChain

CrewAI LangChain Python
Real-time Sentiment Analysis Pipeline

Streaming data pipeline for Reddit sentiment analysis using Kafka and Spark

Kafka Spark NLP
Credit Card Fraud Detection

XGBoost model deployed on Azure with 95% accuracy

XGBoost Azure ML
RAG Chat Application

PDF chat application using locally hosted LLMs and ChromaDB

RAG ChromaDB Ollama

Certifications & Publications