My Story

Sayali image

Ever since the early 2010s, when amazon.com was booming, I have been fascinated with its product recommendations. I have been curious about the science behind these recommendations. The way massive amount of data is stored, analyzed, visualized, and used to make predictions seemed riveting to my 10-year-old brain. This led to my interest in data analytics and machine learning.

I completed my Bachelors in Computer Engineering with honors in Artificial Intelligence and Machine Learning in July 2023. Presently, I’m pursuing the MS in Information Systems course at Santa Clara University.

My experience includes internships at Druva, Josh Software and Bolt IoT. At Druva, I enhanced data pipelines by building data transformation models using Data Build Tool(dbt), improved Snowflake warehouse performance using query optimization, developed real-time monitoring dashboards, and built machine learning models for predictive analytics. As a Senior Year Project Intern - ML Engineer at Josh Software, me and my team developed an ML model for automatic syntax error detection and correction in Python code. At Bolt IoT, I implemented a Smart Parking System aimed at optimizing the process of finding an empty parking spot in a multi-story parking lot.

Apart from studies, I’m an avid reader, especially when it comes to contemporary fiction and mystery thrillers. My favorite authors are J. K. Rowling, Dan Brown, and Agatha Christie. I also like to paint and am an amateur guitarist. After a long day, I like to unwind by baking cakes or cookies.

Currently, I am actively seeking internship & full time opportunities in the dynamic field of analytics and machine learning. If you are in search of a dedicated individual with a strong background in data analytics, machine learning, and database management, lets connect!

Education

  • Santa Clara University - Leavey School of Business, Santa Clara, CA

    Master of Science in Information Systems

    September 2023 - June 2025
    GPA: 3.89/ 4.0

    Relevant Coursework: Data Analytics (Python), Machine Learning, Database Management Systems, Big Data Modeling and Analytics, Information Systems Analysis and Design

  • Savitribai Phule Pune University, India

    Bachelor of Engineering in Computer Engineering, Honors in Artificial Intelligence and Machine Learning

    August 2019 - June 2023
    GPA: 4.0/ 4.0

    Relevant Coursework: Statistics, Data Structures, Database Management, Data Science and Big Data Analytics, Artificial Intelligence, Machine Learning, Deep Learning, Natural Language Processing

Experience

  • Lightmatter

    Data Systems Intern

    June 2025 - Present
    • Evaluating Snowflake AI tools to improve pipeline efficiency and enable advanced analytics in reporting workflows
    • Built a Sigma data app backed by a Dagster-dbt-Snowflake pipeline to automate headcount allocation and reporting, eliminating manual Excel workflows and improving efficiency by 35%
    • Developed a Streamlit dashboard to analyze photonics engineering R&R (Reproducibility and Repeatability) test data for enhanced experiment validation and insight generation
  • Keywords Studios

    AI Research Intern

    April 2025 - June 2025
    • Applied prompt engineering to design context-aware prompt-response pairs for training AI agents across diverse scenarios
    • Developed prompts using chain of thought techniques to improve AI agent reasoning and output accuracy
  • Santa Clara University

    Research Assistant

    January 2025 - April 2025
    • Analyzed the impact and uncertainties surrounding the Inflation Reduction Act by leveraging Python and advanced NLP techniques, including ChatGPT, to extract insights and identify policy-related uncertainties from economic documents
  • Leveraged ArcGIS to visualize and analyze spatial data, assessing the impact of forest treatment efforts on wildfire reduction

  • Druva

    Data Analytics Intern

    June 2024 - December 2024
    • Created a Streamlit app for automatic Snowpipe creation, enabling real-time ingestion of AWS S3 bucket data into Snowflake warehouse, improving operational efficiency by 12% and reducing manual data handling time by 5%
    • Designed and implemented dbt models for data ingestion, data transformation and data modeling, streamlining ETL processes from external sources and improving data pipeline efficiency by 16%
    • Built interactive Sigma dashboards for KPI tracking, reducing service downtime by 25% through proactive monitoring
    • Leveraged Zapier to automate reporting workflows, reducing manual intervention and ensuring timely updates for the team
    • Deployed Activepieces on AWS EC2 as a POC to optimize workflow automation, demonstrating potential for a 30% reduction in operational costs by minimizing dependency on third-party automation tools
    • Extracted, transformed, and integrated email data with existing propensity models and performed email sentiment analysis to improve churn prediction accuracy by 15%
  • Vanguard

    Practicum Project - Data Science Intern

    January 2024 - June 2024
    • Analyzed discrepancies between self-designated benchmarks and actual equity mutual funds performance
    • Employed data mining and web scraping with Selenium to extract and clean financial data from various sources
    • Applied prompt engineering to extract benchmark information from yfinance data, achieving an 80% success rate in benchmark extraction
    • Implemented SARIMAX model for time-series analysis, forecasting and performance evaluation
    • Provided actionable insights for optimizing benchmark criteria, empowering Vanguard to make informed investment decisions
  • Josh Software, Pune, India

    Senior Year Project Intern - Machine Learning Engineer

    August 2022 - May 2023
    • Automated the process of syntax error detection and correction in Python codes with an accuracy of 72%
    • Applied ETL, data preprocessing, feature engineering and data wrangling to prepare GitHub data for analysis
    • Engineered an encoder-decoder transformer model with LSTM cells for effective code analysis
  • Inventrom Private Limited - Bolt IoT, Remote

    IoT Intern

    January 2022 - March 2022
    • Created a web application to streamline the process of locating available parking spaces in a multi-story parking facility by providing real-time parking lot status; minimizing time and fuel wasted in searching for a parking spot by 25%
    • Utilized ultrasonic sensor and Arduino nano board to determine presence/ absence of a car in a parking spot
    • Crafted a Vue JS-based UI to showcase parking lot status, incorporating LED indicators to display available spots

Projects

  • Tableau Dashboard: Road Accidents Analysis

    May 2024

    The Road Accidents Analysis Dashboard project aims to provide comprehensive insights into road safety by analyzing historical accident data. Through a series of interactive visualizations, including hexbin maps and donut charts, the dashboard highlights key trends and patterns in accident occurrence based on time, contributing factors including light conditions and road conditions.

    Read More

  • Tableau Dashboard: British Airways Review Analysis

    April 2024

    The goal of this project was to create a comprehensive Tableau dashboard to analyze customer reviews of British Airways, aiming to extract actionable insights for enhancing customer satisfaction and service quality. The dashboard aggregates and visualizes key metrics and trends derived from customer feedback data. It provides the necessary information to make data-driven decisions and implement targeted improvements to enhance the overall customer experience with British Airways.

    Read More

  • Airbnb Price Prediction and Recommendation

    January 2024 - March 2024

    The aim of this project was to build an ML model for price prediction and to automatically recommend airbnb listings based on user activity. For the price prediction model, I tried out multiple ML models including Linear Regression, Lasso Regression, Ridge Regression, and ensemble models - Bagging Regressor, Decision Tree Regressor, Gradient Boosting Regressor and Extreme Gradient Boosting Regressor. Extreme Gradient Boosting Regressor gave the best performance with an R2 score of 60%. The recommendation system, implemented using the Cosine Similarity algorithm, successfully recommended top 5 Airbnb listings that matched previously viewed listings.

    Read More

  • Netflix Database Design

    January 2024 - March 2024

    The “Netflix Database Design” project aims to develop a comprehensive database system tailored for a streaming service akin to Netflix. The primary objective is to create a scalable, optimized schema capable of managing vast amounts of user data, viewing habits, content metadata, and operational information.

    Read More

  • Hospital Dataset Analysis

    September 2023 - December 2023

    This project aims to uncover key trends and patterns impacting hospital operations. Data cleaning was used to handle missing values. Visualizations created using Seaborn and Matplotlib uncovered common diseases across age groups. Regression Analysis was performed to predict mortality rate of a patient.

    Read More

  • Vaccination Management System

    August 2022 - December 2022

    I designed and implemented a user-friendly COVID Vaccination Management System, streamlining appointment scheduling for users and enabling administrators to efficiently manage vaccine doses and centers.

    Read More

  • Tourist Destination Recommendation System

    August 2022 - October 2022

    The objective of this project was to build a recommendation system which suggests places based on user input. To accomplish this objective, I implemented the Cosine Similarity algorithm. This algorithm suggests destinations based on the similarity to a user-provided location, taking into account factors such as destination category (historical, beaches, cultural, adventure, etc.), visa requirements for Indian citizens, and minimum budget per day.

    Read More

  • Cross Notes: Extension for Google Docs

    May 2021 - June 2021

    Google Docs extension to streamline cross-referencing meeting notes. This extension includes a feature that automatically fills sections referenced by hashtag and section name with the designated content.

    Read More

  • B7- Badaam Saat - Multiplayer Card Game

    April 2020 - July 2020

    B7- Badaam Saat is a real-time, multiplayer web game centered around cards, featuring live chat functionality, proxy play, and skip chance options.

    Read More