Work Experience
Arizona State University
Data Scientist 08 Jan. 2023 - Present Phoenix, Arizona, USA
• Handled students’ performance streaming data (source: ZyBook vendor) coming into the AWS system through Kinesis data streams by building an end-to-end ETL pipeline using AWS Glue and Spark, leveraging NumPy and pandas for data manipulation, optimizing the overall workflow, and reducing the run time of the pipeline from 385 seconds to 164 seconds.
• Expanded the pipeline by integrating an AWS Lambda function with the existing pipeline to filter out irrelevant information and reduce the overall dimensionality of the processed data using the PCA algorithm from scikit-learn.
• Implemented A/B testing methodologies to experiment with diverse content strategies for influencers, optimizing the effectiveness of content and enhancing overall engagement by 20%.
• Built a Virtual Academic Advisor using LSTM and ARIMA models on AWS Sage Maker, delivering 92% accurate predictions of future course grades, enabling personalized course recommendations.
Rio Tinto
Data Engineer Intern 20 May 2024 - 09 Aug. 2024 Boron, California, USA
• Spearheaded the reengineering of the Borax data stream pipeline using PySpark, increasing processing speed from 32 to 45 shards per second, and substantially enhancing throughput and efficiency for mine site data modeling.
• Pioneered the development of an interactive dashboard application using Palantir Foundry and integrated Apache MLlib to cluster blasting patterns, facilitating advanced data visualization and operational analysis.
• Digitized the entire blasting process, conducted on-site testing in the mine pit, and developed automated webhooks to upload blasting data into SAP, reducing manual data entry time by 75% and increasing data upload accuracy to 99.8%, while leading a critical project to assess Expected Borax Sales and generating the G6 Reconciliation chart for upper management.
Tata Consultancy Services (TCS Digital)
Machine Learning Engineer (Systems Engineer) 20 May 2021 - 02 Nov. 2022 Gujarat, India
• Performed the feature extraction using PyTorch and Deep Neural Network Model (e.g. Vision Transformer) to identify the Instrument size depth based on 2D images.
• Migrated the on-premises workflow pipeline to AWS SageMaker, tuned the model hyperparameters using Bayesian Optimization (Statistics & Applied Mathematics), and improved the tuning time.
• Implemented parameter-efficient fine-tuning on the Fluorescent Penetrant Inspection (FPI) vision model to detect cracks in mechanical gears, training and validating the fine-tuned model on AWS SageMaker notebook.
• Utilized SQL for data manipulation and extraction, ensuring efficient data processing and management within the Big Data ecosystem.
• Employed Scikit-Learn, Keras, and Tensor Flow for developing and fine-tuning machine learning models, achieving a 20% improvement in detection accuracy.
BVM Engineering College
Undergraduate Research Assistant 20 May 2021 - 02 Nov. 2022 Gujarat, India
• Conducted hyperparameter tuning techniques on transformer-based models (mainly BERT) using Python and PyTorch, achieving a 15% increase in model accuracy for Neural Networks and other advanced algorithms.
• Leveraged Apache Spark and MLlib for distributed data processing and machine learning tasks, integrating Big Data solutions on Azure cloud platform, resulting in a 20% reduction in processing time and 30% improvement in scalability.