Get To Know More About

Shouvik Sengupta

Profile picture
Experience icon

Occupation

Data Scientist

Education icon

Education

B.Tech. Electrical and Electronics Engineering
M.S. Data Science

Data Scientist with expertise in advanced analytics, machine learning, and software engineering. Skilled at transforming raw data into actionable insights and communicating technical concepts to diverse audiences. Seeking a challenging role to apply advanced data science techniques, programming skills and problem-solving abilities to build data driven business solutions.


My LinkedIn profile My Github profile
Arrow icon

Explore My

Skills

Programming Languages

Experience icon

Python

Experience icon

R Studio

Experience icon

SQL

Experience icon

Matlab

Frameworks

Experience icon

TensorFlow

Basic

Experience icon

Scikit-Learn

Experience icon

PyTorch

Experience icon

PySpark

Experience icon

Flask

Cloud / Data

Experience icon

AWS

Experience icon

Azure

Experience icon

PostgresSQL

Experience icon

MongoDB

Experience icon

Airflow

Experience icon

Kafka

Other Tools

Experience icon

PowerBI

Experience icon

Tableau

Experience icon

Docker

Experience icon

Git

Arrow icon

Explore My

Experience

Virufy

Data Scientist

Sep 2024 - present

  • Developed and deployed AI models on AWS for COVID Detection from coughs for early diagnosis using audio files
  • Researched and adapted technologies to build and scale seamless cost-effective ETL pipelines reducing processing time by 5%

University of Colorado Boulder

Data Science Course Assistant

Sep 2023 - May 2024

  • Mentored, graded and developed tests for Machine Learning course, fostering supportive learning experiences for 60 students

Tata Consultancy Services

Software Developer

Jun 2021 - Aug 2022

  • Orchestrated the migration of Restful microservices from Python to Java Springboot, optimizing the performance by 20%
  • Optimized performance by 10% by implementing Kafka based messaging and MongoDB for data storage and backup
  • Utilized Mockito for unit testing, and deployed the project seamlessly using docker containers over Amazon web services
  • Automated build, testing and deployment to AWS using Jenkins, facilitating CI/CD workflows for robustness and scalability

Schneider Electric Infrastructure LTD

Intern

May 2019 - Jul 2019

  • Built Arduino-based rack-in rack-out counter for switchgear, reducing cost by 50% compared to its mechanical counterpart
  • Participated and delivered presentations, and automated updating of weekly FMEA excel and powerpoint using Python

Browse My Recent

Projects

Project 1

Driver Profiling

The project focuses on developing models using CNN-LSTM and XGBoost to analyze telematics data, leveraging isolation forests to detect anomalies, assess driving patterns, and score driver safety. Interactive Tableau dashboards made insights accessible to non-technical audiences, supporting safer driving practices and decision-making.


COVID Detection

COVID Detection

The Chest X-ray Pneumonia Detection project involved deploying an AWS EC2 web application powered by deep learning to identify pneumonia from chest X-rays with 87% accuracy, supporting medical diagnosis. The workflow incorporated tools like MLflow, DVC, GitHub Actions, and modular coding to ensure efficient experiment tracking, version control, and seamless CI/CD integration.


Project 3

Conversation emotion-cause pair extraction

The Conversation Emotion-Cause Pair Extraction project focused on modeling the extraction of emotion-cause pairs from conversations. By leading a team and employing effective delegation and goal setting, the project achieved 61% accuracy in emotion classification through fine-tuning a RoBERTa model with PyTorch, enhancing expertise in NLP. Transfer learning was applied to a BERT SQuAD Hugging Face model, attaining a 62.2% proportional F1 score in causal span detection.


Project 1

Statistical Analysis of Greenhouse Gases

The Statistical Analysis of Greenhouse Gases project involved using regression modeling to forecast CO and NOx levels, providing actionable insights for emission management. The work included exploratory data analysis (EDA), data cleaning, feature selection, and diagnostic techniques, resulting in a 6% improvement in the adjusted R², enhancing the model's predictive accuracy.

Project 2

English Premier League Predictor

The English Premier League Predictor project involved building an ETL pipeline with Airflow and AWS to ingest and store EPL data in Amazon Redshift for efficient SQL-based extraction and analysis. Machine learning models were trained to predict match outcomes, and Power BI dashboards were developed to support data-driven decision-making.

Project 3

Anime Recommendation

The anime recommendation web application, built using Flask, featured a recommendation model based on content-based filtering. Anime data was web-scraped, and user reviews were classified as positive or negative using BERT sentiment analysis, enhancing personalized user experience.

Arrow icon

Get in Touch

Contact Me