Shutonu Mitra

Software| Cloud | AI

Hello!

I am a backend-leaning Software Engineer with experience building
scalable APIs, cloud-native services, and
reliable backend systems in production environments.
I focus on applied AI, integrating machine learning into real-world
software through NLP-driven services,
graph-based modeling, and data-intensive platforms.
Reach out to me at shutonumitra@gmail.com

Interests

Backend Engineering
I design and build scalable APIs, microservices, and backend architectures with a focus on performance, fault tolerance, and long-term maintainability.

Cloud & Distributed Systems
Experience building and operating cloud-native systems using containerization, managed services, and distributed data pipelines on modern cloud platforms.

Applied AI
I focus on integrating AI and LLM-based features into production systems, emphasizing observability, failure handling, and real-world user impact.

Machine Learning
Experience developing and deploying ML and NLP-driven services with attention to evaluation quality, scalability, and maintainable model workflows.

Education

Master of Science – Computer Science

Aug 2023 – May 2025

Virginia Polytechnic Institute & State University, USA

Coursework:

  • Software Engineering
  • Distributed Systems
  • Operating Systems
  • Cloud Computing & Applications
  • Big Data Technologies
  • Machine Learning I & II
  • Fundamentals of Information Security
  • Web & Mobile Application Development




Bachelor of Science – Computer Science & Engineering

2018 – 2022

Military Institute of Science and Technology, Bangladesh

Coursework:

  • C Programming
  • Object-Oriented Programming (C++ and Java)
  • Data Structures & Algorithms
  • Database Management Systems
  • Operating Systems (Linux)
  • Software Engineering
  • Computer Architecture
  • Microprocessor & Systems Engineering
  • Networks & Security
  • Compiler Design
  • Object-Oriented Analysis & Design
  • Web Application Development
  • Mobile Application Development
  • Human–Computer Interaction




Higher Secondary School Certificate & Science

2015 - 2017

Holy Cross College, Dhaka, Bangladesh

HSC Result: 5.00 (In the Scale of 5.00)





Secondary School Certificate & Science

2010 - 2014

Holy Cross School, Dhaka, Bangladesh

SSC Result: 5.00 (In the Scale of 5.00)

JSC Result: 5.00 (In the Scale of 5.00)

Professional Experience

Mitra US Computer Solutions

Full Stack Software Engineer
June 2025 – Present

Remote, USA

  • Developed an Azure-deployed C#/.NET backend and Angular (TypeScript) internal portal to centralize hiring-to-onboarding tracking across Greenhouse, WinTeam, and Checkr for HR and operations.
  • Implemented cross-system integrations from Greenhouse (GHO) into WinTeam using .NET and T-SQL, mapping recruiting and onboarding entities into ERP schemas; integrated Checkr background-check and SSN verification statuses with validation and state propagation.
  • Debugged cross-system integration failures by tracing requests and workflow state transitions and validating SQL Server data to identify schema and mapping mismatches, restoring reliable onboarding processing.

Curanostics Inc.

Back End Engineer
May 2024 – May 2025

Atlanta, Georgia, USA

  • Built and scaled patient-facing REST APIs as containerized microservices on Google Cloud, improving latency, throughput, and system reliability.
  • Delivered production clinician dashboards with a strong focus on performance, resilience, and user experience.
  • Established CI/CD pipelines and monitoring to improve deployment frequency and reduce incident recovery time.
  • Integrated a production-ready RAG-based medical summarization and Q&A service with observability and failure handling.

SELISE Digital Platforms

Data Engineer
Dec 2021 – Mar 2023

Dhaka, Bangladesh

  • Designed and optimized SQL and NoSQL data stores supporting high-throughput, low-latency production workloads.
  • Built and automated large-scale Spark and Airflow pipelines on AWS for data ingestion and analytics.
  • Containerized ETL and ML services and deployed them on Kubernetes with autoscaling and rollback strategies.
  • Collaborated with product and analytics teams to deliver BI and ML-backed features used across multiple teams.

Teletalk Bangladesh Ltd.

Software Engineer Intern
Feb 2021 – Apr 2021

Dhaka, Bangladesh

  • Engineered backend services for a nationwide, high-traffic exam results platform serving millions of users.
  • Implemented performant search and ingestion pipelines using Java, Spring Boot, Redis, and MySQL.
  • Diagnosed and resolved production issues related to concurrency, caching, and edge-case handling.

Handshake AI Solutions

AI Research Fellow
Aug 2025 – Present

Remote, USA

  • Evaluated large-scale AI agent reasoning traces to identify reproducibility, fairness, and reliability failures.
  • Designed realistic failure and adversarial scenarios to stress-test LLM evaluation pipelines.
  • Contributed to improving model reliability by exposing high-severity integrity breakdowns missed by standard benchmarks.

Virginia Tech

Machine Learning Engineer – Robotics & IoT
Jun 2025 – Present

Blacksburg, Virginia, USA

  • Built and evaluated reinforcement learning agents with reliability-focused success and retry metrics.
  • Developed privacy-preserving sensing pipelines using mmWave radar data for indoor fall detection.
  • Productionized ML workflows on AWS SageMaker with automated retraining and low-latency inference.

Virginia Tech

Graduate Research Assistant
Aug 2024 – May 2025

Virginia, USA

  • Built data pipelines and ML services for large-scale cyber-fraud and social scam analysis.
  • Combined graph-based models and NLP pipelines to improve forecasting and explainability.
  • Designed visualization systems to surface risk drivers and support decision-making.

Projects

  • All
  • Data and Cloud
  • Web and Android
  • System Design and Development
  • Others

Job AI Copilot

Job Application AI Copilot is a production-focused AI platform that helps candidates tailor resumes and reason about job fit through explainable, role-specific feedback. It uses agentic AI pipelines to parse job descriptions, analyze resumes, and surface skill gaps while preserving factual accuracy. The system is built on stateless APIs and serverless infrastructure to support concurrent usage and burst workloads. AI components are treated as modular system parts with bounded reasoning, prompt versioning, and failure-aware workflows. This project showcases applied AI engineered as a reliable backend system, not a one-off LLM demo. Code

Paper to Code

Paper to Code is an AI-powered tool that converts algorithm descriptions from research papers into executable code using structured LLM reasoning. It ingests PDFs, extracts relevant algorithmic content, and translates it into clear, runnable implementations. The system treats LLMs as reasoning engines, preserving algorithmic intent while avoiding unsupported or hallucinated steps. Its architecture emphasizes simplicity, statelessness, and reproducibility over black-box generation. This project highlights how applied AI can accelerate the path from research papers to working code for developers and researchers. Code

Predicts

Predictis is a system consisting of a wearable device and mobile application, which allows the users to know their risk levels of having CVDs in the future easily and efficiently. The Internet of Things(IoT) and Machine Learning (ML) techniques were adopted to develop the system that can classify its users into three risk levels (Green, Yellow, Red zones) with an F1 score of 80.4% and two risk levels ( presence and absence of CVD) with an F1 score of 91%. The stacking classifier incorporating best performing ML algorithms was used for predicting the risk levels of the end-users utilizing the UCI Repository dataset. The system has also been further evaluated in terms of effectiveness, efficiency, and user satisfaction yielding satisfactory results in each process. Code

Neural-Network Design

A Python-based repository focused on key Machine Learning and Time Series Analysis algorithms. It includes implementations of ARMA model simulations, neural network training algorithms such as backpropagation, and hybrid models like Radial Basis Function (RBF) networks with backpropagation. The project covers foundational concepts like ARMA model identification using GPAC and classic Perceptron learning for binary classification. Code

AWS Lambda Functions

This project contains AWS Lambda functions that serve as the backend for a MySQL-supported web application. The functions handle various REST API endpoints, such as fetching, adding, and managing categories and books in the database. All APIs are exposed via Amazon API Gateway and return data in JSON format. The database is hosted on AWS RDS with schema details provided in the project. The architecture includes IAM roles, CloudWatch logging, and reusable Lambda layers for efficient resource management. Code

Opinion Mining of Tweets Related to Mental Health

An Exploratory Data Analysis based project to reflect public opinion related to common mental health problems such as anxiety, depression, OCD, schizophrenia, PTSD etc. through time-series analysis from collected twitter data. Data are collected from Twitter using snscrape. Data have been annotated using Valance Aware Dictionary for Sentiment Reasoning (Vader).The collected data have been annotated in three classes: positive, negative and neutral. The analyzer model Recurrent Neural Network with LSTM Architecture shows an accuracy of 85%. Code

Body Signal Analysis of Smokers and Drinkers

This project encompasses a thorough analysis of machine learning methodologies applied to the Smoking and Drinking Dataset with Body Signal. The project delves into regression analysis, classification, clustering, and association rule mining, providing valuable insights and recommendations for future work.

Code

Malicious URL Detection

The project focuses on utilizing machine learning algorithms for the detection and prevention of malicious URLs, aiming to enhance online security and protect user data. The work includes the development and evaluation of various machine learning models, such as Decision Trees, Random Forest, Adaboost, KNN, SGD, ExtraTree, SVC, Gaussian NB, 1D-CNN for their effectiveness in identifying malicious URLs.

Code

PPD Coach

PPD coach is an application for detecting Postpartum Depression in Bangladeshi Mothers using Visual Questionnaires.This study emphasizes on developing some scenario-based visual representation of EPDS questionnaires using 3D animation videos and integrating them to android application make the system available to the end users. Code

PaperTown

PaperTown is a project that involves building a Bookstore web application using a React client app and a Tomcat server with a MySQL database. This project focuses on a web application architecture that provides accessibility and performance considerations, and focuses on scalability. This prototype has single page architecture at client build with React in Typescript and monolith server architecture having Restful API. Overall, the project is focused on building a high-quality web application that can handle a large amount of traffic and provide a good user experience.. Technologies used: MySQL, React, Typescript, Javascipt, CSS ,JDBC, REST API, JAVA. Code

ASMA

A system for managing a hair salon. This system keeps track of employee salaries, inventory and customer data along with billing. This is a fully equipped system that alone manages the entire functioning of a hair salon. It maintains stock of inventory coming in. It also manages individual employee salary. T he system is even equipped to manage customer billing and data storage. Technologies used: Oracle, PHP, HTML, PHP, Javascipt. Code

Gas Leakage Detector

An IoT based system capable of detecting gas leaks and sending alert message via SMS to the user's cellphone as well as triggering an alarm. Code

JAM KOM – A DEEP LEARNING BASED TRAFFIC UPDATE SYSTEM

This is the idea of creating a mobile application called Best Route Analyzer gives the best route with minimum waiting time after the source and destination area are given.

SeekNShare

A website on HTML , CSS and JavaScript and firebase used as backend. Code

Graphics-3D & 2D

Two graphics projects are done with animations. OpenGL(C & C++) is used for 2D graphics project. The 3D graphics project is done on blender with proper modelling, texturing, shading, ridging of the complex mesh and animations. Multiple light sources and camera angles are used.

A Game of Cyphers

A program in C, applying popular encryption algorithms to encrypt the given message. Code

Research and Publication

Graduate Research Assistant

Social Cyber Vulnerability Index Project

As a Graduate Research Assistant, I am collaborating with faculty to extract high-quality data from social media platforms for detecting cyber fraud cascades. I am playing a key role in defining social cyber vulnerability metrics, which led to a 25% improvement in social scam classification accuracy. I developed advanced multimodal data analytics and algorithms, utilizing Graph Neural Networks and Reinforcement Learning, while applying cutting-edge NLP and deep learning techniques to analyze unstructured text data, ultimately reducing false positives by 20%. Additionally, I am contributing to the creation of a geospatial dashboard that maps the Social Cyber Vulnerability Index (SCVI) across different regions, allowing for the identification of high-risk areas and enabling targeted interventions.

Publication:

Shutonu Mitra, Qi Zhang, Hemant Purohit, Chang-Tien Lu, and Jin-Hee Cho, "Towards Inclusive Cybersecurity: Protecting the Vulnerable with Social Cyber Vulnerability Metrics," The Sixth IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, 2024.(Accepted)

Undergrad Thesis

Detection and Prevention of Cyberbullying from Social Media by Exploring Machine Learning Algorithms

This study contributed to introducing a framework for preventing cyberbullying on SM by detecting bullying textual posts by employing sentiment analyses and bullying features through the exploration of ML algorithms. The dataset corpora extracted from Twitter are diverse in content. The taxonomy designed for bully characteristics covers most of the areas a person is bullied for such as racism, prejudice against women, disrespectful or insulting words, aggressiveness, etc. Embedding methods such as BoW and TFIDF have been used on thoroughly preprocessed tweets for applying selected classifiers namely: Liblinear based Logistic Regression, Linear SVM, Multinomial Naive Bayes, Random Forest and BiLSTM-RNN with GloVe Embedding. It is evident from several performance evaluation measures of the models that the Random Forest model with TFIDF embedding performs better in both cases. The highest achieved accuracy (F1 score) for the bully identification model is 80.8% and for the bully classification model is 87.7%. View Thesis Book Code

Publication:

S. Mitra, T. Tasnim, M. A. R. Islam, N. I. Khan and M. S. Majib, "A Framework to Detect and Prevent Cyberbullying from Social Media by Exploring Machine Learning Algorithms," 2021 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2), 2021, pp. 1-4, doi: 10.1109/IC4ME253898.2021.9768450. View paper

Research Article

Predictis: An IoT and Machine Learning-based System to Predict Risk Level of Cardio-Vascular Diseases

This study proposed a system consisting of a wearable device and mobile application, which allows the users to know their risk levels of having CVDs in the future. The Internet of Things (IoT) and Machine Learning (ML) techniques were adopted to develop the system that can classify its users into three risk levels (high, moderate and low risk of having CVD) with an F1 score of 80.4% and two risk levels (high and low risk of having CVD) with an F1 score of 91%. The stacking classifier incorporating best-performing ML algorithms was used for predicting the risk levels of the end-users utilizing the UCI Repository dataset. The resultant system allows the users to check and monitor their possibility of having CVD in near future using real-time data. Also, the system was evaluated from the Human-Computer Interaction (HCI) point of view. Thus, the created system offers a promising resolution to the current biomedical sector Code

Publication:

Islam, M.N., Raiyan, K.R., Mitra, S.et al. Predictis: an IoT and machine learning-based system to predict risk level of cardio-vascular diseases. BMC Health Serv Res 23, 171 (2023). https://doi.org/10.1186/s12913-023-09104-4
View paper

Ongoing Research

[1] "PPD COACH : An Application for detecting Postpartum Depression in Bangladeshi Mothers using Visual Questionnaires.” On-going Research (2023-Present)

Contact