Interests
Artificial Intelligence
I'm passionate about creating and launching cutting-edge AI-powered applications that solve real-world challenges and drive innovation."
Machine Learning and NLP
I've research works in Natural Language Processing, ML/DL and Social Media Analytics
Big Data and Cloud
I have experience in trending Big Data and Cloud Technologies like AWS, Spark, Kafka, BigQuery etc.
Software Development
I've experince in Full stack Android and Web application development in MERN stack.
Education
Master of Science & Computer Science and Applications
Aug 2023 - May 2025
Virginia Polytechnic Institute & State University (Virginia Tech), Virginia USA.
Major : Data Analytics and Artificial Intelligence
CGPA: 3.83 (In the Scale of 4.00)
Bachelors of Science & Computer Science and Engineering
2018 - 2022
Military Institute of Science and Technology (MIST), Dhaka Bangladesh.
CGPA: 3.65 (In the Scale of 4.00)
Higher Secondary School Certificate & Science
2015 - 2017
Holy Cross College, Dhaka, Bangladesh
HSC Result: 5.00 (In the Scale of 5.00)
Secondary School Certificate & Science
2010 - 2014
Holy Cross School, Dhaka, Bangladesh
SSC Result: 5.00 (In the Scale of 5.00)
JSC Result: 5.00 (In the Scale of 5.00)
Experience
Curanostics Inc.
Machine Learning Engineer
Suwanee, Georgia, USA
May 2024 - Aug 2024
- Integrated AI features with REST APIs and interactive healthcare data visualizations, boosting user engagement by 25%.
- Designed scalable back-end systems, improving performance by 40% with thorough testing and debugging.
- Developed ML models for healthcare data analysis using RAG, SciBERT, BioGPT, Med-PaLM on Google Cloud Platform.
Selise Digital Platform
Data Analytics and Data Engineering
Dhaka, Bangladesh
Dec 2021 - March 2023
- Optimized SQL/NoSQL databases, enhancing performance by 35% through query tuning and indexing.
- Automated ETL processes using Databricks, Kubernetes, AWS (S3, EMR), reducing pipeline defects by 40%.
- Deployed ML models in production with AWS Lambda and Step Functions, improving client satisfaction by 19%.
Teletalk Bangladesh LTD
Software Engineer Intern
Dhaka, Bangladesh
February 2021 - April 2021
- Developed and optimized applications using Java, Python, and C++, improving performance RESTful APIs and efficient data retrieval.
- Collaborated with cross-functional teams to design user-friendly interfaces using React and Angular
- Reduced production downtime by 30% through debugging, problem-solving, participating in Agile processes.
- Ensured robust deployment and stability through comprehensive testing across both frontend and backend environments.
Projects
- All
- Data and Cloud
- Web and Android
- System Design and Development
- Others
Predicts
Predictis is a system consisting of a wearable device and mobile application, which allows the users to know their risk levels of having CVDs in the future easily and efficiently. The Internet of Things(IoT) and Machine Learning (ML) techniques were adopted to develop the system that can classify its users into three risk levels (Green, Yellow, Red zones) with an F1 score of 80.4% and two risk levels ( presence and absence of CVD) with an F1 score of 91%. The stacking classifier incorporating best performing ML algorithms was used for predicting the risk levels of the end-users utilizing the UCI Repository dataset. The system has also been further evaluated in terms of effectiveness, efficiency, and user satisfaction yielding satisfactory results in each process. Code
Neural-Network Design
A Python-based repository focused on key Machine Learning and Time Series Analysis algorithms. It includes implementations of ARMA model simulations, neural network training algorithms such as backpropagation, and hybrid models like Radial Basis Function (RBF) networks with backpropagation. The project covers foundational concepts like ARMA model identification using GPAC and classic Perceptron learning for binary classification. Code
AWS Lambda Functions
This project contains AWS Lambda functions that serve as the backend for a MySQL-supported web application. The functions handle various REST API endpoints, such as fetching, adding, and managing categories and books in the database. All APIs are exposed via Amazon API Gateway and return data in JSON format. The database is hosted on AWS RDS with schema details provided in the project. The architecture includes IAM roles, CloudWatch logging, and reusable Lambda layers for efficient resource management. Code
Opinion Mining of Tweets Related to Mental Health
An Exploratory Data Analysis based project to reflect public opinion related to common mental health problems such as anxiety, depression, OCD, schizophrenia, PTSD etc. through time-series analysis from collected twitter data. Data are collected from Twitter using snscrape. Data have been annotated using Valance Aware Dictionary for Sentiment Reasoning (Vader).The collected data have been annotated in three classes: positive, negative and neutral. The analyzer model Recurrent Neural Network with LSTM Architecture shows an accuracy of 85%. Code
Body Signal Analysis of Smokers and Drinkers
This project encompasses a thorough analysis of machine learning methodologies applied to the Smoking and Drinking Dataset with Body Signal. The project delves into regression analysis, classification, clustering, and association rule mining, providing valuable insights and recommendations for future work.
CodeMalicious URL Detection
The project focuses on utilizing machine learning algorithms for the detection and prevention of malicious URLs, aiming to enhance online security and protect user data. The work includes the development and evaluation of various machine learning models, such as Decision Trees, Random Forest, Adaboost, KNN, SGD, ExtraTree, SVC, Gaussian NB, 1D-CNN for their effectiveness in identifying malicious URLs.
CodePPD Coach
PPD coach is an application for detecting Postpartum Depression in Bangladeshi Mothers using Visual Questionnaires.This study emphasizes on developing some scenario-based visual representation of EPDS questionnaires using 3D animation videos and integrating them to android application make the system available to the end users. Code
PaperTown
PaperTown is a project that involves building a Bookstore web application using a React client app and a Tomcat server with a MySQL database. This project focuses on a web application architecture that provides accessibility and performance considerations, and focuses on scalability. This prototype has single page architecture at client build with React in Typescript and monolith server architecture having Restful API. Overall, the project is focused on building a high-quality web application that can handle a large amount of traffic and provide a good user experience.. Technologies used: MySQL, React, Typescript, Javascipt, CSS ,JDBC, REST API, JAVA. Code
ASMA
A system for managing a hair salon. This system keeps track of employee salaries, inventory and customer data along with billing. This is a fully equipped system that alone manages the entire functioning of a hair salon. It maintains stock of inventory coming in. It also manages individual employee salary. T he system is even equipped to manage customer billing and data storage. Technologies used: Oracle, PHP, HTML, PHP, Javascipt. Code
Gas Leakage Detector
An IoT based system capable of detecting gas leaks and sending alert message via SMS to the user's cellphone as well as triggering an alarm. Code
JAM KOM – A DEEP LEARNING BASED TRAFFIC UPDATE SYSTEM
This is the idea of creating a mobile application called Best Route Analyzer gives the best route with minimum waiting time after the source and destination area are given.
Graphics-3D & 2D
Two graphics projects are done with animations. OpenGL(C & C++) is used for 2D graphics project. The 3D graphics project is done on blender with proper modelling, texturing, shading, ridging of the complex mesh and animations. Multiple light sources and camera angles are used.
Research and Publication
Graduate Research Assistant
Social Cyber Vulnerability Index Project
As a Graduate Research Assistant, I am collaborating with faculty to extract high-quality data from social media platforms for detecting cyber fraud cascades. I am playing a key role in defining social cyber vulnerability metrics, which led to a 25% improvement in social scam classification accuracy. I developed advanced multimodal data analytics and algorithms, utilizing Graph Neural Networks and Reinforcement Learning, while applying cutting-edge NLP and deep learning techniques to analyze unstructured text data, ultimately reducing false positives by 20%. Additionally, I am contributing to the creation of a geospatial dashboard that maps the Social Cyber Vulnerability Index (SCVI) across different regions, allowing for the identification of high-risk areas and enabling targeted interventions.
Publication:
Shutonu Mitra, Qi Zhang, Hemant Purohit, Chang-Tien Lu, and Jin-Hee Cho, "Towards Inclusive Cybersecurity: Protecting the Vulnerable with Social Cyber Vulnerability Metrics," The Sixth IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, 2024.(Accepted)
Undergrad Thesis
Detection and Prevention of Cyberbullying from Social Media by Exploring Machine Learning Algorithms
This study contributed to introducing a framework for preventing cyberbullying on SM by detecting bullying textual posts by employing sentiment analyses and bullying features through the exploration of ML algorithms. The dataset corpora extracted from Twitter are diverse in content. The taxonomy designed for bully characteristics covers most of the areas a person is bullied for such as racism, prejudice against women, disrespectful or insulting words, aggressiveness, etc. Embedding methods such as BoW and TFIDF have been used on thoroughly preprocessed tweets for applying selected classifiers namely: Liblinear based Logistic Regression, Linear SVM, Multinomial Naive Bayes, Random Forest and BiLSTM-RNN with GloVe Embedding. It is evident from several performance evaluation measures of the models that the Random Forest model with TFIDF embedding performs better in both cases. The highest achieved accuracy (F1 score) for the bully identification model is 80.8% and for the bully classification model is 87.7%. View Thesis Book Code
Publication:
S. Mitra, T. Tasnim, M. A. R. Islam, N. I. Khan and M. S. Majib, "A Framework to Detect and Prevent Cyberbullying from Social Media by Exploring Machine Learning Algorithms," 2021 International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering (IC4ME2), 2021, pp. 1-4, doi: 10.1109/IC4ME253898.2021.9768450. View paper
Research Article
Predictis: An IoT and Machine Learning-based System to Predict Risk Level of Cardio-Vascular Diseases
This study proposed a system consisting of a wearable device and mobile application, which allows the users to know their risk levels of having CVDs in the future. The Internet of Things (IoT) and Machine Learning (ML) techniques were adopted to develop the system that can classify its users into three risk levels (high, moderate and low risk of having CVD) with an F1 score of 80.4% and two risk levels (high and low risk of having CVD) with an F1 score of 91%. The stacking classifier incorporating best-performing ML algorithms was used for predicting the risk levels of the end-users utilizing the UCI Repository dataset. The resultant system allows the users to check and monitor their possibility of having CVD in near future using real-time data. Also, the system was evaluated from the Human-Computer Interaction (HCI) point of view. Thus, the created system offers a promising resolution to the current biomedical sector Code
Publication:
Islam, M.N., Raiyan, K.R., Mitra, S.et al. Predictis: an IoT and machine learning-based system to predict risk level of cardio-vascular diseases. BMC Health Serv Res 23, 171 (2023). https://doi.org/10.1186/s12913-023-09104-4
View paper
Ongoing Research
[1] "PPD COACH : An Application for detecting
Postpartum Depression in Bangladeshi Mothers
using Visual Questionnaires.” On-going Research (2023-Present)