Hi, I'm Abhijeet Singh.

A(n)

Self-driven, quick starter, passionate Data Engineer & ML Developer with a curious mind who enjoys solving complex and challenging real-world problems.

About

Experienced Data Engineer and ML Developer with 7+ years of research and industry experience in designing, building, and scaling end-to-end data and machine learning solutions. Proficient in developing and deploying robust ELT/ETL pipelines, scalable data models, and cloud-based data warehouses using platforms like GCP and Azure. Skilled in architecting data workflows for batch and streaming use cases, leveraging tools such as BigQuery, Dataflow, Cloud Composer, Data Factory, and PySpark to process millions of records daily. Specialized in building predictive models and classification systems using NLP, LLMs, and statistical methods. Experienced with end-to-end ML lifecycle management—including feature engineering, model training, evaluation, deployment, monitoring, and retraining—using tools like Vertex AI, Azure ML, MLflow, and TensorFlow.Strong advocate of MLOps best practices for production-grade AI systems, with hands-on experience in version control, CI/CD automation, model governance, and infrastructure provisioning using Terraform and Git. Passionate about transforming raw data into actionable insights and intelligent products. Adept at cross-functional collaboration, bridging the gap between engineering, data science, and business to deliver high-impact, scalable.

  • Languages: Python, SQL, C++, JavaScript, MATLAB, PHP, HTML/CSS
  • Tools: GCP, Azure, Linux, Bash, Git, Terraform, Docker, Shell, ROS, Cognos, Apache Airflow
  • NLP: NLTK, Spacy, Gensim, Hugging Face, Stanza, LangChain, DeepSpeed, PaddlePaddle
  • Databases: MySQL, Microsoft SQL Server, SQLite, PostgreSQL, Pandas
  • Libraries: Tensorflow, Pytorch, Keras, Sklearn, OpenCV, YOLO, Pandas, Numpy
  • Soft Skills: Teamwork, Leadership, Communication, Work Ethic, Time Management, Creativity

I am seeking a challenging position that will enable me to combine my skills in Data Engineering and ML Development, while fostering professional growth, engaging experiences, and personal development.

Experience

Data Engineer II
  • Led design and automation of 10+ data pipelines on Azure ingesting millions of records daily from SAP and legacy systems into Snowflake for market segmentation. Implemented CI/CD with Azure DevOps for streamlined deployments, and used ADF, Azure Functions, and ELT best practices to deliver scalable, cost-efficient solutions (40% faster queries & 99.9% uptime)
  • Designed scalable, optimized data models and built robust backend APIs to manage millions of sales-representative records. Orchestrated end-to-end data pipelines on Azure using Data Factory, Azure SQL, and Azure Functions delivering high-performance, maintainable solutions aligned with enterprise data architecture standards
  • Built and orchestrated telecom order processing pipelines on Azure using Data Factory, Event Hubs, Azure Databricks, and Synapse. Engineered scalable ETL workflows to support machine learning-driven customer segmentation, enabling efficient data ingestion, transformation, and model scoring with 88%+ accuracy. Ensured seamless integration across services with CI/CD and monitoring
  • Designed and automated end-to-end data pipelines on GCP using Cloud Composer (Airflow), PySpark on Dataproc, and BigQuery to process 200K+ timesheet records daily. Optimized data reconciliation workflows, reducing errors by 30% and improving performance by 40%, while ensuring scalability, reliability, and cost-effective processing
Jan. 2024 - Present | Toronto, ON, Canada
Data Engineer I
  • Designed and deployed a streaming fraud detection pipeline on GCP using Pub/Sub, Dataflow, BigQuery, and Vertex AI. Enabled low-latency model inference with neural networks, achieving a 92% F1 score and $1M+ annual fraud savings. Built with CI/CD, monitoring, and data quality checks for scalability and reliability
  • Built and orchestrated 9 batch prediction pipelines on GCP for customer churn using Cloud Composer, Pub/Sub, BigQuery, and Vertex AI. Automated data ingestion, transformation, and model deployment for scalable, repeatable batch scoring
  • Built an automated machine learning pipeline for finance revenue forecasting on Alteryx,reducing forecast error MAPE by 20% and manual effort by 40%.
  • Gained a deep understanding of business values and agile delivery model through close collaboration with the clients and the business team
Jan. 2021 - Dec. 2023 | Halifax, NS, Canada
Data Scientist and ML Researcher
  • Developed an NLP–driven binary classification model using TF-IDF and Word2Vec to identify duplicate Quora questions, 84% precision and 90% recall
  • Designed and implemented a sentiment analysis pipeline for online reviews using SpaCy-based feature engineering and cross-validation, achieving a 90.08% F1 score with an SVM model
  • Developed a robust malware detection model using advanced feature engineering on ASM and byte files, achieving a multiclass log loss of 0.01 with XGBoost
  • Developed a pickup density forecasting model using clustering for regional segmentation and Fourier features for temporal patterns, achieving 10.16% MAPE with XGBoost
Jan. 2020 - Jan. 2021 | Halifax, NS, Canada
Logistic Data Analyst
  • Handled data cleaning, missing value treatment, and preparation for analysis with full process documentation to support logistics insights
  • Created visualizations using Python and Tableau to present user stories and drive data-informed decisions in logistics operations
Sep 2018 - July. 2019 | India

Skills

Languages

Python
C++
C
PHP
JavaScript
HTML

Tools

Azure Cloud
Google Cloud
Terraform
Linux
Git
Docker

Databases

MySQL
Microsoft SQL Server
SQLite
PostgreSQL

NLP

NLTK
Spacy
Gensim
Hugging Face
Stanza
TextBlob

Libraries

Tensorflow
Pytorch
Keras
Sklearn
Numpy
Pandas

Visualizations

Streamlit
Tableau
Matplotlib
Seaborn
Plotly
Microsoft Power BI

Soft Skills

Communication
Teamwork
Leadership
Work Ethic
Time Management

Education

Dalhousie University

Halifax, NS, Canada

Degree: Master of Science in Computer Science (Ecommerce)

APJ Abdul Kalam Technical Univesity

Lucknow, India

Degree: B.Tech in Computer Science & Engineering

Contact