From Design to Results

I’m Joshua Orfin, a Senior Data Scientist and ML Engineer With 6+ Years of Experience in The Field

I am a technical professional with experience in designing, implementing and scaling production-ready ML solutions using Data Science and MLOps best practices. I have worked with multiple large companies across various industries, helping them to successfully scale their solutions to production and yield the desired results.

Email

joshua.orfin@gmail.com

Location

Montréal, Québec, Canada

Education

Bachelor of International Studies – Economics

2012 – 2016

Université de Montréal

M. Sc. Business Analytics – Data Science

2017 – 2019

HEC Montréal

Experience

Managing Data Scientist & ML Engineer Consultant at IBM

January 2023 – Present

I lead technical initiatives and projects while coaching an amazing group of ML Engineers.

In short, I:

– Design, develop, and deliver many large-scale AI solutions (LLM, computer vision, ETL, traditional Data Science techniques, etc.) for various industry clients across Canada.

– I have also consistently dedicated time to building in-house accelerators that are now used across IBM for client project deliveries, from which the most ambitious and successful has been Falcon MLOps.

Advanced Analytics Consultant at IBM

May 2019 – May 2021

I began my IBM journey through the associates program: a two-year program designed to provide a large spectrum of structured training and practical experience for graduate hires.

Senior Data Scientist & ML Engineer Consultant at IBM

June 2021 – December 2022

On top of project deliveries, I started to lead the AI & Analytics department Incubator: the R&D team responsible for developing strategic platforms and asset accelerators.

Sample Projects1

Senior ML Engineer at a Top 10 Global Airline Company

May 2023 – Present

Passenger Demand
Predict passenger demand between city pairs (70k) for the global flight network.

I developed the AI end-to-end operationalization of the solution, deployed all models and set in place the MLOps lifecycle, together with monitoring. This project was particularly demanding due to the sheer amount of data processed (over 800M rows of demand history with 300+ features).

This solution produces accurate demand prediction which enables better-informed decisions for operations, pricing and resource allocation planning.

IT Chatbot
Build an IT chatbot that answers common employee IT-related questions.

I designed and implemented the chatbot based on the RAG framework.

The estimated cost savings of this agent is over $140K each month for one use case only.

Data Scientist and ML Engineer at a Top 10 Global Public Pension Fund

June 2021 – December 2022

Surprise Prediction
Predict the Wall Street Surprise indicator using time-series and forest-based models on Databricks.

I developed and operationalized a predictive engine to feed models used by fund managers before making major investment decisions on behalf of pension fund contributors.

The data was also of gargantuan size (surprise is estimated each quarter with a forecast of 1 year by Wall Street for each ticker).

Data Scientist and Developer at a Top 5 North American Railway Company

May 2019 – May 2021

Vision Inspection
Inspection Automation using computer vision (deep learning) models.

I built and deployed over 11 computer vision and 6 Machine Learning models that detect and classify defective parts on train cars.

This aids the operators to monitor car safety and request manual inspection at the right time.

Tools, Languages, Concepts

Relevant Tools2

Docker, Kubernetes, Git, Cloud (Azure & AWS), Databricks (Pyspark), ETL Orchestration (e.g. Airflow), Python libraries (e.g. Pandas, Numpy, Semantic Kernel, Autogen, xgboost, PyTorch…), Monitoring (e.g. Arize, Grafana).

Languages

Python, SQL, R and SAS.

Concepts

CI/CD, GenAI, AI, Machine Learning, Data Science, MLOps and LLMOps.

  1. For specifics on tools and technology used, please reach out to me privately. ↩︎
  2. I tried my best to place the most relevant tools. I’ve had the privilege of working with many more than what I highlighted here. ↩︎