>> SYSTEM READY
ENGINEERING INTELLIGENCE.
RESEARCHING MULTIMODALITY.
BUILDING AGENTS.

CURRENT_STATUS●

Research Assistant @ Vector Institute / UHN WangLab. Foundation Models, Multimodality & Medical AI.

SHUOLIN(LEO) YIN

BASED IN TORONTO, CANADA
EST. 2023

Initialize

EXP.

PROFESSIONAL HISTORY
& RESEARCH

Orchestrate large-scale experiments on HPC clusters with SLURM; co-author manuscripts under review at Medical Image Analysis, Nature Communications, and npj Precision Oncology.
Pretrain and finetune DINOv3 on 300M medical images using H100 GPUs with PyTorch DDP, achieving 10–20% improvements over SOTA.
Lead the MICCAI FLARE 2025 technical committee in designing evaluation metrics; release 3 fine-tuned Qwen3/2.5-VL and MedGemma models.
Present a self-supervised pathology foundation model at the NeurIPS 2025 SLCPFM Competition, outperforming SOTA across 5 downstream tasks and 3 cancer types.
Publish a medical vision-language models paper in the MICCAI Educational Challenge (top 5), open-sourced as an end-to-end training cookbook (200+ GitHub stars).
Collaborate cross-functionally with clinicians, translating medical domain knowledge into ML architectures tracked with Weights & Biases.

PyTorch DDPDINOv3H100 / SLURMQwen3-VLMedGemmaWeights & Biases

Built a multi-agent system for outbound sales ops covering discovery, evidence retrieval, outreach generation, and CRM follow-up with LangGraph and MCP.
Implemented citation-first RAG over FDA, CE/MDR references, ARTG, and internal documents, returning sourced excerpts for agent review and audit.
Distilled a task-specific SLM from Qwen3-32B, cutting inference cost by 7× while matching quality within 2% on automated + human evals (n=1,500).
Shipped a CRM automation agent with an MCP layer coordinating workflow across discovery, compliance, and outreach.
Supported pilots across 9 countries by shipping enrichment and retrieval features, reducing research time by 70% and lifting sales-team throughput.
Built an eval test set and guardrails; tracked retrieval hit and citation coverage, cutting unsupported claims by 30% via logging and error analysis.

LangGraphMCPCitation-first RAGSLM DistillationMLOps

Built HumanOS, a closed-loop autonomous agent that perceives, plans, intervenes, verifies, and learns across HealthKit, camera, and behavioral telemetry.
Implemented a tiered long-term memory system using Multi-Query RAG with MRR fusion over pgvector (1536-d) to drive persistent personalization.
Built conversational agent flows with OpenAI Agents SDK and LangChain; reached an average post-session user satisfaction of 9/10 across 500 beta users.
Shipped agent-driven interventions spanning VoIP, iOS FamilyControls, pledge gates, and multimodal verification, reducing average task delay by 40%.
Deployed behavioural analytics with TensorFlow.js and Supabase to track engagement metrics, enabling proactive, data-driven intervention.
Productionized infra on Supabase: PostgreSQL + pgvector with RLS, Deno edge functions, pg_cron jobs, and JS bridge handlers across native and React.

OpenAI Agents SDKLangChainpgvectorSupabaseTensorFlow.jsiOS / React Native

Synthesized medical MLLM literature through systematic review and experimental methodology toward a TPAMI journal submission.
Constructed an end-to-end data pipeline for a multimodal evaluation benchmark with automated quality checks, reducing process time by 75%.
Developed a PyTorch multimodal evaluation framework assessing VLM accuracy, robustness, and cross-modal consistency, tracked with MLflow.
Coordinated interdisciplinary collaboration between medical experts and researchers, designing annotation protocols and improving accuracy by 40%.
Automated medical image labelling workflows in Python with NumPy and Pandas, systematically reducing validation effort.
Deployed evaluation infrastructure on GCP Compute Engine with Docker, enabling reproducibility across teams.

Medical MLLMsPyTorchMLflowGCP Compute EngineDocker

Engineered a two-tower recommendation system in TensorFlow for volunteer matching, improving match acceptance by 85% on historical placements.
Built a real-time analytics dashboard with Firebase and TensorFlow.js, deploying engagement prediction models for churn analysis.
Developed a full-stack platform with React Native and React.js using a CI/CD pipeline for consistent, robust application builds.
Scaled the non-profit to 50+ members, 100+ partner organizations, and 500+ monthly active users through a four-tier management system.

TensorFlowTensorFlow.jsFirebase / FirestoreReact NativeReact.jsCI/CD

PUBS.

PUBLICATIONS
& PRESENTATIONS

01First Author

From Concept to Code: A General Framework for Building a Medical Vision-Language Model

Open-source end-to-end training cookbook for practitioners building medical vision-language models — from data curation to deployment (200+ GitHub stars).

MICCAI Educational Challenge

2025 · Challenge

Top 5

02First Author

Adapting Pathology Foundation Models for Prediction of Homologous Recombination Deficiency Status in High-Grade Serous Carcinomas Using Whole-Slide Images

Adapting pathology foundation models to predict HRD status from whole-slide images of high-grade serous carcinomas.

npj Precision Oncology

2026 · Journal

Under Review

03Co-author

Self-Configuring Multi-Task Learning for Joint 3D Biomedical Image Segmentation and Classification

Medical Image Analysis

2026 · Journal

Under Review

04Second Author

Benchmarking and Adapting On-Device Large Language Models for Clinical Decision Support

Nature Communications

2026 · Journal

Under Review

05Presenter

Pathology-Native Self-Supervision at Scale: DINOv3 for Cancer Foundation Models

Outperformed CONCHv1.5 and UNIv2 across 5 downstream pathology tasks and 3 cancer types.

NeurIPS SLCPFM Competition

2025 · Workshop

Presented

06Organizer

FLARE — Fast, Low-resource, Accurate, Robust, and Effectual Medical Image Analysis Challenge

Technical committee — designed evaluation metrics and released 3 fine-tuned Qwen3/2.5-VL and MedGemma models.

MICCAI

2025 · Challenge

Organizer

6 ITEMS · MEDICAL AI · MULTIMODALITY · FOUNDATION MODELSONGOING RESEARCH

PROJECTS

SELECTED BUILDS

AI AGENTS
FULLSTACK
RESEARCH

React NativeYOLOv11RAG / LangChain

Echo – AI-Powered Sustainable Fashion Marketplace

Lead Engineer // Mar 2025

Won 1st place in Canada and 2nd globally at BCG & Global Spark Hack the Globe, building an AI marketplace inside a 48-hour window.

OpenAI Agents SDKMCPRAG

EZ-Career – Autonomous AI Job Application Agent

AI & Full Stack Engineer // Apr 2024 – Jun 2024

Designed a multi-agent system on the OpenAI Agents SDK using an Agent-as-Tool architecture to orchestrate end-to-end job-application automation.

React NativeLangChainAWS Lambda

YiXing – AI-Driven Personalized Travel Planner

Project Lead / Co-Founder // Jun 2023 – May 2024

Led development of an AI travel planner using fine-tuned GPT models with LangChain-powered RAG for personalized itineraries.

React NativeFastAPIFew-shot Learning

ReassurED – LLM-Powered Emergency Care Navigator

Project Lead // Sep 2024

Led a 24-hour hackathon build of a cross-platform emergency healthcare app using React Native (Expo) with real-time Firestore hospital data.

TECHNOLOGIES

LIVE

PythonJavaC/C++JavaScriptTypeScriptSQLBashRMATLABPyTorchTensorFlowJAXTransformersLangChainOpenCVYOLOscikit-learnRAGReact/React NativeNext.jsNode.jsExpressFastAPIFlaskPostgreSQLNoSQLFirebaseSupabaseRESTful APIsCI/CDGCPCompute EngineTPUFirestoreCloud RunVertex AIAWSEC2S3SageMakerDynamoDBBedrockAzure MLMulti-Agent SystemsOpenAI Agents SDKModel Context Protocol (MCP)Agent OrchestrationDockerKubernetesSLURMMulti-GPU Training (DDP, FSDP)Distributed SystemsWeights & BiasesMLflowPythonJavaC/C++JavaScriptTypeScriptSQLBashRMATLABPyTorchTensorFlowJAXTransformersLangChainOpenCVYOLOscikit-learnRAGReact/React NativeNext.jsNode.jsExpressFastAPIFlaskPostgreSQLNoSQLFirebaseSupabaseRESTful APIsCI/CDGCPCompute EngineTPUFirestoreCloud RunVertex AIAWSEC2S3SageMakerDynamoDBBedrockAzure MLMulti-Agent SystemsOpenAI Agents SDKModel Context Protocol (MCP)Agent OrchestrationDockerKubernetesSLURMMulti-GPU Training (DDP, FSDP)Distributed SystemsWeights & BiasesMLflow

MLflowWeights & BiasesDistributed SystemsMulti-GPU Training (DDP, FSDP)SLURMKubernetesDockerAgent OrchestrationModel Context Protocol (MCP)OpenAI Agents SDKMulti-Agent SystemsAzure MLBedrockDynamoDBSageMakerS3EC2AWSVertex AICloud RunFirestoreTPUCompute EngineGCPCI/CDRESTful APIsSupabaseFirebaseNoSQLPostgreSQLFlaskFastAPIExpressNode.jsNext.jsReact/React NativeRAGscikit-learnYOLOOpenCVLangChainTransformersJAXTensorFlowPyTorchMATLABRBashSQLTypeScriptJavaScriptC/C++JavaPythonPythonJavaC/C++JavaScriptTypeScriptSQLBashRMATLABPyTorchTensorFlowJAXTransformersLangChainOpenCVYOLOscikit-learnRAGReact/React NativeNext.jsNode.jsExpressFastAPIFlaskPostgreSQLNoSQLFirebaseSupabaseRESTful APIsCI/CDGCPCompute EngineTPUFirestoreCloud RunVertex AIAWSEC2S3SageMakerDynamoDBBedrockAzure MLMulti-Agent SystemsOpenAI Agents SDKModel Context Protocol (MCP)Agent OrchestrationDockerKubernetesSLURMMulti-GPU Training (DDP, FSDP)Distributed SystemsWeights & BiasesMLflow

SHUOLIN(LEO) YIN

EXP.

Research Assistant

Lead ML & Full-Stack Engineer

AI & Full-Stack Engineer, Founding Team

AI Research Assistant

President / Founder

PUBS.

From Concept to Code: A General Framework for Building a Medical Vision-Language Model

Adapting Pathology Foundation Models for Prediction of Homologous Recombination Deficiency Status in High-Grade Serous Carcinomas Using Whole-Slide Images

Self-Configuring Multi-Task Learning for Joint 3D Biomedical Image Segmentation and Classification

Benchmarking and Adapting On-Device Large Language Models for Clinical Decision Support

Pathology-Native Self-Supervision at Scale: DINOv3 for Cancer Foundation Models

FLARE — Fast, Low-resource, Accurate, Robust, and Effectual Medical Image Analysis Challenge

PROJECTS

Echo – AI-Powered Sustainable Fashion Marketplace

EZ-Career – Autonomous AI Job Application Agent

YiXing – AI-Driven Personalized Travel Planner

ReassurED – LLM-Powered Emergency Care Navigator

TECHNOLOGIES