Yuan Si

Yuan Si

AI4SE Researcher · University of Waterloo

Math / Data Science & Combinatorics Optimization & Pure Math

About

I am a research assistant at the University of Waterloo. My primary research uses large language models and multimodal signals (video, execution traces, code) to automatically debug, tutor, and evaluate block-based programs — 4 first-authored papers including two accepted at FSE 2026 and ISSTA 2026. On the side, I also work on analytic number theory and probability: a paper on the Collatz affine random model is under review at Forum of Mathematics, Sigma, and a paper on Guy's four-corner rational distance problem (D19) is under review at the Journal of Number Theory. Previously worked at Microsoft Azure & AI Research.

Research Interests

LLM-based Program Repair Multimodal Debugging AI for Computing Education Evaluation of LLMs Analytic Number Theory Probability Theory

Publications

Conference Papers
ViScratch: Using Large Language Models and Gameplay Videos for Automated Feedback in Scratch
Yuan Si, Daming Li, Hanyuan Shi, Jialu Zhang
ACM SIGSOFT International Conference on the Foundations of Software Engineering (FSE) 2026 Accepted
ScratchEval: A Multimodal Evaluation Framework for LLMs in Block-Based Programming
Yuan Si, Simeng Han, Daming Li, Hanyuan Shi, Jialu Zhang
ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) 2026 Accepted
Journal Submissions
A Microcanonical Phase Transition for the Collatz Affine Random Model
Yuan Si
Submitted to Forum of Mathematics, Sigma, 2026 — sharp resonant phase transition at the entropy line for Tao's Syracuse affine model, reducing hard-frequency mixing to a primitive ternary Bernoulli-bridge transform Under Review
Mixed Parity, Diagonal Denominator, and the Pell-Chord Genus-Five Obstruction for the Four-Corner Rational Distance Problem
Yuan Si
Submitted to Journal of Number Theory, 2026 — unconditional necessary conditions on Guy's problem D19 (Pillai's unit-square four-distance) and a reduction of the residual obstruction to a non-fixed Pell-chord involution on a family of arithmetic-genus-five curves Under Review
Preprints
EcoScratch: Cost-Effective Multimodal Repair for Scratch Using Execution Feedback
Yuan Si, Ming Wang, Daming Li, Hanyuan Shi, Jialu Zhang
Preprint, 2026 Under Review
Stitch: Step-by-step LLM Guided Tutoring for Scratch
Yuan Si, Kyle Qi, Daming Li, Hanyuan Shi, Jialu Zhang
Preprint, 2025 Under Review
Elliptic Decomposition of the Pell-Chord Genus-Five Obstruction for the Four-Corner Rational Distance Problem
Yuan Si
Preprint, 2026 — structural follow-up to Paper I: decomposes the residual genus-five curve into full-2-torsion elliptic pieces via 2-isogeny and Kani–Rosen Jacobian factorization; reduces the four-corner problem to a two-variable Pythagorean-slope exclusion
Research Reports & Other
Multiplayer Rock-Paper-Scissors: Nash Equilibria via Linear Programming
Yuan Si
Research Report, 2025
Tesla Charging Station Optimization via Independent Dominating Sets
Yuan Si
Research Report, 2025
Public Goods Game: Cooperation Dynamics and Intervention Analysis
Yuan Si
Research Report, 2025
Textbook
A Gentle Introduction to Optimization
Yuan Si
2024 — Adopted as required reading in 3 university courses

Experience

University Researcher — AI in Multidimensional Input

July 2025 – Present
University of Waterloo
  • Led 4 first-authored papers on LLM-driven debugging, tutoring, and evaluation for Scratch (ViScratch accepted at FSE 2026).
  • Built multimodal repair systems fusing gameplay video, execution feedback, and project JSON to localize bugs and synthesize fixes via LLM-guided loops.
  • Developed an interactive tutoring system (Stitch) and a 100-project executable benchmark (ScratchEval) for evaluating LLM repair quality on block-based code.

University Researcher — Game Theory & Graph Theory

Dec. 2024 – March 2025
University of Waterloo, Combinatorics & Optimization Department
  • Derived Nash equilibria for multiplayer Rock-Paper-Scissors via linear programming. [report]
  • Modeled optimal Tesla charging station placement using independent dominating sets. [report]
  • Analyzed cooperation dynamics in Public Goods Games. [report]
  • Investigated strategic dynamics in graph-based games (Cops and Robbers).

Microsoft Researcher — AI Trends

May 2024 – Aug. 2024
Microsoft — Azure & AI, Research
  • Investigated AI scribe technologies for healthcare; synthesized findings into research recommendations for physician workflow automation.
  • Designed and evaluated improvements to an insurance chatbot through systematic analysis of user interaction data.

Projects

ResearchOS

Local-first research decision and execution system. Turns a brief into runnable experiments, evidence-checked claims, and a packaged manuscript draft. Provider-agnostic LLM adapters (OpenAI / Anthropic / mock), dual-agent code worker, HITL approval gates, encrypted secret store.

Python · FastAPI · React · TypeScript · LLM · AGPL-3.0

WizardingWorld

Harry Potter Hogwarts experience mod for Terraria via tModLoader. 590 C# files, 12 multi-phase bosses, Pensieve memory replay framework, three-language localization. Built with Claude Code + Codex in a dual-agent dev workflow.

C# · .NET 8 · tModLoader · MIT

DevToolkit

43 zero-dependency single-file Python CLI tools across 9 categories: web, data, process, MCP, security, scaffolding. Copy any file and run.

Python · CLI · MCP