Judy Stephen
I work in ML, building the inference systems and frameworks that make foundation models fast in production. I care about the whole arc: from a GPU kernel to the product someone relies on. Currently at Apple, thinking about what these systems need to get right as they become more central to how people work and create.
Designing and building serving infrastructure for foundation models on Private Cloud Compute. Designed the end-to-end pipeline for image generation, from device to compute nodes, working across framework, server, and privacy teams to make the infrastructure reusable for new clients. Self-initiated speculative decoding after identifying an opportunity in research; shipped in C++ with a 2.3x speedup. Identified a gap in serving evaluation and built LLM-as-a-Judge as a correctness gate for serving changes.
Built the hardware backend that became PyTorch's interface to Apple Silicon. Designed the dispatch layer routing ops to GPU, CPU, and Apple Neural Engine, and implemented operator kernels for each. Production Transformers, RNNs, and LSTMs running natively on device.
Compiler performance optimization in C++. 7.5x runtime improvement.
H.G. Chen, S. Jayasuriya, J. Yang, J. Stephen, et al. "ASP Vision: Optically Computing the First Layer of Convolutional Neural Networks Using Angle Sensitive Pixels." CVPR 2016.
Training language models to be honest rather than agreeable. Fine-tuned Mistral-7B-Instruct with DPO and QLoRA, generated 234 preference pairs with Claude API; reduced sycophancy rate from 96% to 8% on held-out eval. TruthfulQA improved +2%, confirming no catastrophic forgetting.
Exploring what it takes to make a model genuinely good at conversation. SFT and DPO pipeline on Qwen 2.5 7B; ran into DPO collapse (loss to 0, TruthfulQA degraded), currently investigating. The failure is informative: over-optimization on conversational preference data breaks general capabilities faster than expected.
MEng project building a hardware accelerator for convolutional neural networks in C++. Won Best AI and Machine Learning Masters Project at Cornell ECE 2017.