dylan lim
email
twitter
linkedin
hi! i'm a bs/ms student at stanford studying computer science.
i work on making ml systems fast. some recent work includes whole model kernel fusion (
low latency
,
high throughput
) and writing performant
multi-gpu communication primitives
.
research
we bought the whole gpu, so we're going to use the whole gpu
(hazy research 2025)
look ma, no bubbles! designing low-latency megakernels
(hazy research 2025)
one kernel for all your gpus
(hazy research 2025)
layoutvlm: 3d scene generation
(cvpr 2025)
flexflow: distributed dnn compiler
(stanford 2024-25)
things i built
shard — distributed training system for local compute
flexgraph — ml model architecture generation for compiler evaluation
industry
jump trading — multi-node gpu orchestration
(summer 2025)
valuenex — 10× faster graph algorithms
(spring 2024)
candid — rag pipelines & ml infrastructure
(2023-24)
misc
homeschooled military kid from 4 cities
carnegie hall pianist
performed at american country music awards
former american red cross national youth council lead