I study computer science at Cornell University. My research interests are in decision-making (reinforcement learning, bandits) and generative modeling (diffusion models, LLMs). I am fortunate to work with professors Wen Sun, Robert Kleinberg, and Kianté Brantley.
Currently, I am a research scientist intern at Databricks working on deep research. Previously, I was a research intern at NVIDIA and a software engineering intern at DRW.
Outside of research, I enjoy mathematics, art, music, literature, and drone photography. A picture of me can be found here.
ascii image of the humble administrator's garden. suzhou, china
Publications
See my Google Scholar for the most up-to-date list.
We present a system for training enterprise search agents via reinforcement learning that achieves state-of-the-art performance across a diverse suite of hard-to-verify agentic search tasks. Our work makes four core cont…
Recent work has shown that for particular combinations of base model and training algorithm, *reinforcement learning with random rewards* (RLRR) improves the performance of LLMs on certain math reasoning benchmarks. This…
The controllable generation of diffusion models aims to steer the model to generate samples that optimize some given objective functions. It is desirable for a variety of applications including image generation, molecule…
While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generati…