Blocking the Cisco Secure Client Socket Filter on macOS
The Cisco Secure Client Socket Filter installed by the Cisco VPN client on my laptop (VPN is necessary to connect to USC compute cluster) was causing problems: sometimes when I started up my laptop I couldn’t access the internet even when…
My research AI coding workflow
My eyes generally glaze over reading “how I use AI” posts, but enough of my colleagues have expressed interest that I think this would be worth sharing. My overall strategy for coding these days is to have OpenAI’s Codex agent write and…
Better math fonts for AI venue submissions
Most AI conference venues have Times (or Times New Roman) as their font for submissions. Unfortunately the math in the templates is set in computer modern (or latin modern, not sure). Improve your paper’s aesthetics by using Times for the…
Some papers that caught my attention at NeurIPS 2025
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations shows that language models are fine with however you tokenize your doc. Can we hide information in the tokenization…
Do language models lexicalize phrases?
Do langauge models “lexicalize” certain multi-token words or phrases (i.e., treat them as atomic units)? How would we measure lexicalization in LMs?
LoRA weight decay thoughts
Irhum makes an observation that I have thought about from time to time: LoRA-adapted model weights decay to the original model during training (not 0). How much of LoRA’s success could be attributed…
Some reading on recursive reasoning transformers
LLM have built-in MACs
pico.sh seems like a cool, minimal effort tool for blogging, but it doesn’t support math, so I’m going to roll my own.
Bug in my torch code, random links.
Torch code bug that took me an hour to fix: I wrapped a function in
@torch.inference_mode which called another function which
called another function that was trying to call
torch.backward.
Hello world, MathML in NetNewsWire
I created an RSS feed for my website. I did it by hand, this template. I don’t use any frameworks for maintaining my website, I just write everything by hand. I don’t change my website often…
The "Right Way" to Ensemble Language Models
Suppose you have n langauge models with embedding size d, vocabulary size v, and softmax matrices W1, W2, …, Wn ∈ ℝv × d and you want to sample from them as an ensemble. One…
Research Interest Demo
I have created a discord server for people interested in collaborating with the lab. Email me for an invite!
Obtaining logprobs from an LLM API
Many LLM APIs give top-k logprobs in their outputs. What if we want to obtain all the logprobs? Here I present two algorithms for obtaining logprobs from an LLM API. Both of these depend on the API allowing us to add a logit bias to…
A differentiable function from binary integer to one-hot representations
I would like to define a differentiable function f : {0, 1}log v → {0, 1}v that converts binary number representations of log v bits into one-hot vectors. This can be accomplished by using fuzzy logic operators to…
Deep BA Sampling
TL;DR: we can use any intermediate LM representation to prove that a subset of next-token candidates have non-zero probability.
Heavy tails and diversity in model distributions
Direct sampling from model output distributions often gives incoherent outputs. Some have attributed this to a heavy tail, i.e., the model assigns too much probability to low-probability tokens. My goal is to test this hypothesis.
Visualizations
These are some visualizations I have made over the years, both for academia and for fun!
The Softmax Function is Linear
I asked my Twitter followers if they knew that the softmax function is linear. The result was disbelief.
ss.py — a CLI wrapper for the Semantic Scholar API
ss.py is my personal command line tool for searching and
citing academic papers via the Semantic Scholar API. About page. GitHub.
Putting word count in Vim statusline for LaTeX files
This ftplugin updates the word count in the statusline on every save. More frequent updates slow Vim down and cause random rendering problems.
Configuring Zathura
I finally got Zathura (the pdf viewer) configured the way I want it on MacOS. I installed using homebrew following these instructions. I set up an automator script to launch Zathura for me…
Washington State
I learned some Blender and used some open source elevation data to make a nice looking relief map of Washington State. Check out how I made it here. 📍
Camping in Cottonwood Wash
This last weekend, Caitlyn and I backpacked up Cottonwood Wash in Utah’s San Rafael Swell, a beautiful canyon all to ourselves. We even spotted some petroglyphs. ⛰️
Course notes from CS183
These are my notes from the course CS183: Foundations of Machine Learning. They are imperfect and incomplete but I really enjoyed making them. If you would like to make edits…