Michael Leung
  • Writing
  • About

On this page

  • Michael Min Wah Leung
  • Writing
  • Edit this page
  • Report an issue
Categories
All (5)
GRPO (1)
LLMs (1)
MLX (1)
RLHF (1)
hybrid-inference (1)
interpretability (3)
neuroscience (1)
post-training (1)
probing (1)
safety (1)
seq2seq (1)
sign-language (1)
signal-processing (1)
steering (1)
transformers (2)

Michael Min Wah Leung

Notes on post-training, sequence modelling, and the occasional brain. Code-first, results-honest, written by an ML engineer with a research background in neuroscience.

Writing

Does a single direction mediate refusal? A small reproduction

interpretability
safety
steering
transformers
A reproduction of Arditi et al. 2024 on a 1.5B chat model: ablating one difference-of-means direction lowered held-out harmful refusal where a norm-matched random direction did not, and adding it raised harmless refusal. A first model could not be tested at all, which is part of what the exercise taught me.
Jun 17, 2026
9 min

A probe at layer 0 is a lie detector for your experiment

interpretability
probing
transformers
Two linear probes on Gemma 3 1B hit 100% accuracy and taught opposite lessons. The number is not the finding; the shape of the accuracy-by-layer curve is.
Jun 8, 2026
8 min

Why SFT learned the words but GRPO learned the rules

post-training
GRPO
RLHF
LLMs
Teaching a 14B model a proprietary equipment-naming taxonomy with a hand-tuned reward function, and why ~250 lines of reward code and a quarter-epoch of GRPO closed the gap that more SFT couldn’t.
May 2, 2026
12 min

From consuming a pretrained model to training my own

seq2seq
sign-language
MLX
hybrid-inference
Building a continuous-sign-language Copilot: a Transformer Seq2Seq trained from scratch on How2Sign, two training backends, and a hybrid runtime that reaches 93.6% sentence-level recognition.
May 1, 2026
13 min

Patient-specific filters as biomarkers

neuroscience
signal-processing
interpretability
ICA, FOOOF, and CSP for noisy EEG, and what spatial filters taught me about feature extraction in transformers.
Apr 30, 2026
10 min
No matching items

© 2026 Michael Leung

Writing · About · RSS

  • Edit this page
  • Report an issue

Built with Quarto.