Research Scientist / Postdoc · Duke ECE PhD (May 2026)
Research Scientist / Postdoc — Reliability · Continual Learning · Memory
Almost everything I've published is one question in different costumes: how should a learning system manage its own knowledge — what to keep, what to merge, what to discard, and when to admit it doesn't know? I call it epistemic self-modeling, and it runs from credit-risk interpretability to agent memory.
The thread
I didn't set out with a grand thesis — I kept circling the same drain and only now can name it. The two pillars usually pitched as separate interests, agent memory and LLM reliability, are one problem seen from opposite ends.
A reliable agent has to know what it knows. A good memory is what it knows. SAGE governs what gets into the knowledge store; the reliability work governs what comes out as an answer. Both are acts of epistemic control.
One question, eight years
Globally consistent explanations for credit decisions, with no accuracy loss for full interpretability. The seed wasn't “interpretability” as a topic — it was the instinct that a model should be accountable for why it believes what it believes.
NeurIPS 2018 Workshop · DSS 2022
Catastrophic forgetting is, underneath, a what-to-keep problem. Task-similarity detection asks: is this task genuinely novel, or similar enough to reuse what I already have? — answered with a lightweight test instead of retraining everything.
NeurIPS 2020 · IEEE MLSP 2025
Learning under hard constraints: compress the global model, preserve accuracy, never see the client's data. The discipline of keeping only what is sufficient — and nothing more.
3 patents filed
The clean inversion — not what to keep, but what to reject. VLMs hand out a confident in-set label even when the image belongs to no category they were given. The fix: accept an answer only when the visual and textual arms agree — a cheap, principled rule for knowing when you don't know.
TMLR 2026
The field poured itself into the read path — retrieval, vector indexes, knowledge graphs — and left the write decision to an expensive LLM call per fact. I framed memory evolution as novelty detection: a closed-form von Mises–Fisher gate routes clearly-new facts to ADD, redundant ones to NOOP, and sends only the ambiguous cases to the LLM. It is the continual-learning task-similarity test, reincarnated for agent memory.
ACL ARR 2026 · 7/7 over Mem0 · ~3.4× cheaper
The throughline is a taste: faced with an expensive deliberation, I reach for a lightweight, principled decision rule — not a bigger model or more API calls. The same geometric, statistics-grounded instinct, eight years apart.
The program
One research program converging from two directions — toward memory that knows how sure it is.
extending SAGE
extending TMLR
Evaluation is its own thread. Today's benchmarks (LoCoMo, LongMemEval) score end-task QA — they can't isolate write-decision quality or memory-conflict handling. Building evaluation that measures the thing directly is part of the program.
Two load-bearing papers and a coherent lineage — not a decade of agent-memory hype. And research that becomes a runnable, reproducible artifact, sometimes a shipped system with measured impact (Pinterest, ~1% revenue uplift).
The record
SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs
ARR (under review)
Fallback-Enabled Closed-Set Classification: Cross-Modal Consistency in Vision-Language Models
TMLR 2026
GAN Memory with No Forgetting
NeurIPS 2020
Model Recycling Framework for Multi-source Data-free Supervised Transfer Learning
IEEE MLSP 2025 (Oral)
Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks
IEEE MLSP 2025
A Holistic Approach to Interpretability in Financial Lending
Decision Support Systems 2022
Let's talk
Available June 2026 (flexible through end of 2026). I'm looking for research roles where I can own problems end-to-end — formulation, evaluation design, and reproducible artifacts. Let's talk.