Research Scientist / Postdoc · Duke ECE PhD (May 2026)

Teaching learning systems what to keep, what to question, and when to say “I don't know.”

Research Scientist / Postdoc — Reliability · Continual Learning · Memory

Almost everything I've published is one question in different costumes: how should a learning system manage its own knowledge — what to keep, what to merge, what to discard, and when to admit it doesn't know? I call it epistemic self-modeling, and it runs from credit-risk interpretability to agent memory.

3
Patents
7+
Publications
1%
Revenue uplift @ Pinterest
3.4×
Cheaper memory (SAGE)

The thread

One obsession, in different costumes.

I didn't set out with a grand thesis — I kept circling the same drain and only now can name it. The two pillars usually pitched as separate interests, agent memory and LLM reliability, are one problem seen from opposite ends.

A reliable agent has to know what it knows. A good memory is what it knows. SAGE governs what gets into the knowledge store; the reliability work governs what comes out as an answer. Both are acts of epistemic control.

One question, eight years

The research journey

  1. 2018FICO · Prof. Cynthia Rudin

    Reasoning a model can answer for

    Globally consistent explanations for credit decisions, with no accuracy loss for full interpretability. The seed wasn't “interpretability” as a topic — it was the instinct that a model should be accountable for why it believes what it believes.

    NeurIPS 2018 Workshop · DSS 2022

  2. 2020–22GAN Memory · Sustainable Continual Learning

    What to keep

    Catastrophic forgetting is, underneath, a what-to-keep problem. Task-similarity detection asks: is this task genuinely novel, or similar enough to reuse what I already have? — answered with a lightweight test instead of retraining everything.

    NeurIPS 2020 · IEEE MLSP 2025

  3. 2019–22Federated Learning · Samsung

    Minimal sufficient knowledge

    Learning under hard constraints: compress the global model, preserve accuracy, never see the client's data. The discipline of keeping only what is sufficient — and nothing more.

    3 patents filed

  4. 2025Cross-Modal Consistency · TMLR

    What to refuse

    The clean inversion — not what to keep, but what to reject. VLMs hand out a confident in-set label even when the image belongs to no category they were given. The fix: accept an answer only when the visual and textual arms agree — a cheap, principled rule for knowing when you don't know.

    TMLR 2026

  5. 2025–nowSAGE

    The lineage snaps into focus

    The field poured itself into the read path — retrieval, vector indexes, knowledge graphs — and left the write decision to an expensive LLM call per fact. I framed memory evolution as novelty detection: a closed-form von Mises–Fisher gate routes clearly-new facts to ADD, redundant ones to NOOP, and sends only the ambiguous cases to the LLM. It is the continual-learning task-similarity test, reincarnated for agent memory.

    ACL ARR 2026 · 7/7 over Mem0 · ~3.4× cheaper

The throughline is a taste: faced with an expensive deliberation, I reach for a lightweight, principled decision rule — not a bigger model or more API calls. The same geometric, statistics-grounded instinct, eight years apart.

The program

What I'm building toward

One research program converging from two directions — toward memory that knows how sure it is.

Direction I

Memory as a control problem

extending SAGE

  • Forgetting & compression as decisions, not curves — what should a long-horizon agent let decay, and on what evidence?
  • Conflict resolution when stored memories contradict — which wins, and how to represent “this used to be true”?
  • Calibrated thresholds that track their own error rate, not just the store's geometry.
Direction II

Reliability for agents, not just classifiers

extending TMLR

  • Selective action — an agent that abstains, asks a clarifying question, or escalates instead of confabulating a tool call.
  • Cross-path consistency: accept an action only when independent reasoning routes agree.
  • Calibration of operations — knowing when a write or a retrieval is uncertain, not just a final answer.
The endpoint is uncertainty-aware memory: a store where every item carries a calibrated confidence, and novelty, trust, and abstention share one coherent epistemic state. A memory item that knows it might be wrong; a retrieval that propagates the doubt; an agent that abstains because its memory is unsure.

Evaluation as a thread

Evaluation is its own thread. Today's benchmarks (LoCoMo, LongMemEval) score end-task QA — they can't isolate write-decision quality or memory-conflict handling. Building evaluation that measures the thing directly is part of the program.

What grounds it

Two load-bearing papers and a coherent lineage — not a decade of agent-memory hype. And research that becomes a runnable, reproducible artifact, sometimes a shipped system with measured impact (Pinterest, ~1% revenue uplift).

The record

Selected publications

  1. SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs

    ARR (under review)

  2. Fallback-Enabled Closed-Set Classification: Cross-Modal Consistency in Vision-Language Models

    TMLR 2026

  3. GAN Memory with No Forgetting

    NeurIPS 2020

  4. Model Recycling Framework for Multi-source Data-free Supervised Transfer Learning

    IEEE MLSP 2025 (Oral)

  5. Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks

    IEEE MLSP 2025

  6. A Holistic Approach to Interpretability in Financial Lending

    Decision Support Systems 2022

Let's talk

Hiring a research scientist or postdoc?

Available June 2026 (flexible through end of 2026). I'm looking for research roles where I can own problems end-to-end — formulation, evaluation design, and reproducible artifacts. Let's talk.