● Agent Memory & Systems · Duke ECE PhD (2026)

Memory infrastructure for LLM agents.

ML Engineer / Researcher — Agent Memory · Retrieval · Production Systems

I build the write/retrieve/forget lifecycle for agent memory. SAGE — a novelty gate for efficient memory evolution — beats Mem0 7/7 on the LoCoMo benchmark at ~3.4× lower cost and ~2.5× lower latency. Public arXiv preprint + single-command reproducible code.

3
Patents
7+
Publications
1%
Revenue uplift @ Pinterest
3.4×
Cheaper memory (SAGE)

Open to

Where I can plug in

Recommendation & retrieval, LLM/VLM reliability, or agent memory — I bring production experience and research depth to all three.

01

Recommendation & Retrieval

Candidate generation, embedding retrieval, and ranking at production scale.

Pinterest GraphSAGE + Faiss · 1% revenue uplift · BERT broad-match CTR

02

LLM / VLM Reliability

Hallucination detection, uncertainty, and trust & safety for foundation models.

TMLR 2026 cross-modal consistency · GPT-4V, Qwen-VL, LLaMA-VL benchmarks

03

Agent Memory

Memory infrastructure for LLM agents across the write / retrieve / forget lifecycle.

SAGE — beats Mem0 7/7 · 3.4× cheaper · code public

The thesis

Three vertices. One machine.

Agent memory is what holds my work together — the same machinery I've shipped and researched for years, wearing an agent costume.

Read path

Retrieval & ranking

A memory system embeds, stores, and retrieves top-k under latency and cost constraints, then ranks by relevance to context. That's a two-tower / ANN problem — exactly the Pinterest GraphSAGE + Faiss systems I shipped to production.

Pinterest · GraphSAGE + Faiss · 1% revenue uplift

Write path

Continual learning

What to write, consolidate, and forget — and how to avoid catastrophic interference with old knowledge — is the continual-learning problem stated verbatim. My Samsung work (continual + federated, 3 patents) is the backbone of the write path.

Samsung · continual + federated · 3 patents

Serving path

Production systems & reliability

Memory at production scale means latency budgets, cost per query, and reliable serving. SAGE's cost and latency reductions are systems results — and my reliability research (TMLR 2026) keeps the read/write paths trustworthy.

SAGE · cost + latency wins · TMLR 2026 reliability

Retrieval (read) + continual learning (write) + systems (serve) = memory infrastructure. I have real artifacts at all three vertices — and they converge on SAGE.

How I got here

The roadmap to agent memory

  1. 2019 — 2022→ the write path

    Continual Learning

    Research Fellow / Deep Learning Research Intern · Samsung Semiconductor · SOC R&D Lab

    • Led continual & federated learning research — 3 patents, 2 publications.
    • Communication-efficient federated learning via global-model quantization; server-side refinement without client data access.
    • Sustainable continual learning: task-similarity detection + encoder reuse — the same problem class as bounded memory growth & forgetting in agent memory.
    • GAN Memory with No Forgetting (NeurIPS 2020) — parameter-efficient generative replay.
  2. 2022 & 2023→ the read path

    Recommendation Systems

    Research Intern — Ads Retrieval & Targeting · Pinterest Labs

    • 2023: Shipped a graph-based advertiser-similarity retrieval pipeline (GraphSAGE embeddings + Faiss ANN) into Pinterest's auto-targeting; 1% revenue uplift in A/B testing.
    • 2022: Multitask BERT model for broad match — improved ad-query relevance with measurable CTR gains.
    • Owned large-scale retrieval features end-to-end: ingestion, embedding indexing, candidate scoring, online serving, eval.
  3. 2020 — Present→ the serving path

    Efficient & Reliable ML

    PhD Research — Duke University · Advisor: Prof. Ricardo Henao

    • Cross-modal consistency for hallucination detection in VLMs (TMLR 2026) — reliable identification of low-confidence / “unknown” predictions.
    • Multi-source data-free transfer learning (IEEE MLSP 2025 Oral) — efficient model recycling under white-box & black-box access.
    • Sustainable continual learning (IEEE MLSP 2025) — parameter reuse against superlinear model growth.
  4. 2025 — Now◀ where it all leads

    Agentic Memory

    Memory Management System for AI Agents · Duke University · the convergence

    • SAGE — a novelty gate for efficient memory evolution in agentic LLMs (ARR under review).
    • Cost-efficient, low-latency memory database updates: when to write, summarize, compress, or forget.
    • Beats Mem0 on 7/7 settings · 3.4× lower cost · 2.5× lower latency. Repo public & live.

Selected work

Projects & research

SAGE

Agent Memory · ARR — under review · code public

A novelty gate for efficient memory evolution in agentic LLMs. Frames memory evolution as novelty detection via density estimation, so the system writes/consolidates only what matters.

  • Beats Mem0 on 7/7 settings
  • 3.4× lower cost · 2.5× lower latency
  • Balances memory freshness vs. compute overhead

Continual & Federated Learning

Continual Learning · Samsung · 3 patents

Communication-efficient federated learning, sustainable continual learning, and continual few-shot learning — the write-path backbone of agent memory.

  • 3 patents filed
  • GAN Memory w/ No Forgetting (NeurIPS 2020)
  • Bounded memory growth & anti-forgetting

Pinterest Ads Retrieval

Retrieval / RecSys · Shipped to production

Graph-based advertiser-similarity retrieval (GraphSAGE + Faiss ANN) plus a multitask BERT broad-match model, integrated into Pinterest's Spinner workflow.

  • 1% revenue uplift (A/B)
  • Measurable CTR improvement
  • End-to-end: indexing → serving → eval

VLM Reliability

Trust & Safety · TMLR 2026

A cross-modal consistency framework that detects hallucinations in vision-language models by comparing visual- and text-grounded reasoning paths.

  • Benchmarked GPT-4V, Qwen-VL, LLaMA-VL
  • Quantified epistemic uncertainty
  • Fallback-enabled closed-set classification

Toolkit

Skills & publications

Memory & Retrieval

  • Embedding retrieval
  • Retrieval & ranking
  • Faiss (IVF / HNSW)
  • Graph indexing
  • Conflict resolution
  • Summarization & fusion
  • Memory lifecycle (write/update/compress/forget)
  • RAG pipelines

LLM & VLM

  • Prompt engineering
  • In-context learning
  • Hallucination & conflict resolution
  • Quantization · LoRA · distillation
  • Self-supervised learning
  • VLMs (LLaMA, Qwen-VL, GPT-class)

ML Foundations

  • Representation learning
  • Generative models
  • Continual / federated learning
  • Domain adaptation
  • Interpretable ML
  • Large-scale recommendation

Systems & Infra

  • Production ML pipelines
  • Online inference
  • A/B testing
  • Distributed training (Slurm)
  • Docker · AWS · Spark
  • Hugging Face

Languages & Tools

  • Python
  • C++
  • SQL
  • Bash
  • PyTorch
  • HF Transformers
  • Git
  • Linux

Selected publications

  1. SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs

    ARR (under review)

  2. Fallback-Enabled Closed-Set Classification: Cross-Modal Consistency in Vision-Language Models

    TMLR 2026

  3. GAN Memory with No Forgetting

    NeurIPS 2020

  4. Model Recycling Framework for Multi-source Data-free Supervised Transfer Learning

    IEEE MLSP 2025 (Oral)

  5. Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks

    IEEE MLSP 2025

  6. A Holistic Approach to Interpretability in Financial Lending

    Decision Support Systems 2022

Let's talk

Building agent memory or retrieval infra?

Available June 2026. Production systems experience plus research depth across retrieval, continual learning, and reliability. Reach out.