PhD, Duke University (2026) · available now

Sijia Wang(Scarlett)

Research Scientist / ML Engineer — RecSys · LLM Reliability · Agent Memory

I build recommendation, reliability, and memory systems for ML & LLM agents — research depth, production scale.

Duke ECE PhD working across recommendation & retrieval, LLM/VLM reliability, and agent memory. Two Pinterest internships shipping production retrieval and ranking; research spanning NeurIPS 2020, TMLR 2026, and IEEE MLSP 2025 Oral, plus 3 patents on continual learning and model compression.

View my work Get in touch

Patents

Publications

Revenue uplift @ Pinterest

3.4×

Cheaper memory (SAGE)

Open to

Where I can plug in

Recommendation & retrieval, LLM/VLM reliability, or agent memory — I bring production experience and research depth to all three.

Recommendation & Retrieval

Candidate generation, embedding retrieval, and ranking at production scale.

Proof: Pinterest GraphSAGE + Faiss · 1% revenue uplift · BERT broad-match CTR

LLM / VLM Reliability

Hallucination detection, uncertainty, and trust & safety for foundation models.

Proof: TMLR 2026 cross-modal consistency · GPT-4V, Qwen-VL, LLaMA-VL benchmarks

Agent Memory

Memory infrastructure for LLM agents across the write / retrieve / forget lifecycle.

Proof: SAGE — beats Mem0 7/7 · 3.4× cheaper · code public

The thesis

Three vertices. One machine.

Agent memory is what holds my work together — the same machinery I've shipped and researched for years, wearing an agent costume.

Read path

Retrieval & ranking

A memory system embeds, stores, and retrieves top-k under latency and cost constraints, then ranks by relevance to context. That's a two-tower / ANN problem — exactly the Pinterest GraphSAGE + Faiss systems I shipped to production.

Pinterest · GraphSAGE + Faiss · 1% revenue uplift

Write path

Continual learning

What to write, consolidate, and forget — and how to avoid catastrophic interference with old knowledge — is the continual-learning problem stated verbatim. My Samsung work (continual + federated, 3 patents) is the backbone of the write path.

Samsung · continual + federated · 3 patents

Serving path

Production systems & reliability

Memory at production scale means latency budgets, cost per query, and reliable serving. SAGE's cost and latency reductions are systems results — and my reliability research (TMLR 2026) keeps the read/write paths trustworthy.

SAGE · cost + latency wins · TMLR 2026 reliability

Retrieval (read) + continual learning (write) + systems (serve) = memory infrastructure. I have real artifacts at all three vertices — and they converge on SAGE.

How I got here

The roadmap to agent memory

Each stop solved one face of the same problem. Read the dates top-down — they converge.

2019 — 2022→ the write path
Continual Learning
Research Fellow / Deep Learning Research Intern · Samsung Semiconductor · SOC R&D Lab
- Led continual & federated learning research — 3 patents, 2 publications.
- Communication-efficient federated learning via global-model quantization; server-side refinement without client data access.
- Sustainable continual learning: task-similarity detection + encoder reuse — the same problem class as bounded memory growth & forgetting in agent memory.
- GAN Memory with No Forgetting (NeurIPS 2020) — parameter-efficient generative replay.
2022 & 2023→ the read path
Recommendation Systems
Research Intern — Ads Retrieval & Targeting · Pinterest Labs
- 2023: Shipped a graph-based advertiser-similarity retrieval pipeline (GraphSAGE embeddings + Faiss ANN) into Pinterest's auto-targeting; 1% revenue uplift in A/B testing.
- 2022: Multitask BERT model for broad match — improved ad-query relevance with measurable CTR gains.
- Owned large-scale retrieval features end-to-end: ingestion, embedding indexing, candidate scoring, online serving, eval.
2020 — Present→ the serving path
Efficient & Reliable ML
PhD Research — Duke University · Advisor: Prof. Ricardo Henao
- Cross-modal consistency for hallucination detection in VLMs (TMLR 2026) — reliable identification of low-confidence / “unknown” predictions.
- Multi-source data-free transfer learning (IEEE MLSP 2025 Oral) — efficient model recycling under white-box & black-box access.
- Sustainable continual learning (IEEE MLSP 2025) — parameter reuse against superlinear model growth.
2025 — Now◀ where it all leads
Agentic Memory
Memory Management System for AI Agents · Duke University · the convergence
- SAGE — a novelty gate for efficient memory evolution in agentic LLMs (ARR under review).
- Cost-efficient, low-latency memory database updates: when to write, summarize, compress, or forget.
- Beats Mem0 on 7/7 settings · 3.4× lower cost · 2.5× lower latency. Repo public & live.

Selected work

Projects & research

Real artifacts at every vertex of the memory triangle — read, write, and serve.

SAGE

Featured

Agent Memory·ARR — under review · code public

A novelty gate for efficient memory evolution in agentic LLMs. Frames memory evolution as novelty detection via density estimation, so the system writes/consolidates only what matters.

Beats Mem0 on 7/7 settings
3.4× lower cost · 2.5× lower latency
Balances memory freshness vs. compute overhead

Pinterest Ads Retrieval

Retrieval / RecSys·Shipped to production

Graph-based advertiser-similarity retrieval (GraphSAGE + Faiss ANN) plus a multitask BERT broad-match model, integrated into Pinterest's Spinner workflow.

1% revenue uplift (A/B)
Measurable CTR improvement
End-to-end: indexing → serving → eval

Continual & Federated Learning

Continual Learning·Samsung · 3 patents

Communication-efficient federated learning, sustainable continual learning, and continual few-shot learning — the write-path backbone of agent memory.

3 patents filed
GAN Memory w/ No Forgetting (NeurIPS 2020)
Bounded memory growth & anti-forgetting

VLM Reliability

Trust & Safety·TMLR 2026

A cross-modal consistency framework that detects hallucinations in vision-language models by comparing visual- and text-grounded reasoning paths.

Benchmarked GPT-4V, Qwen-VL, LLaMA-VL
Quantified epistemic uncertainty
Fallback-enabled closed-set classification

Toolkit

Skills & publications

Research depth and production reach across the memory stack.

Memory & Retrieval

Embedding retrieval
Retrieval & ranking
Faiss (IVF / HNSW)
Graph indexing
Conflict resolution
Summarization & fusion
Memory lifecycle (write/update/compress/forget)
RAG pipelines

LLM & VLM

Prompt engineering
In-context learning
Hallucination & conflict resolution
Quantization · LoRA · distillation
Self-supervised learning
VLMs (LLaMA, Qwen-VL, GPT-class)

ML Foundations

Representation learning
Generative models
Continual / federated learning
Domain adaptation
Interpretable ML
Large-scale recommendation

Systems & Infra

Production ML pipelines
Online inference
A/B testing
Distributed training (Slurm)
Docker · AWS · Spark
Hugging Face

Languages & Tools

Python
C++
SQL
Bash
PyTorch
HF Transformers
Git
Linux

Selected publications

SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs
ARR (under review)
Fallback-Enabled Closed-Set Classification: Cross-Modal Consistency in Vision-Language Models
TMLR 2026
GAN Memory with No Forgetting
NeurIPS 2020
Model Recycling Framework for Multi-source Data-free Supervised Transfer Learning
IEEE MLSP 2025 (Oral)
Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks
IEEE MLSP 2025
A Holistic Approach to Interpretability in Financial Lending
Decision Support Systems 2022

Let's talk

Hiring for RecSys, LLM reliability, or agent memory?

PhD, Duke University (2026) · available now. I'm open to recommendation & retrieval, LLM/VLM reliability, and agent-memory roles. Reach out.

scarlett.95.wang@gmail.com Résumé

Email GitHub LinkedIn Scholar