Sijia Wang(Scarlett)
Research Scientist / ML Engineer — RecSys · LLM Reliability · Agent Memory
I build recommendation, reliability, and memory systems for ML & LLM agents — research depth, production scale.
Duke ECE PhD working across recommendation & retrieval, LLM/VLM reliability, and agent memory. Two Pinterest internships shipping production retrieval and ranking; research spanning NeurIPS 2020, TMLR 2026, and IEEE MLSP 2025 Oral, plus 3 patents on continual learning and model compression.
Where I can plug in
Recommendation & retrieval, LLM/VLM reliability, or agent memory — I bring production experience and research depth to all three.
Recommendation & Retrieval
Candidate generation, embedding retrieval, and ranking at production scale.
LLM / VLM Reliability
Hallucination detection, uncertainty, and trust & safety for foundation models.
Agent Memory
Memory infrastructure for LLM agents across the write / retrieve / forget lifecycle.
Three vertices. One machine.
Agent memory is what holds my work together — the same machinery I've shipped and researched for years, wearing an agent costume.
Retrieval & ranking
A memory system embeds, stores, and retrieves top-k under latency and cost constraints, then ranks by relevance to context. That's a two-tower / ANN problem — exactly the Pinterest GraphSAGE + Faiss systems I shipped to production.
Continual learning
What to write, consolidate, and forget — and how to avoid catastrophic interference with old knowledge — is the continual-learning problem stated verbatim. My Samsung work (continual + federated, 3 patents) is the backbone of the write path.
Production systems & reliability
Memory at production scale means latency budgets, cost per query, and reliable serving. SAGE's cost and latency reductions are systems results — and my reliability research (TMLR 2026) keeps the read/write paths trustworthy.
The roadmap to agent memory
Each stop solved one face of the same problem. Read the dates top-down — they converge.
- 2019 — 2022→ the write path
Continual Learning
Research Fellow / Deep Learning Research Intern · Samsung Semiconductor · SOC R&D Lab
- Led continual & federated learning research — 3 patents, 2 publications.
- Communication-efficient federated learning via global-model quantization; server-side refinement without client data access.
- Sustainable continual learning: task-similarity detection + encoder reuse — the same problem class as bounded memory growth & forgetting in agent memory.
- GAN Memory with No Forgetting (NeurIPS 2020) — parameter-efficient generative replay.
- 2022 & 2023→ the read path
Recommendation Systems
Research Intern — Ads Retrieval & Targeting · Pinterest Labs
- 2023: Shipped a graph-based advertiser-similarity retrieval pipeline (GraphSAGE embeddings + Faiss ANN) into Pinterest's auto-targeting; 1% revenue uplift in A/B testing.
- 2022: Multitask BERT model for broad match — improved ad-query relevance with measurable CTR gains.
- Owned large-scale retrieval features end-to-end: ingestion, embedding indexing, candidate scoring, online serving, eval.
- 2020 — Present→ the serving path
Efficient & Reliable ML
PhD Research — Duke University · Advisor: Prof. Ricardo Henao
- Cross-modal consistency for hallucination detection in VLMs (TMLR 2026) — reliable identification of low-confidence / “unknown” predictions.
- Multi-source data-free transfer learning (IEEE MLSP 2025 Oral) — efficient model recycling under white-box & black-box access.
- Sustainable continual learning (IEEE MLSP 2025) — parameter reuse against superlinear model growth.
- 2025 — Now◀ where it all leads
Agentic Memory
Memory Management System for AI Agents · Duke University · the convergence
- SAGE — a novelty gate for efficient memory evolution in agentic LLMs (ARR under review).
- Cost-efficient, low-latency memory database updates: when to write, summarize, compress, or forget.
- Beats Mem0 on 7/7 settings · 3.4× lower cost · 2.5× lower latency. Repo public & live.
Projects & research
Real artifacts at every vertex of the memory triangle — read, write, and serve.
Pinterest Ads Retrieval
Graph-based advertiser-similarity retrieval (GraphSAGE + Faiss ANN) plus a multitask BERT broad-match model, integrated into Pinterest's Spinner workflow.
- 1% revenue uplift (A/B)
- Measurable CTR improvement
- End-to-end: indexing → serving → eval
Continual & Federated Learning
Communication-efficient federated learning, sustainable continual learning, and continual few-shot learning — the write-path backbone of agent memory.
- 3 patents filed
- GAN Memory w/ No Forgetting (NeurIPS 2020)
- Bounded memory growth & anti-forgetting
VLM Reliability
A cross-modal consistency framework that detects hallucinations in vision-language models by comparing visual- and text-grounded reasoning paths.
- Benchmarked GPT-4V, Qwen-VL, LLaMA-VL
- Quantified epistemic uncertainty
- Fallback-enabled closed-set classification
Skills & publications
Research depth and production reach across the memory stack.
Memory & Retrieval
- Embedding retrieval
- Retrieval & ranking
- Faiss (IVF / HNSW)
- Graph indexing
- Conflict resolution
- Summarization & fusion
- Memory lifecycle (write/update/compress/forget)
- RAG pipelines
LLM & VLM
- Prompt engineering
- In-context learning
- Hallucination & conflict resolution
- Quantization · LoRA · distillation
- Self-supervised learning
- VLMs (LLaMA, Qwen-VL, GPT-class)
ML Foundations
- Representation learning
- Generative models
- Continual / federated learning
- Domain adaptation
- Interpretable ML
- Large-scale recommendation
Systems & Infra
- Production ML pipelines
- Online inference
- A/B testing
- Distributed training (Slurm)
- Docker · AWS · Spark
- Hugging Face
Languages & Tools
- Python
- C++
- SQL
- Bash
- PyTorch
- HF Transformers
- Git
- Linux
Selected publications
SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs
ARR (under review)
Fallback-Enabled Closed-Set Classification: Cross-Modal Consistency in Vision-Language Models
TMLR 2026
GAN Memory with No Forgetting
NeurIPS 2020
Model Recycling Framework for Multi-source Data-free Supervised Transfer Learning
IEEE MLSP 2025 (Oral)
Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks
IEEE MLSP 2025
A Holistic Approach to Interpretability in Financial Lending
Decision Support Systems 2022