论文卡片
162 张 · 主题 engineering · organized/paper_cards
全部
engineering · 162llm-infra · 160agent · 158evaluation · 158rag · 155multimodal · 131risk · 127database · 110AI Agent · 1RAG评测 · 1长上下文推理 · 1agent安全 · 1
4.4 OpenComputer:Verifiable Software Worlds for Computer-Use Agents
arXiv:2605.19769
条目R2:RAG over Thinking Traces — 思维痕迹检索改善推理任务(arXiv 2605.03344v2)
arXiv:2605.03344
9️⃣ arXiv · Benchmarking Multimodal Memory for Realistic User-Agent Interactions(M3Exam)(⭐⭐⭐ 参考)
arXiv:2606.07402
2. User as Code: Executable Memory for Personalized Agents
arXiv:2606.16707
2. DIVERGE: Diversity-Enhanced RAG
arXiv:2602.00238
1. Directory-Aware Query and Maintenance in Vector Databases
arXiv:2606.16903
4.2 MRAgent:Memory is Reconstructed, Not Retrieved
arXiv:2606.06036
1️⃣ RTP-LLM · 阿里巴巴工业级推理引擎 — arXiv:2605.29639(⭐⭐⭐⭐⭐ 必读)
arXiv:2605.29639
7️⃣ arXiv · Position Paper:LLM Serving 需要数学优化,而非仅靠启发式 ⭐⭐⭐⭐⭐ 学术前沿
arXiv:2605.01280
5. SwiftCache: Efficient LLM Serving for Multi-turn Conversations
arXiv:2606.16135
🟡 保留 4:"The Last Harness" — Meta-Evolution 双层循环
arXiv:2604.21003
🔴 保留 3:Agentic Harness Engineering (AHE) — arXiv 实证论文
arXiv:2604.25850
6️⃣ OScaR · 极端KV Cache量化 — arXiv:2605.19660(⭐⭐⭐ arXiv)
arXiv:2605.19660
Agent runtime / security / harness 补充候选
arXiv:2603.25723
5️⃣ arXiv · Fluid-Guided在线调度 + WAIT策略(⭐⭐⭐⭐ 补充)
arXiv:2504.11320
2. AI Engineering Blueprint for On-Premises RAG(arXiv:2604.01395)
arXiv:2604.01395
1️⃣ arXiv · AIConfigurator:多框架LLM推理配置自动优化(⭐⭐⭐⭐⭐ 必读)
arXiv:2601.06288
MMProLong:长上下文视觉语言模型的有效续训练(精读 · flyP)
arXiv:2605.13831
8. When Iterative RAG Beats Ideal Evidence
arXiv:2601.19827
7. Decentralized Multi-Agent Systems with Shared Context (DeLM)
arXiv:2606.10662
6. Evaluation and Benchmarking of LLM Agents: A Survey
arXiv:2507.21504
5. Context-Fractured Decomposition Attacks on Tool-Using LLM Agents
arXiv:2606.09084
4. Parthenon Law: A Self-Evolving Legal-Agent Framework
arXiv:2606.04602
4. DCD (Domain–Collection–Document)
arXiv:2604.07590
3. Tail-Aware Adaptive-k (TAA-k)
arXiv:2606.11907
🔟 arXiv · 后确定性分布式系统:自主基础设施新基础 ⭐⭐⭐⭐ 学术前沿
arXiv:2606.01722
SSGM框架(Stability and Safety-Governed Memory)
arXiv:2603.11768
6. Stratum — Agent 生成管道的 Rust 高性能运行时
arXiv:2603.03589
3. Experience as Compass: Multi-Agent RAG with Evolving Orchestration(arXiv:2604.00901)
arXiv:2604.00901
14. LLM 推理在线调度:hindsight optimal benchmark
arXiv:2502.07115
多智能体系统瓶颈综述(ICLR 2026 论文聚焦)
arXiv:/inbox/flyp/2026-06-17-multi-agent-bottleneck.md
Substack 线索:Sebastian Raschka (@rasbt)
arXiv:/inbox/flyp/2026-06-12-substack-rasbt.md
DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving
arXiv:/inbox/flyp/2026-06-11-DrivePI-4D-MLLM-autonomous-driving.md
BabyVision: Visual Reasoning Beyond Language
arXiv:/inbox/flyp/2026-06-16-BabyVision-inverted-competence.md
2026-06-11 Agent 与空间推理文献审稿
arXiv:/inbox/flyp/2026-06-11-agent-spatial.md
2026-06-10 多模态文献简报
arXiv:/inbox/flyp/2026-06-10-multimodal.md
🔴 保留 · `Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Benchmarking`
arXiv:2606.10749
🔴 保留 · `PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents`
arXiv:2606.12329
🔴 保留 · `DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch`
arXiv:2606.10728
条目S1:To Data & Beyond — Important LLM Papers Week of 12-17 Jan 2026
arXiv:2601.09668
条目E4:arXiv 2605.04595 — KV Cache 队列论理与稳定性分析
arXiv:2605.04595
条目D1:SIFT — 利用注意力不变性加速RAG Prefill(arXiv 2606.09441,2026-06)
arXiv:2606.09441
条目A2:Text World Models for LLM-based Agents
arXiv:2606.09032
条目A2:ACL 2026 Findings — LLM Agent记忆机制演进调查(arXiv:2605.06716)
arXiv:2605.06716
条目A1:EvoArena + EvoMem — 动态环境下的LLM Agent记忆演进基准(arXiv:2606.13681)
arXiv:2606.13681
条目A1:BRTR — Beyond Rows to Reasoning:多模态电子表格 Agentic Retrieval 框架
arXiv:2603.06503
条目 G: 公共部门 ML Pipeline 工程教训(含性能数据表)
arXiv:2511.01545
条目 F: Google 企业定制 LLM — 代码转换实战数据
arXiv:2605.16517
条目 E-NF2:MLOps 架构指南 — 25 条模型集成/部署规范(灰色文献综述)
arXiv:2606.06535
条目 E-NF1:Albireo — 突破 Amdahl 定律的 LLM 推理张量并行调度
arXiv:2606.01927
条目 A02:Corpus2Skill — 将文档语料库蒸馏为可导航技能目录
arXiv:2604.14572
② "Living Databases: A Unified Model for Continuous Schema Evolution, Versioning, and Transformations"(arXiv:2605.00676v1)
arXiv:2605.00676
② "Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG"(arXiv:2501.09136v4,2026-04更新)
arXiv:2501.09136
① FROAV: A Framework for RAG Observation and Agent Verification(arXiv:2601.07504v1)
arXiv:2601.07504
arXiv-3:A First Look at the Security Issues in the Model Context Protocol Ecosystem
arXiv:2510.16558
[TSseek] Regular Expression-Based Similarity Search for Distributed Time Series Datasets
arXiv:2606.09824
[TOKI] A Bitemporal Operator Algebra for Contradiction Resolution in LLM-Agent Persistent Memory
arXiv:2606.06240
[Larch] Learned Query Optimization for Semantic Predicates
arXiv:2606.07923
[DataEvolver] Automatic Data Preparation for Large Language Models through Multi-Level Self-Evolving
arXiv:2606.07001
[Bespoke-Card] Why Tune When You Can Generate? Synthesizing Workload-Specific Cardinality Estimators
arXiv:2606.09361
9️⃣ arXiv · 下一代云原生内存数据库:从 Redis 到 Valkey ⭐⭐⭐⭐⭐ 必读评测
arXiv:2510.19805
8️⃣ arXiv · Taming the Titans:高效 LLM 推理服务综述(ACL INLG 2025)⭐⭐⭐⭐ 综述论文
arXiv:2504.19720
7️⃣ ByteHouse · 字节跳动云原生数据仓库架构深度解析(arXiv)⭐⭐⭐⭐ 系统复现
arXiv:2602.08226
7. Triton Attention Kernel 学术分析 (arXiv 2511.11581)
arXiv:2511.11581
6. LLM Research Papers: The 2026 List (Jan–May) — Sebastian Raschka
arXiv:2601.21204
6. LLM Research Papers: The 2026 List (Jan–May) — Sebastian Raschka
arXiv:2602.08071
6. LLM Research Papers: The 2026 List (Jan–May) — Sebastian Raschka
arXiv:2602.15763
6. LLM Research Papers: The 2026 List (Jan–May) — Sebastian Raschka
arXiv:2603.15031
6. LLM Research Papers: The 2026 List (Jan–May) — Sebastian Raschka
arXiv:2603.15569
6. LLM Research Papers: The 2026 List (Jan–May) — Sebastian Raschka
arXiv:2604.12374
5️⃣ arXiv · Is Agentic RAG Worth It? An Experimental Comparison of RAG Approaches(⭐⭐⭐⭐ 高优先级)
arXiv:2601.07711
4️⃣ arXiv · Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use(⭐⭐⭐⭐ 高优先级)
arXiv:2605.05287
4️⃣ Tangram · 多轮对话非均匀KV Cache — arXiv:2606.06302(⭐⭐⭐⭐ 新鲜 arXiv)
arXiv:2606.06302
4. The End of Software Engineering(arXiv:2606.05608)
arXiv:2606.05608
3️⃣ arXiv · Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents(⭐⭐⭐⭐ 高优先级)
arXiv:2604.22085
3️⃣ arXiv · MatryoshkaLoRA(⭐⭐⭐⭐ 值得关注)
arXiv:2605.07850
3️⃣ Speculative Decoding 延迟可解释模型 — arXiv:2605.15051(⭐⭐⭐⭐ 调优必读)
arXiv:2605.15051
3. Tutti:让 SSD 后备 KV Cache 成为长上下文生产方案
arXiv:2605.03375
3. Kubernetes for GenAI Inference(arXiv:2602.04900v2)
arXiv:2602.04900
3. Flow-Controlled Scheduling for LLM Inference(arXiv 2604.11001)
arXiv:2604.11001
2️⃣ arXiv · Generating Leakage-Free Benchmarks for Robust RAG Evaluation(⭐⭐⭐⭐⭐ 必读评测方法论)
arXiv:2605.08838
2️⃣ Cats · 边缘推理的自投机级联验证 — arXiv:2605.11186(⭐⭐⭐⭐ 边缘推理重点)
arXiv:2605.11186
2. 分布式向量数据库 Qdrant 在 HPC 上的性能(arXiv 2509.12384,2025-09,持续更新)
arXiv:2509.12384
2. Systemic Measurement Bias in LLM Inference Benchmarking
arXiv:2605.24217
2. DualPath:打破 Agentic LLM 推理的存储带宽瓶颈
arXiv:2602.21548
1️⃣1️⃣ arXiv · RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic RAG Systems(⭐⭐⭐ 参考)
arXiv:2510.13910
1️⃣ arXiv · Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Open Problems(⭐⭐⭐⭐⭐ 必读综述)
arXiv:2603.07670
16. [arxiv:2004.05074 — Paxos vs Raft: Have we reached consensus on distributed consensus?](https://arxiv.org/abs/2004.05074)
arXiv:2004.05074
13. TTKV:Temporal-Tiered KV Cache(HBM+DRAM 分层)
arXiv:2604.19769
12. SoK: Agentic RAG(arXiv 2603.07379,ACL 2026)
arXiv:2603.07379
11. AgenticRAGTracer(arXiv 2602.19127)
arXiv:2602.19127
10. Cloud Native System for LLM Inference Serving(arXiv 2507.18007)
arXiv:2507.18007
1. vLLM Startup Latency: Six-Step Systematic Characterization
arXiv:2606.07362
1. Data Flow Control(DFC):AI Agent 数据安全策略的内核级执行框架
arXiv:2606.05679
1. AlphaEval: Evaluating Agents in Production
arXiv:2604.12162