Tool Agents: Empowering LLMs to Use Tools and Explore Environments

Author published on 2026-04-08 included in Blog

A comprehensive deep dive into Tool Agents: from Toolken vocabulary injection and CodeAct execution, to DocPrompting, Toolformer self-learning, visual Set-of-Mark grounding, and autonomous environment exploration.

Coding Agents: Evaluation, Frameworks, and Code LLMs

Author published on 2026-04-08 included in Blog

A comprehensive deep dive into Coding Agents, detailing fine-grained evaluation benchmarks (SWE-bench, LiveCodeBench), agentic frameworks (SWE-agent vs. Agentless), and the sophisticated mechanisms of code localization, code efficiency, and LLM safety.

When LLMs Learn Memory, Reasoning, and Planning: The Three Core Capabilities of Language Agents

Author published on 2026-03-12 included in Blog

From the definition of Agents and Language Agents and their three generations, through memory (episodic/semantic/procedural, RAG, HippoRAG), reasoning (ReAct interleaved with action), and planning (reactive, tree search, world models, WebDreamer), to a unified picture and the Bitter Lesson.

LLM Reasoning: Prompting, Multi-Path Search, and Iterative Self-Improvement

Author published on 2026-03-08 included in Blog

From CoT and analogical prompting to self-consistency, ORM/PRM verification, tree-of-thoughts, multi-round self-reflection and token budget allocation, with the Bitter Lesson in mind.

RLHF and Test-Time Compute: Reinforcement Learning and Inference-Time Optimization for LLMs

Author published on 2026-03-08 included in Blog

From reward design, policy gradient, and PPO to RLHF/RLVR, then inference-time sampling and verification, Archon architecture search, and when to use RL vs test-time scaling.

LLM Basics: Pretraining, Prompting, Fine-tuning and Reinforcement Learning

Author published on 2026-03-08 included in Blog

An overview of core methods for training and using large language models: compute and scaling, prompting, fine-tuning, and reinforcement learning.

Weile Luo

Tool Agents: Empowering LLMs to Use Tools and Explore Environments

Coding Agents: Evaluation, Frameworks, and Code LLMs

When LLMs Learn Memory, Reasoning, and Planning: The Three Core Capabilities of Language Agents

LLM Reasoning: Prompting, Multi-Path Search, and Iterative Self-Improvement

RLHF and Test-Time Compute: Reinforcement Learning and Inference-Time Optimization for LLMs

LLM Basics: Pretraining, Prompting, Fine-tuning and Reinforcement Learning