AI Agents

TODO

Multi-agent system

LLM agents can interact with each other in a collaborative or competitive manner. This enables them to achieve adcancement through teamwork or adversarial interactions.

In these system, agents can work together to…

Human-Agent Cooperation

LLM agents can interact with humans, providing them with assistance and performing tasks more efficiently and safely

Humans to verify and correct agent

Agent-to-Agent (A2A) Protocol

  • A protocol enabling standardized communication across agents
    • Support agents using different frameworks to communicate
    • A client (local) agent can discover agents by fetching “Agent Card” of available remote agents
    • And then delegate a task to the chosen remote agent
  • Supports streaming and asynchronous push notifications for long tasks

What is LLM Agents

Planning: (TODO)

MCP (Model Context Protocol)

  • Connecting (N) LLMs to (M) external tools/resources to be a NxM problem
  • MCP standardizes the LLM-tool communication into N→1→M process
  • Build with a client-server model
    • MCP client: the agent that needs to call tool/data
    • MCP server: a service to expose external tools and data sources

AI Agent/Workflow Frameworks

  • Frameworks initially proposed to standardize AI workflows, provide some out-of-box design patterns and abstractions
  • Some examples
    • LangChain
    • LlamaIndex: Good RAG support
    • CrewAI and Camel: multi-agent framework for more complex tasks
  • But a lot of necessary, adding complexity for agents, harder to customize
  • Prof opinion for most tasks today:
    • No framework (pure Python)
    • No MCP (can just write your own functions or hooks)
      • if low amount of tools
    • No A2A (no need for multi-agents)
    • Why?
      • LLMs are so good that you don’t need human made processes
      • If you give LLMs enough tools, it’s usually sufficient
      • Focus on prompts, templates from open source

What is AI Agent Infra?

  • Agent testing and evaluation
    • Unit + e2e test, metrics, benchmarks, human-in-the-loop
  • Agent autotuning and optimization
    • Automated prompt tuning, model selection, tool selection, workflow optimization
  • Agent hosting
    • serverless or long-running?
    • stateful or stateless
  • tooling, memory and data
  • monitoring and observability