Agents and Tools in Hugging Face Transformers
This documentation introduces agents, systems using LLMs as engines with access to external functions called tools. These tools enable agents to perform tasks beyond the capabilities of standalone LLMs, such as logic, calculations, and web searches.
Key Features:
- Tool-Augmented LLMs: Combines the power of LLMs with specialized tools for enhanced task execution.
- CodeAgent: Plans and executes actions at once, recommended for multimodal tasks.
- React Agents: Plan and execute actions one by one, leveraging observations for efficient reasoning.
- Customizable System Prompts: Tailor agent behavior with customizable system prompts.
- Toolbox Management: Easily add, update, and manage tools in the agent's toolbox.
- Default Toolset: Includes tools for document question answering, image question answering, speech-to-text, text-to-speech, translation, web search, and Python code interpretation.
- Custom Tool Creation: Create custom tools using a simple decorator-based format.
- Code Execution Safety: Secure Python interpreter environment with authorized imports.
Use Cases:
- Building AI systems capable of complex reasoning and task execution.
- Creating multimodal applications that combine language, vision, and audio.
- Automating tasks that require external data access or computation.
- Developing custom tools for specific domains and use cases.