Shortcuts

TextToImage

class agentlego.tools.TextToImage(model='sd', device='cuda', toolmeta=None)[source]

A tool to generate image according to some keywords.

Parameters:
  • model (str) – The stable diffusion model to use. You can choose from “sd” and “sdxl”. Defaults to “sd”.

  • device (str) – The device to load the model. Defaults to ‘cuda’.

  • toolmeta (None | dict | ToolMeta) – The additional info of the tool. Defaults to None.

Default Tool Meta

  • name: TextToImage

  • description: This tool can generate an image according to the input text.

  • inputs:

    • keywords (str): A series of English keywords separated by comma.

  • outputs:

    • ImageIO

Examples

Use the tool directly (without agent)

from agentlego.apis import load_tool

# load tool
tool = load_tool('TextToImage', device='cuda')

# apply tool
image = tool('cute cat')
print(image)

With Lagent

from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool

# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('TextToImage', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))

# agent running with the tool.
ret = agent.chat(f'Please generate an image of cute cat.')
for step in ret.inner_steps[1:]:
    print('------')
    print(step['content'])

With Transformers Agent

from transformers import HfAgent
from agentlego.apis import load_tool

# load tools and build transformers agent
tool = load_tool('TextToImage', device='cuda').to_transformers_agent()
agent = HfAgent('https://api-inference.huggingface.co/models/bigcode/starcoder', additional_tools=[tool])

# agent running with the tool (For demo, we directly specify the tool name here.)
image = agent.run(f'Use the tool `{tool.name}` to generate an image of cat.')
print(image)

Set up

Before using the tool, please confirm you have installed the related dependencies by the below commands.

pip install -U diffusers

Reference

This tool uses a Stable Diffusion model in default settings. See the following paper for details.

@InProceedings{Rombach_2022_CVPR,
    author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
    title     = {High-Resolution Image Synthesis With Latent Diffusion Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {10684-10695}
}