ImageExpansion¶
- class agentlego.tools.ImageExpansion(caption_model='blip-base_3rdparty_caption', device='cuda', toolmeta=None)[source]
A tool to expand the given image.
- Parameters:
Default Tool Meta¶
name: ImageExpansion
description: This tool can expand the peripheral area of an image based on its content, thus obtaining a larger image.
inputs:
image (ImageIO)
scale (str): expand ratio, can be a float number or two float number for width and height ratio.
outputs:
ImageIO
Examples¶
Use the tool directly (without agent)
from agentlego.apis import load_tool
# load tool
tool = load_tool('ImageExpansion', device='cuda')
# apply tool
image = tool('examples/demo.png', '1.25')
print(image)
With Lagent
from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool
# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('ImageExpansion', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))
# agent running with the tool.
img_path = 'examples/demo.png'
ret = agent.chat(f'According to the image `{img_path}`, expand its size to 1.25 times')
for step in ret.inner_steps[1:]:
print('------')
print(step['content'])
Set up¶
Before using this tool, please confirm you have installed the related dependencies by the below commands.
pip install -U diffusers
pip install -U openmim
mim install -U mmpretrain
Reference¶
This tool uses BLIP and Stable Diffusion in default settings. See the following papers for details.
@inproceedings{li2022blip,
title={BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation},
author={Junnan Li and Dongxu Li and Caiming Xiong and Steven Hoi},
year={2022},
booktitle={ICML},
}
@InProceedings{Rombach_2022_CVPR,
author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
title = {High-Resolution Image Synthesis With Latent Diffusion Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {10684-10695}
}