SemanticSegmentation¶
- class agentlego.tools.SemanticSegmentation(seg_model='mask2former_r50_8xb2-90k_cityscapes-512x1024', device='cuda', toolmeta=None)[源代码]
A tool to conduct semantic segmentation on an image.
- 参数:
seg_model (str) – The model name used to inference. Which can be found in the
MMSegmentationrepository. Defaults tomask2former_r50_8xb2-90k_cityscapes-512x1024.device (str) – The device to load the model. Defaults to ‘cuda’.
toolmeta (None | dict | ToolMeta) – The additional info of the tool. Defaults to None.
默认工具信息¶
名称: SemanticSegmentation
描述: This tool can segment all items in the input image and return a segmentation result image. It focus on urban scene images.
输入:
image (ImageIO)
输出:
ImageIO
Examples¶
Download the demo resource
wget http://download.openmmlab.com/agentlego/road.jpg
Use the tool directly (without agent)
from agentlego.apis import load_tool
# load tool
tool = load_tool('SemanticSegmentation', device='cuda')
# apply tool
segmentation = tool('road.jpg')
With Lagent
from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool
# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('SemanticSegmentation', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))
# agent running with the tool.
ret = agent.chat(f'Please segment the city scene image at `road.jpg`')
for step in ret.inner_steps[1:]:
print('------')
print(step['content'])
Set up¶
Before using the tool, please confirm you have installed the related dependencies by the below commands.
pip install openmim
mim install mmsegmentation
Reference¶
This tool uses a Mask2Former model by default. See the following paper for details.
@inproceedings{cheng2021mask2former,
title={Masked-attention Mask Transformer for Universal Image Segmentation},
author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
journal={CVPR},
year={2022}
}