Shortcuts

SemanticSegmentation

class agentlego.tools.SemanticSegmentation(seg_model='mask2former_r50_8xb2-90k_cityscapes-512x1024', device='cuda', toolmeta=None)[source]

A tool to conduct semantic segmentation on an image.

Parameters:
  • seg_model (str) – The model name used to inference. Which can be found in the MMSegmentation repository. Defaults to mask2former_r50_8xb2-90k_cityscapes-512x1024.

  • device (str) – The device to load the model. Defaults to ‘cuda’.

  • toolmeta (None | dict | ToolMeta) – The additional info of the tool. Defaults to None.

Default Tool Meta

  • name: SemanticSegmentation

  • description: This tool can segment all items in the input image and return a segmentation result image. It focus on urban scene images.

  • inputs:

    • image (ImageIO)

  • outputs:

    • ImageIO

Examples

Download the demo resource

wget http://download.openmmlab.com/agentlego/road.jpg

Use the tool directly (without agent)

from agentlego.apis import load_tool

# load tool
tool = load_tool('SemanticSegmentation', device='cuda')

# apply tool
segmentation = tool('road.jpg')

With Lagent

from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool

# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('SemanticSegmentation', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))

# agent running with the tool.
ret = agent.chat(f'Please segment the city scene image at `road.jpg`')
for step in ret.inner_steps[1:]:
    print('------')
    print(step['content'])

Set up

Before using the tool, please confirm you have installed the related dependencies by the below commands.

pip install openmim
mim install mmsegmentation

Reference

This tool uses a Mask2Former model by default. See the following paper for details.

@inproceedings{cheng2021mask2former,
  title={Masked-attention Mask Transformer for Universal Image Segmentation},
  author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
  journal={CVPR},
  year={2022}
}