SegmentAnything¶
- class agentlego.tools.SegmentAnything(sam_model='sam_vit_h_4b8939.pth', device='cuda', toolmeta=None)[source]
A tool to segment all objects on an image.
- Parameters:
Default Tool Meta¶
name: SegmentAnything
description: This tool can segment all items in the image and return a segmentation result image.
inputs:
image (ImageIO)
outputs:
ImageIO: The segmentation result image.
Examples¶
Download the demo resource
wget http://download.openmmlab.com/agentlego/cups.png
Use the tool directly (without agent)
from agentlego.apis import load_tool
# load tool
tool = load_tool('SegmentAnything', device='cuda')
# apply tool
segmentation = tool('cups.png')
With Lagent
from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool
# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('SegmentAnything', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))
# agent running with the tool.
ret = agent.chat(f'Please segment the image `cups.png`.')
for step in ret.inner_steps[1:]:
print('------')
print(step['content'])
Set up¶
Before using the tool, please confirm you have installed the related dependencies by the below commands.
pip install segment_anything
Reference¶
This tool uses a Segment Anything model. See the following paper for details.
@misc{kirillov2023segment,
title={Segment Anything},
author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
year={2023},
eprint={2304.02643},
archivePrefix={arXiv},
primaryClass={cs.CV}
}