ImageToDepth¶
- class agentlego.tools.ImageToDepth(device='cuda', toolmeta=None)[source]
A tool to estimation depth of an image.
Default Tool Meta¶
name: ImageToDepth
description: This tool can generate the depth image of an image.
inputs:
image (ImageIO)
outputs:
ImageIO
Examples¶
Use the tool directly (without agent)
from agentlego.apis import load_tool
# load tool
tool = load_tool('ImageToDepth', device='cuda')
# apply tool
depth = tool('examples/demo.png')
print(depth)
With Lagent
from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool
# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('ImageToDepth', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))
# agent running with the tool.
img_path = 'examples/demo.png'
ret = agent.chat(f'Please estimate the depth of the image `{img_path}`')
for step in ret.inner_steps[1:]:
print('------')
print(step['content'])
Set up¶
Before using the tool, please confirm you have installed the related dependencies by the below commands.
pip install -U transformers
Reference¶
This tool uses a DPT model in default settings. See the following paper for details.
@article{DBLP:journals/corr/abs-2103-13413,
author = {Ren{\'{e}} Ranftl and
Alexey Bochkovskiy and
Vladlen Koltun},
title = {Vision Transformers for Dense Prediction},
journal = {CoRR},
volume = {abs/2103.13413},
year = {2021},
url = {https://arxiv.org/abs/2103.13413},
eprinttype = {arXiv},
eprint = {2103.13413},
timestamp = {Wed, 07 Apr 2021 15:31:46 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2103-13413.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}