(a) The task environment includes a camera for real-time top-view capturing, a dual-arm robot, tool(s), and a blue block to be manipulated to the target location. (b) The architecture of our system: Unstructured data input is converted to a subtask list in the symbolic task planner with an LLM, and a manoeuvrability-driven planner to compute the tool's manoeuvrability and generate an affordance-oriented motion and path. (c) Execution process of the result given by the system: dual-arm robots take turns pushing the blue block from one side to another via collaboration.
Experiment Setup