In the bustling world of robots, MIT’s Improbable AI Lab has crafted a dynamic trio of models, collectively known as Compositional Foundation Models for Hierarchical Planning (HiP), aimed at elevating robotic decision-making for everyday tasks. While humans effortlessly glide through chores, robots often struggle with the intricate planning involved in each step. HiP, inspired by the prowess of models like OpenAI’s GPT-4, takes a distinctive approach by employing three distinct foundation models, each trained on different data modalities—language, vision, and action. This innovative trio, operating in a hierarchical structure, not only simplifies the planning process but also enhances transparency and adaptability in the face of new information. HiP’s potential spans household chores to complex construction tasks, making it a formidable ally in the world of AI-driven robotics.
Photo by Getty Images
Credits: Alex Shipps, MIT News