{"id":4336,"date":"2025-08-15T15:02:10","date_gmt":"2025-08-15T07:02:10","guid":{"rendered":"https:\/\/www.rzautoassembly.com\/?p=4336"},"modified":"2025-08-15T15:02:10","modified_gmt":"2025-08-15T07:02:10","slug":"the-first-open-source-platform-for-robot-world-models-released-launched-by-zhiyuan-robotics","status":"publish","type":"post","link":"https:\/\/www.rzautoassembly.com\/sk\/the-first-open-source-platform-for-robot-world-models-released-launched-by-zhiyuan-robotics\/","title":{"rendered":"The first open-source platform for robot world models released! Launched by \u00a0Zhiyuan Robotics"},"content":{"rendered":"<figure id=\"attachment_4337\" aria-describedby=\"caption-attachment-4337\" style=\"width: 300px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.rzautoassembly.com\/sk\/product\/epson-robot\/\"><img fetchpriority=\"high\" decoding=\"async\" class=\"size-medium wp-image-4337\" src=\"https:\/\/www.rzautoassembly.com\/wp-content\/uploads\/2025\/07\/Ne\u0161tandardn\u00e9 automatiza\u010dn\u00e9 zariadenia reklama kreativita-111-1.png\" alt=\"\" width=\"300\" height=\"217\" srcset=\"\" sizes=\"(max-width: 300px) 100vw, 300px\" data-srcset=\"\" \/><\/a><figcaption id=\"caption-attachment-4337\" class=\"wp-caption-text\">\u00a0 \u00a0 \u00a0 \u00a0 Stroj na mont\u00e1\u017e autodielov<\/figcaption><\/figure>\n<p><span style=\"font-size: 14pt;\"><strong>Genie Envisioner (GE) Platform Achieves End-to-End Reasoning for Robots\u2019 \u201cPerception-Decision-Execution\u201d<\/strong><\/span><\/p>\n<p>&nbsp;<\/p>\n<p>The Genie Envisioner (GE) platform innovatively integrates future frame prediction, strategy learning, and simulation evaluation into a closed-loop architecture centered on video generation, achieving for the first time an end-to-end reasoning process for robots to complete from perception to decision-making and then to execution within the same world model.<\/p>\n<p><span style=\"font-size: 14pt;\"><strong>Zhiyuan Robotics Unveils GE Platform with Full Open-Source Commitment<\/strong><\/span><\/p>\n<p>&nbsp;<\/p>\n<p>Recently, Zhiyuan Robotics launched the industry\u2019s first unified world model platform for real-world robot manipulation, Genie Envisioner (GE), and announced that it will open-source all codes, pre-trained models, and evaluation tools.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><span style=\"font-size: 14pt;\">GE Platform Architecture and Its Application Value in Precision Manufacturing<\/span><\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>The platform innovatively integrates future frame prediction, strategy learning, and simulation evaluation into a closed-loop architecture centered on video generation, realizing for the first time an end-to-end reasoning process for robots to complete from perception to decision-making and then to execution within the same world model. This ability to accurately model physical interactions has broad application prospects in the field of precision manufacturing \u2014 for example, in<span style=\"color: #00ccff;\"> <a style=\"color: #00ccff;\" href=\"https:\/\/www.rzautoassembly.com\/zh\/products\/biological-indicator-assembly-machine\/\"><u>biological indicator assembly machines<\/u><\/a><\/span>, through a similar visual-action closed-loop control logic, sterile and precise docking of microbial carriers and reaction tubes can be achieved. Its sub-millimeter assembly accuracy and the spatiotemporal dynamic modeling concept of the GE platform are jointly promoting the upgrading of automated systems towards the integration of \u201cperception-decision-execution\u201d.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><span style=\"font-size: 14pt;\">Limitations of Traditional Robot Learning Systems<\/span><\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>Traditional robot learning systems generally adopt a phased development model of \u201cdata collection \u2014 model training \u2014 strategy evaluation\u201d. Each link is independent and relies on task-specific tuning, resulting in high development complexity and long iteration cycles.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><span style=\"font-size: 14pt;\">Technical Foundation for GE Platform\u2019s Breakthrough in Fragmented Architecture<\/span><\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>The GE platform breaks through this fragmented architecture bottleneck by building a unified video-generated world model. Based on approximately 3,000 hours of real robot manipulation video data (covering more than 1 million real machine records), the platform establishes a direct mapping from language instructions to visual space, completely retaining the spatiotemporal dynamic information of the robot\u2019s interaction with the environment.<\/p>\n<p>&nbsp;<\/p>\n<p><strong><span style=\"font-size: 14pt;\">Vision-Centered World Modeling Paradigm and Performance Improvements<\/span><\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>The core breakthrough lies in the vision-centered world modeling paradigm. Different from the mainstream VLA (Vision-Language-Action) methods that rely on language abstraction, GE directly models the interaction dynamics between robots and the environment in visual space, achieving accurate capture of physical laws. This paradigm brings significant performance improvements:<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-size: 14pt;\"><strong>Improved cross-platform generalization efficiency<\/strong><\/span>: On brand-new robot platforms such as Agilex Cobot Magic, the GE-Act action model can perform tasks with high quality with only 1 hour (about 250 demonstrations) of teleoperation data, which is better than the \u03c00 and GR00T models that require large-scale multi-ontology pre-training;<\/p>\n<p>&nbsp;<\/p>\n<p>Breakthrough in long-time sequence task execution: In ultra-10-step continuous tasks such as folding cartons, the success rate of GE-Act is as high as 76% (\u03c00 is 48%, UniVLA\/GR00T is 0%). This is mainly attributed to the ability of visual space to explicitly model spatiotemporal evolution and the innovative design of the sparse memory module.<\/p>\n<p><strong><span style=\"font-size: 14pt;\">The technical architecture consists of three collaborative components:<\/span><\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>GE-Base multi-view video base model: It adopts an autoregressive video generation framework, maintains spatial consistency through three-way perspective input from the head and the wrists of both arms, and combines a sparse memory mechanism to enhance long-time sequence reasoning. The training is divided into two stages: 3-30Hz multi-resolution time sequence adaptation training to improve motion robustness, and 5Hz fixed sampling strategy alignment fine-tuning;<\/p>\n<p>&nbsp;<\/p>\n<p>GE-Act parallel flow matching action model: The 160M parameter lightweight architecture converts visual representations into control commands through a cross-attention mechanism, and adopts \u201cslow-fast\u201d asynchronous reasoning (video DiT 5Hz \/ action model 30Hz), achieving 200-millisecond 54-step real-time response on RTX 4090 GPU;<\/p>\n<p>&nbsp;<\/p>\n<p>GE-Sim hierarchical action condition simulator: Through Pose2Image conditions and motion vector encoding, control commands are accurately converted into visual predictions, supporting closed-loop strategy evaluation and data generation, and can complete thousands of strategy rollouts per hour.<\/p>\n<p><strong><span style=\"font-size: 14pt;\">EWMBench Evaluation Suite for Quantifying World Model Quality<\/span><\/strong><\/p>\n<p>To quantify the quality of the world model, the team simultaneously launched the EWMBench evaluation suite to evaluate the modeling ability from the dimensions of scene consistency and trajectory accuracy. In the comparison of models such as Kling and OpenSora, GE-Base leads in key indicators and is highly consistent with human judgment. The platform has now opened the project homepage, papers, and code repository, promoting the evolution of embodied intelligence from the \u201cpassive execution\u201d to the \u201cimagination-verification-action\u201d paradigm.<\/p>\n<p><span style=\"color: #00ccff;\"><a style=\"color: #00ccff;\" href=\"https:\/\/www.rzautoassembly.com\/sk\/injection-molded-parts-automated-assembly-system-with-auto-loading\/\">Index assembly machine<\/a><\/span><br \/>\n<span style=\"color: #00ccff;\"><a style=\"color: #00ccff;\" href=\"https:\/\/www.rzautoassembly.com\/sk\/products\/\">Intelligent index assembly robot<\/a><\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Genie Envisioner (GE) Platform Achieves End-to-End Reasoning for Robots\u2019 \u201cPerception-Decision-Execution\u201d \u00a0 The Genie Envisioner (GE) platform innovatively integrates future frame prediction, strategy learning, and simulation evaluation into a closed-loop architecture centered on video generation, achieving for the first time an end-to-end reasoning process for robots to complete from perception to decision-making and then to execution [\u2026]<\/p>","protected":false},"author":1,"featured_media":4338,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1,124],"tags":[],"class_list":["post-4336","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news","category-technology"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/posts\/4336","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/comments?post=4336"}],"version-history":[{"count":0,"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/posts\/4336\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/media\/4338"}],"wp:attachment":[{"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/media?parent=4336"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/categories?post=4336"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rzautoassembly.com\/sk\/wp-json\/wp\/v2\/tags?post=4336"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}