Embodied AI is entering a critical turning point. Recently, China’s Zhiren Robotics released the Genie Envisioner World Simulator 2.0 (GE-Sim 2.0), attempting to push the World Model—from a tool that simply understands the environment—into a world simulator (World Simulator) that can directly run, train, and optimize robots.

If you still don’t understand how significant this is, take a look first at the fundamental flaws of LLM architectures: in terms of training logic, existing LLMs only predict the context based on huge text corpora. They can know that the words “an apple falls” often appear together, but they do not truly understand the causal relationships of gravity or the physical world.

That’s why scientists like Yang LeCun and Fei-Fei Li have thrown themselves into the World Model track. Once AI has the ability to understand 3D environments and make physics predictions, this technology will become the digital brain for “Physical AI” such as autonomous robots, self-driving, and smart manufacturing. Therefore, the World Model approach argues that robots will be a crucial carrier. Now that a humanoid robot vendor like Zhiren Robotics has entered the scene, it symbolizes the vanguard of China’s hardware-driven resurgence.

Previously, Wei Zhe-jia, chairman of TSMC, had said: if you look at mainland China, it keeps making robots that can jump around and bounce. That’s useless—it’s just for show. He pointed out that the key is to enable the robot “brain” to operate, and who makes the brains? Nvidia (Nvidia), AMD (AMD), and a bunch of U.S. companies—but 95% of the brains are made by TSMC. GE-Sim 2.0’s development bottlenecks are still closely tied to China’s model development.

The World Model roadmap claims that robots are key

Current mainstream LLMs rely on massive corpora and statistical relationships to understand context, and predict the next word. It can know that the words “an apple falls” often appear together, but it does not truly understand the causal relationships of gravity or the physical world.

This kind of pattern performs extremely well in text generation, programming assistance, or Q&A tasks, but in scenarios that require understanding real-world structure, reasoning causal relationships, and long-term planning, there are still fundamental limitations. The bigger problem is that data sources are gradually running out. LLM training depends heavily on high-quality human data, and in recent years the industry has begun warning that available human text data may be consumed up within the next few years. Then, just like inbreeding that can lead to genetic defects, models will gradually drift away from reality and see performance degradation.

(In-depth analysis: Do LLMs have flaws? Why is Yang LeCun betting on the AMI World Model route?)

This is also why, in recent years, two heavyweight figures in the AI academic community—Yang LeCun and Fei-Fei Li, known as the “AI godmother”—have both chosen to bet on the next-generation AI architecture called World Model.

Back when the author said: looking further ahead, once AI has the ability to understand 3D environments and make physics predictions, this technology will become the digital brain for “Physical AI” such as autonomous robots, self-driving, and smart manufacturing. Therefore, the World Model roadmap claims that robots will be a very important carrier. Now that humanoid robot maker Zhiren Robotics has entered the fray, it symbolizes the vanguard of China’s hardware-driven resurgence.

Previously, Wei Zhe-jia, chairman of TSMC, when talking about robots and semiconductor development, said plainly: if you look at mainland China, it keeps making robots that can jump around and bounce. That’s useless—it’s only good-looking to the eyes. He pointed out that the key is to ensure the robot brain can function, and who makes the brain—Nvidia (Nvidia), AMD (AMD), and a bunch of U.S. companies. But 95% of the brain is made by TSMC.

(TSMC’s Wei Zhe-jia is sarcastic: China’s robots bounce around—only good-looking, useless! The key still comes from Nvidia)

World Model evolution: from understanding the world to learning within the world

In the past few years, World Model has been regarded as a key technology for AI to understand reality. By using image, language, and sensor data, the model can predict environmental changes and give robots basic decision-making abilities.

But GE-Sim 2.0’s core breakthrough is not just understanding the world—it is learning and an action system directly in the “world generated by the model.” It brings Action into the core variables, upgrading from traditional state prediction to a complete loop:

State

Action

State Evolution

This means that robots are no longer just observing and responding; they can actively try things out, self-optimize, and continue learning in a simulated environment. This shift makes World Model evolve from a “cognitive model” into “training infrastructure.”

GE-Sim 2.0: Let robots “evolve” in a virtual world

GE-Sim 2.0 is defined as a set of “embodied world simulators.” Its core goal is to solve the three major bottlenecks of real-world training: cost being too high, insufficient data, and difficulty in scaling. By generating environments with models, the system can train robots at scale without relying on the real world.

Technically, GE-Sim 2.0 integrates three key capabilities. First is “action-driven image generation”: the model can generate corresponding future frames based on the robot’s actions, while maintaining multi-view consistency, including the head viewpoint and left/right hand operation viewpoints.

Second is proprioception modeling: it doesn’t just simulate external visuals; it can also predict the robot’s own joints and action state, making decision-making closer to the real physical world.

Third is “automatic task assessment”: through a built-in reward model, the system can automatically determine whether a task is completed—for example, “put the blue object into the red box”—and provide feedback directly for reinforcement learning. This enables robots to complete a full closed loop in a simulated environment:

GE-Sim 2.0 can already achieve “minute-level” stable video generation

Compared with earlier models that could only generate short video segments, GE-Sim 2.0 can already achieve “minute-level” stable video generation, supporting long-duration task simulations. Meanwhile, by training with large-scale real data (remote operation, deployment, and interaction data), the model has stronger generalization capabilities across different scenarios and tasks. This is especially critical for humanoid robots: real-world operations are highly variable, and fixed-scene training alone can’t handle it.

The emergence of the World Simulator means that robots can “practice infinitely” in a virtual world—this will bring two structural changes: first, training costs will drop dramatically. Second, the speed of capability iteration will increase exponentially.

Zhiren Robotics: a new force in China’s humanoid robotics

Zhiren Robotics was founded in 2023 by Peng Zhihui, Huawei’s “genius youth,” focusing on embodied intelligence that merges AI and robotics.

The company’s core products include:

the “Yuan Zheng” series humanoid robots

the “Lingxi” robot system

the general large model GO-1

It has already completed multiple rounds of fundraising and has received investments from institutions such as Sequoia China and Hillhouse Capital. It is viewed as an important player in China’s humanoid robot sector, forming a competitive landscape with Unitree Robotics.

This article Zhiren GE-Sim 2.0: generate worlds with a World Model, Unitree’s formidable rival pushes humanoid robots toward self-evolution first appeared on ChainNews ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Google launches Deep Research Max: supports MCP and can access enterprise private data

AI Agent AI Industry News

According to a Google DeepMind official blog announcement, Google launched a new generation of autonomous research agents, Deep Research and Deep Research Max, on April 21, 2026. They are built on Gemini 3.1 Pro, as the official release following the preview version provided through the Interactions API in December 2025. Both agents are now available in the paid tiers of the Gemini API in the form of a public preview, and new startups and enterprise users on Google Cloud will be able to connect gradually. The two variants are positioned differently: interactive vs asynchronous depth Google categorizes the two agents by use case: Deep Research

ChainNewsAbmedia47m ago

OpenAI Codex Monthly Active Users Reach 4 Million in Under Two Weeks

AI Industry News

OpenAI Codex hit 4 million MAUs, announced by Sottiaux and Altman; the jump came in under two weeks from 3 million, and rate limits were reset across all tiers to celebrate. OpenAI Codex reached 4 million monthly active users in under two weeks since reaching 3 million, according to statements by OpenAI executives. To mark the milestone, rate limits across all tiers were reset.

GateNews2h ago

Two South African AI Startups Selected for Google for Startups Accelerator Africa Class 10

AI Industry News

Two SA startups, Loop and Vambo AI, join Google's Accelerator Africa 10th cohort from 2,600 apps; Loop enhances mobility/payments, Vambo AI enables multilingual AI; program runs Apr-Jun 2026 with mentors and AI workshops. Abstract: Two South African startups, Loop and Vambo AI, have been selected for the Google for Startups Accelerator Africa's 10th cohort, chosen from about 2,600 applications and one of 15 African participants. Loop digitizes mobility and payments, while Vambo AI provides multilingual AI infrastructure for translation, speech, and generative AI across African languages. The 2026 program runs April 13–June 19 and offers mentorship and hands-on workshops focused on AI/ML. Since 2018, the accelerator has supported 106 startups from 17 African countries, helping them raise over $263 million and create more than 2,800 jobs.

GateNews3h ago

Forbes AI 50 List Features 20 New Companies; OpenAI and Anthropic Capture 80% of Total Funding

AI Industry News

Gate News message, April 21 — Forbes released its 2026 eighth edition AI 50 list, featuring 20 newly included companies. OpenAI and Anthropic continue to lead the rankings, attracting substantial capital from top Silicon Valley venture capitalists and major tech firms. The combined funding for all l

GateNews4h ago

Zi变量 Unveils WALL-B Embodied AI Model; Robots to Enter Real Homes in 35 Days

AI Industry News

Gate News message, April 21 — Zibianliang (自变量), a Chinese robotics company, held a press conference on April 21 to unveil its next-generation embodied AI foundation model, WALL-B. The company announced that robots powered by WALL-B will enter real households in 35 days. According to Zibianliang co

GateNews4h ago

OpenAI Prepares Agents Feature for ChatGPT, Codenamed Hermes

AI Agent AI Industry News

Gate News message, April 21 — OpenAI is preparing a new Agents feature for ChatGPT, codenamed "Hermes," according to Tibor Blaho, who monitors AI product updates. The feature includes a new agent builder called "studio" that allows users to create agents from templates, schedule runs, and

GateNews5h ago

Comment

0/400

No comments