Close enough to touch: AI-generated 3D environments in VR
World Labs is developing an AI that generates interactive 3D environments from prompts or images. With devices such as Meta Quest 3, you can now experience them
View of an AI-generated 3D environment through VR glasses.
(Image: World Labs)
A fictitious Swiss mountain village against an impressive Alpine backdrop, an imposing steampunk city with pointed towers, arches and magnificent clocks, a forest full of fluorescent giant mushrooms: the AI-generated environments of the WebXR application "Lofi Worlds" invite you to linger and dream. From realistic to fantastic and fairytale-like to artistically stylized: the range of VR scenes is diverse. The application even transports you to Edward Hopper's famous painting "Nighthawks", enhanced by environments that go beyond the original.
Lofi Worlds is an application from AI start-up World Labs and runs on stand-alone VR glasses such as Meta Quest 3 and Apple Vision Pro. Thanks to WebXR technology, all you need to do to immerse yourself in the 3D environments is call up the corresponding website.
The rendering technology used is 3D Gaussian Splatting ("3DGS"). This means that the 3D environments consist of tiny, spatially arranged splashes of color that form continuous landscapes in the overall impression. This gives some scenes an impressionistic look. A new feature of Lofi Worlds is that the splats move dynamically to create the impression of leaves in the wind, for example. They also react to touch and begin to undulate gently as if they were alive.
Dynamic Gaussian splats on the web and on almost all devices
What is immediately noticeable the first time and clouds the experience is the comparatively low resolution and unsharpness. Moving around freely is also penalized: if you move more than a few steps and change your perspective, you will quickly come across dark empty spaces in the splat piles, which destroy the illusion of a continuous 3D world.
Nevertheless, it is remarkable that Lofi Worlds can render interactive Gaussian splats on self-sufficient VR glasses such as Meta Quest 3. The technical basis of the application is Spark, a 3DGS renderer for web applications developed by World Labs and recently released as open source under the MIT license. Spark supports the web-based 3D graphics engine Three.js and uses WebGL2 for the rendering of Gaussian splats, which ensures broad device compatibility. Lofi Worlds can therefore also be tried out on smartphones and desktop computers. In conjunction with VR goggles, WebXR takes over the stereoscopic display, i.e. separate images for each eye, and enables interaction via hand tracking.
Videos by heise
World Labs wants to equip AI with "spatial intelligence"
World Labs is a US start-up and was founded by AI pioneer Fei-Fei Li, among others. Often referred to as the "Godmother of AI", the researcher has led the development of the ImageNet image data set since 2006, which marked a breakthrough in the field of machine vision. Fei-Fei Li sees the next step in the development of visual AI in "spatial intelligence", i.e. the ability to imagine spaces and interact with objects in them. The start-up is working on so-called "large world models", with which AIs can perceive the world as a spatial and physical experience like humans, rather than as a two-dimensional, static reality. The start-up has so far raised 230 million US dollars in investment capital.
The start-up wants to develop a generative AI that can create persistent, accessible and geometrically solid 3D environments based on text prompts or individual images. In December, World Labs published the first examples of AI-generated 3D environments, on which Lofi Worlds is also based. However, the technology cannot yet be tested. Sundquist assures us that the start-up is working on further developing the AI models and offering precisely this option in the future.
According to Sundquist, the potential fields of application are diverse: AI with spatial intelligence could generate virtual worlds for video games, film sets, YouTube backgrounds, training environments for robots and, last but not least, for VR and AR glasses, i.e. the devices that make it possible to experience spatial content in the most impressive way.
(mack)