Cosmos and Nemotron: Nvidia presents new AI models
Nvidia wants to drive forward the development of physics AI with a world model. The company is also presenting new language models and AI blueprints.
(Image: Michael Vi/Shutterstock.com)
Nvidia has presented Cosmos, a new platform for basic world models. It is intended to be used to develop AI applications that understand physics and can therefore be used in robotics and autonomous vehicles. Nvidia also announced Nemotron, a family of language models for the development of AI agents for companies. They can be used in customer support, fraud detection or in the management of supply chains and inventories.
Nvidia world model generates training data for physics AI
The development of AI models with an understanding of physics requires a high volume of training data. With Cosmos, developers can input text, images, and videos as well as sensor and motion data to obtain physically correct training videos that replace tests in the real world. 3D scenarios developed in Nvidia Omniverse can also be converted into videos. The company promises that Cosmos can process 20 million hours of video material within two weeks using Nvidia Blackwell.
In addition to the large language model Llama Nemotron, the model family also includes Cosmos Nemotron, a visual language model that draws on the recognition and analysis capabilities of the world model. In combination, the two language models can be used for applications in companies. Warehouses, for example, whose current stock levels are recorded by cameras, are conceivable. An AI application analyzes the images and compares the captured goods with the stored records.
AI templates for typical use cases
In addition to the new AI models, Nvidia presented new blueprints for AI agents. To this end, the company worked together with partners in some cases. The templates cover functions for frequently used applications. This should enable developers to create AI applications adapted to companies without having to create their basic functions themselves. The blueprints can be used to create AI agents that comment on code, structure repositories or create automated web searches, for example.
Videos by heise
A template developed by Nvidia includes the conversion of PDF content into podcasts. An AI agent developed with this is designed to summarize texts, tables, and images from PDF files and make them available to users as a monologue or conversation. The manufacturer promises that users will be able to learn the information more efficiently and at their pace. Developers can run the blueprints with ready-made configurations on end devices, in data centers or in the cloud.
Nvidia provides the AI models in three levels, between 4 and 14 billion parameters. The smallest level, Nano, is intended for PCs and other end devices, while Nvidia's Ultra level is aimed at use in data centers. Cosmos and Nemotron models are licensed under the Open Model License, which permits commercial use, but are not open source like NVLM. In the future, companies will also receive the models via the Nvidia AI Enterprise platform and as part of NIM Microservices. Some of them are already available as previews.
Heise Medien is an official media partner of CES 2025.
(sfe)