Introducing NVIDIA’s AI-powered video conferencing development platform, Maxine

We have all become more dependent on video conferencing for business and pleasure during this period when meeting face-to-face has become so restricted. Once the novelty wore off, people naturally started to look for ways to make themselves, and their surroundings, look better as they spent more and more time in front of tier webcams.

NVIDIA Maxine, which is being made available to developers and partners next week, is a new “cloud-native” platform that provides developers with a suite of GPU-accelerated AI tools to enhance and “dramatically” reduce video conferencing bandwidth requirements. Maxine uses GANs (Generative Adversarial Networks) to analyse the facial points of each person on a call and then algorithmically reanimates the face in the video on the other side, significantly reducing bandwidth requirements, to 10% of that required by the H.264 standard.

Maxine’s face alignment feature automatically adjusts the video image so participants appear to face each other during a call. Gaze correction simulates eye contact, while the auto-frame feature allows the video feed to follow a speaker who leaves the screen. Developers can also provide call participants with animated avatars that are controlled automatically by their voices and tone of voice and use NVIDIA’s Jarvis SDK for conversational features, including AI language models for speech recognition, language understanding, and speech generation.

Videoconferencing assistants can be build that take notes and answer questions in humanlike voices and there are tools that can power translations and transcriptions to help participants understand what’s being discussed.