Articles for category: AI News

NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation

The Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years, numerous modifications to this architecture have been proposed to enhance aspects such as training stability, inference efficiency, context length, and robustness. In a new paper nGPT: Normalized Transformer with Representation Learning on the Hypersphere, an NVIDIA research team proposes the normalized Transformer (nGPT), which consolidates key findings in Transformer research under a unified framework, offering faster learning and reduced training steps—by factors ranging from 4 to 20 depending on sequence length. The researchers summarize their main contributions as follows:

DeepSeek dims shine of AI stars

Credit: Unsplash/CC0 Public Domain China-based DeepSeek shook up the world of generative artificial intelligence (GenAI) early this year with a low-cost but high-performance model that challenges the hegemony of OpenAI and other big-spending behemoths. Since late 2022, just a handful of AI assistants—such as OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s Gemini—have reigned supreme, becoming ever more capable thanks to multi-billion-dollar investments in engineers, data centers, and high-performance AI chips. But then DeepSeek upended the sector with its R1 model, which it said cost just $6 million or so, powered by less-advanced chips. While specialists suspect DeepSeek may have cost more

#IJCAI panel on communicating about AI with the public

Science communication is an invaluable skill for researchers. It can help demystify AI for a broad range of people including policy makers, business leaders, and the public. In a panel session at the 33rd International Joint Conference on Artificial Intelligence (IJCAI-24), Michael Wooldridge and Toby Walsh talked with Peter Stone about lessons they’ve learnt from communicating about AI with different audiences. They gave advice on how to talk to media, how you should tailor your communication for various audiences, and how to tackle different methods of communication. They drew on their personal experiences to provide hints and tips for anyone

How to Access Gemma 3 Multimodal?

Google’s commitment to making AI accessible leaps forward with Gemma 3, the latest addition to the Gemma family of open models. After an impressive first year—marked by over 100 million downloads and more than 60,000 community-created variants—the Gemmaverse continues to expand. With Gemma 3, developers gain access to state-of-the-art, lightweight AI models that run efficiently on a variety of devices, from smartphones to high-end workstations. Built on the same technological foundations as Google’s powerful Gemini 2.0 models, Gemma 3 is designed for speed, portability, and responsible AI development. Also Gemma 3 comes in a range of sizes (1B, 4B, 12B

Alibaba Researchers Introduce R1-Omni: An Application of Reinforcement Learning with Verifiable Reward (RLVR) to an Omni-Multimodal Large Language Model

Emotion recognition from video involves many nuanced challenges. Models that depend exclusively on either visual or audio signals often miss the intricate interplay between these modalities, leading to misinterpretations of emotional content. A key difficulty is reliably combining visual cues—such as facial expressions or body language—with auditory signals like tone or intonation. Many existing systems also lack the capability to explain their decision-making process, which makes it hard to understand how a specific emotion is detected. Furthermore, these models can sometimes generate reasoning that does not directly reflect the input data, or they might fail to fully utilize important audio

ServiceNow Nabs LLM-Powered Chatbot Pioneer Moveworks for $2.85B

(TeeStocker/Shutterstock) Long before ChatGPT ignited the generative AI revolution, a company called Moveworks was busy using a promising new family of language models to help solve tough technological problems in customer service. All that work paid off this week when ServiceNow announced it has agreed to purchase Moveworks for $2.85 billion. Moveworks was founded in 2016 by Bhavin Shah, Vaibhav Nivargi, Varun Singh, and Jiang Chen to develop conversational interfaces, or chatbots, that companies could use to augment human call-center workers. At the time, the company relied on recurrent neural networks (RNN) techniques, which were not easy to work with.

Photonic Fabric: Celestial AI Secures $250M Series C Funding

 SANTA CLARA, CA – March 11, 2025 – Celestial AI, creator of the Photonic Fabric optical interconnect, today announced that it has raised $250 million in its Series C1 funding round led by Fidelity Management & Research Company, bringing the total capital raised to date to more than $515 million. New investors include funds and accounts managed by BlackRock, Maverick Silicon, Tiger Global Management and Lip-Bu Tan, as well as participation from existing investors including AMD Ventures, Koch Disruptive Technologies (KDT), Temasek, Temasek’s wholly-owned subsidiary Xora Innovation, Porsche Automobil Holding SE and The Engine Ventures. “With the emergence of complex

GitHub to unbundle Advanced Security

GitHub announced plans to unbundle its GitHub Advanced Security (GHAS) product, breaking it up into two standalone products: GitHub Secret Protection and GitHub Code Security. The unbundling is set to happen on April 1. GitHub Secret Protection will detect and prevent secret leaks before they happen, using push protection, secret scanning, AI-powered detection, security insights, and other capabilities. GitHub Code Security, meanwhile, will help developers identify and remediate vulnerabilities faster with code scanning, Copilot autofix, security campaigns, dependency review action, and more, according to GitHub. Announced March 4, the unbundling is intended to make GitHub’s security offering easier to access

Gemini Robotics: Google DeepMind’s New AI Models for Robots

Generative AI models are getting closer to taking action in the real world. Already, the big AI companies are introducing AI agents that can take care of web-based busywork for you, ordering your groceries or making your dinner reservation. Today, Google DeepMind announcedtwo generative AI models designed to power tomorrow’s robots. The models are both built on Google Gemini, a multimodal foundation model that can process text, voice, and image data to answer questions, give advice, and generally help out. DeepMind calls the first of the new models, Gemini Robotics, an “advanced vision-language-action model,” meaning that it can take all those same