Articles for category: AI Research

a metadata format for ML-ready datasets

Machine learning (ML) practitioners looking to reuse existing datasets to train an ML model often spend a lot of time understanding the data, making sense of its organization, or figuring out what subset to use as features. So much time, in fact, that progress in the field of ML is hampered by a fundamental obstacle: the wide variety of data representations. ML datasets cover a broad range of content types, from text and structured data to images, audio, and video. Even within datasets that cover the same types of content, every dataset has a unique ad hoc arrangement of files

Understanding RL Vision

Contents In this article, we apply interpretability techniques to a reinforcement learning (RL) model trained to play the video game CoinRun . Using attribution combined with dimensionality reduction as in , we build an interface for exploring the objects detected by the model, and how they influence its value function and policy. We leverage this interface in several ways. Dissecting failure. We perform a step-by-step analysis of the agent’s behavior in cases where it failed to achieve the maximum reward, allowing us to understand what went wrong, and why. For example, one case of failure was caused by an obstacle

Accelerating optimization over the space of probability measures

Accelerating optimization over the space of probability measures Shi Chen, Qin Li, Oliver Tse, Stephen J. Wright; 26(31):1−40, 2025. Abstract The acceleration of gradient-based optimization methods is a subject of significant practical and theoretical importance, particularly within machine learning applications. While much attention has been directed towards optimizing within Euclidean space, the need to optimize over spaces of probability measures in machine learning motivates the exploration of accelerated gradient methods in this context, too. To this end, we introduce a Hamiltonian-flow approach analogous to momentum-based approaches in Euclidean space. We demonstrate that, in the continuous-time setting, algorithms based on this

Author Correction: Biophysical neural adaptation mechanisms enable artificial neural networks to capture dynamic retinal computation

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not

Yi-34B, Llama 2, and common practices in LLM training: a fact check of the New York Times

On February 21 2024, the New York Times published “China’s Rush to Dominate A.I. Comes With a Twist: It Depends on U.S. Technology.” The authors claim that Yi-34B, a recent large language model by the Chinese startup 01.AI, is fundamentally indebted to Meta’s Llama 2: There was just one twist: Some of the technology in 01.AI’s system came from Llama. Mr. Lee’s start-up then built on Meta’s technology, training its system with new data to make it more powerful. This assessment is based on a misreading of the cited Hugging Face issue. While we make no claims about the overall

Google’s research on quantum error correction

Quantum computers have the potential to revolutionize drug discovery, material design and fundamental physics — that is, if we can get them to work reliably. Certain problems, which would take a conventional computer billions of years to solve, would take a quantum computer just hours. However, these new processors are more prone to noise than conventional ones. If we want to make quantum computers more reliable, especially at scale, we need to accurately identify and correct these errors. In a paper published today in Nature, we introduce AlphaQubit, an AI-based decoder that identifies quantum computing errors with state-of-the-art accuracy. This

[2410.08309] Swing-by Dynamics in Concept Learning and Compositional Generalization

[Submitted on 10 Oct 2024 (v1), last revised 13 Mar 2025 (this version, v2)] View a PDF of the paper titled Swing-by Dynamics in Concept Learning and Compositional Generalization, by Yongyi Yang and 5 other authors View PDF HTML (experimental) Abstract:Prior work has shown that text-conditioned diffusion models can learn to identify and manipulate primitive concepts underlying a compositional data-generating process, enabling generalization to entirely novel, out-of-distribution compositions. Beyond performance evaluations, these studies develop a rich empirical phenomenology of learning dynamics, showing that models generalize sequentially, respecting the compositional hierarchy of the data-generating process. Moreover, concept-centric structures within the data

[2402.14327] Subobject-level Image Tokenization

[Submitted on 22 Feb 2024 (v1), last revised 12 Mar 2025 (this version, v3)] View a PDF of the paper titled Subobject-level Image Tokenization, by Delong Chen and 4 other authors View PDF HTML (experimental) Abstract:Patch-based image tokenization ignores the morphology of the visual world, limiting effective and efficient learning of image understanding. Inspired by subword tokenization, we introduce subobject-level adaptive token segmentation and explore several approaches, including superpixel, SAM, and a proposed Efficient and PanOptiC (EPOC) image tokenizer. Our EPOC combines boundary detection — a simple task that can be handled well by a compact model — with watershed

[2411.02948] Grounding Natural Language to SQL Translation with Data-Based Self-Explanations

[Submitted on 5 Nov 2024 (v1), last revised 13 Mar 2025 (this version, v2)] View a PDF of the paper titled Grounding Natural Language to SQL Translation with Data-Based Self-Explanations, by Yuankai Fan and 4 other authors View PDF HTML (experimental) Abstract:Natural Language Interfaces for Databases empower non-technical users to interact with data using natural language (NL). Advanced approaches, utilizing either neural sequence-to-sequence or more recent sophisticated large-scale language models, typically implement NL to SQL (NL2SQL) translation in an end-to-end fashion. However, like humans, these end-to-end translation models may not always generate the best SQL output on their first try.

[2409.11697] Monomial Matrix Group Equivariant Neural Functional Networks

[Submitted on 18 Sep 2024 (v1), last revised 13 Mar 2025 (this version, v3)] View a PDF of the paper titled Monomial Matrix Group Equivariant Neural Functional Networks, by Viet-Hoang Tran and Thieu N. Vo and Tho H. Tran and An T. Nguyen and Tan M. Nguyen View PDF HTML (experimental) Abstract:Neural functional networks (NFNs) have recently gained significant attention due to their diverse applications, ranging from predicting network generalization and network editing to classifying implicit neural representation. Previous NFN designs often depend on permutation symmetries in neural networks’ weights, which traditionally arise from the unordered arrangement of neurons in