Articles for category: AI Research

Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization

Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization Antoine de Mathelin, François Deheeger, Mathilde Mougeot, Nicolas Vayatis; 26(4):1−68, 2025. Abstract This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a practical solution to the lack of prediction diversity observed recently for standard approaches when used out-of-distribution (Ovadia et al., 2019; Liu et al., 2021). Considering that this issue is mainly related to a lack of weight diversity, we claim that standard methods sample in “over-restricted” regions of the weight space due to the use of “over-regularization” processes, such as weight

Cell2fate infers RNA velocity modules to improve cell fate prediction

The cell2fate model Cell2fate builds on established concepts for RNA velocity1,2, employing a dynamical model to explain variation in spliced (s) and unspliced (u) read counts for individual genes and cells (Fig. 1a), which can be defined in two coupled ODEs: $$\frac{d{u}_{g}}{{dt}}={\alpha }_{g}(t)-{\beta }_{g}{u}_{g}$$ (1) $$\frac{d{s}_{g}}{{dt}}={\beta }_{g}{u}_{g}-{\gamma }_{g}{s}_{g}.$$ (2) Fig. 1: Cell2fate model overview. a–d, Cell2fate allows inferring complex and subtle transcriptional dynamics (a) by modeling gene-specific transcription rates (b) using a smaller number of independent modules with simple dynamics that also give rise to a modular structure in RNA velocity (c) and counts (d). λ denotes the rate of

SAEs trained on the same data don’t learn the same features

In this post, we show that when two TopK SAEs are trained on the same data, with the same batch order but with different random initializations, there are many latents in the first SAE that don’t have a close counterpart in the second, and vice versa. Indeed, when training only about 53% of the features are shared Furthermore, many of these unshared latents are interpretable. We find that narrower SAEs have a higher feature overlap across random seeds, and as the size of the SAE increases, the overlap decreases. This is consistent with evidence from the feature splitting and absorption

Stanford CRFM

*Work done while at Stanford CRFM We introduce HELM Safety v1.0 as a collection of 5 safety benchmarks spanning 6 risk categories (violence, fraud, discrimination, sexual content, harassment, deception) and evaluate 24 prominent language models as an ongoing effort to standardize safety evaluations. Content Warning: The transcipts in the evaluations of HELM Safety may be offensive due to their coverage of topics such as discrimination, violence, and other types of harm. Introduction Given the many risks of language models such as inadvertent bias, developing chemical and malicious use for scams, many efforts aim to improve the safety of language models.

Virtual Personas for Language Models via an Anthology of Backstories – The Berkeley Artificial Intelligence Research Blog

We introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience. What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors? In “Language Models as Agent Models”, compelling evidence suggests that recent language models could be considered models of agents: provided with a textual context, LLMs are capable of generating conditional text that represents the characteristics of an agent likely to have produced that context. This

Start building with Gemini 2.0 Flash and Flash-Lite

Since the launch of the Gemini 2.0 Flash model family, developers are discovering new use cases for this highly efficient family of models. Gemini 2.0 Flash offers stronger performance over 1.5 Flash and 1.5 Pro, plus simplified pricing that makes our 1 million token context window more affordable. Today, Gemini 2.0 Flash-Lite is now generally available in the Gemini API for production use in Google AI Studio and for enterprise customers on Vertex AI. 2.0 Flash-Lite offers improved performance over 1.5 Flash across reasoning, multimodal, math and factuality benchmarks. For projects that require long context windows, 2.0 Flash-Lite is an

[2410.02490] Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold

[Submitted on 3 Oct 2024 (v1), last revised 28 Feb 2025 (this version, v2)] View a PDF of the paper titled Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold, by Hoang Phuc Hau Luu and 4 other authors View PDF HTML (experimental) Abstract:Optimization in the Bures-Wasserstein space has been gaining popularity in the machine learning community since it draws connections between variational inference and Wasserstein gradient flows. The variational inference objective function of Kullback-Leibler divergence can be written as the sum of the negative entropy and the potential energy, making forward-backward Euler the method of choice. Notably, the backward

Exploring Unified Vision-Language Tracking with Multi-Modal Alignment

[Submitted on 7 Jul 2023 (v1), last revised 28 Feb 2025 (this version, v2)] View a PDF of the paper titled All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment, by Chunhui Zhang and 6 other authors View PDF HTML (experimental) Abstract:Current mainstream vision-language (VL) tracking framework consists of three parts, \ie a visual feature extractor, a language feature extractor, and a fusion model. To pursue better performance, a natural modus operandi for VL tracking is employing customized and heavier unimodal encoders, and multi-modal fusion models. Albeit effective, existing VL trackers separate feature extraction and feature integration, resulting in

[2410.18210] Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks

[Submitted on 23 Oct 2024 (v1), last revised 27 Feb 2025 (this version, v2)] View a PDF of the paper titled Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks, by Samuele Poppi and 6 other authors View PDF HTML (experimental) Abstract:Recent advancements in Large Language Models (LLMs) have sparked widespread concerns about their safety. Recent work demonstrates that safety alignment of LLMs can be easily removed by fine-tuning with a few adversarially chosen instruction-following examples, i.e., fine-tuning attacks. We take a further step to understand fine-tuning attacks in multilingual LLMs. We first discover cross-lingual generalization of fine-tuning attacks:

[2405.18540] Learning diverse attacks on large language models for robust red-teaming and safety tuning

[Submitted on 28 May 2024 (v1), last revised 28 Feb 2025 (this version, v2)] Authors:Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain View a PDF of the paper titled Learning diverse attacks on large language models for robust red-teaming and safety tuning, by Seanie Lee and 10 other authors View PDF HTML (experimental) Abstract:Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs). Developing effective protection against many modes of attack