Articles for category: AI News

Fearing AI will take their jobs, California workers plan a long battle against tech

Meanwhile, people like Amba Kak see opportunities for gains by workers against technological threats but said that it may require strategically picking the right battles. Kak previously advised the Federal Trade Commission and is executive director of the AI Now Institute, a nonprofit that researches the human rights impact of the technology.   — Kak told CalMatters she plans to pay more attention to activity in state legislatures in places like California and New York, where lawmakers are already considering a bill that protects people from AI in a manner similar to California’s to Senate Bill 1047, a controversial bill requiring AI safeguards that

Less is more: How ‘chain of draft’ could cut AI costs by 90% while improving performance

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A team of researchers at Zoom Communications has developed a breakthrough technique that could dramatically reduce the cost and computational resources needed for AI systems to tackle complex reasoning problems, potentially transforming how enterprises deploy AI at scale. The method, called chain of draft (CoD), enables large language models (LLMs) to solve problems with minimal words — using as little as 7.6% of the text required by current methods while maintaining or even improving accuracy. The findings were published in a

Opera introduces browser-integrated AI agent

Opera has introduced “Browser Operator,” a native AI agent designed to perform tasks for users directly within the browser. Rather than acting as a separate tool, Browser Operator is an extension of the browser itself—designed to empower users by automating repetitive tasks like purchasing products, completing online forms, and gathering web content. Unlike server-based AI integrations which require sensitive data to be sent to third-party servers, Browser Operator processes tasks locally within the Opera browser. Opera’s demonstration video showcases how Browser Operator can streamline an everyday task like buying socks. Instead of manually scrolling through product pages or filling out

What’s the best AI model to handle $1 million in freelance software engineering?

OpenAI recently introduced SWE-Lancer, a benchmark that tests how well today’s most advanced AI models can handle real-world software engineering tasks. Even though they just tweeted about it 19 hours ago, this paper is already the top paper on AIModels.fyi in the agents category and growing fast. Their paper presents a benchmark that evaluates AI language models on 1,488 actual freelance software engineering tasks from Upwork, collectively worth $1 million in payouts. These tasks were sourced from the open-source Expensify repository. Fig 2. “Evaluation flow for SWE manager tasks; during proposal selection, the model has the ability to browse the

Meta is planning to launch a standalone AI app

Just a heads up, if you buy something through our links, we may get a small share of the sale. It’s one of the ways we keep the lights on here. Click here for more. According to a report from CNBC, Meta is pushing into the generative AI space by planning to launch its AI chatbot, Meta AI, as a standalone app.  The company, led by Mark Zuckerberg, aims to roll out the Meta AI app between April and June 2024 and potentially introduce a subscription model offering advanced features.  This development positions Meta AI in direct competition with established

Building advanced AI systems: Challenges and best practices

My name is Akash, co-founder and CEO of Bellum.ai. Our mission is to help companies build reliable AI systems in production. In this talk, I’ll share insights from working with hundreds of companies using AI, highlighting what works, what doesn’t, and where AI development is headed. The journey to AI innovation Early experiences with AI AI has always been on the horizon, but my moment of realization came about four to five years ago, at the beginning of COVID, when I first experimented with GPT-3’s API. It wasn’t perfect—prone to generating random, inaccurate responses—but it demonstrated a capability never seen

Understanding Reasoning LLMs – by Sebastian Raschka, PhD

This article describes the four main approaches to building reasoning models, or how we can enhance LLMs with reasoning capabilities. I hope this provides valuable insights and helps you navigate the rapidly evolving literature and hype surrounding this topic. In 2024, the LLM field saw increasing specialization. Beyond pre-training and fine-tuning, we witnessed the rise of specialized applications, from RAGs to code assistants. I expect this trend to accelerate in 2025, with an even greater emphasis on domain- and application-specific optimizations (i.e., “specializations”). Stages 1-3 are the common steps to developing LLMs. Stage 4 specializes LLMs for specific use cases.

Judges Are Fed up With Lawyers Using AI That Hallucinate Court Cases

This article was produced in collaboration with Court Watch, an independent outlet that unearths overlooked court records. Subscribe to them here. After a group of attorneys were caught using AI to cite cases that didn’t actually exist in court documents last month, another lawyer was told to pay $15,000 for his own AI hallucinations that showed up in several briefs.  Attorney Rafael Ramirez, who represented a company called HoosierVac in an ongoing case where the Mid Central Operating Engineers Health and Welfare Fund claims the company is failing to allow the union a full audit of its books and records,