Articles for category: AI Tools

Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) process data from different modalities like text, audio, image, and video. Compared to text-only models, MLLMs achieve richer contextual understanding and can integrate information across modalities, unlocking new areas of application. Prime use cases of MLLMs include content creation, personalized recommendations, and human-machine interaction. Examples of MLLMs that process image and text data include Microsoft’s Kosmos-1, DeepMind’s Flamingo, and the open-source LLaVA. Google’s PaLM-E additionally handles information about a robot’s state and surroundings. Combining different modalities and dealing with different types of data comes with some challenges and limitations, such as alignment of heterogeneous data,

Unlocking the Potential of AI Agents: From Pilots to Production Success

While 85% of global enterprises already use Generative AI (GenAI), organizations face significant challenges scaling these projects beyond the pilot phase. Even the most advanced GenAI models struggle to deliver business-specific, accurate, and well-governed outputs, largely because they lack awareness of relevant enterprise data. While many customers are comfortable deploying GenAI solutions across low-risk, limited-scope use cases, most do not have the confidence to deploy for external or internal use cases that carry financial risk. Today we are excited to introduce several key innovations that will help enterprises scale and deploy AI agents with confidence. These include: Centralized governance for

What the history of the web can teach us about the future of AI

Blog post: https://explosion.ai/blog/history-web-future-ai Video: https://www.youtube.com/live/kpocg6b89Fs?si=pmNN1kX5GJCe1vke&t=3840 Recent advancements in Generative AI are exciting, and will surely have a significant, yet uncertain impact on the future. Are we still going to need developers going forward, or will they be replaced by AI? Is Big Tech monopolizing the technology? And will we become entirely dependent on API providers, sacrificing the spirit of open-source software and data privacy? I believe there is a lot we can learn from another groundbreaking technology: the web. In this talk, I’ll show you what the history of the web can teach us about the future of artificial intelligence,

FLUX.1 Tools – Control and steerability for FLUX

The team at Black Forest Labs is back with FLUX.1 Tools, a new set of models that add control and steerability to their FLUX text-to-image model. The FLUX.1 Tools lineup includes four new features: Fill: Inpainting and outpainting, like a magic AI paintbrush for precise edits. Canny: Use edge detection to generate images with precise structure. Depth: Use depth maps to generate images with realistic perspective. Redux: An adapter for the FLUX.1 base models that you can use to create variations of images. Each of these new features is available for both the FLUX.1 [dev] and FLUX.1 [pro] models, with

Understanding Adversarial Attacks Using Fast Gradient Sign Method

In machine learning and artificial intelligence, adversarial attacks have gained much attention from researchers. These attacks alter the inputs to mislead the model into making wrong predictions. Among these, the Fast Gradient Sign Method (FGSM), is particularly worth mentioning because of its effectiveness and simplicity . The significance of FGSM lies in its ability to expose the vulnerability of modern models to minor variations in input data. These perturbations, which frequently go unnoticed by human observers, inflict errors on prediction accuracy. Understanding and minimizing these vulnerabilities is pivotal to building fault-resistant machine learning systems trusted in practical applications like autonomous

How C.H. Robinson is transforming the logistics industry with LangChain

C.H. Robinson is one of the world’s largest global logistics providers, managing 37 million shipments a year by ocean, air, rail and truck. It’s known for solving logistics challenges from the simple to the most complex. With the advent of GenAI, the company has created proprietary tech that represents an efficiency breakthrough for its industry and for supply chains around the world.  Problem they’re solving Customers using C.H. Robinson’s digital tools have been able to get instant service for years. But thousands of its 83,000 customers still prefer to conduct many routine transactions by email, requiring people to read the

We Tried OpenAI’s New Deep Research—Here’s What We Found

Was this newsletter forwarded to you? Sign up to get it in your inbox. On Sunday night, OpenAI dropped a new tool called “deep research”—an agentic research assistant.  If you give it a question—like, “Can you compile a wardrobe for me based on these pictures?” or “Can you write a history of Every’s business from its founding until now?” or “Look through recently filed 10ks, do you see any uncaught financial discrepancies?”—it will happily go off and compile a full-blown research report that can run upwards of 16,000 words. Crucially, it’s  not like previous versions of ChatGPT that are equipped

GB300 & B300 – Reasoning Inference, Amazon, Memory, Supply Chain – SemiAnalysis

Merry Christmas has come thanks to Santa Huang. Despite Nvidia’s Blackwell GPU’s having multiple delays, discussed here, and numerous times through the Accelerator Model due to silicon, packaging, and backplane issues, that hasn’t stopped Nvidia from continuing their relentless march. Aug 04, 2024 Nvidia’s Blackwell Reworked – Shipment Delays & GB200A Reworked Platforms Dylan Patel, Wega Chu, Daniel Nishball, Myron Xie, Chaolien Tseng They are bringing to market a brand-new GPU only 6 months after GB200 & B200, titled GB300 & B300. While on the surface it sounds incremental, there’s a lot more than meets the eye. The changes are

llm-openrouter 0.4

llm-openrouter 0.4. I found out this morning that OpenRouter include support for a number of (rate-limited) free API models. I occasionally run workshops on top of LLMs (like this one) and being able to provide students with a quick way to obtain an API key against models where they don’t have to setup billing is really valuable to me! This inspired me to upgrade my existing llm-openrouter plugin, and in doing so I closed out a bunch of open feature requests. Consider this post the annotated release notes: I’m trying to get support for LLM’s new schema feature into as

Release notes for Deephaven Core version 0.36

Deephaven Community Core version 0.36.0 is available now, with several new features, improvements, bug fixes, and more. We’ve rounded up the highlights below. Native table iteration in Python​ Four new table operations are now available that allow you to iterate over table data in Python efficiently. They are: The first two iterate over the table one row at a time, while the latter iterate over chunks of rows. All four methods use efficient chunked operations on the backend and return generators to minimize data copies and memory usage, making them ideal for large tables. Take a look at how they’re