Articles for category: AI Tools

Why partitioned tables are powerful

“You don’t have to be an engineer to be a racing driver, but you do have to have Mechanical Sympathy.” – Jackie Stewart, racing driver This simple quote has deep meaning when it comes to many facets of life. Its value cannot be understated when it comes to software. In this article, we’ll touch upon an important concept in Deephaven – partitioned tables. Like with auto racing, having a grasp of the mechanics behind a system will enable you to maximize its potential. So, let’s take a closer look at partitioned tables and how they can help you get the

Fine-Tuning Llama 3.2 Vision

VLMs (Vision Language Models) are powerful AI architectures. Today, we use them for image captioning, scene understanding, and complex mathematical tasks. Large and proprietary models such as ChatGPT, Claude, and Gemini excel at tasks like converting equation images to raw LaTeX equations. However, smaller open-source models like Llama 3.2 Vision struggle, especially in 4-bit quantized format. In this article, we will tackle this use case. We will be fine-tuning Llama 3.2 Vision to convert mathematical equation images to raw LaTeX equations. Figure 1. Gradio demo after fine-tuning Llama 3.2 Vision for converting LaTeX images to equations. The primary aim of

Data Machina #261 – by Carlos

Generative AI + Time-Series Forecasting? Many world-class organisations are starting to invest in new GenAI+TS forecasting methods that involve for example: developing new specialised VAEs, using Vision-Language Models, pre-training the model with trillions of TS data points, or incorporating text embedding and tokenisation into the TS forecasting method. Checkout these 6 very recent, interesting papers that show the impressive, rapid evolution in this area. Re-programming LLMs for time-series modelling. This a great post about how researchers are trying to align the information gap between time series and natural language from every perspective of training a LLM. Re-programming a LLM for

DeepSeek Fire-Flyer: What You Need to Know

While DeepSeek has garnered headlines for its increasingly powerful AI models, a key ingredient lies beneath the surface: Fire-Flyer, an ambitious homegrown AI-HPC infrastructure that enables training trillion-parameter models at unprecedented cost efficiency. What makes this software-hardware co-design framework even more remarkable is that DeepSeek has accomplished this infrastructure feat with a team of fewer than 300 employees, showcasing their deep technical expertise in building a system optimized for high-speed data access and efficient computation. This infrastructure-first approach represents a significant competitive advantage, demonstrating how focused investment in the computational backbone can yield outsized results in the rapidly evolving AI

2024 Year in Review

2024 was a peaceful year of steady progress. With regard to my craft, the prototypes of 2023 were scaled and put into production, and I rediscovered the joy of building in public. On the personal side, I continued the prior year’s focus on health, further improving my diet and exercise habits, leading to measurable results. Past years: 2020, 2021, 2022, 2023 Reviewing my 2024 goals ✅ Work: Shipped ML/LLM systems that serve customers at scale. 2024 was the year of productionizing the prototypes of 2023. I learned how to deploy these systems reliably, at scale, and cost-effectively, for both customer-facing UXes

Ngrok Alternative for UDP Tunneling: Exploring Better Options

Why UDP Tunneling Matters for Developers For developers, network engineers, and IT professionals, tunneling is essential when working with local applications that need external access. Many rely on Ngrok, a widely used tunneling tool that simplifies exposing local servers to the internet. However, a significant limitation of Ngrok is its lack of support for UDP tunneling, which is crucial for applications like multiplayer gaming, VoIP, DNS, and IoT services. Since UDP is widely used for real-time communication due to its low latency, developers often face challenges when exposing their UDP-based applications. Fortunately, alternative solutions exist that support UDP tunneling efficiently.

Title Launch Observability at Netflix Scale | by Netflix Technology Blog | Mar, 2025

Part 3: System Strategies and Architecture By: Varun Khaitan With special thanks to my stunning colleagues: Mallika Rao, Esmir Mesic, Hugo Marques This blog post is a continuation of Part 2, where we cleared the ambiguity around title launch observability at Netflix. In this installment, we will explore the strategies, tools, and methodologies that were employed to achieve comprehensive title observability at scale. To create a comprehensive solution, we decided to introduce observability endpoints first. Each microservice involved in our Personalization stack that integrated with our observability solution had to introduce a new “Title Health” endpoint. Our goal was for

MySQL At Uber

How does Uber achieve 99.99% availability across 2,000+ MySQL® clusters? Learn how we manage our MySQL fleet at scale, from architecture to control plane optimizations. Source link

How To Use mongodump for MongoDB Backups – BMC Software

Maintaining backups is vital for every organization. Data backups act as a safety measure where data can be recovered or restored in case of an emergency. Typically, you create database backups by replicating the database, using either: Built-in tools Specialized external backup services Backing Up MongoDB MongoDB offers multiple inbuilt backup options depending on the MongoDB deployment method you use. We’ll look briefly at the options, but then we’ll show you how to utilize one particular option—MongoDB mongodump—for the backup process. (This article is part of our MongoDB Guide. Use the right-hand menu to navigate.) Built-in backups in MongoDB Here

8-bit Quantization with Lightning Fabric

Takeaways Readers will learn the basics of Lightning Fabric’s plugin for 8-bit quantization. Introduction The aim of 8-bit quantization is to reduce the memory usage of the model parameters by using lower precision types than full (float32) or half (bfloat16) precision. Meaning – 8-bit quantization compresses models that have billions of parameters like Llama 2 or SDXL and makes them require less memory. Thankfully, Lightning Fabric makes quantization as easy as setting a mode flag in a plugin! 8-bit Quantization 8-bit quantization is discussed in the popular paper 8-bit Optimizers via Block-wise Quantization and was introduced in FP8 Formats for