Articles for category: AI Tools

Hugging Face on AMD Instinct MI300 GPU

Join the next Hugging Cast on June 6th to ask questions to the post authors, watch a live demo deploying Llama 3 on MI300X on Azure, plus a bonus demo deploying models locally on Ryzen AI PC! Register at https://streamyard.com/watch/iMZUvJnmz8BV Introduction At Hugging Face we want to make it easy to build AI with open models and open source, whichever framework, cloud and stack you want to use. A key component is the ability to deploy AI models on a versatile choice of hardware. Through our collaboration with AMD, for about a year now, we are investing into multiple different

Incomplete JSON Pretty Printer

Incomplete JSON Pretty Printer. Every now and then a log file or a tool I’m using will spit out a bunch of JSON that terminates unexpectedly, meaning I can’t copy it into a text editor and pretty-print it to see what’s going on. The other day I got frustrated with this and had the then-new GPT-4.5 build me a pretty-printer that didn’t mind incomplete JSON as a web page, using an OpenAI canvas. Here’s the chat and here’s the resulting interactive. I spotted a bug with the way it indented code today so I pasted it into Claude 3.7 Sonnet

Using Coingecko API with Deephaven

It’s hard to avoid the “crypto buzz” that is filling the news and discussion in the world these days. With the release of a powerful, real-time data engine like Deephaven Community, we figured you should know how to pull in both historical and live crypto data. You can navigate to our ready-to-go Docker container and look at our Cryptocurrency history example and read along for more information. First, make sure you have the Python version of Deephaven running with our example data. For more on how to start that, see our quick start. Navigate to the directory /data/examples/CryptoCurrency/, as we

DataScience SG Meetup – Panel On the Different Roles in Data

I was recently invited by DataScience SG to join a panel discussing the various roles in data (e.g., data scientist, machine learning engineer, data engineer, data analyst, etc.) They were looking for someone who had experience hiring across the different roles and I was happy to share my experience. Considering that it was a Thursday night, it was a great turnout where >200 people showed up at Google’s Auditorium to attend and ask great questions. From the meetup page: Ever wondered what the different data roles like AI researcher, data scientist, big data engineer, machine learning engineer, and data analyst

Using TF-IDF Vectors With PHP & PostgreSQL

Vectors in PostgreSQL are used to compare data to find similarities, outliers, groupings, classifications and other things. pg_vector is a popular extension for PostgreSQL that adds vector functionality to PostgreSQL. What is TF-IDF? TF-IDF stands for Term Frequency-Inverse Document Frequency. It’s a way to compare the importance of a word in a document compared to a collection of documents. Term Frequency Term frequency refers to how often a word is used within a document. In a 100 word document, if the word ‘test’ occurs 5 times, then the term frequency would be 5/100 = 0.05 Inverse Document Frequency Inverse Document

Fiscal data in text: Information extraction from audit reports using Natural Language Processing | Data & Policy

Policy Significance Statement Annual audits by supreme audit institutions produce important information on the health and accuracy of governmental budgets. These reports include the monetary value of discrepancies, missing funds, and corrupt actions. This paper offers a strategy for collecting that information from historical audit reports and creating a database on budgetary discrepancies. It uses machine learning and natural language processing to accelerate and scale the collection of data to thousands of paragraphs. The granularity of the budgetary data obtained through this approach is useful to reformers and policymakers who require detailed data on municipal finances. This approach can also

Hugging Face and Microsoft Deepen Collaboration

Today at Microsoft Build we are happy to announce a broad set of new features and collaborations as Microsoft and Hugging Face deepen their strategic collaboration to make open models and open source AI easier to use everywhere. Together, we will work to enable AI builders across open science, open source, cloud, hardware and developer experiences – read on for announcements today on all fronts! A collaboration for Cloud AI Builders we are excited to announce two major new experiences to build AI with open models on Microsoft Azure. Expanded HF Collection in Azure Model Catalog A year ago, Hugging

Tracing the thoughts of a large language model

Tracing the thoughts of a large language model. In a follow-up to the research that brought us the delightful Golden Gate Claude last year, Anthropic have published two new papers about LLM interpretability: To my own personal delight, neither of these papers are published as PDFs. They’re both presented as glorious mobile friendly HTML pages with linkable sections and even some inline interactive diagrams. More of this please! Source link

Using Redpanda to stream Docker Stats in Deephaven

If you’re anything like me, you use Docker for nearly everything. However, for most of my projects, I try to maximize efficiency but don’t feel the need to go full Kubernetes on the project. That’s where this simple application comes in. I want a live feed of all my Docker containers since I will (often) forget to change the settings inside my docker-compose.yml and try and do something like load a 2 GB file without thinking. Or, I do a little side project and forget to tidy up the container with a docker compose down. I use some of the

Data Science and Agile (What Works, and What Doesn’t)

Since I last posted on moderating a panel on Data Science and Agile, some have reached out for my views on this. This topic is also discussed among the data science community, with questions on how agile can be incorporated into a data science team, and how to get the gains in productivity. Can agile work well with data science? (Hint: If it can’t, this post, and the next, won’t exist.) Follow-up: Data Science and Agile (Frameworks for effectiveness) Follow-up: What I Love about Scrum for Data Science In this post, we’ll discuss on the strengths and weaknesses of Agile