Articles for category: AI Tools

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

Whisper is one of the best open source speech recognition models and definitely the one most widely used. Hugging Face Inference Endpoints make it very easy to deploy any Whisper model out of the box. However, if you’d like to introduce additional features, like a diarization pipeline to identify speakers, or assisted generation for speculative decoding, things get trickier. The reason is that you need to combine Whisper with additional models, while still exposing a single API endpoint. We’ll solve this challenge using a custom inference handler, which will implement the Automatic Speech Recogniton (ASR) and Diarization pipeline on Inference

SMU Masters in IT – How to get started in Data Science

I was recently invited to guest lecture at SMU’s Masters of IT in Business. The grad students were mostly interested in my journey into data science, as well as what the data team does at Lazada. Here’s what I shared with them, largely based on a previous post (How to get Started in Data Science). P.S., here’s what I shared at SMU’s MITB last year. If you found this useful, please cite this write-up as: Yan, Ziyou. (Jun 2017). SMU Masters in IT – How to get started in Data Science. eugeneyan.com. https://eugeneyan.com/speaking/how-to-get-started-in-data-science-talk/. or @article{yan2017masters, title = {SMU Masters in

Complete Guide: Build Your 1st AI Agent (No-Code + Code Tutorial)

Curious about how AI agents actually work—and how to build one yourself? In this hands-on tutorial, I walk you through everything you need to know to get started. Whether you’re a developer looking to build intelligent workflows with the OpenAI Agent SDK or someone who prefers a low-code approach using Lyzr AI, this guide covers it all. We’ll explore how to: Create an AI-powered email assistant using OpenAI Agent SDK, Resend, and Firecrawl Deploy your agent efficiently using Nebius AI Studio for low-cost inference Build the same kind of agent—without writing code—using the Lyzr AI platform This is a great

GitHub – wjbmattingly/ww2-spacy

WW2 spaCy is a pipeline for processing primary and secondary sources for World War 2 and performing named entity recognition (NER). Currently, the pipeline is focused on United States-based military NER with plans to expand to include other countries in the near future. NOTE Updates are occuring frequently for this pipeline to expand its labels and the accuracy of the rules. The pipeline is designed to not rely on machine learning so that it can remain modular, meaning a user can take a specific pipe and attach it to their own pipeline without issue. The main component is a SpanRuler

🚀 Accelerating LLM Inference with TGI on Intel Gaudi

We’re excited to announce the native integration of Intel Gaudi hardware support directly into Text Generation Inference (TGI), our production-ready serving solution for Large Language Models (LLMs). This integration brings the power of Intel’s specialized AI accelerators to our high-performance inference stack, enabling more deployment options for the open-source AI community 🎉 ✨ What’s New? We’ve fully integrated Gaudi support into TGI’s main codebase in PR #3091. Previously, we maintained a separate fork for Gaudi devices at tgi-gaudi. This was cumbersome for users and prevented us from supporting the latest TGI features at launch. Now using the new TGI multi-backend

Tech in Asia – My Journey in Data Science and Advice for others

Recently, Christopher, Managing Partner at Tri5 Ventures, reached out for an interview about “The Life of a Data Scientist”. The intent is to share knowledge and insight with people aspiring to enter the field, or those currently practicing data science. The article was published a week ago on Tech in Asia and can be found here: “4 Singapore-based data scientists share how data has been impacting lives”. It covers data science professionals across multiple backgrounds, including researchers, entrepreneurs, and startups. A few people have asked if I could build on what was shared in the article, so I’m reproducing my

HarmonyOS NEXT Practical: Pop up Bottom Menu

Goal: Pull up the bottom menu to enable it to pop up and close. Knowledge points: The half modal page (bindSheet) defaults to a non full screen pop-up interactive page in modal form, allowing some underlying parent views to be visible, helping users retain their parent view environment when interacting with the half modal. Semimodal pages are suitable for displaying simple tasks or information panels, such as personal information, text introductions, sharing panels, creating schedules, adding content, etc. If you need to display a semi modal page that may affect the parent view, semi modal supports configuring it as a

Reflections on a year of spaCy consulting at Explosion

We’ve been offering consulting services through spaCy Tailored Pipelines for almost a year! I thought I’d post some lessons I’ve learned from chatting with practitioners about their NLP challenges, developing production-ready NLP pipelines for clients, and working with an open-source development team. First, it’s always worth the time to fully understand a client’s problem as best as possible. We try to be thorough yet expeditious in our interactions – it can take a lot of time and energy to get everyone a shared understanding about when a challenge is a fit for our services. Not everything results in a project,

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Building applications with LLMs requires considering more than just quality: for many use-cases, speed and price are equally or more important. For consumer applications and chat experiences, speed and responsiveness are critical to user engagement. Users expect near-instant responses, and delays can directly lead to reduced engagement. When building more complex applications involving tool use or agentic systems, speed and cost become even more important, and can become the limiting factor on overall system capability. The time taken by sequential requests to LLMs can quickly stack up for each user request adding to the cost. This is why Artificial Analysis

OMSCS CS6300 (Software Development Process) Review and Tips

You might also be interested in this OMSCS FAQ I wrote after graduation. Or view all OMSCS related writing here: omscs. Recently, I completed the Georgia Tech OMSCS Software Development Process (CS6300) course over the summer. It was very enriching—I learnt about proper software engineering practices and created apps in Java and Android. Here’s an overview of my experience, for those who are considering taking it. Why did I take this course? Since entering the data and technology industry a couple of years ago, I’ve always felt the need to improve my skills in software engineering. This is compounded by