Articles for category: AI Tools

Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

Recently, code generation models have become very popular, especially with the release of state-of-the-art open-source models such as BigCode’s StarCoder and Meta AI’s Code Llama. A growing number of works focuses on making Large Language Models (LLMs) more optimized and accessible. In this blog, we are happy to share the latest results of LLM optimization on Intel Xeon focusing on the popular code generation LLM, StarCoder. The StarCoder Model is a cutting-edge LLM specifically designed for assisting the user with various coding tasks such as code completion, bug fixing, code summarization, and even generating code snippets from natural language descriptions.

Git tricks 2/3

Hi! I’m back with the second part of this series. Sorry for the delay. I don’t have a lot of experience writing posts, and it takes me a moment. 😆 In this post, I want to share more cool Git commands and use cases inspired by my experience. I hope you enjoy it! By the way, I generated an image for the post with Mistral to have a cool banner picture. 😅 Let me know what you think! Git stash The git stash command is incredibly useful when you need to switch contexts but don’t want to commit your current

Page not found | Towards Data Science

Your home for data science and Al. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals. © Insight Media Group, LLC 2025 About Privacy Policy Terms of Use Towards Data Science is now independent! Cookies Settings Source link

a Leaderboard for Real World Use Cases

Today, the Patronus team is excited to announce the new Enterprise Scenarios Leaderboard, built using the Hugging Face Leaderboard Template in collaboration with their teams. The leaderboard aims to evaluate the performance of language models on real-world enterprise use cases. We currently support 6 diverse tasks – FinanceBench, Legal Confidentiality, Creative Writing, Customer Support Dialogue, Toxicity, and Enterprise PII. We measure the performance of models on metrics like accuracy, engagingness, toxicity, relevance, and Enterprise PII. Why do we need a leaderboard for real world use cases? We felt there was a need for an LLM leaderboard focused on real world,

Introducing the MCP Community Portal: Bridging AI Assistants and External Tools

Hello Community! I’m excited to share a project I’ve been working on: the MCP Community Portal (https://github.com/ajeetraina/mcp-portal). This is a modern, community-driven collection of Docker Model Context Protocol (MCP) servers, tools, and resources that aims to revolutionize how AI assistants like Claude interact with external systems. What is the MCP Portal all about? The Model Context Protocol (MCP) is an emerging standard that enables AI assistants to interact directly with external tools, databases, and services without requiring complex API integrations or extensive coding knowledge. The MCP Portal serves as: A central hub for MCP servers – Pre-built Docker containers that

Release v1.1.0 · explosion/spacy-transformers · GitHub

✨ New features and improvements Refactor and improve transformer serialization for better support of inline transformer components and replacing listeners. Provide the transformer model output as ModelOutput instead of tuples in TransformerData.model_output and FullTransformerBatch.model_output. For backwards compatibility, the tuple format remains available under TransformerData.tensors and FullTransformerBatch.tensors. See more details in the transformer API docs. Add support for transformer_config settings such as output_attentions. Additional output is stored under TransformerData.model_output. More details in the TransformerModel docs. Add support for mixed-precision training. Improve training speed by streamlining allocations for tokenizer output. Extend support for transformers up to v4.11.x. 🔴 Bug fixes Fix support

Patch Time Series Transformer in Hugging Face

In this blog, we provide examples of how to get started with PatchTST. We first demonstrate the forecasting capability of PatchTST on the Electricity data. We will then demonstrate the transfer learning capability of PatchTST by using the previously trained model to do zero-shot forecasting on the electrical transformer (ETTh1) dataset. The zero-shot forecasting performance will denote the test performance of the model in the target domain, without any training on the target domain. Subsequently, we will do linear probing and (then) finetuning of the pretrained model on the train part of the target data, and will validate the forecasting

Introducing Cypress.stop()

Have you ever wanted to stop execution of a spec file? Cypress has been rolling out some nifty features lately, and one that caught my eye is the new Cypress.stop() command. This little gem allows you to stop test execution programmatically, bringing more flexibility to your Cypress test suite. Before this came to light, I have been using Cypress.runner.stop() to do the same thing slightly different In this article, I’ll break down what Cypress.stop() is, why it’s useful, and how you can integrate it into your tests. What is Cypress.stop()? Cypress.stop() is a newly introduced command that immediately halts a

explosion/floret: 🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

floret is an extended version of fastText that can produce word representations for any word from a compact vector table. It combines: fastText’s subwords to provide embeddings for any word Bloom embeddings (“hashing trick”) for a compact vector table To learn more about floret, check out our blog post on floret vectors. For a hands-on introduction, experiment with English vectors in this example notebook: intro_to_floret git clone https://github.com/explosion/floret cd floret make This produces the main binary floret. Install the python wrapper with pip: Or install from source in developer mode: git clone https://github.com/explosion/floret cd floret pip install -r requirements.txt pip

Hugging Face Text Generation Inference available for AWS Inferentia2

We are excited to announce the general availability of Hugging Face Text Generation Inference (TGI) on AWS Inferentia2 and Amazon SageMaker. Text Generation Inference (TGI), is a purpose-built solution for deploying and serving Large Language Models (LLMs) for production workloads at scale. TGI enables high-performance text generation using Tensor Parallelism and continuous batching for the most popular open LLMs, including Llama, Mistral, and more. Text Generation Inference is used in production by companies such as Grammarly, Uber, Deutsche Telekom, and many more. The integration of TGI into Amazon SageMaker, in combination with AWS Inferentia2, presents a powerful solution and viable