Articles for category: AI Tools

Forward vs. Reverse Proxy: A Developer Friendly Guide

When dealing with web infrastructure, proxies play a crucial role in managing traffic, securing connections, and optimizing performance. However, not all proxies serve the same purpose. Forward proxies and reverse proxies may seem similar at first glance, but they serve different roles in a network. Understanding their distinctions is essential for developers working on network security, scalability, and access control. What Is a Proxy Server? A proxy server is an intermediary between clients and servers, handling requests and responses. Instead of direct communication between a client and a destination server, the proxy intercepts the request, processes it, and forwards it

Victoria’s Blog

TLDR The SpanRuler component of spaCy allows you to create rules to recognize spans or entities within your data. Lj and I created a spaCy project to showcase the functionality of the SpanRuler within a NER pipeline, but when we didn’t see the improvement we were looking for in the initial pipeline evaluation, I looked into the data and found some inconsistencies in the annotations. This led me to go back and create a Prodigy workflow to relabel data to get more consistent annotations. Machine learning is rarely a linear process that magically produces results, and iterating between your models

Training and Finetuning Embedding Models with Sentence Transformers v3

Sentence Transformers is a Python library for using and training embedding models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. Its v3.0 update is the largest since the project’s inception, introducing a new training approach. In this blogpost, I’ll show you how to use it to finetune Sentence Transformer models to improve their performance on specific tasks. You can also use this method to train new Sentence Transformer models from scratch. Finetuning Sentence Transformers now involves several components, including datasets, loss functions, training arguments, evaluators, and the new

Become a Strava PowerUser with Deephaven

It’s the time of year for comfort food and Turkey Trots! Maybe you’re training for a race or simply maintaining an exercise routine. Personally, I consistently use my fitness watch to ensure I’m meeting my step goals and to track my progress as I try to shave a few seconds off my mile time. The free Strava app beloved by runners and cyclists is another great resource to motivate you in your fitness journey and connect with a community. You can track a variety of exercises and store your results in the app. Did you know you can also download

OMSCS CS7646 (Machine Learning for Trading) Review and Tips

You might also be interested in this OMSCS FAQ I wrote after graduation. Or view all OMSCS related writing here: omscs. The 2019 spring term ended a week ago and I’ve been procrastinating on how ML4T (and IHI) went. I’ve known all along that writing is DIFFICULT, but recently it seems significantly more so. Perhaps its because I’ve noticed this site has been getting a lot more traffic recently. This includes having Prof Thad Starner commenting on my post for his course on Artificial Intelligence. This has increased my own expectations of my writing, making it harder for me to

Business Intelligence: Turning Data into Actionable Insights

In today’s data-driven world, Business Intelligence (BI) is a game-changer. Companies leveraging BI tools gain real-time analytics, predictive insights, and automated reporting to make smarter decisions and improve operational efficiency. Key Benefits of Business Intelligence Solutions:📊 Data Visualization – Transform raw data into meaningful dashboards.🔍 Predictive Analytics – Forecast trends and make proactive business decisions.📈 Optimized Performance – Identify inefficiencies and boost productivity.⚡ Real-Time Decision Making – Access critical insights on demand. Source link

NLP, before and after spaCy — textacy 0.13.0 documentation

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals — tokenization, part-of-speech tagging, dependency parsing, etc. — delegated to another library, textacy focuses primarily on the tasks that come before and follow after. features Access and extend spaCy’s core functionality for working with one or many documents through convenient methods and custom extensions Load prepared datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments Clean, normalize, and explore raw text before processing it with spaCy Extract structured information

Benchmarking Text Generation Inference

In this blog we will be exploring Text Generation Inference’s (TGI) little brother, the TGI Benchmarking tool. It will help us understand how to profile TGI beyond simple throughput to better understand the tradeoffs to make decisions on how to tune your deployment for your needs. If you have ever felt like LLM deployments cost too much or if you want to tune your deployment to improve performance this blog is for you! I’ll show you how to do this in a convenient Hugging Face Space. You can take the results and use it on an Inference Endpoint or other

Detect credit card fraud with Deephaven

Credit card fraud causes billions of dollars in damages each year. The most infamous cases have affected tens to hundreds of millions of consumers in single attacks through the unlawful exposure of personally identifiable information (PII) related to credit cards. Isolated cases are also common, and can be caused by a variety of methods including skimming, social engineering, and application fraud. In order to protect their customers, credit card companies rely on fraud detection and prevention software to analyze credit card purchases. These programs look for unusual or unexpected patterns to classify them as possibly fraudulent. In this blog, we

OMSCS CS6440 (Intro to Health Informatics) Review and Tips

You might also be interested in this OMSCS FAQ I wrote after graduation. Or view all OMSCS related writing here: omscs. Why take this class? In general, I’m keenly interested in HealthTech (and EdTech). One key reason I decided to enroll in OMSCS was due to the electives on health tech. The syllabus for IHI looks like a primer on the key technologies and standards in health technology, such as FHIR, EHRs, PHRs, etc. In addition, the weekly live lectures were a draw as they involved guest lecturers who were actively building health tech applications, both in startups and established