Articles for category: AI Tools

How to Interview and Hire ML/AI Engineers

Hiring well is the highest leverage activity we can do for the mission and organization. And running effective interviews is key to hiring well. We can think of the interview process as a system: Given a candidate, assesses whether they are a good fit for the role and team. Thus, to hire well, the interview system should be reliable and valid, with minimal noise. In this write-up, Jason and I will share a few things we’ve learned about interviewing candidates for machine learning (ML) and AI roles. First, we’ll discuss what technical and non-technical qualities to assess. Then, we’ll share

Unlocking the Secrets of Remote Development: Insights from Craig Cannon

In a recent interview, Craig Cannon, Director of Marketing at Y Combinator, shared valuable insights on effectively working with remote developers. With a fully distributed engineering team for his personal project, Cannon’s experience sheds light on the growing trend of remote work in the tech industry. Key Takeaways Remote teams can be more cost-effective and efficient. Clear communication and documentation are essential for success. Hiring strong communicators is crucial to overcoming remote work challenges. The Appeal of Remote Teams Cannon explained that many founders opt for remote teams for two primary reasons: immediate onboarding and cost-effectiveness. For contractors, the need

Transformers Key-Value Caching Explained

As the complexity and size of transformer-based models grow, so does the need to optimize their inference speed, especially in chat applications where the users expect immediate replies. Key-value (KV) caching is a clever trick to do that: At inference time, key and value matrices are calculated for each generated token. KV caching stores these matrices in memory so that when subsequent tokens are generated, we only compute the keys and values for the new tokens instead of having to recompute everything. The inference speedup from KV caching comes at the cost of increased memory consumption. When memory is a

Databricks Data Intelligence Day 2025 서울 오프라인 세미나 진행

기업들이 전략적 의사 결정을 내릴 때 데이터 기반 인사이트를 적극 활용함에 따라 데이터 인텔리전스 플랫폼의 최신 트렌드는 더욱 정교하고 확장 가능하며 안전한 솔루션으로 발전하는 방향을 보여주고 있습니다.  조직이 수집하고 저장하며 활용하는 정보들을 더 잘 이해하고 사용하기 위한 도구와 방법들을 총칭하는 데이터 인텔리전스는 AI 애플리케이션 개발과 조직 전반의 데이터 접근을 더욱 용이하게 만들 수 있기에 그 중요성은 더욱 더 증가하고 있습니다.  AI로 변화하는 시대에서 모든 산업의 리더는 데이터와 AI를 활용하여 조직을 운영하고 지원하는 핵심 역할을 맡게 될 것입니다. 2025년 4월 29일 서울 파르나스 그랜드 인터컨티넨탈 호텔에서 열리는 Data Intelligence Day 컨퍼런스에 참석하여 조직에 필요한 AI 및 데이터 역량을 극대화하세요. 신뢰할 수

FLUX is fast and it’s open source

FLUX is now much faster on Replicate, and we’ve made our optimizations open-source so you can see exactly how they work and build upon them. Here are the end-to-end speeds: FLUX.1 [schnell] at 512×512 and 4 steps: 0.29 seconds (P90: 0.49 seconds) FLUX.1 [schnell] at 1024×1024 and 4 steps: 0.72 seconds (P90: 0.95 seconds) FLUX.1 [dev] at 1024×1024 and 28 steps: 3.03 seconds (P90: 3.90 seconds) This is from the west coast of the US using the Python client. Here’s a demo of FLUX.1 [schnell]. (It’s live, just start typing!) Here’s the full app, and source code, if you’d like

The Hidden Bottleneck: How GPU Memory Hierarchy Affects Your Computing Experience

The GPU memory hierarchy is increasingly becoming an area of interest for deep learning researchers and practitioners alike. By building an intuition around memory hierarchy, developers can minimize memory access latency, maximize memory bandwidth, and reduce power consumption leading to shorter processing times, accelerated data transfer, and cost-effective compute usage. A thorough understanding of memory architecture will enable developers to achieve peak GPU capabilities at scale. CUDA (Compute Unified Device Architecture) is a parallel computing platform developed by NVIDIA for configuring GPUs. The execution of a CUDA program begins when the host code (CPU serial code) calls a kernel function.

How Infor is Transforming Enterprise AI using LangGraph and LangSmith

Infor is a leading enterprise software company that provides cloud-based multi-tenant solutions tailored to specific industries like Aerospace & Defense, Automotive, Distribution, Fashion, Food & Beverage, Healthcare, and Industrial Manufacturing. Their solutions are offered to customers as cloud suites, a comprehensive set of integrated software applications delivered as Software-as-a-Service (SaaS) across multiple AWS regions. These suites help organizations streamline operations, boost productivity, and reduce IT costs by leveraging cloud infrastructure. Infor OS (Operating Service) is the cloud-based platform that powers all Infor cloud suite applications and services, providing a unified cloud experience that enhances functionality, security, and system interoperability for