Articles for category: AI Tools

[Tutorial] Chapter 7: Workflow – DEV Community

Originally published at https://www.nocobase.com/en/tutorials/task-tutorial-workflow. Congratulations on reaching the final chapter! Here, we’ll introduce and briefly explore the powerful workflow features in NocoBase. This feature lets you automate tasks within the system, saving time and enhancing efficiency. Challenge Solution from the Previous Chapter Before diving in, let’s quickly recap the solution to the last challenge. We successfully set up comment permissions for the “Partner” role as follows: Add Permission : Allows users to post comments. View Permission : Allows users to view all comments. Edit Permission : Users can edit only their own comments. Delete Permission : Users can delete only

Setting your ML project up for success

What can you do to maximize probability of success for your Machine Learning solution? Throughout my 15 years as data scientist in academia, big pharma and through consulting, one common theme has emerged: the most reliable predictor of success for any NLP or ML-based solution is whether or not you involve the data science team early on. By introducing your data scientists to the domain experts right from the start of the project, you can iteratively refine and improve both your data and your ML models. I’ve worked on a few projects throughout the years where data was pretty much

Fully Transparent and Permissive Self-Alignment for Code Generation

Instruction tuning is an approach of fine-tuning that gives large language models (LLMs) the capability to follow natural and human-written instructions. However, for programming tasks, most models are tuned on either human-written instructions (which are very expensive) or instructions generated by huge and proprietary LLMs (which may not be permitted). We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code LLM trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs. StarCoder2-15B-Instruct

OMSCS CS6476 (Computer Vision) Review and Tips

You might also be interested in this OMSCS FAQ I wrote after graduation. Or view all OMSCS related writing here: omscs. I recently completed my first course for the Georgia Tech OMSCS—CS6476 Computer Vision—and wanted to share some thoughts I had on it. Why choose this course? I recently built APIs for image classification and reverse image search using deep learning libraries. Through the process, I gained an understanding of how images work as a data structure, and how to apply machine learning on them to build useful data products. Nonetheless, there was a yearning to get a more in-depth

Idea Code App: My Build Story

Alright, let’s hear it. We’ve all been there—you have an idea for an app that could help you out, and you realize it could help others as well. But you don’t know how to create a good application, or moreover, you have no idea how to build one. Now, you go on YouTube and start searching, and the number of videos is overwhelming. You watch one, then two, but when it comes down to actually starting from nothing, things start to get a little rough. You don’t know the app flow, how users will interact with it, or even what

Multi hash embeddings in spaCy

In this technical report we lay out a bit of history and introduce the embedding methods in spaCy in detail. Second, we critically evaluate the hash embedding architecture with multi-embeddings on Named Entity Recognition datasets from a variety of domains and languages. The experiments validate most key design choices behind spaCy’s embedders, but we also uncover a few surprising results. Source link

Improving Prompt Consistency with Structured Generations

Recently, the Leaderboards and Evals research team at Hugging Face did small experiments, which highlighted how fickle evaluation can be. For a given task, results are extremely sensitive to minuscule changes in prompt format! However, this is not what we want: a model prompted with the same amount of information as input should output similar results. We discussed this with our friends at Dottxt, who had an idea – what if there was a way to increase consistency across prompt formats? So, let’s dig in! Context: Evaluation Sensitivity to Format Changes It has become increasingly clear that LLM benchmark performance

How to get started in Data Science

More than a handful of times have I been asked about how to get into the field of data science. This includes SMU’s Master of IT in Business classes, regular meet-ups (e.g., DataScience SG, and requests via email/linkedin. Though the conversations that follow differ depending on the person’s background, a significant portion is applicable to most people. I’m no data science rockstar. Neither am I an instructor that teaches how to get into data science. Nonetheless, here’s some previously shared advice on “How to get started in Data Science”, documented here so it can be shared in a more scalable

Integrate AI in E-commerce: Have You Tried This Success Shortcut?!

Everyone is racing to develop the next giant success story with GenAI or with niche AI integrations, and we wouldn’t want you to be left behind! High Reward AI & E-commerce Integrations Although many solutions are proving to be transformative after they integrate AI, here’s something you should attempt first so that you can then innovate it further! Hyper-Personalized Experiences Ever gotten a recommendation for a product within seconds after describing the product you want? Yes? Scary, isn’t it? It is absolutely what we will avoid in hyper-personalization. Instead, the AI-powered recommendation engines will analyze the customer data and preferences