March 28, 2025

ikayaniaamirshahzad@gmail.com

Introducing spaCy v3.5 · Explosion


We’re excited to release v3.5 of the spaCy Natural Language
Processing library. spaCy v3.5 introduces three new CLI commands, adds fuzzy
matching, provides improvements to our entity linking functionality, and
includes a range of language updates and bug fixes.

New CLI commands

  • apply applies a pipeline to one or more
    .txt, .jsonl or .spacy files
  • benchmark speed profiles a pipeline’s
    speed with a warmup and a confidence interval
  • find-threshold tests a range of
    threshold values for spancat, textcat_multilabel, etc, to identify the
    most optimal one.

Examples on how to run these commands can be found in our
CLI documentation as well as in our
v3.5 usage notes.

Fuzzy matching

The new FUZZY operator allows
fuzzy matches based on
Levenshtein edit distance:

pattern = [{"LOWER": {"FUZZY": "definitely"}}]

The FUZZY and REGEX operators are now also supported for lists with IN and
NOT_IN:

pattern = [{"TEXT": {"REGEX": {"NOT_IN": ["^awe(some)?$", "^wonder(ful)?"]}}}]

Entity linking

The entity linker’s knowledge base has been refactored for easier customization.
KnowledgeBase is now an abstract class and the
default implementation is the new class
InMemoryLookupKB.

Read more about all the improvements, updates and bug fixes:

Many cool new plugins, extensions, pipelines and tutorials have been added to
the spaCy universe and
spaCy projects since v3.4:

View the spaCy universe

Additionally, the spaCy team has added demo projects for two newer components:

Resources



Source link

Leave a Comment