March 29, 2025

ikayaniaamirshahzad@gmail.com

Our Year in Review · Explosion


The year 2021 is coming to an end, and like the previous year, it was shaped by
unique challenges that impacted our work together. For Explosion, it was a very
productive year. We found an investor that fits our strategy, we released spaCy
v3, the work on Prodigy Teams is in full swing, and the team has grown
a lot. So here’s our look back at our highlights of the year 2021.

  • 💫 Feb 1: We kicked off February with the big release of
    spaCy v3.0, which features new
    transformer-based pipelines that get spaCy’s accuracy right up to the current
    state-of-the-art, and a new workflow system to help you take projects from
    prototype to production. If you’re interested in what spaCy v3 is all about,
    check out our video, where Ines
    and Matt guide you through some of the most exciting new features!
  • 🪐 Feb 1: As an add-on to our spaCy v3 release, we published
    spaCy projects, which allow managing
    end-to-end spaCy workflows for different use cases and domains.
  • 📺 Feb 1: Along with spaCy v3, Ines published the
    behind the scene spaCy v3 design concepts
    video.
  • 📺 Feb 1: Sofie celebrated the release of spaCy v3 with her tutorial on
    implementing a
    trainable entity relation extraction component in spaCy v3.
  • 📺 Feb 3: At the
    Contributing.Today meetup
    with Guido van Rossum, Sofie presented the new
    spaCy v3 features.

  • 💫 Mar 4: We released 1.0 of our
    spaCy and Stanza package, which
    allows you to use the latest Stanza (StanfordNLP) research models directly in
    spaCy.
  • 📺 Mar 17: March saw
    a new episode of Vincent
    Warmerdam’s “Intro to NLP with spaCy” series. In this episode, Vincent
    showcased the project system of spaCy v3.
  • 📺 Mar 29: Ines joined the at the German Python Podcast to talk about
    Natural Language Processing with spaCy.
  • 🥳 Mar 30: At the end of March, we celebrated that spaCy reached
    20k+ stars on GitHub.

  • 📺 Apr 22: Ines was invited as a guest on Microsoft’s
    A bit of AI show to talk about
    her journey into AI.
  • 📺 Apr 30: Later that month, Ines joined the
    Snorkel Science Talks.
    She discussed her path into machine learning, fundamental design decisions
    behind spaCy, and the importance of bringing together different stakeholders
    in the machine learning development process.

  • 💫 Jul 7: We released spaCy v3.1,
    which allows using predicted annotations during training. In addition, the
    release includes a SpanCategorizer component for predicting arbitrary and
    overlapping spans. You can create training data for it using Prodigy’s new
    annotation UI for overlapping spans.
  • 🤗 Jul 13:
    Hugging Face welcomed spaCy to their Hub.
    You can now upload any spaCy pipeline using the
    spacy-huggingface-hub
    CLI, with auto-generated pretty READMEs and a interactive visualizers to try
    your pipeline in the browser.
  • 🥳 Jul 14: Sofie became
    team lead for
    spaCy.
  • Aug 12: We’ve partnered with Weights & Biases and the tracking of
    reproducible spaCy NLP pipelines
    became even easier.
  • Aug 12: We released
    Prodigy v1.11, which includes a
    bunch of new features, including a new installation process via pip and new
    wheels for Python 3.9 and ARM architectures, a new recipe and UI for
    annotating overlapping and nested spans,
    new recipes for improving a sentence recognizer model, further training and
    data export recipes that seamlessly integrate with spaCy’s config system.
  • 📺 Aug 17: Ines was live on the radio and joined the
    Byte Into IT show on Melbourne’s Triple R radio
    station.

  • 💥 Sep 2: A big moment for us – we sold
    5% of Explosion.
    Since founding Explosion in 2016, we’ve run the company as a profitable
    business. Our next step is Prodigy Teams, and doing this project well is much
    more important to us than doing it cheaply, so we decided to consider an
    external investment. With SignalFire, we found an
    investor that fits our strategy.

  • 💫 Nov 5: We released spaCy v3.2,
    which improved performance for spaCy on Apple M1 and Nvidia GPU, added Doc
    input for pipelines, and provided registered scoring functions.
  • 🍏 Nov 5: Along with the new spaCy 3.2, we published our
    thinc-apple-ops package to
    accelerate spaCy on macOS by calling into Apple’s native “Accelerate” library.
  • 🌸 Nov 5: We also presented Adriane’s recent work on our new
    floret library, which uses fastText
    and Bloom embeddings for compact, full-coverage vectors with spaCy.
  • 🌳 Nov 17: In mid-November, Daniël presented our new experimental machine
    learning-based lemmatizer
    that posts accuracies above 95% for many languages.
  • ✍️ Nov 17: Our machine learning engineer Lj Miranda published a
    detailed technical overview
    on using spaCy’s project config system, traversing our stack in increasing
    levels of abstraction.
  • 🛡️ Nov 17: The Guardian
    wrote about
    how their data science team used spaCy and Prodigy to
    train a machine learning model that helps extract quotes from news articles
    and match them to the correct source.
  • 📺 Nov 30 Weights & Bias hosted an
    AMA with Ines, where she talked
    about software development, Python, startups and product building.

  • ✍️ Dec 8: For
    KDNuggets,
    Ines shared her perspective on AI and Machine Learning developments in 2021
    and key trends for 2022.
  • 🏫 Dec 9: We’ve updated our interactive
    NLP course for spaCy v3! The updated course is
    available in English, Spanish, German, and Japanese. More languages will
    follow.
  • ✍️ Dec 14: To demonstrate the performance of spaCy v3.2, Adriane compiled
    a series of UD benchmarks
    comparable to the Stanza and Trankit evaluations on Universal Dependencies
    v2.5.
  • 🦑 Dec 15: Our machine learning engineer
    Edward published his
    blog post on Healthsea, an end-to-end
    spaCy pipeline to
    analyze user reviews to supplementary products and
    extract their potential effects on health.
  • 📺 Dec 17: Ines was invited as a guest on the TalkPython podcast to
    discuss
    machine learning ethics and EU laws.

With the community and the team continuing to grow, we look forward to making 2022 even better. Thanks for all your support!





Source link

Leave a Comment