March 29, 2025

ikayaniaamirshahzad@gmail.com

Our Year in Review · Explosion


While 2020 hasn’t been easy for anyone, at Explosion we’ve considered ourselves
relatively fortunate in this most
interesting
year. We’ve always worked remotely, so we’ve been able to take both pride and
comfort in continuing to ship good software. Here’s a look back at what we’ve
been up to.

  • 🔮 Jan 28: 2020 started with a big release: the alpha of
    Thinc v8.0, a lightweight deep learning library that
    offers an elegant, type-checked, functional-programming API for composing
    models, with support for layers defined in other frameworks such as PyTorch,
    TensorFlow or MXNet. Thinc was re-written from the ground up to support some
    of the new workflows coming to spaCy v3.0, including
    a flexible training configuration system and the ability to plug in model
    implementations written in any framework.

  • 🎤 Feb 8: In February, Matt and Ines were invited to PyCon Colombia in
    Medellín – thanks to the team for organizing such an awesome event! Ines
    presented a keynote titled
    “The Future of NLP in Python”
    about how new Python tooling and advancements in Natural Language Processing
    help with closing the gap between prototype and production, making it easier
    to ship powerful natural language understanding pipelines.
  • 📺 Feb 8: At PyCon Colombia, Ines was also
    interviewed by Karolina Ladino
    and they talked the history of spaCy, and how to get into programming, machine
    learning and NLP.

  • 📺 Mar 2: March started with a
    new episode of
    Vincent Warmerdam’s popular video series,
    “Intro to NLP with spaCy”. In this episode, he explored the processing
    pipeline and trained a simple NER model to detect programming languages.
  • 📺 Mar 16: Ines published an end-to-end
    video tutorial showing how to
    use our annotation tool Prodigy to train a named entity
    recognition model from scratch, by taking advantage of semi-automatic
    annotation and modern transfer learning techniques.
  • 💻 Mar 20: Sebastián released Typer,
    a library for building modern CLIs, powered by Python type hints. We’ve been
    using it extensively in our projects ever since!
  • 📺 Mar 24: In the next Prodigy
    tutorial video, Ines showed how
    to build fully custom annotation workflows and UIs for image captioning, and
    how to plug in a simple PyTorch image captioning model. Also: cats! 😺
  • 📻 Mar 30: Towards the end of the month, Matt joined the
    Podcast.__init__
    podcast again and discussed Explosion’s developer tools stack and what’s next
    for spaCy, Thinc and Prodigy.

  • 🏫 Apr 21: In April, we released the first translation of the free spaCy
    online course, Modernes NLP mit spaCy,
    featuring German instructions and text examples.
  • 📻 Apr 26: Ines was also invited as a guest on the
    Chai Time Data Science podcast
    and talked about her NLP journey, spaCy and Prodigy, open-source development,
    and tattoos.

  • 🏫 May 6: May started off with a Japanese translation of the free spaCy
    online course:
    spaCy を使った先進的な自然言語処理. Special
    thanks to Yohei Tamura!
  • 📺 May 7: A day later, Sofie released an end-to-end video tutorial showing
    how to train your own
    custom Entity Linking model
    with spaCy to disambiguate different mentions of a person name to unique
    identifiers in a knowledge base, and how to create your own training data from
    scratch.
  • 🏫 May 11: ¡Hola! The free spaCy online course was released in Spanish,
    complete with Spanish text examples:
    NLP avanzado con spaCy. Thanks to
    Camila Gutierrez!
  • 📺 May 14: May featured even more additions to the free spaCy course:
    Ines recorded video versions in
    English and
    German that you can view as
    standalone lessons on YouTube, or watch as part of the interactive online
    course.

  • 📺 Jun 13: June saw
    another new episode of Vincent
    Warmerdam’s “Intro to NLP with spaCy” series. In this episode, he digs deeper
    into the performance of the NER model he trained, using a rule-based
    classifier to probe for errors and improve the training data.
  • 💫 Jun 16: We also released spaCy v2.3, which added
    trained pipelines for Chinese, Japanese, Danish, Polish and Romanian, updated
    all 15 model families with word vectors and improved accuracy, while also
    decreasing model size and loading times for models with vectors.
  • Jun 16: Prodigy got a big upgrade in June with the
    release of v1.10.0. The version
    includes a bunch of new features, interfaces and recipes for dependency and
    relation annotation, audio and video annotation, as well as a new and improved
    manual image annotation interface with support for editing shapes and bounding
    boxes.
  • 📺 June 16: To show you the new Prodigy features in action, Ines recorded
    a video walkthrough that
    includes examples of dependency and relation annotation, coreference
    resolution, biomedical event extraction, audio and video annotation, NER
    annotation for fine-tuning transformers and more!
  • 🎤 Jun 18: At Rasa’s Level 3 AI Assistant conference, Ines talked about
    “Designing Practical NLP Solutions”,
    how to break down larger business problems into solvable machine learning
    tasks, and how to make your NLP projects fail less.
  • 💻 Jun 21:
    spacy-streamlit is released!
    It’s a Python library containing building blocks and visualizers for
    integrating spaCy pipelines into Streamlit apps.
  • 📺 Jun 25: Finally, we published a
    Spanish video version of the
    free online course, presented by
    Camila Gutierrez. ¡Practiquemos!

  • 📻 Oct 4: Sebastián was a guest on the
    Talk Python podcast
    to discuss building modern and fast APIs with FastAPI.
  • 📻 Oct 13: On the
    DevJourney Podcast,
    Ines shared her personal software development journey, from getting her first
    computer to becoming a core developer of spaCy and founding Explosion.
  • 💫 Oct 15: In mid-October, we finally published the long awaited
    nightly pre-release of spaCy v3.0! spaCy v3.0
    features all new transformer-based pipelines that bring spaCy’s accuracy right
    up to the current state-of-the-art. You can use any pretrained transformer to
    train your own pipelines, and even share one transformer between multiple
    components with multi-task learning. Training is now fully configurable and
    extensible, and you can define your own custom models using PyTorch,
    TensorFlow and other frameworks. The new spaCy projects system lets you
    describe whole end-to-end workflows in a single file, giving you an easy path
    from prototype to production, and making it easy to clone and adapt
    best-practice projects for your own use cases.
  • 🎤 Oct 26: In her
    keynote at Global AI Live,
    Ines presented the upcoming spaCy v3.0 and how it makes it easier than ever to
    bring state-of-the-art NLP projects from prototype to production.
  • 🐍 Oct 27: Ines was honored to be recognized as a
    Python Software Foundation Fellow,
    due to her work with Explosion on spaCy and other projects.
  • 📻 Oct 29: Wrapping up October, Ines and Sofie joined the
    Gradient Dissent podcast hosted
    by Weights & Biases to talk about spaCy v3.0 and the new features, the
    motivation behind the new release and the various design decisions we made
    along the way.

  • 📰 Dec 4: For
    KDNuggets,
    Ines shared her perspective on AI and Machine Learning developments in 2020
    and key trends for 2021.
  • 💫 Dec 11: In December, GitHub introduced discussion boards, so we
    officially launched the
    spaCy discussion board! Come
    join the community and ask for help with your code, share tips, tricks and
    best practices, discuss features and project ideas, collaborate on language
    support, show off what you’ve built and stay up to date with the latest spaCy
    news!
  • 💘 Dec 14: To celebrate another year (and Ines’ birthday!), we started
    another round of sending
    spaCy stickers to
    the community! This time with new designs, including cool holographic styles.
    You can still
    sign up here
    to receive yours!
  • 📻 Dec 28: Wrapping up 2020, Ines joined the
    Python Year in Review episode
    of Talk Python to talk about what the year had in store for 2020, and what to
    expect for 2021.

With the community and the team continuing to grow, we look forward to making 2021 even better. Thanks for all your support!





Source link

Leave a Comment