March 30, 2025
spaCy and the future of multi-lingual NLP
The bad More competitive • “winning” a shared task could be worth millions (fame and future salary) • few researchers care about the languages More expensive • experiments now cost a lot to run (especially GPU) • results are unpredictable • pressure to run fewer experiments on fewer datasets Faster moving, less careful • huge pressure to publish • volume of publications makes reviewing more random • dynamics promote incremental work Source link