Accelerating Text Processing With RAPIDS

Short talk

Go to NumFOCUS academy page.

See how you can speed up text processing on GPUs without ever leaving Python. In this session, we will explore the use of DataFrame APIs, up through higher level, special text manipulation functions, and on up into sklearn style Text Vectorizers and integration with SpaCy, HuggingFace, to get 100x+ faster NLP pipelines. We also cover scaling up these workflows to multiple nodes using Dask.

Speaker

Vibhu Jawa

Vibhu Jawa is a Software Engineer and Data Scientist on the RAPIDS team at NVIDIA, where his efforts are focused on building GPU-accelerated data science products. Previous to NVIDIA, Vibhu was doing his master’s in computer science at Johns Hopkins, where his research was focussed on NLP and building interpretable machine learning models for healthcare.