Token Classification with Spacy with Live Demo

Arsh Anwar
3 min readOct 27, 2022

Did you Know?

  • Every day we are generating 2.5 quintillion that is 2.5 e+9 GB bytes of data.
  • Google processes 3.5 billion requests per day
  • Amazon has hosted 1,000,000,000 gigabytes of data across more than 1,400,000 servers.

With this humongous amount of data, there is a need of classifying the data for our use. This classification of named entities of data is done by a technique called Named Entity Recognition or Token Classification.

The NER model first identifies an entity and then categorizes the entity into the most suitable class. Some of the common types of Named Entities will be:

1. Organisations :

  • NASA, TrueFoundry, Google, etc

2. Places:

  • Los Angeles, New York, Delhi.

3. Money:

  • 1 Billion Dollars, 50 Great Britain Pounds.

4. Date:

  • 15t h August

5. Person:

  • Elon Musk, Richard Feynman, Subhas Chandra Bose.

The capacity of NER models to understand Named Entities is dependent on the data on which they were trained. NER has several uses.

NER may be used for content categorization by collecting the different Named Entities in a text and understanding the content themes based on that data. In academia and research, NER may be used to obtain data and information from a wide range of textual content more quickly. NER is extremely useful for extracting information from large text collections.

Image Source: MonkeyLearn

In Today’s blog, we will create a Pipeline that classifies tokens present in the text. We are going to develop this pipeline with Spacy. We will also work on creating a Live Demo using Gradio and deploy it on TrueFoundry.

Let’s begin

Importing Libraries

import osimport gradio as grimport spacyfrom spacy import displacy

Creating our Pipeline

Here, we will create our Function named ‘ner’ it will take ‘text’ and ‘model name’ as input and will return ‘position_tokens’, ‘meta_data’, and ‘dependency tree’
In this function, Token classification can be done with 2 Spacy Models:

  • Spacy’s English Transformer based Model
  • Spacy’s Small NER Model

Building Gradio App

We are going to create our Gradio App and run it on port 8080

Test Drive

Running our App

Voila! it runs, Now we will work on deploying this.

Deployment

We are going to use TrueFoundry for our deployment

Logging into TrueFoundry

Heading to Deployment Section

1) Creating a new deployment

2) Select the Service option and Workspace name

3) Fill out properties and submit

4) Deploying

5) Successful Deployment

6) Final Thoughts

After the deployment is done, you will be able to use the Gradio App.

The app is deployed here: https://token-classification-arsh-dev.tfy-ctl-euwe1-develop.develop.truefoundry.tech/

Video

Code

The above code is also present in my Repository

References:

  1. TrueFoundry: https://truefoundry.com/
  2. TrueFoundry App: https://app.truefoundry.com/
  3. TrueFoundry Docs: http://docs.truefoundry.com/
  4. Code: https://github.com/d4rk-lucif3r/Token-Classification-with-Spacy

--

--

Arsh Anwar

AI/ML expert. Built LuciferML (100k+ downloads). Co-founder @Revca, building smart solutions for a sustainable future.