Token Classification with Spacy with Live Demo

3 min readOct 27, 2022

Did you Know?

Every day we are generating 2.5 quintillion that is 2.5 e+9 GB bytes of data.
Google processes 3.5 billion requests per day
Amazon has hosted 1,000,000,000 gigabytes of data across more than 1,400,000 servers.

With this humongous amount of data, there is a need of classifying the data for our use. This classification of named entities of data is done by a technique called Named Entity Recognition or Token Classification.

The NER model first identifies an entity and then categorizes the entity into the most suitable class. Some of the common types of Named Entities will be:

1. Organisations :

NASA, TrueFoundry, Google, etc

2. Places:

Los Angeles, New York, Delhi.

3. Money:

1 Billion Dollars, 50 Great Britain Pounds.

4. Date:

15t h August

5. Person:

Elon Musk, Richard Feynman, Subhas Chandra Bose.

The capacity of NER models to understand Named Entities is dependent on the data on which they were trained. NER has several uses.

NER may be used for content categorization by collecting the different Named Entities in a text and understanding the content themes based on that data. In academia and research, NER may be used to obtain data and information from a wide range of textual content more quickly. NER is extremely useful for extracting information from large text collections.

In Today’s blog, we will create a Pipeline that classifies tokens present in the text. We are going to develop this pipeline with Spacy. We will also work on creating a Live Demo using Gradio and deploy it on TrueFoundry.

Let’s begin

Importing Libraries

import osimport gradio as grimport spacyfrom spacy import displacy

Creating our Pipeline

Here, we will create our Function named ‘ner’ it will take ‘text’ and ‘model name’ as input and will return ‘position_tokens’, ‘meta_data’, and ‘dependency tree’
In this function, Token classification can be done with 2 Spacy Models: