The goal of Named Entity Recognition is to locate and classify named entities in a sequence. The named entities are pre-defined categories chosen according to the use case such as names of people, organizations, places, codes, time notations, monetary values, etc. Essentially, NER aims to assign a class to each token (usually a single word) in a sequence. Because of this, NER is also referred to as token classification.
Model developing Process
For this we are using pre-trained models from simple transformers which build over Hugging face Library.
The process of performing Named Entity Recognition in Simple Transformers does not deviate from the standard pattern.
- Initialize a NERModel
- Train the model with train_model()
- Evaluate the model with eval_model()
- Make predictions on (unlabelled) data with predict()
Supported Model Types using simple transformers
1 . ALBERT = albert 2 . BERT = bert 3 . BERTweet = bertweet 4 . BigBird = bigbird 5 . CamemBERT = camembert 6 . DeBERTa = deberta 7 . DeBERTa = deberta 8 . DeBERTaV2 = deberta-v2 9 . DistilBERT = distilbert 10 . ELECTRA = electra 11 . HerBERT = herbert 12 . LayoutLM = layoutlm 13 . Longformer = longformer 14 . MobileBERT = mobilebert 15 . MPNet = mpnet 16 . RoBERTa = roberta 17 . SqueezeBert = squeezebert 18 . XLM = xlm 19 . XLM-RoBERTa = xlmroberta 20 . XLNet = xlnet
The above models completely uses concept encoders and decoders
A DataFrame containing the 3 columns sentence_id, words, labels. Each value in words will have a
labels value. The sentence_id determines which words belong to a given sentence. I.e. the words from
sequence should be assigned the same unique sentence_id.
Named entity recognition depends on the lables . Model can develop in different lable format here we are using an couple of lablels ['O', 'B-geo', 'B-gpe', 'B-per', 'I-geo', 'B-org', 'I-org', 'B-tim', 'B-art', 'I-art', 'I-per', 'I-gpe', 'I-tim', 'B-nat', 'B-eve', 'I-eve', 'I-nat']
Explanation of the labels
- O = Outside of a named entity
- B-MIS = Beginning of a miscellaneous entity right after another miscellaneous entity
- I-MIS = Miscellaneous entity
- B-PER = Beginning of a person's name right after another person's name
- I-PER = Person's name
- B-ORG = Beginning of an organisation right after another organisation
- I-ORG = Organisation
- B-LOC = Beginning of a location right after another location
- I-LOC = Location
Model used for Named entity recognition
- Arguments.num_train_epochs = 3
- Arguments.train_batch_size = 32
- Arguments.eval_batch_size = 32
- Arguments.learning_rate = 4e-5
- Arguments.max_seq_length = 128
- Arguments.adam_epsilon = 1e-8
- Arguments.do_lower_case = True
- Arguments.n_gpu = 1
- Arguments.overwrite_output_dir = True
Follow technical report.docx file for complete explanation about models used and for dataset.