The code that accompanies this article can be received after subscription

* indicates required

Advancements in the NLP have skyrocketed in the past couple of years. In the research, machine learning and deep learning paved the way to new techniques and possibilities. Many of these techniques already went mainstream like ELMoPyText, BERT, etc. These techniques provided us with many features, like predicting text, detecting entities in the text, and probably the most important – understanding context of the text.

The industry followed this and various frameworks and tools are developed based on the new research techniques. Today we have several options when it comes to different NLP frameworks. In this article, I cover one of those frameworks – Flair. I was impressed by its ease of use and the possibilities it provides, and how you can achieve state-of-the-art results in just a matter of minutes. So, let’s dive into it.

Ultimate Guide to Machine Learning with Python

This bundle of e-books is specially crafted for beginners. Everything from Python basics to the deployment of Machine Learning algorithms to production in one place. Become a Machine Learning Superhero TODAY!

1. What is Flair?

Flair is an amazing NLP framework built on top of PyTorch deep learning framework. It is built and outsourced by the Humboldt University of Berlin and Zalando Research. In essence, it provides three big families of functionalities. First, it exposes state-of-the-art natural language processing (NLP) models so you can easily use them for transfer learning.

Second, it provides various embeddings, so you can use Flair as an embedding library. Finally, it exposes various datasets and allows you to train your own models, so you can use it as a deep learning framework. Flair is a multilanguage library, so it is very good if you work with multiple languages, especially if you need to work with German.

AI Visual

1.1 Flair SOTA NLP Models

As we already mentioned, the first big Flair functionality is that exposes various SOTA NLP models, so you can use them via transfer learning. Some of the models that can be used out of the box are:

1. Name-Entity Recognition (NER): Detect words in the text which represent a person, location, or name in the text.

2. Parts-of-Speech Tagging (PoS): Mark all the words in the given text based on the “part of speech” they belong to.

3. Text Classification: Classify text based on the defined criteria.

These models are really good and if not SOTA, then they are very close to SOTA. Check out the results of the models that Flair provides against current SOTA models for NER:

Flair NER SOTA models

1.2 Flair as Embedding Library

We will explore this topic in the following chapters in more detail, however, it is important to realize that Flair provides various word and document embedding techniques, like BERT embeddings, ELMo, etc. It is especially interesting that it provides its own embeddings – Flair Embeddings or Contextual String Embeddings.

Contextual String Embeddings

This is a novel type of word embedding which is character-based. These embeddings are trained without any explicit notion of words and thus fundamentally model words as sequences of characters. Also, they are contextualized by their surrounding text, meaning that the same word will have different embeddings depending on its contextual use.

Find out more about Contextual String Embeddings in this paper.

In this paper, we propose a novel type of contextualized character-level word embedding which we hypothesize to combine the best attributes of the above-mentioned embeddings; namely, the ability to (1) pre-train on large unlabeled corpora, (2) capture word meaning in context and therefore produce different embeddings for polysemous words depending on their usage, and (3) model words and context fundamentally as sequences of characters, to both better handle rare and misspelled words as well as model subword structures such as prefixes and endings.

Alan Akbik

Professor of Machine Learning, Humboldt-Universität zu Berlin

2. Installation & Prerequsites 

Before installing Flair, make sure that you have installed Python 3.6 or higher, and PyTorch 1.5 or higher. Once you have installed these just run following command: 

pip install flair

3. Flair Basics

In general, there are several important Flair data types. Let’s start with the most simple one and work our way from there. 

3.1 Sentence Class

In the core of the Flair library you can find Sentence class:

from flair.data import Sentence

This data type is used for the majority of operations in this library. As you probably realize it represents one sentence. Every Sentence uses tokenization, so each sentence is split into tokens automatically. Default tokenization is done by segtok library under the hood: 

# Automatic tokenization using the segtok library.
sentence = Sentence('I love Rubik\'s Code blog!')
print(sentence)
Sentence: "I love Rubik 's Code blog !"   [− Tokens: 7]

You can access each token of the sentence either by using an index or by using method get_token:

print(sentence[1])
print(sentence.get_token(2))
Token: 2 love
Token: 2 love
Programming Visual

Of course, you can choose not to use tokenization and split the sentence just by words, meaning split just by whitespace:

sentence = Sentence('I love Rubik\'s Code blog!', use_tokenizer=False)
print(sentence) # Note that text is still split on whitespaces, so 5 tokens are returned here
Sentence: "I love Rubik's Code blog!"   [− Tokens: 5]

If you need some other tokenization, for example for Japanese, you can do that as well:

from flair.tokenization import JapaneseTokenizer

japaneze_tokenizer = JapaneseTokenizer("janome")
sentence = Sentence('愛は法です。意志の下での愛。', use_tokenizer=japaneze_tokenizer)
print(sentence)
Sentence: "愛 は 法 です 。 意志 の 下 で の 愛 。"   [− Tokens: 12]

3.2 SequenceTagger and TextClassifier Classes

To load all those neat SOTA models you can use two classes: SequenceTagger and TextClassifier. The SequenceTagger is used for NER and POS operations, while TextClassifeier is used, as the name suggests, for text classification. We will see how these classes are used for these operations in the following chapters, but the most important method of both classes is the load method:

from flair.models import SequenceTagger
from flair.models import TextClassifier

tagger = SequenceTagger.load('ner')
classifier = TextClassifier.load('en-sentiment')

4. Flair Embeddings

As we mentioned, Flair provides various embeddings types through their flair.embeddings module. You can use simple GLoVe embedding, novel Flair embedding, or stacked them together. Here is what you need to import first:

from flair.data import Sentence
from flair.embeddings import WordEmbeddings
from flair.embeddings import FlairEmbeddings
from flair.embeddings import StackedEmbeddings

sentence = Sentence('I love Rubik\'s Code blog!')

In the snippet above we also initialized the sentence “I love Rubik’s Code blog!”. To load various embeddings we use WordEmbeddings class. For example, if we want to load GLoVe embeddings and apply them to previously initialized Sentence, all we have to do is:

glove_embedding = WordEmbeddings('glove')
glove_embedding.embed(sentence)
Programming Visual

The results can be accessed by using the embedding property of a token in a sentence:

print(sentence[1])
print(sentence[1].embedding)
Token: 2 love
tensor([ 2.5975e-01,  5.5833e-01,  5.7986e-01, -2.1361e-01,  1.3084e-01,
         9.4385e-01, -4.2817e-01, -3.7420e-01, -9.4499e-02, -4.3344e-01,
        -2.0937e-01,  3.4702e-01,  8.2516e-02,  7.9735e-01,  1.6606e-01,
        -2.6878e-01,  5.8830e-01,  6.7397e-01, -4.9965e-01,  1.4764e+00,
         5.5261e-01,  2.5295e-02, -1.6068e-01, -1.3878e-01,  4.8686e-01,
         1.1420e+00,  5.6195e-02, -7.3306e-01,  8.6932e-01, -3.5892e-01,
        -5.1877e-01,  9.0402e-01,  4.9249e-01, -1.4915e-01,  4.8493e-02,
         2.6096e-01,  1.1352e-01,  4.1275e-01,  5.3803e-01, -4.4950e-01,
         8.5733e-02,  9.1184e-02,  5.0177e-03, -3.4645e-01, -1.1058e-01,
        -2.2235e-01, -6.5290e-01, -5.1838e-02,  5.3791e-01, -8.1040e-01,
        -1.8253e-01,  2.4194e-01,  5.4855e-01,  8.7731e-01,  2.2165e-01,
        -2.7124e+00,  4.9405e-01,  4.4703e-01,  5.5882e-01,  2.6076e-01,
         2.3760e-01,  1.0668e+00, -5.6971e-01, -6.4960e-01,  3.3511e-01,
         3.4609e-01,  1.1033e+00,  8.5261e-02,  2.4847e-02, -4.5453e-01,
         7.7012e-02,  2.1321e-01,  1.0444e-01,  6.7157e-02, -3.4261e-01,
         8.5534e-01,  1.3361e-01, -4.3296e-01, -5.6726e-01, -2.1348e-01,
        -3.3277e-01,  3.4351e-01,  3.2164e-01,  4.4527e-01, -1.3208e+00,
        -1.3270e-01, -7.0820e-01, -4.8472e-01, -6.9396e-01, -2.6080e-01,
        -4.7099e-01, -5.7492e-02,  9.3587e-02,  4.0006e-01, -4.3419e-01,
        -2.7364e-01, -7.7017e-01, -8.4028e-01, -1.5620e-03,  6.2223e-01])

If we want to use novel Flair Embeddings we can use the class of the same name. This class also provides various options, like forward and backwards embeddings:

flair_embedding = FlairEmbeddings('news-forward')

flair_embedding.embed(sentence)

print(sentence[1])
print(sentence[1].embedding)
Token: 2 love
tensor([ 0.2598,  0.5583,  0.5799,  ..., -0.0039, -0.0130,  0.0047])

The cool thing is that you can stack both of those embeddings together, if you want to use them like that:

stacked_embeddings = StackedEmbeddings([
                                        glove_embedding,
                                        flair_embedding
                                       ])

stacked_embeddings.embed(sentence)

5. Named Entity Recognition (NER) with Flair

Ok, let’s finally see how we can use some of those Flair state-of-the-art NLP models that we talked so much about. We start with Named Entity Recognition or NER. Here is how Wikipedia defines NER:

Named-entity recognition (NER) (also known as (named) entity identificationentity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

Wikipedia

In an essence, it is a process of detecting words in the unstructured text that are representing names of organizations, persons, cities, etc. It is a commonly used and very powerful NLP technique.  We already know that we can use SequenceTagger class for this. Very cool thing is that Flair is multilingual.

This turned out really important in the project I am working on at the moment, so I am really happy that there is support for other languages. However, let’s start with English and see how we can do the same with other languages.

5.1 Named Entity Recognition (NER) in English

First, we import all necessary modules and initialize the Sentence:

from flair.data import Sentence
from flair.models import SequenceTagger

sentence = Sentence('Rubiks Code is located in Berlin!')

Then we load the English NER model using SequenceTagger:

en_ner = SequenceTagger.load('ner')

The first time you run the previous function, the NER model will be downloaded from HuggingFace. Then the model is loaded into en_ner variable. After that, we can run the model on defined text and see how it performed:

en_ner.predict(sentence)

print(sentence.to_tagged_string())
Rubiks <B-ORG> Code <E-ORG> is located in Berlin <S-LOC> !
Coding Visual

Note that we used the to_tagged_string method to retrieve necessary information from the tagged sentence. The model successfully identified and tagged Rubik’s Code as an organization and Berlin as a location. The same information can be extracted from the sentence like this:

print(sentence)
print('NER tags:')

# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
Sentence: "Rubiks Code is located in Berlin !"   [− Tokens: 7  − Token-Labels: "Rubiks <B-ORG> Code <E-ORG> is located in Berlin <S-LOC> !"]
The following NER tags are found:
Span [1,2]: "Rubiks Code"   [− Labels: ORG (0.5957)]
Span [6]: "Berlin"   [− Labels: LOC (0.9998)]

Here we can also see the probability, ie. confidence with which the model identified previous entities. This is amazing. The code is so clean and easy to use. Even better, we can do the same thing in other languages too.

5.2 Named Entity Recognition (NER) in German and French

The code for German NER is almost the same:

sentence = Sentence('Rubiks Code befindet sich in Berlin!')

# Loading German NER model
de_ner = SequenceTagger.load('de-ner')

de_ner.predict(sentence)

print(sentence)
print('NER tags:')

# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
Sentence: "Rubiks Code befindet sich in Berlin !"   [− Tokens: 7  − Token-Labels: "Rubiks <S-PER> Code befindet sich in Berlin <S-LOC> !"]
NER tags:
Span [1]: "Rubiks"   [− Labels: PER (0.9822)]
Span [6]: "Berlin"   [− Labels: LOC (0.9986)]

It is interesting that the German model didn’t identify Rubik’s Code as an organization, but it identify Rubik as a person. 

Sentiment Analysis Visual

We can do the same thing in French:

fr_ner = SequenceTagger.load('fr-ner')

sentence = Sentence('Rubiks Code est à Berlin !')

fr_ner.predict(sentence)

print(sentence)
print('NER tags:')

# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
Sentence: "Rubiks Code est à Berlin !"   [− Tokens: 6  − Token-Labels: "Rubiks <B-ORG> Code <E-ORG> est à Berlin <S-LOC> !"]
NER tags:
Span [1,2]: "Rubiks Code"   [− Labels: ORG (0.5733)]
Span [5]: "Berlin"   [− Labels: LOC (0.873)]

6. Text Classification with Flair

Flair provides many ways for text classification. In this chapter, we learn how we can implement simple sentiment analysis with it. For this purpose, we need to use TextClassifier class. Similar to NER, we can do this in multiple languages.

In the past couple of years, sentiment analysis became one of the essential tools to monitor and understand customer feedback. This way detection of underlying emotional tone that messages and responses carry is fully automated, which means that businesses can better and faster understand what the customer needs and provide better products and services.

Artificial Intelligence Visual

Sentiment Analysis is, in a nutshell, the most common text classification tool. It’s the process of analyzing pieces of text to determine the sentiment, whether they’re positive, negative, or neutral. Understand the social sentiment of your brand, product, or service while monitoring online conversations is one of the essential tools of the modern business and sentiment analysis is the first step towards that.

6.1 Simple Sentiment Analysis in English

Let’s start with importing all necessary modules and initializing, obviously positive sentence:

from flair.data import Sentence
from flair.models import TextClassifier

sentence = Sentence('I love Rubik\'s Code blog!')

Now, let’s load the model:

classifier = TextClassifier.load('en-sentiment')

Same as in previous examples, the model will be downloaded in the first run. Here is how we use that model:

classifier.predict(sentence)

print('Sentiment: ', sentence.labels)
Sentiment:  [POSITIVE (0.9826)]

We can see that the model classified the previous sentence as POSITIVE with pretty high confidence. This is really good, let’s check if it works with negative sentences as well:

sentence = Sentence('I don\'t like pineapple!')

classifier.predict(sentence)

print('Sentiment: ', sentence.labels)
Sentiment:  [NEGATIVE (0.9988)]
Jeopardy! Dataset

6.2 Detecting Offensive Language in German

This is not the only type of classification which can be done with Flair. We can, for example detect offensive language in German:

sentence = Sentence('Ich liebe dich!')

classifier = TextClassifier.load('de-offensive-language')
classifier.predict(sentence)

print('Offensive: ', sentence.labels)
Offensive:  [OTHER (1.0)]

7. Flair Datasets

Flair library provides a number of datasets and for that, it uses Corpus class and object. This object is constructed of a list of train sentences – train dataset, a list of dev sentences – validation datasets, and a list of test sentences – test dataset. Using this object you can either load the dataset directly from Flair or load your own dataset (more on that later). Here is how you can load English corpus:

corpus = flair.datasets.UD_ENGLISH()

print(f'Train size: {len(corpus.train)}')
print(f'Test size: {len(corpus.test)}')
print(f'Dev size: {len(corpus.dev)}')
Train size: 12543
Test size: 2077
Dev size: 2001

After this you can easily access sentences from this corpus with indexes:

print(corpus.test[9])
Sentence: "I 'm staying away from the stock ."   [− Tokens: 8  − Token-Labels: "I <I/PRON/PRP/nsubj/Nom/Sing/1/Prs> 'm <be/AUX/VBP/aux/Ind/Pres/Fin> staying <stay/VERB/VBG/root/Pres/Part> away <away/ADV/RB/advmod> from <from/ADP/IN/case> the <the/DET/DT/det/Def/Art> stock <stock/NOUN/NN/obl/Sing> . <./PUNCT/./punct>"]

The German corpus is also available:

corpus = flair.datasets.UD_GERMAN()

print(f'Train size: {len(corpus.train)}')
print(f'Test size: {len(corpus.test)}')
print(f'Dev size: {len(corpus.dev)}')

print(corpus.test[9])
Sentence: "Absolut empfehlenswert ist auch der Service ."   [− Tokens: 7  − Token-Labels: "Absolut <absolut/ADV/ADJD/advmod> empfehlenswert <empfehlenswert/ADJ/ADJD/root> ist <sein/AUX/VAFIN/cop/Ind/Sing/3/Pres/Fin> auch <auch/ADV/ADV/advmod> der <der/DET/ART/det/Nom/Def/Masc/Sing/Art> Service <Service/NOUN/NN/nsubj/Nom/Masc/Sing> . <./PUNCT/$./punct>"]

8. Training a Model

Finally, let’s explore how you can train a model using Flair. Technically, you can use Flair as a standard deep learning framework. Here we train a model that can classify messages from Spam SMS DatasetSpam detection was one of the first Machine Learning tasks that were used on the Internet. This task falls under NLP and text classification jobs which sound like the perfect match for the Flair framework. It is heavily used in literature and it is great for beginners. First, let’s import all necessary modules:

import pandas as pd
from pathlib import Path

from flair.data import Sentence
from flair.embeddings import WordEmbeddings, FlairEmbeddings, DocumentRNNEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
from flair.data import Corpus
from flair.datasets import ClassificationCorpus

We use various modules that we already explored in previous chapters. The Sentence, TextClassifier and embedding classes we already learned how to use. The Corpus object we use more explicitly here. Also, we use ClassificationCorpus for loading data from .csv. The ModelTrainer class is used to train the model, as the name sugessts.

8.1 Load and prepare Data

Ok, so let’s load and pre-process the data from the dataset:

data = pd.read_csv(".\\data\\spam.csv", encoding='latin-1')

data.sample(frac=1).drop_duplicates()
data = data[['v1', 'v2']].rename(columns={"v1":"label", "v2":"text"})
data['label'] = '__label__' + data['label'].astype(str)

Ok, now we need to transfer this data from Pandas dataframe to Corpus object. For that we need to split it into train, validation and test subsets. Also, for this we utilize ClassificationCorpus class:

border_1 = int(len(data)*0.8)
border_2 = int(len(data)*0.9)

data.iloc[0:border_1].to_csv('.\\data\\train.csv', sep='\t', index = False, header = False)
data.iloc[border_1:border_2].to_csv('.\\data\\test.csv', sep='\t', index = False, header = False)
data.iloc[border_2:].to_csv('.\\data\\dev.csv', sep='\t', index = False, header = False)

corpus: Corpus = ClassificationCorpus(Path('.\\data\\'), test_file='test.csv', dev_file='dev.csv', train_file='train.csv', label_type='topic')

Finally, we need to build word and document embedings from this data. We use GLoVe and Flair embeddings for word embeddings and DocumentRNNEmbeddings  for document embeddings:

word_embeddings = [WordEmbeddings('glove'), FlairEmbeddings('news-forward-fast'), FlairEmbeddings('news-backward-fast')]
document_embeddings = DocumentRNNEmbeddings(word_embeddings, hidden_size=512, reproject_words=True, reproject_words_dimension=256)

8.2 Train a Model

We put all that together and train a model:

classifier = TextClassifier(document_embeddings, label_dictionary=corpus.make_label_dictionary(label_type='topic'), multi_label=False, label_type='topic')
trainer = ModelTrainer(classifier, corpus)
trainer.train('.\\model', max_epochs=5)

First, we create an object of TextClassifier class. We pass along embeddings, labels and label types. Then we create a ModelTrainer object using the previously created classifier object and corpus object. Finally, we run the train method.

The output will be located in the model folder. There the information about each epoch, weights values and the model itself are stored. Output of this process looks something like this:

2021-09-10 16:59:35,414 Computing label dictionary. Progress:
100%|██████████| 4457/4457 [00:03<00:00, 1340.41it/s]
2021-09-10 17:00:02,226 Corpus contains the labels: topic (#4457)
2021-09-10 17:00:02,229 Created (for label 'topic') Dictionary with 2 tags: ham, spam
2021-09-10 17:00:02,237 ----------------------------------------------------------------------------------------------------
2021-09-10 17:00:02,239 Model: "TextClassifier(
  (loss_function): CrossEntropyLoss()
  (document_embeddings): DocumentRNNEmbeddings(
    (embeddings): StackedEmbeddings(
      (list_embedding_0): WordEmbeddings('glove')
      (list_embedding_1): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.25, inplace=False)
          (encoder): Embedding(275, 100)
          (rnn): LSTM(100, 1024)
          (decoder): Linear(in_features=1024, out_features=275, bias=True)
        )
      )
      (list_embedding_2): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.25, inplace=False)
          (encoder): Embedding(275, 100)
          (rnn): LSTM(100, 1024)
          (decoder): Linear(in_features=1024, out_features=275, bias=True)
        )
      )
    )
show more (open the raw output data in a text editor) ...

2021-09-10 17:00:02,262 ----------------------------------------------------------------------------------------------------
2021-09-10 17:00:02,264 Device: cpu
2021-09-10 17:00:02,265 ----------------------------------------------------------------------------------------------------
2021-09-10 17:00:02,267 Embeddings storage mode: cpu
2021-09-10 17:00:02,271 ----------------------------------------------------------------------------------------------------

2021-09-10 17:01:08,981 epoch 1 - iter 14/140 - loss 0.01010582 - samples/sec: 9.13 - lr: 0.100000
2021-09-10 17:02:08,546 epoch 1 - iter 28/140 - loss 0.00864684 - samples/sec: 7.54 - lr: 0.100000
2021-09-10 17:03:05,738 epoch 1 - iter 42/140 - loss 0.00766546 - samples/sec: 7.88 - lr: 0.100000
2021-09-10 17:04:15,945 epoch 1 - iter 56/140 - loss 0.00688913 - samples/sec: 6.87 - lr: 0.100000
2021-09-10 17:05:24,812 epoch 1 - iter 70/140 - loss 0.00605645 - samples/sec: 6.51 - lr: 0.100000
2021-09-10 17:06:28,052 epoch 1 - iter 84/140 - loss 0.00563861 - samples/sec: 7.09 - lr: 0.100000
2021-09-10 17:07:27,420 epoch 1 - iter 98/140 - loss 0.00548316 - samples/sec: 7.82 - lr: 0.100000
2021-09-10 17:08:16,254 epoch 1 - iter 112/140 - loss 0.00507662 - samples/sec: 9.18 - lr: 0.100000
2021-09-10 17:08:59,090 epoch 1 - iter 126/140 - loss 0.00479925 - samples/sec: 10.47 - lr: 0.100000
2021-09-10 17:09:47,542 epoch 1 - iter 140/140 - loss 0.00471874 - samples/sec: 9.25 - lr: 0.100000
2021-09-10 17:09:51,488 ----------------------------------------------------------------------------------------------------
2021-09-10 17:09:51,489 EPOCH 1 done: loss 0.0047 - lr 0.1000000
2021-09-10 17:11:09,362 DEV : loss 0.00234626024030149 - f1-score (micro avg)  0.9785
2021-09-10 17:11:09,708 BAD EPOCHS (no improvement): 0
2021-09-10 17:11:09,710 saving best model
2021-09-10 17:11:14,686 ----------------------------------------------------------------------------------------------------
2021-09-10 17:12:19,000 epoch 2 - iter 14/140 - loss 0.00292956 - samples/sec: 9.45 - lr: 0.100000
2021-09-10 17:13:38,299 epoch 2 - iter 28/140 - loss 0.00347677 - samples/sec: 5.66 - lr: 0.100000
2021-09-10 17:14:33,241 epoch 2 - iter 42/140 - loss 0.00307422 - samples/sec: 8.19 - lr: 0.100000
2021-09-10 17:15:45,948 epoch 2 - iter 56/140 - loss 0.00310895 - samples/sec: 6.62 - lr: 0.100000
2021-09-10 17:16:36,750 epoch 2 - iter 70/140 - loss 0.00291028 - samples/sec: 8.83 - lr: 0.100000
2021-09-10 17:17:32,092 epoch 2 - iter 84/140 - loss 0.00280403 - samples/sec: 8.11 - lr: 0.100000
2021-09-10 17:18:36,880 epoch 2 - iter 98/140 - loss 0.00281082 - samples/sec: 6.98 - lr: 0.100000
2021-09-10 17:19:37,738 epoch 2 - iter 112/140 - loss 0.00272288 - samples/sec: 7.79 - lr: 0.100000
2021-09-10 17:20:29,745 epoch 2 - iter 126/140 - loss 0.00259132 - samples/sec: 8.63 - lr: 0.100000
show more (open the raw output data in a text editor) ...

   macro avg     0.9732    0.9226    0.9459       557
weighted avg     0.9765    0.9767    0.9760       557
 samples avg     0.9767    0.9767    0.9767       557

2021-09-10 17:53:27,039 ----------------------------------------------------------------------------------------------------
{'test_score': 0.9766606822262118,
 'dev_score_history': [0.978494623655914,
  0.982078853046595,
  0.985663082437276,
  0.9838709677419355,
  0.985663082437276],
 'train_loss_history': [0.004718736432190163,
  0.0025660763221981146,
  0.0022709675352163288,
  0.0019738855226341923,
  0.0018342424205162503],
 'dev_loss_history': [tensor(0.0023),
  tensor(0.0019),
  tensor(0.0023),
  tensor(0.0016),
  tensor(0.0014)]}

Conclusion

In this article, we explored many possibilities of Flair – a powerful NLP framework. We had a chance to learn its basics and to learn how to use it for various NLP tasks such as NER and Sentiment Analysis.

Ultimate Guide to Machine Learning with Python

This bundle of e-books is specially crafted for beginners. Everything from Python basics to the deployment of Machine Learning algorithms to production in one place. Become a Machine Learning Superhero TODAY!

Nikola M. Zivkovic

Nikola M. Zivkovic

Nikola M. Zivkovic is the author of the books: Ultimate Guide to Machine Learning and Deep Learning for Programmers. He loves knowledge sharing, and he is an experienced speaker. You can find him speaking at meetups, conferences, and as a guest lecturer at the University of Novi Sad.