Suppose we are playing a word game: I think a few words and you have to guess them.

As usually you will ask me some questions, identify the subject and try to guess. A possible end of the game is:

**I think**: rabbit, carrot, orange.**You try**: dog, carrot, peach, orange.

We can define 2 quantities to synthesize how good you are as this game:

**precision**: is the number of*correct guess over the number of tries*= 2/4**recall**: is the number of*correct guess over the number of words to guess*= 2/3

It’s easy to see that this two numbers are somehow in competition.

If you start reading a vocabulary you will end up having a good recall at the price of a very small precision:

**precision**= 3/10.000**recall**= 3/3

On the other side, spending a lot of time investigating for a single word can result in a high precision at the price of a small recall:

**precision**= 1/1**recall**= 1/3

We have 2 numbers, Precision and Recall, and we need to chose the strategy to play, so we need some way to merge them in a single value, rank the methods and chose the best one.

The F1 score is a standard way to mix the two numbers in a single score:

Let’s compute the F1 score for the tree proposed solutions:

**your**method scores**F1 = 0.57**- the
**dictionary**method scores**F1= 0.0006** - the
**investigator**method scores**F1=0.5**

So, the best score is the one of your solution!