Information Retrieval Evaluation Metrics

We define two common information retrieval (IR) metrics: MAP@K and NDCG.

MAP@K

Mean Average Precision at K (MAP@K): are the predicted items relevant? Are the most relevant items on top?

MAP of a set of queries = $\frac{\sum_{q=1}^{Q}AP(q)}{Q}$
$\text{P@K} = \frac{\text{number of relevant items in top K results}}{K}$
$\text{AP@K} = \frac{1}{\text{GTP}} \sum_{k=1}^{N} \text{P@K} \times \text{rel}@K$
- GTP: total number of ground truth positives
- rel@K: indicator function which equals 1 if document at rank $k$ is relevant, 0 otherwise

_config.yml

If all the relevant documents are returned at the front, then AP is a perfect 1.0. Hence, $AP$ metric penalize models that are not able to return a ranked list with true positives leading the list.

Normalized Discounted Cumulative Gain

Cumulative Gain (CG) = $\sum_{i=1}^{n}r_i$
- for a recommendation set of $n$ documents, $r_i$ is relevance of document $i$
- to inject position aware scoring, we do DCG
Discounted CG (DCG) = $\sum_{i=1}^{n} \frac{r_i}{log_{2}(i+1)}$
- An alternative expression is $\sum_{i=1}^{n} \frac{2^{r_i}-1}{log_{2}(i+1)}$, which penalizes more heavily (vs the above expression), if documents with higher relevance are ranked lower.
- If relevance scores are binary (0 or 1), then the 2 expressions are equal
- Example. Let’s look at 2 rankings. Set-A = [2,3,3,1,2], Set-B = [3,3,2,2,1]
  - $DCG_{A} = \frac{2}{log_{2}(1+1)} + \frac{3}{log_{2}(2+1)} + \frac{3}{log_{2}(3+1)} + \frac{1}{log_{2}(4+1)} + \frac{2}{log_{2}(5+1)} \approx 6.64$
  - $DCG_{B} = \frac{3}{log_{2}(1+1)} + \frac{3}{log_{2}(2+1)} + \frac{2}{log_{2}(3+1)} + \frac{2}{log_{2}(4+1)} + \frac{1}{log_{2}(5+1)} \approx 7.14$
- Con of DCG: it is an absolute score. The more relevant documents that a query has, the higher the DCG score. But each query will have different number of relevant documents. Hence we do NDCG, which bounds the score to 0.0 - 1.0.
Normalized DCG (NDCG) = $\frac{DCG}{iDCG}$
- Compute DCG of recommended/predicted order. Compute DCG of ideal order (iDCG).
- For the above examples, NDCG = $\frac{6.64}{7.14} \approx 0.93$

Written on December 5, 2022