Fairness Metrics - API Reference¶
Auto-generated documentation for fairness metric classes.
warprec.evaluation.metrics.fairness.biasdisparitybd.BiasDisparityBD
¶
Bases: TopKMetric
Bias Disparity (BD) metric.
This metric measures the relative disparity in bias between the distribution of recommended items and the distribution of items in the training set, aggregated over user and item clusters. It is computed as the relative difference between BiasDisparityBR (bias in recommendations) and BiasDisparityBS (bias in the training set):
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations (used by BiasDisparityBR). |
required |
num_items
|
int
|
Number of items in the training set. |
required |
user_cluster
|
Tensor
|
Lookup tensor of user clusters. |
required |
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/biasdisparitybd.py
warprec.evaluation.metrics.fairness.biasdisparitybr.BiasDisparityBR
¶
Bases: TopKMetric
The BiasDisparityBR@K (Bias Disparity - Bias Recommendations) metric.
This metric computes the disparity between the distribution of recommended items and the global item distribution per user cluster, averaged over users in the cluster.
Attributes:
| Name | Type | Description |
|---|---|---|
user_clusters |
Tensor
|
Tensor mapping each user to a user cluster. |
item_clusters |
Tensor
|
Tensor mapping each item to an item cluster. |
PC |
Tensor
|
Global distribution of items across item clusters. |
category_sum |
Tensor
|
Accumulator tensor of shape counting recommended items per user-item cluster pair. |
total_sum |
Tensor
|
Accumulator tensor counting total recommendations per user cluster. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
The cutoff. |
required |
num_items
|
int
|
Number of items in the training set. |
required |
user_cluster
|
Tensor
|
Lookup tensor of user clusters. |
required |
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/biasdisparitybr.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | |
warprec.evaluation.metrics.fairness.biasdisparitybs.BiasDisparityBS
¶
Bases: BaseMetric
BiasDisparityBS measures the disparity in recommendation bias across user and item clusters.
This metric quantifies how the distribution of recommended items deviates from the global item distribution within each user cluster. It helps to identify whether certain user groups are disproportionately exposed to specific item categories compared to the overall item popularity.
Attributes:
| Name | Type | Description |
|---|---|---|
user_clusters |
Tensor
|
Tensor mapping each user to a user cluster. |
item_clusters |
Tensor
|
Tensor mapping each item to an item cluster. |
PC |
Tensor
|
Global distribution of items across item clusters. |
category_sum |
Tensor
|
Accumulated counts of positive interactions per user-item cluster pair. |
total_sum |
Tensor
|
Accumulated counts of positive interactions per user cluster. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_items
|
int
|
Number of items in the training set. |
required |
user_cluster
|
Tensor
|
Lookup tensor of user clusters. |
required |
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/biasdisparitybs.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | |
warprec.evaluation.metrics.fairness.itemmadranking.ItemMADRanking
¶
Bases: TopKMetric
Item MAD Ranking (ItemMADRanking) metric.
This metric measures the disparity in item exposure across different item clusters in the top-k recommendations, by computing the Mean Absolute Deviation (MAD) of the average discounted relevance scores per cluster. The goal is to evaluate whether some item clusters receive consistently higher or lower exposure than others.
Attributes:
| Name | Type | Description |
|---|---|---|
num_items |
int
|
Number of items in the training set. |
item_clusters |
Tensor
|
Tensor mapping each item to an item cluster. |
item_counts |
Tensor
|
Tensor of counts of item recommended. |
item_gains |
Tensor
|
Tensor of gains of item recommended. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations. |
required |
num_items
|
int
|
Number of items in the training set. |
required |
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/itemmadranking.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
warprec.evaluation.metrics.fairness.itemmadrating.ItemMADRating
¶
Bases: TopKMetric
Item MAD Rating (ItemMADRating) metric.
This metric measures the disparity in the average rating received by items across different item clusters, considering only the items that were recommended and were relevant to the user. It computes the Mean Absolute Deviation (MAD) of the average rating per item cluster. The goal is to evaluate whether some item clusters receive consistently higher or lower average ratings when they are successfully recommended (i.e., recommended to a relevant user).
Attributes:
| Name | Type | Description |
|---|---|---|
num_items |
int
|
Number of items in the training set. |
item_clusters |
Tensor
|
Tensor mapping each item to an item cluster. |
item_counts |
Tensor
|
Tensor of counts of item recommended and relevant. |
item_gains |
Tensor
|
Tensor of summed ratings/relevance for item recommended and relevant. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations. |
required |
num_items
|
int
|
Number of items in the training set. |
required |
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/itemmadrating.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
warprec.evaluation.metrics.fairness.reo.REO
¶
Bases: TopKMetric
Ranking-based Equal Opportunity (REO) metric.
This metric evaluates the fairness of a recommender system by comparing the proportion of recommended items from different item clusters (or groups) among the relevant items in the ground truth. It calculates the standard deviation of these proportions divided by their mean, providing a measure of how equally the system recommends relevant items across different groups.
Attributes:
| Name | Type | Description |
|---|---|---|
item_clusters |
Tensor
|
A tensor mapping item index to its cluster ID. |
cluster_recommendations |
Tensor
|
Accumulator for the total count of relevant recommended items per cluster. |
cluster_total_items |
Tensor
|
Accumulator for the total count of relevant items per cluster in the ground truth. |
n_effective_clusters |
int
|
The total number of unique item clusters. |
n_item_clusters |
int
|
The total number of unique item clusters, including fallback cluster. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations. |
required |
*args
|
Any
|
The argument list. |
()
|
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
None
|
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/reo.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | |
warprec.evaluation.metrics.fairness.rsp.RSP
¶
Bases: TopKMetric
Ranking-based Statistical Parity (RSP) metric.
This metric evaluates the fairness of a recommender system by comparing the proportion of recommended items from different item clusters (or groups) out of the pool of items not seen during training. It calculates the standard deviation of these proportions divided by their mean, providing a measure of how equally the system recommends items across different groups, regardless of relevance in the test set.
Attributes:
| Name | Type | Description |
|---|---|---|
item_clusters |
Tensor
|
A tensor mapping item index to its cluster ID. |
cluster_recommendations |
Tensor
|
Accumulator for the total count of recommended items per cluster in the top-k. |
denominator_counts |
Tensor
|
Pre-calculated total count of items per cluster not in the training set across all users. |
n_effective_clusters |
int
|
The total number of unique item clusters. |
n_item_clusters |
int
|
The total number of unique item clusters, including fallback cluster. |
user_interactions |
Tensor
|
Accumulator for counting how many times each user has been evaluated. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations. |
required |
num_users
|
int
|
Number of users in the training set. |
required |
item_interactions
|
Tensor
|
Tensor containing counts of item interactions in the training set. |
required |
item_cluster
|
Tensor
|
Lookup tensor of item clusters. |
None
|
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/rsp.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
warprec.evaluation.metrics.fairness.usermadranking.UserMADRanking
¶
Bases: UserAverageTopKMetric
User MAD Ranking (UserMADRanking) metric.
This metric measures the disparity in user exposure across different user clusters in the top-k recommendations, by computing the Mean Absolute Deviation (MAD) of the average per-user nDCG scores per cluster. The MAD is computed as the mean of absolute differences between every pair of cluster-level averages.
Attributes:
| Name | Type | Description |
|---|---|---|
user_clusters |
Tensor
|
Tensor mapping each user to an user cluster. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations. |
required |
num_users
|
int
|
Number of users in the training set. |
required |
user_cluster
|
Tensor
|
Lookup tensor of user clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|
Source code in warprec/evaluation/metrics/fairness/usermadranking.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
warprec.evaluation.metrics.fairness.usermadrating.UserMADRating
¶
Bases: UserAverageTopKMetric
User MAD Rating (UserMADRating) metric.
This metric measures the disparity in the average rating/score received by users across different user clusters, considering the average rating of their top-k recommended items. It computes the Mean Absolute Deviation (MAD) of the average per-user average top-k rating scores per user cluster. The MAD is computed as the mean of absolute differences between every pair of cluster-level averages.
Attributes:
| Name | Type | Description |
|---|---|---|
user_clusters |
Tensor
|
Tensor mapping each user to an user cluster. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k
|
int
|
Cutoff for top-k recommendations. |
required |
num_users
|
int
|
Number of users in the training set. |
required |
user_cluster
|
Tensor
|
Lookup tensor of user clusters. |
required |
dist_sync_on_step
|
bool
|
Whether to synchronize metric state across distributed processes. |
False
|
**kwargs
|
Any
|
Additional keyword arguments. |
{}
|