this project is about Shapley value calculation in Java (and Kotlin) introduced by Lloyd Shapley. To have a complete view you can have a look to the wikipedia article https://en.wikipedia.org/wiki/Shapley_value.
In a cooperative game, some players have the possibility to forge coalitions to reach a common goal : increase a revenue, or share a cost.
A difficulty in cooperation game theory is the fair distribution of profits among the players. The Shapley value provides on possible answer to this question.
First a little joke :
- The great question for the father of Lloyd Shapley, Harlow Shapley was the size of the universe (see https://en.wikipedia.org/wiki/Great_Debate_(astronomy))
- The great question for Lloyd Shapley was how to share a taxi ...
Only the son received the Nobel price (in economics, in 2012)
The formula are done with the web site https://www.codecogs.com/latex/eqneditor.php
- N is the set containing all the players
- S is a subset of N
- Characteristic function is writen down 'v()'
- The Shapley value is the Greek letter phi ()
To be more complete the Shapley value for the player i is writen like that
In a cooperative game
- with n players N={1,2,..., n}
- a coalition is a subset of N
The characteristic function associates for each coalition a real number. With this function, we can know the revenue (or the cost) for the a coalition S : v(S). The value of v({}) (empty set) is 0, because if there is no player there is no cost or revenue.
The Shapley value is based on four axioms :
- Efficiency
The sum of the Shapley values of all participants are equals to the value generated by the grand coalition.
- Null player
The Shapley value of a null player i is null. A null player is a player who never contributes.
- Symmetry
if i and j are two equivalent actors, they will have the same Shapley value.
- Additivity
If you bring more, you get more
To respect these 4 axioms there is only one solution, one way to calculate the Shapley value. The calculation is the average of the marginal contribution.
How to share a taxi travel ? Amy, Bob and Clare are sharing a taxi. We imagine they are going to the same direction.
- Amy must pay 6 to go home
- Bob must pay 12 to go home
- Clare must pay 42 to go home
If they share the taxi, the Shapley value will give :
- Amy must pay 2
- Bob must pay 5
- Clare must pay 35
- The total 2+5+35 = 42
With this example, we can see that Shapley value is for everybody efficient. Because Amy, Bob, and Clare pay less when they share than when they travel alone but the sum of the three values are enough to pay the bill.
Here is several examples with a few participants. You can imagine that with big groups the calculation becomes very difficult.
characteristic function :
shapley value
@Test
public void testCalculateOneParticipant() {
CharacteristicFunction cfunction =
new CharacteristicFunction.CharacteristicFunctionBuilder(1)
.addCoalition(1.0, 1).build();
ShapleyValue s = new ShapleyValue(cfunction);
s.calculate();
Map<Integer,Double> output =s.getResult();
double phi1 = output.get(1);
assertEquals(phi1, 1.0, 0.01);
}
Order | marginal contribution 1 | marginal contribution 2 |
---|---|---|
1 2 | v({1})=1 | v({1,2})-v({1}) =3 |
2 1 | v({1,2})-v({2})=2 | v({2})= 2 |
@Test
public void testCalculateTwoParticipants() {
CharacteristicFunction cfunction =
new CharacteristicFunction.CharacteristicFunctionBuilder(2)
.addCoalition(1.0, 1)
.addCoalition(2.0, 2)
.addCoalition(4.0, 1, 2).build();
ShapleyValue s = new ShapleyValue(cfunction);
s.calculate();
Map<Integer,Double> output =s.getResult();
double phi1 = output.get(1);
double phi2 = output.get(2);
assertEquals(phi1, 1.5, 0.01);
assertEquals(phi2, 2.5, 0.01);
}
- v({1}) = 80
- v({2}) = 56
- v({3}) = 70
- v({1,2}) = 80
- v({1,3}) = 85
- v({2,3}) = 72
- v({1,2,3}) = 90
@Test
public void testCalculateThreeParticipants() {
CharacteristicFunction cfunction =
new CharacteristicFunction.CharacteristicFunctionBuilder(3)
.addCoalition(80.0, 1)
.addCoalition(56.0, 2)
.addCoalition(70.0, 3)
.addCoalition(80.0, 1, 2)
.addCoalition(85.0, 1, 3)
.addCoalition(72.0, 2, 3)
.addCoalition(90.0, 1, 2, 3).build();
ShapleyValue s = new ShapleyValue(cfunction);
s.calculate();
Map<Integer,Double> output =s.getResult();
double v1 = output.get(1);
double v2 = output.get(2);
double v3 = output.get(3);
assertEquals(v1, 39.2, 0.1);
assertEquals(v2, 20.7, 0.1);
assertEquals(v3, 30.2, 0.1);
}
In the glove game the players have a left or a right glove. The goal is to form a pair. A coalition with at least one left glove and one right glove wins (v(S)=1) otherwise loses (v(S)=0).
We have three players R1, R2, L3 with R1 and R2 having a right glove and L3 have a left glove.
Using the symmetrie axiom we expect that the Shapley value is the same for R1 and R2. We expect also that L3 receives more than R1 and R2 because L3 is mandatory if we want to win the game whereas R1 or R2 are not mandatory to win the game. R1 and R2 participate less than L3.
S | V(S) |
---|---|
{} | 0 |
{R1} | 0 |
{R2} | 0 |
{L3} | 0 |
{R1,R2} | 0 |
{R1,L3} | 1 |
{R2,L3} | 1 |
{R1,R2,L3} | 1 |
Order | marginal contribution R1 | marginal contribution R2 | marginal contribution L3 |
---|---|---|---|
R1 R2 L3 | 0 | 0 | 1 |
R1 L3 R2 | 0 | 0 | 1 |
R2 R1 L3 | 0 | 0 | 1 |
R2 L3 R1 | 0 | 0 | 1 |
L3 R1 R2 | 1 | 0 | 0 |
L3 R2 R1 | 0 | 1 | 0 |
@Test
public void testEvaluationThreePlayersOne() {
GloveGameApplication evaluation =
new GloveGameApplication.GloveGameApplicationBuilder()
.addPlayer("Adam1", Hand.LEFT)
.addPlayer("Adam2", Hand.LEFT)
.addPlayer("Lea1", Hand.RIGHT)
.build();
Map<String,Double> output = evaluation.calculate();
double phiAdam1= output.get("Adam1");
double phiAdam2= output.get("Adam2");
double phiLea1 = output.get("Lea1");
assertEquals(phiAdam1, 0.16, 0.01);
assertEquals(phiAdam2, 0.16, 0.01);
assertEquals(phiLea1, 0.67, 0.01);
}
This example is : Amy, Bob and Clare are sharing a taxi. We imagine they are going to the same direction.
- Amy must pay 6 to go home
- Bob must pay 12 to go home
- Clare must pay 42 to go home
S | V(S) |
---|---|
{} | 0 |
{A} | 6 |
{B} | 12 |
{C} | 42 |
{A,B} | 12 |
{A,C} | 42 |
{B,C} | 42 |
{A,B,C} | 42 |
Order | marginal contribution A | marginal contribution B | marginal contribution C |
---|---|---|---|
A B C | 6 | 6 | 30 |
A C B | 6 | 0 | 36 |
B A C | 0 | 12 | 30 |
B C A | 0 | 12 | 30 |
C A B | 0 | 0 | 42 |
C B A | 0 | 0 | 42 |
@Test
public void testCalculateThreeParticipants() {
TaxiApplication taxiApplication =
new TaxiApplication.TaxiApplicationBuilder()
.addUser(6.0, "A")
.addUser(12.0, "B")
.addUser(42.0, "C")
.build();
Map<String,Double> output = taxiApplication.calculate();
double phiA = output.get("A");
double phiB = output.get("B");
double phiC = output.get("C");
assertEquals(phiA, 2.0, 0.01);
assertEquals(phiB, 5.0, 0.01);
assertEquals(phiC, 35.0, 0.01);
}
Example: There are four fraudulent transactions T1, T2, T3, T4. There are four rules trying to detect the fraud events.
- Rule1 detects T1, T2, T3
- Rule2 detects T1, T2, T3
- Rule3 detects T1, T2, T3
- Rule4 detects T4
The Shapley value evaluates the contribution of each rules (the sum will be normalized to 1) The Rules4 detects 1/4 of the event (alone) so we can expect phiRule4=0.25 The rules 1, 2, 3 detect the same events and should have the same Shapley value. phiRule1=phiRule2=phiRule2=0.25
@Test
public void testEvaluationFourRules() {
FraudRuleApplication evaluation =
new FraudRuleApplication.FraudRuleApplicationBuilder()
.addRule("Rule1", 1,2,3)
.addRule("Rule2", 1,2,3)
.addRule("Rule3", 1,2,3)
.addRule("Rule4", 4)
.build();
Map<String,Double> output = evaluation.calculate();
double phiRule1 = output.get("Rule1");
double phiRule2 = output.get("Rule2");
double phiRule3 = output.get("Rule3");
double phiRule4 = output.get("Rule4");
assertEquals(phiRule1, 0.25, 0.01);
assertEquals(phiRule2, 0.25, 0.01);
assertEquals(phiRule3, 0.25, 0.01);
assertEquals(phiRule4, 0.25, 0.01);
}
In fact to measure the quality of the fraud rules we can used the metric "f1 score" based on the "recall" and the "precision". These three metrics f1-score, recall and precision are between 0 and 1.
A bad precision shows us that the rule makes mistake some genuine transactions are considered as fraudulent. A bad recall shows us that the rule does not find out all the fraudulent transactions.
The fscore combines the recall and the precision following this formula :
Please see the article in Wikipedia https://en.wikipedia.org/wiki/F1_score for more information.
two rules, two transactions
FraudRuleV2Application evaluation =
new FraudRuleV2Application.FraudRuleV2ApplicationBuilder()
.addRule(new RuledTransaction("1,1,0")) //T1 is fraudulent, R1 considers it as fraudulent
.addRule(new RuledTransaction("1,0,1")) //T2 is fraudulent, R2 considers it as fraudulent
.build();
It means that:
- T1 is fraudulent and the rules R1 considers it as fraudulent but R2 does not consider it as fraudulent (false negative).
- T2 is fraudulent and the rules R1 does not consider it as fraudulent but R2 considers it as fraudulent.
fraudulent | Rule1 | Rule2 | Rule1 and Rule2 | |
---|---|---|---|---|
T1 | fraudulent | 1 | 0 | 1 |
T2 | fradulent | 0 | 1 | 1 |
for the two rules the precision is 1 (there is no false positive) the recall is 1/2=0.5 (only one fraudulent transaction is found whereas there is two fraudulent transasactions)
if we conbine the two rules (with the OR operator) then the two fraudulent transaction are found and recall, precision and f1score are equals to 1.
S | V(S) |
---|---|
{} | 0 |
{R1} | 0.667 |
{R2} | 0.667 |
{R1 R2} | 1 |
Permutation | marginal contribution R1 | marginal contribution R2 |
---|---|---|
1 2 | v({1})=0.667 | v({1,2})-v({1}) =0.133 |
2 1 | v({1,2})-v({2})= 0.133 | v({2})= 0.667 |
Shapley Value | (0.667+0.133)/2 =0.5 | (0.667+0.133)/2 =0.5 |
We can see the symetry because the contribution of R1 and R2 are the same and their Shapley value are the same.
FraudRuleV2Application evaluation =
new FraudRuleV2Application.FraudRuleV2ApplicationBuilder()
.addRule(new RuledTransaction("1,1,0,1")) //T1 is fraudulent
.addRule(new RuledTransaction("1,1,1,0")) //T2 is fraudulent
.addRule(new RuledTransaction("0,0,1,1")) //T3 is genuine
.build();
S | Precision | Recall | V(S) |
---|---|---|---|
{} | 0 | ||
{R1} | 1 | 1 | 1 |
{R2} | 0.5 | 0.5 | 0.5 |
{R3} | 0.5 | 0.5 | 0.5 |
{R1 R2} | 2/3 | 1 | 0.8 |
{R1 R3} | 2/3 | 1 | 0.8 |
{R2 R3} | 2/3 | 1 | 0.8 |
{R1 R2 R3} | 2/3 | 1 | 0.8 |
Permutation | marginal contribution R1 | marginal contribution R2 | marginal contribution R3 |
---|---|---|---|
1 2 3 | v({R1})=1 | v({R1,R2})-v({R1})= -0.2 | v({R1,R2,R3})-v({R1,R2}) =0 |
1 3 2 | v({R1})=1 | v({R1,R2,R3})-v({R1,R3}) =0 | v({R1,R3})-v({R1})= -0.2 |
2 1 3 | v({R1,R2})-v({R2})= 0.3 | v({R2})= 0.5 | v({R1,R2,R3})-v({R1,R2}) =0 |
2 3 1 | v({R1,R2,R3})-v({R2,R3}) =0 | v({R2})= 0.5 | v({R2,R3})-v({R2})= 0.3 |
3 1 2 | v({R1,R3})-v({R3})= 0.3 | v({R1,R2,R3})-v({R1,R3}) =0 | v({R3})= 0.5 |
3 2 1 | v({R1,R2,R3})-v({R2,R3}) =0 | v({R2,R3})-v({R3})= 0.3 | v({R3})= 0.5 |
Shapley Value | (1+1+0.3+0+0.3+0)/6 = 0.433 | (-0.2+0+0.5+0.5+0+0.3)/6 = 0.183 | (0-0.2+0+0.3+0.5+0.5)/6 = 0.183 |
We can see the symetry because the contribution of R2 and R are the same and their Shapley value are the same.
FraudRuleV2Application evaluation =
new FraudRuleV2Application.FraudRuleV2ApplicationBuilder()
.addRule(new RuledTransaction("1,1,1,1,0")) //T1 is fraudulent
.addRule(new RuledTransaction("1,1,1,1,0")) //T2 is fraudulent
.addRule(new RuledTransaction("1,1,1,1,0")) //T3 is fraudulent
.addRule(new RuledTransaction("1,0,0,0,1")) //T4 is fraudulent
.addRule(new RuledTransaction("0,1,1,1,0")) //T5 is genuine
.build();
S | Precision | Recall | V(S) |
---|---|---|---|
{} | 0 | ||
{R1} | 3/4 | 3/4 | 3/4 |
{R2} | 3/4 | 3/4 | 3/4 |
{R3} | 3/4 | 3/4 | 3/4 |
{R4} | 1 | 1/4 | 2/5 |
{R1 R2} | 3/4 | 3/4 | 3/4 |
{R1 R3} | 3/4 | 3/4 | 3/4 |
{R1 R4} | 4/5 | 1 | 8/9 |
{R2 R3} | 3/4 | 3/4 | 3/4 |
{R2 R4} | 4/5 | 1 | 8/9 |
{R3 R4} | 4/5 | 1 | 8/9 |
{R1 R2 R3} | 3/4 | 3/4 | 3/4 |
{R1 R2 R4} | 4/5 | 1 | 8/9 |
{R2 R3 R4} | 4/5 | 1 | 8/9 |
{R1 R2 R3 R4} | 4/5 | 1 | 8/9 |
Permutation | m cont R1 | m cont R2 | m cont R3 | m cont R4 |
---|---|---|---|---|
1 2 3 4 | 3/4 | 0 | 0 | 5/36 |
1 2 4 3 | 3/4 | 0 | 0 | 5/36 |
1 3 2 4 | 3/4 | 0 | 0 | 5/36 |
1 3 4 2 | 3/4 | 0 | 0 | 5/36 |
1 4 2 3 | 3/4 | 0 | 0 | 5/36 |
1 4 3 2 | 3/4 | 0 | 0 | 5/36 |
------------- | ---------- | ---------- | ---------- | --------- |
2 1 3 4 | 0 | 3/4 | 0 | 5/36 |
2 1 4 3 | 0 | 3/4 | 0 | 5/36 |
2 3 1 4 | 0 | 3/4 | 0 | 5/36 |
2 3 4 1 | 0 | 3/4 | 0 | 5/36 |
2 4 1 3 | 0 | 3/4 | 0 | 5/36 |
2 4 3 1 | 0 | 3/4 | 0 | 5/36 |
------------- | ---------- | ---------- | ---------- | --------- |
3 1 2 4 | 0 | 0 | 3/4 | 5/36 |
3 1 4 2 | 0 | 0 | 3/4 | 5/36 |
3 2 1 4 | 0 | 0 | 3/4 | 5/36 |
3 2 4 1 | 0 | 0 | 3/4 | 5/36 |
3 4 1 2 | 0 | 0 | 3/4 | 5/36 |
3 4 2 1 | 0 | 0 | 3/4 | 5/36 |
------------- | ---------- | ---------- | ---------- | --------- |
4 1 3 2 | 22/45 | 0 | 0 | 2/5 |
4 1 2 3 | 22/45 | 0 | 0 | 2/5 |
4 2 1 3 | 0 | 22/45 | 0 | 2/5 |
4 2 3 1 | 0 | 22/45 | 0 | 2/5 |
4 3 1 2 | 0 | 0 | 22/45 | 2/5 |
4 3 2 1 | 0 | 0 | 22/45 | 2/5 |
------------- | ---------- | ---------- | ---------- | --------- |
Shapley Value | 493/2160 | 493/2160 | 493/2160 | 49/240 |
Shapley Value | 0.2282 | 0.2282 | 0.2282 | 0.2042 |
Question: The parliament of Micronesia is made up of four political parties, A, B, C, and D, which have 45, 25, 15, and 15 representatives, respectively. If a coalition has the majority it receives 1, if a coalition has no majority it receives 0.
the solution is
- phiA=0.5
- phiB=1/6
- phiC=1/6
- phiD=1/6 B, C, and D have the same Shapley value, so the same influence.
The Belgium parliament has 150 members and a lot of parties (13 parties, figures of the 18/04/2018)
- NVA 31
- PS 23
- MR 20
- CD&V 18
- openVLD 14
- PSA 13
- EcoloGroen 12
- CDH 9
- VB 3
- Defi 2
- PTB 2
- VW 2
- PP 1
In fact, it becomes very difficult to calculate the Shapley value for this 13 parties. Because the number of permutation possible is huge : factorial(13)= a lot around 6 milliard or 6 billion.
But it is possible to take a smaller set of permutations taken randomly. In this case 500 000 permutations seems to be enough to have a precision of 0.1 percent and takes 30 seconds calculation with a old PC.
With a normalized results to 100, here is the result finally not so far from the proportional result
- NVA 22.9
- PS 15.4
- MR 13.5
- CD&V 12.0
- openVLD 8.8
- PSA 8.2
- EcoloGroen 7.6
- CDH 6.0
- VB 1.6
- Defi 1.1
- PTB 1.1
- VW 1.1
- PP 0.6
The Belgium parliament will have 150 members and a lot of parties (12 parties, figures of the 10/06/2019) Here Ecolo and Groen are considered as two parties. Member per parties
- NVA 25
- PS 20
- VB 18
- MR 14
- Ecolo 13
- CD&V 12
- openVLD 12
- PTB-PDVA 12
- PSA 9
- Groen 8
- CDH 5
- Defi 2
Shapley value (normalized to 100)
- NVA 18,1
- PS 13.7
- VB 12.4
- MR 9.1
- Ecolo 8.5
- CD&V 7.9
- openVLD 7.9
- PTB-PDVA 7.9
- PSA 5.4
- Groen 5.1
- CDH 3.1
- Defi 1.0
With my old pc I can calculate 1 000 000 coalitions in 31,5s. So here is a first evaluation of an exact calculation of the parliament application of the Shapley value following the number of parties
nb parties | exact calculation evaluation |
---|---|
10 | 2 min |
11 | 21 min |
12 | 4 hours |
13 | 2 days |
14 | 32 days |
- how to share a taxi https://www.youtube.com/watch?v=aThG4YAFErw
- course about game theory https://www.youtube.com/watch?v=qcLZMYPdpH4
https://math.stackexchange.com/questions/1310344/calculating-shapley-value-on-voting-game
- full course about game theory https://www.coursera.org/learn/game-theory-1/