md `## Commentary: What is happening here?
### Setup
Background: We are using the data from [an amplification experiment](https://observablehq.com/@jjj/untitled/2) and cutting the results in a different way
We are taking a prediction (P_initial), and for every prediction (Pn) that follows, we are asking:
- Is (Pn) closer to the last aggregate before it (AGn), or to (P_initial)?
- How much better, or worse, does (Pn) do in comparison to (AGn)? Call this quantity X.
- If (Pn) is closer to (P_initial), and (Pn) does better than (AGn), increase or decrease the helpfulness of (P_initial)'s user by X
- If there are m (P_initial)s such that (Pn) is closer to (P_initial) than to the (AGn), then instead of X, use X/m
With this, what we want to answer is: Do predictions influenced by someone do better? Is this person a positive influence?
### Comments
- Predicting earlier and wrong is penalized much more strongly than predicting late and wrong
- This set-up penalizes users who make a lot of predictions because of the same issues that the normal scoring system has: people don't outperform the market. But this isn't the only factor; compare Misha Yagudin and JK, and geesh and holomanga
- The following users are particularly interesting:
- Reprisal
- Individually, he did pretty badly in:
- [Unbound Prometheus: Europe had a lower birthrate than China [1]](https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15?state=closed)
- [Unbound Prometheus: The Han Dynasty in China established a state monopoly on salt and metals [1]
](https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15/m/050b48c9-022e-4b53-95ec-a8f371292b93)
- [Unbound Prometheus: pre-Industrial Revolution, average French wage was what percent of the British wage? [2]](https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15/m/7a2774d8-6b6e-468c-84cd-d91104ebefbb)
- etc.
- But correctly in:
- [Unbound Prometheus: Pre-Industrial Britain had a legal climate more favorable to industrialization than continental Europe [5]
](https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15/m/bbe62da4-e100-4935-a419-30477a167540), where people moved in his direction.
- [https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15/m/e827ff9f-7327-4243-9e72-cbf9e6da9376](https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15/m/e827ff9f-7327-4243-9e72-cbf9e6da9376)
- etc.
- Because in the first case he predicted relatively late, and because his errors were not repeated. i.e. he didn't have much influence, he gets a small positive score. Personally I would be very curious to see how well he does if he participates in the next rounds, and in particular if he learns to use wider intervals.
- holomanga & geesh. They're the users who earnt the most money, but both had an (unadjusted) negative score. They also tend to predict earlier, and so get hit hard. But holomanga gets hit (much) harder.
- NunoSempere. This is myself. This position depends on the specific distance (see below), and that depends on a number of judgement calls; with other distances I am slightly negative.
- Elizabeth. Her answer to [Unbound Prometheus: Just before the Industrial Rev…ere less friendly towards science than Europe [4]" id: "1f56fcf1-21c8-4149-8abf-4feed882d31c](https://www.foretold.io/c/f19015f5-55d8-4fd6-8621-df79ac072e15?state=closed) is substantially different from her prior, and this is interesting because her prior wasn't neutral. I think that Elizabeth's Bayes factor, i.e., the direction of the update is more interesting in this case than the resolution. In general, I would have wanted there to be more cases in which Elizabeth, as opposed to Priors Bot, had made an initial prediction (or, are both the same)?
- Note on degrees of freedom & judgement calls: Geesh is in general "more helpful" than holomanga, but the degree to which this is so depends on the distance used. Elizabeth is generally negative. I am sometimes slightly negative. Reprisal is usually near the top, but can go down
- To compute the distance between cdfs, I am using: Integral(| cdf 1 - cdf2|). A next step, and perhaps using a better distance (KL divergence?), would be to make the influence of one prediction on another depend on the distance. But if you multiply by a factor of 1/(1+constant1.distance), or of (constant2 - constant3.distance), you already have an infinite number of degrees of freedom. The principled way to do this, Shapley values, involves information which we do not have.
- Relationship to Shapley value: Tenuous. This is Shapley value with a lot of terms missing, or counterfactual value dividing by the number of participants.
- The first order effect is missing.
- I get the impression that getting something similar to Shapley values from here wouldn't be that difficult?
- A very convenient assumption to make would be that later users whose predictions are very similar to someone else's previous prediction would have predicted the aggregate or something closer to the aggregate instead.`