Why don't Prediction Markets Work as well as we like?

Part 1 or 2

Nov 24, 2020

Prediction markets have been suggested as an important way to get people to aggregate information to arrive at an accurate estimate of a future event. They have achieved some amount of popularity among people in crypto. They have been proposed as a building block in larger technocratic governance schemes, such as futarchy.

More over, mechanisms that share some similarities to prediction markets have been used as prediction competitions on sites such as Good Judgement Project and Metaculus.

However, there is a lingering question of the type - if prediction markets are indeed a very accurate way to arrive at an estimate, why aren't they used more often? Also, how successful have they really been in predicting outcomes of big events, such as elections? One answer that Robin Hanson gives is that social incentives in large organization are structured away from actually knowing accurate predictions.

I think the answer is not the whole truth and in this post I will explore why. I will suggest a better philosophical attitude with which to approach prediction markets, compare several mechanisms and propose a *mathematically novel* mechanism in Part 2 that improves on several existing designs.

My future vision is that of a robust many to many information market, where people contribute money to get estimates from predictors about a future event, which has a robust Nash equilibrium for participation for rational people.

What are the types of belief aggregators?

Normally what we call a prediction market is a continuous double-auction mechanism, however this isn't the only way to implement a prediction market nor is this the only way to set up a "belief aggregation system." A belief aggregation system is a process that takes money and probabilities of beliefs in some fashion from people and outputs an aggregated probability as well as payouts towards "better predictions".

So we are going to look at a general problem of "belief aggregation systems" in general, of which a double-auction is one type. I am going to look at 3 different system, which have different inputs and mechanisms.

1. Continuous double auction

Continuous double auction is the most familiar mechanism. At each point the market has a certain probability and if one believes that this probability is too high or too low, one can sell or buy the stock in the event from a counter party who thinks the opposite. Our inputs to the mechanism is the amount of money + whether probability should be higher or lower.

There is a generic problem with many prediction markets, but especially the zero sum game double auction in that they pin people against each other. This makes it irrational to bet if you believe that other people have better information on the event that you do, even if you don't necessarily know which way this information points. In expectation, if every market participant was truly rational and had good estimates about the expected value of everyone's information, then no trades would happen. So the Nash equilibrium of the market is the market not existing. This is really bad because the prediction markets in a zero sum contain an inherent contradiction about the rationality of participants that they *ought* to rationally bet on their beliefs, but *ought* to irrationally ignore the question of whether they actually have better "on average" insider information compared to the market participants.

2. Scoring rule payout (such as Brier score payout)

Brier Score payout is the a default way to score predictions in places like "Good Judgement Project." The mechanism is as follows: everyone announces their probability p of the outcome and each person get penalized by either (1-p)^2 or p^2 depending on if the event happened or not. This is the same as root mean squared error where p is the prediction and 0 or 1 is the outcome.

Our inputs is the subjective probability of the event. This Brier Score does not involve when the prediction is made. It also does not involve money or any real way to distinguish good predictors from bad one. There are ways to modify this, which will be discussed below

3. LSMR - Logarithmic Market Scoring Rules (or a similar mechanism for any scoring rule)

LSMR or Logarithmic Market Scoring Rules has been invented described by Robin Hanson in these papers. Link 1 Link 2 Link 3

From the paper:

More formally, a market scoring rule always has a current probability distribution p, which is equal to the last report that anyone has made to this rule. Anyone can at any time inspect this distribution, or change any part of it by making a new report. If someone chooses to give a report rt at time t, and then the actual state

eventually turns out to be i, this person will be paid a reward

ci = si(rt, rt−1) = si(rt) − si(rt−1),

where si(r) is some proper scoring rule, and rt−1 is the last report made. Since this amount ci can be negative, the person must show an ability to pay this if needed, such as by depositing or escrowing an amount— mini ci ...

He maximizes the expected value of si(rt, rt−1) by maximizing the expected value of si(rt), and so wants to honestly report his beliefs here

whenever he would for a simple scoring rule. 5 Whenever someone’s beliefs differ from the current distribution p, he expects to profit on average from making a report. LSMR specifically has si(r) = ln(r), however any market scoring rule (such as a Brier score) would work as well.

To summarize LSMR, just like in continuous double auction, the market presents a probability to each participant. However, unlike in continuous double auction, instead of the input being *the amount of money* you wish to bet and whether you think the probability is higher or lower, the input here is your probability and the amount of money you win (or owe) is determined based on the probability differences.

The first bet is a subsidy of the market, although it's not necessarily all spent. It could be spent with an additional mechanism discussed below.

This zero sum nature of continuous double auction is mitigated by the use of subsidies from the market creator. Subsidies make sense from a regular capitalist point of view as it frequently takes a lot of work for the market participants to come up with a probability. Rewarding that work, even if it is not necessary better than the average market participant will make participation in the market positive sum, thus justifying the work put in and allowing rational institutions to partake as well.

Problems of each setup:

In addition to the problem of subsidies, each of the 3 schemes suffer from different interface problems.

a) Double auction has the problem that it doesn't actually solicit your true probability. If the market says 30%, and you are a small investor, it doesn't matter that you think the actual % is 40% or 80%. It also needs a counter party to trade and has trouble with extremely shallow markets. Some implementations like Predictit also have trouble with extremely *popular* markets as it can only store so many trade orders.

b) LSMR does not allow you to decouple the amount of money you need to put and the probability. If you think the market is very wrong, but only want to bet a small amount of money, LSMR does not allow you to do this. Subsidies are easily added to this mechanism.

c) Simple Brier does not involve money, nor does it allow one to benefit from betting earlier. The advantage of both double auction and LSMR is that they are effectively "time dependent" and allow one to benefit even if it's just moving the probability towards the truth. Theoretically if a person bets 90% when the market is at 30%, they should be rewarded more than somebody who later bets 91% when the market is at 90%, however with simple brier this is not the case.

Now you can modify simple brier averaging to involve bets from participants OR you can modify this to allow time dependence, however it's not obvious to see how to allow BOTH ability to involve bets and time dependence while keeping all the incentives intact. I show that this somewhat possible, given a relaxation of some other conditions.

So the question is: can we create a system where people input BOTH the amount of money they want to bet as well as probability, instead of just one or the other?

To summarize the thing we want from the ideal belief aggregation framework:

1. Expected payoff is maximized when the person truthfully uses their subjective probability of the event (invokes a market scoring rule) (non-negotiable)

2. Allows to specify both probability and money. Basically the expectation is that a system which allows you to specify your actual probability converges faster than one that doesn’t (such as continuous auction).

3. Allows people to place bets without a counter-party

4. Easy to subsidize

5. Able to cash out early

6. Incentives to bet both in the beginning as well as close to the event, or being "time-dependent" or known as rewarding people when they move the current market closer to truth

7. Outputs a single probability

8. No arbitrage allowed - meaning one cannot make a trade that automatically makes money

A note on the table. I think some people could argue that the double-auction markets are still "easy to subsidize". One proposed scheme is the market-maker doing noise trades on the market. I think this is a fairly bad scheme in that it:

1. Makes the market less truthful in the short term

2. Is prone to being gamed if the trades are public

3. Might not work if the "noise" trade points the same way

4. There is lack of clarity over how much subsidizes will be available, making expected value calculations unclear for market participants

5. Uses randomness, which is always a bad sign in major algorithms.

So the general problem we need to try to solve is how many of the 8 things can we implement in a single scheme. It's not obvious to me that all 8 are possible, however implementing both 2, 4 and 6 - Allows specifying both probability and money, easy to subsidize and time-dependent would be a massive improvement in the prediction market interfaces.

I believe I have a working scheme for this, which is original math research!

Coming in Part 2 …

A note on terminology:

A proper scoring rule is a function from prediction x outcome -> R that is maximized by predicting a true probability. Since we are dealing with only binary classifications (outcome being 0 or 1), what we have is a requirement of

scoring_rule = s(x), where x is the estimate

ev (x, p) = p * s(x) + (1-p) * s (1- x)

ev (x, p) should be maximal when x = p, so

d/dx ev(x,p) = p * s'(x) - (1-p) * s' (1-x) should be 0 when x = p

s'(p) / s' (1- p) = (1-p) / p

many functions fit this requirement for s'(x)

s'(x) = 1/x => s(x) = ln(x) => log scoring rule

s'(x) = 1 - x => s(x) = x - x^2 /2 => linear transform of brier score or root mean squared error

s'(x) = (1- x) / (2x^2 - 2x + 1)^3/2 => s(x) = x / sqrt (x^2 + (1-x)^2) - spherical scoring rule

PashaNomics

Discussion about this post