October 19, 2019
Against Lie Inflation by Scott Alexander
I think of this as a sort of sensitivity-and-specificity statistics problem, setting a threshold to divide the population into two groups. If you have a very strict threshold for “abuser”, maybe only someone who inflicts serious physical injuries, then you can use it to separate the most abusive 1% of people from the other 99%. If you have a very weak threshold for “abuser”, so low that 99% of people qualify, then you can use it to separate the 1% least abusive people from the other 99%. If you set it in the middle, you can separate the more abusive half of the population from the less abusive half. If “abuser” picks out the most abusive 1% of people, it transmits a lot of information in a small number of cases. If it picks out the most abusive 99% of people, it transmits very little information in a large number of cases (and now “not an abuser” transmits a large amount of information in a small number of cases!). If the boundary is set at 50%, it transmits an equal moderate amount of information about everyone.
There’s no rule that 50-50 is always the best — for example, if the word “murderer” referred to anyone in the more murderous half of the population, that would be much worse than the system now, where it refers to a much smaller set of people, who you have much more reason to worry about as a discrete group. You’re going to have to find the right threshold for each individual concept.
But it’s never the right decision to draw the line outside the population, so that literally 100% of people fall in one category and 0% in the other.