In our new book, we argue that most talk, including mass media news and academic talk, isn’t really about info, at least the obvious base-level info. But to study talk, it helps to think about what it would in fact look like if it were mostly about info. And as with effective altruism, such an exercise can also be useful for those who see themselves as having unusually sincere preferences, i.e., who actually care about info. So in this post let’s consider what info based talk would actually look like.
From an info perspective, a piece of “news” is a package that includes a claim that can be true or false, a sufficient explanation of what this claim means, and some support, perhaps implicit, to convince the reader of this claim. Here are a few relevant aspects of each such claim:
Surprise – how low a probability a reader would have previously assigned to this claim.
Confidence – how high a probability a reader is to assign after reading this news.
Importance – how much the probability of this claim matters to the reader.
Commonality – how many potential readers this consider this topic important.
Recency – how recently this news became available.
Support Type – what kind of support is offered for a reader to believe this claim.
Support Space – how many words it takes to show the support to a reader.
Definition Space – how many words it takes to explain what this claim means.
Bandwidth – number of channels of communication used at once to tell reader about this news.
Chunk – size of a hard-to-divide model containing news, such as a tweets or a book.
Okay, the amount of info that some news gives a reader on a claim is the ratio of its confidence to its surprise. The value of this info multiplies this info amount by the claim’s importance to that reader. The total value of this news to all readers (roughly) multiplies this individual value by its commonality. Valuable news tells many people to put high confidence in claims that they previously thought rather unlikely, on topics they consider important.
A reader who knew most everything that is currently known would focus mostly on recent news. Real people, however, who know very little of what is known, would in contrast focus mostly on much less recent news. Waiting to process recent news allows time for many small pieces of news to be integrated into large chunks that share common elements of definition and support, and that make better use of higher bandwidth.
In a world mainly interested in getting news for its info, most news would be produced by specialists in particular news topics. And there’d be far more news on topics of common interest to many readers, relative to niche topics of interest only to smaller sets of readers.
The cost of reading news to a reader is any financial cost, plus a time cost for reading (or watching etc.). This time cost is mostly set by the space required for that news, divided by the effective bandwidth used. Total space is roughly definition space plus support space. If the claim offered is a small variation on many similar previous claims already seen by a reader, little space may be required for its definition. In contrast, claims strange to a reader may take a lot more space to explain.
When the support offered for a claim is popularity or authority, such support may be seen as weak, but it can often be given quite concisely. However, when the support offered is an explicit argument, that can seem strong, but it can also take a lot more space. Some claims are self-evident to readers upon being merely stated, or after a single example. If prediction markets were common, market odds could offer concise yet strong support for many claims. The smallest news items will usually not come with arguments.
Given the big advantages of modularity, in news as in anything else, we need a big gain to justify the modularity costs of clumping news together into hard-to-divide units, like articles and books. There are two obvious gain cases here: 1) many related claims, and 2) one focus claim requiring much explanation or support. The first case has a high correlation in reader interest across a set of claims, at least for a certain set of readers. Here a sufficient degree of shared explanation or support across these claims could justify a package that explains and supports them all together.
The second case is where a single focal claim requires either a great deal of explanation to even make clear what is being claimed, or it requires extensive detailed arguments to persuade readers. Or both. Of course there can be mixes of these two cases. For example, if in making the effort to support one main claim, one has already done most of the work needed to support a related but less important claim, one might include that related claim in the same chunk.
For most readers, most of the claims that are important enough to be the focus of a large chunk are also relatively easy to understand. As a result, most of the space in most large focused chunks is devoted to support. And as argument is the main support that requires a lot of space, most of the space in big chunks focused on a central claim is devoted to supporting arguments. Also, to justify the cost of a large chunk with a large value for the reader, most large focused chunks focus on claims to which readers initially assign a low probability.
So how does all this compare to our actual world of talk today? There are a lot of parallels, but also some big deviations. Our real world has a lot of local artisan production on topics of narrow interest. That is, people just chat with each other about random stuff. Even for news produced by efficient specialists, an awful lot of it seems to be on topics of relatively low importance to readers. Readers seem to care more about commonality than about importance. And there’s a huge puzzling focus on the most recently available news.
Books are some of our largest common chunks of news today, and each one usually purports to offer recent news on arguments supporting a central claim that is relatively easy to understand. It seems puzzling that so few big chunks are explicitly justified via shared explanation and justification of many related small claims, or that so man big chunks seem neither to cover many related claims nor a single central claim. It also seems puzzling that most focal claims of books are not very surprising to most readers. Readers do not seem to be proportionally more interested in the books on with more surprising focal claims. And given how much space is devoted to arguments for focal claims, it is somewhat surprising that books often neglect to even mention other kinds of support, such as popularity or authority.
While I do think alternative theories, in which news is not mainly about info, can explain many of these puzzles, a discussion of that will have to wait for another post.
This definitional change doesn't solve the fundamental issue that its is the absolute difference and not the ratio that needs to be used as the central element before multiplication ... of this I am highly confident ( :)) and now that I have bought this to your attention, I am sure you know that you can model this out with dice and cards etc to prove it for yourself.
But the big picture here is you deserve considerable credit for putting together a really interesting question and model so lets not get too hung up on mathematical minutiae, the big point I wanted to make was the value of info literature had addressed a fair amount of this and you are aware of that so we are all good.
With that minor caveat out of the way, I think you are asking some very probing questions here ... why do we have limited regard for the value of information anywhere in society? Why are we a society that doesn't really invest in information at all ... basically why are we so foolish?
my high level response is that we get what we deserve and that as humans our lack of real respect and acceptance of the truth/reality causes us to harm ourselves in very material ways. News being merely one of them. If we were more sincere about finding and acting on the truth/reality then we would probably be more happy and well adjusted beings.
When giving formulas, I had in mind applying them where the news was that a claim had *risen* in probability. Once can of course represent them with the complement event whose probability has lowered, but that is a lot less natural.