Is Fake News spam?

And if not, what do we do about it?

May 28, 2019

Can here. Hope you Americans had a great memorial day! Let’s talk about spam.

Tasty, Tasty Spam

Bill Gates probably never said “640K ought to be enough for anybody” but he definitely did say email spam will be solved in two years, back in 2004. “Two years from now, spam will be solved” was his exact words, in front of a bunch big-wigs at Davos.

Needless to say, spam was not solved in 2006 but it was eventually solved. There’s still a ton of email spam, mind you, clogging the tubes but all in all, most of what people consider rarely hit their inboxes, but instead go to their spam folders. This is progress!

There are a bunch of reasons why and how was spam “solved” in the narrow sense. First of all, lots of stakeholders decided to play together, from industry to governments and and the individual players in the field as well. There was a bout of regulation in the US, the motherlode being the cutely named (I should know…) CAN-SPAM act. As a result, there was was a bunch of high profile cases both in US and other countries, and people did go to jail. To top it all up, then the email people came together and agreed on a few protocols to better authenticate both their servers (like DKIM) and the emails themselves (like SPF).

But there’s also the fact that the technology to detect and pile away the actual emails just got better. We always had the technology to send a ton of emails for cheap, but that’s much easier than being able read each of those emails individually and make a decision on the spot. First is a horizontally scalable problem; you can just throw money at the problem as long as you make on the other hand. Making computers think and understand requires more of a breakthrough.

Obviously, there’s a bit of a chicken-and-egg problem (solution?) here. If you are going to use machine learning to detect spam, the more signal you have, the better your algorithms are going to get. This is why, for example, more and more emails going through a few providers like Google and Microsoft helps. Not just for machine learning, but also being able to stop a to of spam in one go with blacklists and such.

There are big downsides to this as we keep talking about here again and again, but it’s what it is. Economies of scale is a powerful force.

Fake News: Artist formerly known as Spam?

I mention all this, because if you look at the spam problem long enough, and squint a bit, it starts to resemble the fake news problem. Replace Eudora with Facebook and Nigerian princesses with some Russian-government trolling, and you have a system where the costs of distribution of material is cheaper the returns, and the entire thing flies off the wheel. This isn’t really a new line of thinking and I’ll credit some Benedict Evans tweets (who ironically blocked me on Twitter) for some of the terminology I'm using here.

Anyway. It's natural to think that the previous approaches should work on this problem too; 1) centralize to get better data and leverage (i.e. one tweak fixes everything) 2) apply machine learning. Rinse, repeat. Simple enough, really.

If you are, say, Facebook dealing with a huge anti-trust problem, this could be a bit of godsend. If the problems you have created are so big that they are putting entire liberal democracies in the West at risk, and fanning genocidal flames in Southeast Asia, then you can make the argument that “only someone as big as me (centralized) and someone who has the technical chops (machine learning) can solve this problem”. I am not saying that Facebook would rather have the fake news problem around the world than the anti-trust troubles at home, but I am saying you would be incentivized to think that way a bit. It’d at least color your thinking a bit.

It’s good to check your assumptions every once in a while.

What if fake news is not a spam-like problem but actually is something else, that requires different types of solutions?

For example, a defining quality of spam is that is not just it is unsolicited, but it is annoying. It gets in the way of the useful stuff. Not only that, it is crap that you do not want to read, even though there’s enough people who do read them to make them worthwhile to send.

Fake news, on the other hand, is almost always the opposite. You want to read that stuff. For example, Casey Newton pointed to this study in his Interface newsletter that says some of the “fake news” is even more engaging than the real news.

It is eye-opening.

On Facebook, while many more users interact with mainstream content overall, individual junk news stories can still hugely outperform even the best, most important professionally produced stories, drawing as much as four times the volume of shares, likes, and comments.

This sort of makes sense, if you think about the entire genre of literature called urban legends, or conspiracy theories in general. A secret cabal that runs the world is definitely more interesting than a bunch of old people mangling legal documents and yelling at each other on C-SPAN.

And before you think only a nutjob here and there would believe in conspiracy theories, consider that more than 1/3 of Americans don't even buy into the climate science. This is the stuff your boring real news that takes hours of research to produce has to compete against:

A quarter believe that our previous president maybe or definitely was (or is?) the anti-Christ. According to a survey by Public Policy Polling, 15 percent believe that the “media or the government adds secret mind-controlling technology to television broadcast signals,” and another 15 percent think that’s possible. A quarter of Americans believe in witches. Remarkably, the same fraction, or maybe less, believes that the Bible consists mainly of legends and fables—the same proportion that believes U.S. officials were complicit in the 9/11 attacks.

Good luck fitting all that to print, The New York Times.

And there’s also the difference between the motivations of people who send spam and those who create and distribute the fake news.

Fake news is not about profits

The reason why spam flared in the first place, making a quick buck, also made it easy (I mean, bear with me) to both detect and punish those behind it, further making it less attractive. There are only so many ways to get people to make a purchase on your website and get that money in your bank account. In the global financial system, there are ways (and loopholes) to track people and tip the law enforcement to knock on someone’s door. Laundering money is equally hard, which is why you only see relatively large amounts being laundered (and caught).

Fake news, however, come in many forms. A big chunk that exists for the same reason spam exists; the zero-cost distribution means that if you can make something go viral on a platform and slap a few ads on it, you can make a quick buck.

But how about politically motivated fake news? Stuff that a bored Redditor creates with slowing down a politician’s speech to make her sound drunk and incoherent (have they even listened to Trump?) is an interesting example. How do you protect against a lone wolf, when the wolf can inflict damage at a massive scale?

We’ve seen this happen multiple times in India, for example. You can just crop out a video from one event, add a new caption to it and get a bunch of people violently lynched to death. Obviously, the bulk of the blame lies on the physical perpetrators of the crime. But you can’t just shrug this behavior off as people being crazy, when it happens over and over again, to the point of genocidal action, while you are raking in the profits by the billions.

Not that we are making the problem easier on ourselves. One of the big gains of a centralized system, the argument goes, is that allows you collect more data and build better algorithms. Will Facebook be able to gather enough data when they can’t look at the content at all because all the chats are now entirely end-to-end encrypted? Will just looking at the metadata be enough? We don’t really have good answers to these questions.

What Do We Do?

There are some easy wins here, at least in theory. I think a great deal of fake news is spam-like and can be eliminated by similar techniques. Yet, I don’t think that will make the pain go away as much as it did for spam. We’ll need a multi-pronged approach.

Lack of timely, accountable information from social media companies encourages a reactive approach, often too late to fix the damages, let alone prevent them or really understand what happened. Similarly, without the fear of competitors that users can flock to keep them in check, companies engage in extremely risky behaviors.

Moreover, these behaviors and their results are generally hidden from public or hard to even detect, and only discovered by painstaking investigation by journalists. This doesn’t scale, and the power asymmetry, let alone the animosity, between the two industries will only get worse. Regulations around other critical industries (like finance) and individual companies are much tighter, and can be a starting point.

But there are also some other fundamental issues we’d want to discuss. Do we really want to have a truly anonymous internet? For years, the anonymity of the internet allowed was considered a feature, including by yours truly. But a dogmatic anonymity fervor should not disallow accountability.

Furthermore, we should think about whether we want to run our major information distribution channels on advertising based networks, get all our news from a few sources that aren’t accountable to anyone.

🙏 A Kind Request

The Margins has a small, but powerful group of readers. We have prominent technologists, influential (and sometimes insightful :-)) venture capitalists, business leaders and definitely a firebrand of journalists among our subscribers. Thanks for being one.

We now want to share our insights, thoughts, and banter with even more people.

If you like what we write, can I ask you to share your testimonials with us? If you are on Twitter, you can tweet mentioning our account, @The_Margins, to your followers. Or you can send us a quote via email to include in our landing pages, and other various marketing materials. Any restrictions you might have, we’ll honor them.

Of course, please keep forwarding our pieces.

Thanks for your consideration!

What I’m Reading

How the Kleiner Perkins Empire Fell: Kleiner Perkins is as iconic and blue-chip as they come when it comes to Silicon Valley Venture Capital firms. (Disclaimer: I worked at a company where Kleiner Perkins was a major investor, and John Doerr on our board) In recent years, however, the firm has gone through a bit of a turmoil, and arguably lost a bit of its -never intentionally claimed- luster. This is an interesting overview:

The firm’s heart may have been in the right place, but its investments flopped. Some, like electric-car maker Fisker Automotive, went bankrupt. Others, like fuel-cell manufacturer Bloom Energy, took 16 years from Kleiner’s investment in 2002 to go public. The result was a tarnished brand at a time Kleiner’s competitors were killing it with investments in the digital economy. Accel Partners, for example, was the early backer of Facebook. Union Square Ventures was among the first to put money into Twitter. And Benchmark Capital, which scored in the web’s first era by investing in eBay, staked Uber in its early days.

The problem with Ben Thompson's 'aggregation theory': I am a big fan of Ben Thompson’s Stratechery, and have been a paying subscriber for years. This is an, in my humble opinion, a fair criticism of his infamous Aggregation Theory. It purports that aggregation theory is really using new terms for old concepts. Thompson had a response on his newsletter later:

The problem I have with the [aggregation] theory is that it implies there is something fundamentally new or unique about the economics of the brave-new-world of tech, when in reality, the old economic rules still work just fine. This, in turn, creates the raw material to rationalize bubble thinking/valuations, instead of more level-headed analysis. The reality is that from time immemorial, it has always been the case that certain points in the supply chain make more money than others, reflecting differences in market power. Porter's Five Forces, for instance, has long been used as a framework for analysing where and how much market power exists, and explaining and predicting why some firms make more money than others. If your suppliers for e.g. have a lot of bargaining power, all else held constant, you tend to be less profitable, and vice-versa.

Margins by Ranjan Roy and Can Duruk