Facebook’s New Spam-Killer Hints at the Future of Coding
Louis Brandy pauses before answering, needing some extra time to choose his words. “I’m going to get in so much trouble,” he says. The question, you see, touches on an eternally controversial topic: the future of computer programming languages.
Brandy is a software engineer at Facebook, and alongside a team of other Facebookers, he spent the last two years rebuilding the system that removes spam—malicious, offensive, or otherwise unwanted messages—from the world’s largest social network. That’s no small task—Facebook juggles messages from more than 1.5 billion people worldwide. To tackle the problem, Brandy and team made an unusual choice: they used a programming language called Haskell.
In the early ’90s, a committee of academics built Haskell as a kind of experiment in language design, and all these years later, it remains on the fringes of mainstream programming. At GitHub—the primary repository for software code on the ‘net—Haskell ranks 23rd on the list of the most popular languages. Even so, Facebook chose it as the basis for its enormously complex anti-spam system, which went live earlier this year. As I chat with Brandy inside the new Facebook building in Menlo Park, California, I’m trying to understand what this choice says about the evolution of programming languages as a whole.
That may seem an innocent enough question, but any straightforward discussion of the merits of one programming language over another is inevitably met with at least a modicum of vitriol as it spills into the wider community of software developers. Coders choose programming languages for any number of technical reasons, but they also choose them for very personal reasons—and these personal reasons inevitably intertwine with the technical. If Brandy praises Haskell too heavily—or indeed criticizes it too heavily—so many others will cry foul. They’ll probably cry foul anyway.
What he does say is that Haskell is ideally suited to fighting Facebook spam because it’s so adept at executing many different tasks at the same time—and because it gives engineers the tools they need to code all these tasks on the fly. Facebook’s social network is so large and spammers are changing their techniques so quickly that the company needs a way of both building and operating its anti-spam engine at speed. “Latency is the most important thing. We want to be able to stop attacks immediately,” says Brandy, who worked with Facebookers such as Jonathan Coens and noted Haskell guru Simon Marlow in building the system. “We want to run as many checks in the shortest amount of time, and that’s where Haskell helps us.”
As with Google and Amazon, if you consider that Facebook represents where the rest of the internet is going—as the internet grows, so many other online services will face the same problems it faces today—the company’s Haskell project can indeed point the way for the programming world as a whole. That doesn’t mean Haskell will be ubiquitous in the years to come. Because it’s so different from traditional programming languages, coders often have trouble learning to use it; undoubtedly, this will prevent widespread adoption. But Facebook’s work is a sign that other languages will move in Haskell’s general direction.
Indeed, they already are. Newer languages such as Google Go and Mozilla’s Rust are designed so that developers can build massively parallel code and build it at speed. And as Brandy points out, other projects are building Haskell-like software libraries for additional languages, including “reactive” programming projects like RxJava.
For some coders, languages like Go and Rust aren’t quite as proficient as Haskell. But they’re much easier to learn. And they at least approach the ideal that the Haskell community has sought so diligently over the last twenty-five years. “Haskell has pushed so many languages forward,” says Mathias Biilmann, a coder who has ample experience with Haskell and is steeped in a wide range of other languages. “And I’m sure it will continue do so.”
Biilmann builds software for running internet sites, these days at a San Francisco startup called Netlify, and in previous years at an outfit in Spain. At one point, while building a tool that could automatically resize images when someone opened a site on their particular device, he found that Haskell was the ideal language, mostly because it was so adept at running tasks concurrently. In a world where sites handle so many different tasks for so many different people, this is a valuable quality. “You get so many requests for image resizing,” says the Denmark-born Biilmann. “You have to be able to manage lots of concurrent connections.”
Haskell is good at this because it’s a “purely functional programming language.” In essence, you build programs around a series of functions, and each function can operate independently of all the others. That means, among other things, that you can execute the functions in any order you like. You needn’t run them sequentially.
This can improve speed, Biilmann says, but it can also help coders wrap their heads around what they’re doing. “With most languages, you say: ‘First, you do this. Then, you do that,’” he explains. “And once you start doing this with hundreds of processes running at the same time, it becomes very hard for humans to reason about what’s actually happening and in what order things need to happen.”
These same basic characteristics are what made Haskell so attractive to Facebook. The company needed a language that could help engineers write “rules” for identifying spam on its social network. Identifying spam involves gathering data from a wide range of machines inside the company’s massive computing centers, and Haskell provided a way of doing this quickly. “It’s safe in Haskell to run two functions at the same time. You know there will be no side effects. And that’s not true of most other languages,” Brandy says. “It lets you take things that look serial and do them at the same time.”
What’s more, says Marlow, Facebook engineers can write these rules without worrying to much about how they’re be executed. “We wanted to abstract away from concurrency,” he says. “Even though concurrency is needed to get efficiency, we didn’t want our spam-fighting engineers to have to worry about it. Haskell is really good at abstracting things.”
John Edstrom, who uses Facebook’s system to fight spam on Instagram, the photo-centric social network owned by Facebook, underlines how valuable this can be. “With a lot of these rules, we’re writing them as we’re being attacked. We’re like: ‘Oh crap. We have to get these out fast,’” he says. “If we’re working in a purely functional language that we know doesn’t have side effects, the faster we’re able to move.”
This too is important across the larger programming universe. Modern internet services must evolve quickly, not only to serve their ever expanding and ever-changing community of users, but to keep up with the competition.
‘It Would Not Be a Bad Thing’
The thing is: Biilmann no longer uses Haskell. It’s not entirely practical. Not enough people know how to use it, and this is unlikely to change. “Haskell is like a programming language from an alternate future that is never going to happen,” he says. “It solves all these problems it promises to solve. But it’s so different that there is no chance it will become common.”
Today, in building modern services that require extreme concurrency, Biilmann is more likely to use something like Go or Rust. These aren’t quite as powerful as Haskell, he says, but they’re on the right path. And they’re more suited to the mainstream programmer. “Today, if I were to rewrite my image resizer, I would probably rewrite it Go,” he says. “It probably solves 80 percent of the problems that Haskell solves for a service like that, and it basically has no learning curve.”
At Facebook, Brandy says, Haskell’s breed of parallelism isn’t suited to every task. And he acknowledges that it can be difficult for some coders to learn. But he’s confident that its techniques will become more important as the years pass. “There is certainly potential for this kind of thing,” he says. “Every company is basically writing code that is kinda like this. You have to. You see a lot of programming languages that pop up and feel like this, under the hood.”
What about Haskell itself? In the long run, could it evolve to the point where it becomes the norm? Could coders evolve to the point where they embrace it large numbers? “I don’t know,” Brandy says. “But I don’t think it would be a bad thing.”