Baku, July 14, AZERTAC
The robot wars will be won by spambots, unless Google engineers have anything to say about it.
The company announced in its Gmail blog on Thursday that it has been using Google's artificial neural network to help with e-mail spam filtering. Already, the company says that it's been able to block 99.9 percent of spam from reaching inboxes, while incorrectly classifying legitimate e-mail as spam only 0.05 percent of the time.
And it's all thanks to data collection.
For the most part, Google's system is based on Gmail's "report spam" and "not spam" buttons. By taking this user input and referencing other user actions, the Internet giant can learn what counts as spam and what doesn't. For e-mails that were sent with maliciousness intent, the server can learn, parse, and redirect from the inbox.
But spam can still make it past blockers through a variety of ways, the company says. Often, spam succeeds by using previously unaccounted domains (new ones such as .xyz or .horse can get past filters) or by mimicking desired e-mails (or "ham"). Despite new filters, spammers find ways to circumvent them.
Though we may not have completely eradicated spam as computer scientists had thought we would, Internet companies have been able to at least limit its pervasiveness.
The remaining problem lies not in detecting which e-mails are junk. "Blacklisting is an efficient anti-spam mechanism, but is becoming more and more prone to false positives," reads a paper from MIT's Spam Conference 2010, which brought experts together to discuss the future of spam detection. Often times, the "coarse granularity" of blacklists sweep non-malicious addresses into the junk bin, the report says.
And even with whitelists, or lists of approved online addresses, the report asserts that services are just using heuristics to curb spam rather than addressing any computational approach.
So Google is using its "neural network" – a series of learning supercomputers designed to "think" and identify imagery – to detect spam and help close that remaining tenth of a percent of error.
This type of artificial intelligence is grown from a type of machine learning known as "deep learning." These types of neural networks attempt to mimic higher-level thought and abstraction, and many see it as one of the roots for development of artificial intelligence.
Google thinks this can stop junk. Instead of utilizing white- or blacklists to identify spam or ham e-mails, its neural network can use natural-language processing and information from other users to draw conclusions about the messages being analyzed.
But neural networks have their own problems, says Anselm Blumer, associate professor of computer science at Tufts University. To Dr. Blumer, these artificial "neural networks" approach learning from a perspective that is wholly different from how people actually think.