Leveraging Email Lists for Detecting Botnet IPs

Sunday, March 04, 2012

Gianluca Stringhini

B1c4090e84dcfac820a2b8ebe6eee82b

Although spam is at its historical low, it still remains a big problem for network and system administrators. 

Since most of the nowadays spam comes from botnets, spam mitigation research tends to blend with botnet detection. By detecting and shutting down a spamming botnet infrastructure, researchers can have an impact on reducing the worldwide spam levels.

Usually, this can be achieved in two ways: 

  • By detecting and taking down the Command and Control infrastructure
  • By detecting machines as they get infected, and perform cleanups

The first approach requires to reverse engineer the Command and Control protocol of a botnet, and understand what are the critical servers in its infrastructure. This can be a very complicated task, especially for those botnets using multi-layer infrastructures, or peer-to-peer schemes.

After researchers have detected the critical parts of the botnet infrastructure, they can start mitigation steps (for example, sinkholing the DNS requests to the domains associated to those hosts, or asking the ISPs hosting them to take them down). This type of approach has been followed by Microsoft during the Rustock takedown in 2011, as well as by our group at UC Santa Barbara during the attempted Cutwail takedown in 2010.

Although useful, this approach has two drawbacks: first, it depends heavily on the botnet being analyzed. The detection techniques developed to find the critical nodes of one botnet might not apply to attack a second one. Second, the effects of such takedowns tend to be ephemeral: often times, it doesn't take long until the botmasters set up new servers and take their botnet back up.

The second approach suffers from similar problems. The methods used by botnets to propagate vary from botnet to botnet. Therefore there is no technique that can easily monitor machines as they get infected (at least, not from the network vantage point). Also, since the popular trend is to have users click on malicious emails attachments in order to get infected, this is not an easy problem to solve.

In our research, we instead propose a third way of performing botnet mitigation. Instead of learning different features that allow to identify and attack the different botnets, we study how bots behave when sending spam. The intuition here is that there are behavioral characteristics that are common across multiple botnets, and allow to distinguish between bot infected machines sending spam and legitimate users sending emails for legitimate uses.

As a first step in this direction, we developed a system, called BotMagnifier. This system is explained in detail in this paper, that got published at the USENIX Security Symposium last August.

The idea behind BotMagnifier is that bots belonging to the same botnet will share the same codebase and will take orders from the same set of C&C servers. Based on this insight, it should be possible to detect bot infected machines by learning the spamming behavior of a subset of known bots, and look in a network traffic dataset for more machines (i.e., IP addresses) that behaved in the same way. In particular, the system looks for groups of bots that contact the same set of SMTP servers while spamming.

The idea behind this is that, while the email templates and the bot IP addresses a botnet uses might change over time, the victim email lists spammers use to spread their malicious content stay reasonably constant.  

Having an extensive list of bot infected machines is useful for many purposes: it helps tracking the size of the world's largest spamming botnets, and it can be used by ISPs to clean up their networks, by removing or sanitizing the infected machines.  

In our experimental setup, we looked at the IP addresses that sent emails to our spam trap, grouped them based on campaign (i.e., botnet), and learnt the set of destinations they target by looking at the logs of our Spamhaus mirror.

By tracking IP addresses over a period of four months, we were able to observe important events in the lifetime of large spamming botnets, such as takedowns. We also were able to track hundred of thousands of IPs corresponding to infected machines.  

The approach is not limited to the datasets we used. A network administrator could apply it to their own network, and be able to detect spamming machines in it.  

This technique is hard to evade by botmasters. In fact, althought trying to be stealthy, they still have to contact a large number of mailservers for their business to be profitable.

Gianluca Stringhini is a PhD candidate working as research assistant at UC Santa Barbara. His research interests are Network Security, Botnets, and Spam Mitigation. You can follow him on Twitter at @gianlucaSB

Possibly Related Articles:
12536
SPAM
Information Security
SPAM malware Botnets Methodologies Detection Rustock Mitigation Analysis Cutwail Sinkholing Gianluca Stringhini BotMagnifier
Post Rating I Like this!
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.