Tuesday, May 26, 2009

Thwarting phishing on the cheap

If you've been following my blog then you know that phishing has been a real problem at my organization. Users are falling for phishing messages no matter how poorly written they are and we've suffered way too many intrusions because of it. Even though I fear that it is largely impractical, I feel like I have to do something to stem the tide. Since this is unquestionably the largest source of intrusions into our network, even an inefficient method of catching these might be better than nothing.

So the first question is: how can I prevent the phishing messages from getting to my users? That would be the most effective way to combat this. But we've got two Barracuda Spam firewalls already working on this problem. It is unlikely that I'm going to do a better job of blocking the phish messages on the way in. So I decided to focus on messages on the way out.

Here is what I'm trying out right now. This is far from a perfect solution and it is also not the finished poduct. This is my first toe in the water to using Snort to try and detect the responses to phishing messages.

First, I made a new ruletype called phishinghole. This is in my /etc/snort/snort.conf file.

ruletype phishinghole
# The phishinghole rule type gathers up any alerts that could
# be responses to phishing messages and keeps the tcpdumps
# in one easy to read file
type alert
output alert_full: alert
output alert_syslog: LOG_LOCAL4 LOG_ALERT
output log_tcpdump: phishinghole.pcap

So when I write up a rule that uses this alert type, an event will be written to the normal alert file, it will also go to our syslog servers, and the packet itself will be written to a file called phishinghole.pcap. That way, when I want to inspect today's catch, I don't have to go through a huge packet capture file looking for just the packets that interest me.

I also want to be able to classify these events properly, so I added this one line to my classification.config file:
config classification: phishing-response,Possible Response to phishing message,5

Next up, I need to write some rules. I'm going to look for any traffic coming from my network and going to some other network on port 25. That's email. I'm going to use my new phishinghole ruletype, and I want to log the message that this might be the response to a phishing email:
phishinghole tcp $HOME_NET any -> !$HOME_NET 25 (msg:"Possible Phishing Response";

Next we need to define the content that we're going to look for. When I first started doing this, I came up with three regular expressions to look for possible permutations of password, username, and email. However, I decided that I didn't want to take the performance hit of running three regular expression searches against every single packet that leaves the organization. I decided instead to look for the word password and if that matches, then run the other two regular expressions. That should trim down the number of packets I have to look at.
content:"password"; nocase;

Now the magic is in the regular expressions. This is the Perl Compatible Regular Expression I'm using to search for username. This will match regardless of case and whether the word is broken in two with a space or a dash "user name" or "user-name" or "User Name" etc. The second regular expression looks for the word email in a similar fashion.
pcre:"/user[\-|\s]?name/i"; pcre:"/e?[\-|\s]?mail/i";

I still need to review these messages manually to see if someone really did respond to a phishing message or just told the guy to go to hell. So I want to gather a few extra packets to get as much context as reasonably possible. This next part of the rule tells snort to record three additional packets in the conversation.

I also want to make sure my rules are classified properly. I want these events to bubble up to the top of my priority stack, so I put this into the rule:

and I ended the rule with a sid (todays date and a revision number)
sid:20090525; rev:1;)

I repeated the process for three more reasonable permutations of the word password. I figure that a phisher cant get too crazy with the spelling or capitalization or the message will lose credibility. Here are the finished rules that I came up with.
phishinghole tcp $HOME_NET any -> !$HOME_NET 25 (msg:"Possible phishing response"; content:"password"; pcre:"/user[\-|\s]?name/i"; pcre:"/e?[\-|\s]?mail/i"; nocase; tag:session,3,packets; classtype:phishing-response; sid:20090525; rev:1;)

phishinghole tcp $HOME_NET any -> !$HOME_NET 25 (msg:"Possible phishing response"; content:"pass word"; pcre:"/user[\-|\s]?name/i"; pcre:"/e?[\-|\s]?mail/i"; nocase; tag:session,3,packets; classtype:phishing-response; sid:20090525; rev:2;)

phishinghole tcp $HOME_NET any -> !$HOME_NET 25 (msg:"Possible phishing response"; content:"pass-word"; pcre:"/user[\-|\s]?name/i"; pcre:"/e?[\-|\s]?mail/i"; nocase; tag:session,3,packets; classtype:phishing-response; sid:20090525; rev:3;)

Now for the analysis
Since these are .pcap files, it's tempting to open up wireshark and get started at peeking through them. That's what I did and it works just fine.  However, after a couple of days I realized that I was getting way too many false positives, and I needed a new way to separate the wheat from the chaff.  Since the only thing in the .pcaps are email snippets, that means that all of the data I need to sift through is going to be in plain text.  So I ran the strings command against the pcaps just to make sure that I would get a dump of all text in the file.  Then I talked to our system administrators to find out if I could get a plain text feed of active user accounts on the domain.  Now I can do something like this to find out if I need to look at the file in more detail:
strings phishinghole.pcap | grep -f listofusernames

If I get any hits then I know I have an email that has the word username, password, email and a valid user name on our network in it. Even if that is a false positive, that is worth investigating. It's also pretty easy to find the username in the .pcap file when you know what username you're looking for so then you can see the message in context.

I'm going to let this run for a few more days and see if I'm satisfied with it. If I think I've got a winner here, I'll automate the process further.


Mike Kruzel said...

Are legitimate emails getting flagged? Any users complaining?

Black Fist said...

Users aren't complaining because I'm not actually blocking anything. Right now the idea is to see if I can identify the responses and take action on them manually before the potential intruder uses the credentials that were sent off.

Am I getting false positives? Oh yeah, lots of them. I mentioned that I was trying to grep the strings file against a list of active user accounts but that hasn't been working for me. With the number of user accounts that we have it seems that the grep takes forever, so I'm still having to resort to looking through the messages. I suspect that I will be scrapping this idea soon unless I have another idea on how to improve it.