Regular Expressions: Finding Email Addresses

Recently I fixed the Sendmail configuration on one of our boxes, and I’m now inundated daily with over 1700 return mails from old and expired email addresses. This is the second day so I decided to combat it:

  1. Create a procmail script to redirect all the bounced mails into a file.
  2. Grab all the email addresses from that file and create SQL statements to disable sending mail to those users of our site.

With Google’s help, I came up with the following procmail recipe, and stuffed it into my .procmailrc:

:0
* ^From: .*MAILER-DAEMON.*
* ^Subject:.*(Undeliverable|failure notice|Returned mail:|Delivery (Status )?Notification|Mail System Error|Delivery fail|Nondeliverable mail|Message status – undeliverable|Mail Delivery Problem|Notification d’état de la distribution).*
RETURNS

That should catch almost all the returns sent to my inbox.
I used the following code to extract the emails from the RETURNS file. It can probably be done better, but this works well enough.

awk -F “< " '// {print $2}’ ” ‘{print $1}’| sort|uniq

I then grep out bogus lines such as the ones smtp servers add, opened the file in vi and added SQL statements around each email address. I expect a lot less email in my inbox tomorrow..

Here’s a handy online regex tester if you want to test a regular expression easily.


You might also like

If you like this post then please subscribe to my full RSS feed. You can also click here to subscribe by email. There are also my fabulous photos to explore too!

Published by

Donncha

Donncha Ó Caoimh is a software developer at Automattic and WordPress plugin developer. He posts photos at In Photos and can also be found on Google+ and Twitter.

Leave a Reply