Deploying DMARC Without Breaking Everything


  // Share.on([hacker news,
    linkedin,
    twitter,
    facebook,
    reddit])

Too scary? Messing with the configuration on your domain email is scary, especially if you're already sending a lot of it. You have to worry that you're going to screw something up and break all of the email communications for the entire company.

That's absolutely what I was worried about when I was rolling this out. There was already enough craziness going on at the company with the site transition and all of the phishing attempts. If I broke our email and people couldn't communicate with each other about their potential purchases, it was going to get ugly.

In 2012 there weren't a lot of tools to help with this process like there are today. Not many people had any idea how this stuff worked yet and I was dealing with a legacy system spread across 26 OpenBSD servers running Perl that hadn't been updated since Y2K (yes, really) alongside a dozens-of-dynos Rails site in a language that I was in the process of learning.

It was scary. One of the reasons I'm such a big advocate for DMARC today though, is that it was so painless and so easy to roll out that there's no risk whatsoever. It's not overnight, but DMARC provides you with all of the tools that you need to be very careful and ensure that everything is setup like it should be before anything is actually enforced.

Last week, we explained what DMARC is and how it works. I strongly recommend reading it if you haven't already. This is the point that our guide is going to break into subsections.

Deploying DMARC varies depending on the size of your company as well as the number of servers and services your are using to send email, which are usually related. The smaller you are, the easier it is to setup and grow it as your system grows. If you're larger, it's more of a project but it still isn't going to require a lot of man-hours. It will require some calendar time, because you'll setup a configuration and wait a few days to gather information so that you can see what steps you need to take next. Rinse and repeat until you're confident everything is correct.

ANNOUNCEMENT: At BSides on October 28th, 2023 I announced the upcoming 2024 private beta of dmarcSTAR, a service that approaches DMARC in what I believe is the ideal manner after over a decade of experience with the specification. Read more at dmarcSTAR.com.

DMARC Deployment Goals

With any DMARC deployment, we have the same goal. We want a strictly enforced DMARC policy by the time we are done. When we start, we simply want to gather information safely until we are confident that the an enforced policy can be used.

There are three policy types, referenced by the p= attribute of a DMARC record.

p=none;
We have a DMARC record, but we don't want the rules to be enforced at all.
p=quarantine;
Anything that doesn't pass DMARC should be sent to the user's spam folder. This is the lowest "enforced" setting and gives your support staff the ability to have customers check their spam folders if an email they're expecting is missing after turning it on. By the time we get here, you won't need it.
p=reject;
Don't even deliver messages that fail DMARC checks. They aren't from us.

The goal is to get to a p=reject; policy for every domain. A lot of people like the idea of sticking with p=quarantine; but this is still a dangerous place to be. Users can set email filter rules that move messages saying they are from you to specific folders and these rules will bypass the DMARC check if it's allowed to be delivered to the spam folder. Additionally, once you get to p=reject; you'll find that it's easier to keep all of your email rules up to date because nothing new will work until it's been properly configured. This helps prevent Shadow IT too.

The goal is always to get to p=reject;.

Deployment: New Domain, No Email

Wait, why do I need DMARC with a new domain that isn't sending any email? Remember how we said that anyone can send email claiming to be from you? We meant it. Go and and setup a simple SPF and DMARC record to disable all email for the domain until you're ready to use it. That way, you can ensure nobody else is using it for you in the mean time. This is the "Only you can prevent forest fires" of the email world. Listen to Smokey Mail.

Go to your DNS and add these two TXT records. Take a look at dig TXT slimmeryetimbers.com and dig TXT _dmarc.slimmeryetimbers.com as an example.

slimmeryetimbers.com.        100 IN TXT "v=spf1 -all"
_dmarc.slimmeryetimbers.com. 300 IN TXT "v=DMARC1; p=reject; aspf=s; adkim=s;"

The first sets an empty SPF record with strict enforcement, meaning no IP addresses are authorized to send email on behalf of this domain. The second sets a DMARC record with a reject policy that tells receiving mail servers to not even bother sending failing messages to spam and drop them entirely instead. The aspf and adkim settings put both into "strict" mode.

Do that on all your unused domains and you're doing your part to make the internet less spammy while protecting your own domain reputation.

Deployment: Startup, Few Mail Sources

Let's say Slimmer Ye Timbers is plundering the local gym market. We're sending emails out like canon balls: class schedules and reminders, newsletters, billing information, company communications as we talk to suitors about opening more pirate themed gym franchises! Aye, it's busy. How do we roll out DMARC without disrupting any of this?

There are three steps in the process: Recon, Implement, Enforce.

Recon

The very first thing we're going to do, is publish a DMARC record. You're probably thinking, "Wait! I haven't done any preparation for this at all!?" and that's because you don't need to. A DMARC record with a p=none; policy will have no impact at all on your current email delivery. The one thing it will do, is help you gain information about what is currently going on in your email system. Part of the specification allows you to set an email address that will receive daily aggregate activity reports from email servers that receive email claiming to be from your domain. These can come to you personally or you can have them sent to a service that can help visualize the reports for you.

When I started this back in 2012, there weren't any tools to help read these reports so I just read them directly. If you don't have a lot of email sources then this approach should be fine because the reports aren't very hard to read. Let's take a look at one of the reports now.

<record>  
  <row>  
    <source_ip>207.126.144.129</source_ip>  
    <count>237</count>  
    <policy_evaluated>  
      <disposition>none</disposition>  
    </policy_evaluated>  
  </row>  
  <identities>  
    <header_from>slimmeryetimbers.com</header_from>  
  </identities>  
  <auth_results>  
    <dkim>  
      <domain>slimmeryetimbers.com</domain>  
      <result>pass</result>  
      <human_result/>  
    </dkim>  
    <spf>  
      <domain>slimmeryetimbers.com</domain>  
      <result>pass</result>  
    </spf>  
  </auth_results>  
</record>

Every IP address in the report will produce a <record> entry for the <source_ip> of the mail server that delivered the email claiming to be from you. You can see the <count> field indicating how many emails were received from that particular IP address as well. This is why it doesn't matter how much email you're sending. If that number is 10 or 20,000,000 the size of the report doesn't change for you.

Next you'll see the <policy_evaluated> which gives a short summary of the state of your DMARC record at the time these entries were processed. We can see here that our <disposition> was none reflecting the p=none; policy in our DMARC record.

Next, you'll see the <header_from> which indicates the domain name found in the email that triggered this action.

Lastly, we'll see the <auth_results> which tell us whether our mail from this IP address passed the DMARC <dkim> and <spf> checks. We can see the <result> is pass for both here.

So, select the email address where you want to send reports, such as [email protected] and publish the following DMARC record (with your chosen email address instead). You'll publish it to your DNS as a TXT record on the _dmarc subdomain of your domain, so _dmarc.slimmeryetimbers.com in our example. RUA indicates where to send the "aggregate" reports, which summarize daily activity.

"v=DMARC1; p=none; rua=mailto:[email protected];"

Keep in mind, you will not get a report from every mail server that gets email claiming to be from your domain. More and more companies will provide DMARC reports though. There's a list of DMARC report providers with volume published by dmarcian* if you'd like to take a look to see which providers you can expect to see reports from. Some other enterprise providers, like Proofpoint, have the capability of producing reports but it's turned off by default.

Implement

Our goal with these reports is to discover all of the email sending sources that belong to us and work towards ensuring that both SPF and DKIM are passing for all of them. Now that we are collecting daily reports, we need to identify our various email sources. Let's look at our hypothetical email sources for Slimmer Ye Timbers.

Class schedules and reminders - Assume something like Sendgrid or Postmark
Newsletters - Usually Mailchimp or Constant Contact
Billing information - Probably Square or Clover
Company Email - Let's assume Google Workspace

Now, the first thing we want to do before we even bother trying to isolate each of these in our reports is to check with each provider to see if they have instructions for setting up SPF & DKIM. These technologies have been around for a while and DMARC is going on 10 years. Any business that provides an email service is going to be well aware of them, should have instructions somewhere and very likely already made you set it up without you even knowing about it.

Let's try to find the pages for all of the above providers by Googling "setup spf and dkim with ..."

Sendgrid - Explanation of SPF & DKIM which they automate
Postmark - DKIM | SPF takes some additional steps, which they explain here
Mailchimp - DKIM & SPF are handled through the authenticate domain process
Constant Contact - DKIM | SPF
Square - Couldn't find anything.
Clover - Couldn't find anything.
Google Workspace - SPF | DKIM

After some quick Googling, we found instructions for almost all of them. If no instructions are present, it's entirely possible that setup may not be necessary. For example, Square and Clover may opt to send payment receipts directly from their own system rather than from your domain. After a quick look in my email, I can verify that is the case as well by finding a receipt from a company using Square from [email protected]. We only need to setup SPF/DKIM/DMARC for mail that is sent from our domain.

Some services will make this optional as well. Constant Contact, for example, will send using their domain by default but encourages their customers to setup their own domain in order to build up their email reputation over time. If given the option, always set up your own domain.

Now, as we go and follow the instructions for these companies you will notice that you are creating both SPF and DKIM records.

In most cases, there will be multiple DKIM records and each record will have it's own unique subdomain. Usually something like em1234._domainkey.slimmeryetimbers.com as a CNAME record or a TXT record. If you're provided with a CNAME record, it's because the provider will automatically handle DKIM key rotation for you. If you're provided with a TXT record, it's a key that you'll need to make a note to rotate periodically. If you can find an alternative provider that gives you a CNAME option, I would strongly recommend it so that you never have to think about this again.

For SPF, you may see records that appear to overlap by creating TXT records. Usually it will appear to be a complete SPF record that has some form of include:_spf.google.com indicating the provider domain name. The include record here is actually just like a CNAME and you can combine SPF records by just putting everything in the middle together. For example, if you were using both Protonmail and Google each one would have told you to create an SPF record like this:

# Protonmail
v=spf1 include:_spf.protonmail.ch ~all

# Google Workspace
v=spf1 include:_spf.google.com ~all

We don't want to create two SPF records on the same subdomain or it will cause an error when mail servers try to check. Your root domain slimmeryetimbers.com can only have one SPF record, so we need to combine these to use both services. To do that, we just combine the parts between the v=spf1 and the ~all, like so...

v=spf1 include:_spf.protonmail.ch include:_spf.google.com ~all

Each include record let's that provider update the associated IP addresses for the service just like the CNAME on our DKIM keys. If we keep adding too many includes to this record, it could also become invalid because there is a 10 lookup limit. Ideally, each service we're sending email from should be on its own subdomain except for our primary corporate email. We saw our Square email earlier came from messaging.squareup.com and this is the right idea. Some transactional email providers (like Sendgrid, Mailgun, etc) will force this by only allowing you to use a subdomain. We'll talk about that more in our Enterprise section.

One thing to remember when you're setting up your SPF record is what the [+,~,-]all at the end means. This is an indicator of how strictly to enforce the record itself and there are 3 options.

+all allows any IP address to pass your SPF check. You should never use this. Ever.
~all will softfail an IP address that doesn't match, which will flag the IP address as not passing but defer any suggestion of what to do
-all will strictly fail an IP address that doesn't match and tell the mail server to discard it

Out of all of these, you should only be using the ~all to softfail mismatches. As we've discussed, SPF doesn't survive forwarding because it changes the IP address. Strictly enforcing this rule with -all could result in legitimate messages not being delivered. With the softfail, the final decision will fall to our DMARC policy.

After making all of these changes, we're going to wait a couple of days for new DMARC reports and then review them to try to see if we can figure out whether everything that's supposed to be passing is actually passing. This is the point in the process where we hit a loop. No matter how many potential email sources you have, you're going to gather reports, make adjustments, then wait for new reports to see if they worked. Rinse and repeat this process until you feel confident that everything that should be passing is passing.

If you don't want to review the reports yourself and you would prefer to use some type of tool to help with the job, dmarcian* provides an excellent DMARC XML to Human Converter that you can upload reports to in order to try to make some sense of them. You can also sign up for an account, where they'll provide you with an email address to put in your DMARC record to have all of your reports sent there if you wish. You can also upload existing reports to your account.

It's possible that by reviewing reports, you'll find things that you didn't know about. It's also entirely possible that you'll find a lot of mail that isn't yours at all! Most people do. Sometimes as much as 80% of the traffic claiming to be from your domain isn't real at all.

If you discover IP addresses sending legitimate email traffic, usually a web server, you have a couple of options. You can either add the IP address to your SPF record and configure it with it's own DKIM key or you can configure the server to send email through an email provider that you've already configured, like Sendgrid or Postmark. I'd strongly recommend the latter option, because it reduces your overall email footprint and gives you one less thing to keep up with. Most providers will have guides to configure server tools like Sendmail to deliver through their email servers. Websites built with Content Management Systems like Wordpress usually have plugins available to allow you to deliver email through these services as well. With the state of email today, it's generally a best practice to avoid having individual servers sending email unless you're committed to maintaining them. Each IP address that sends will be tracked an given its own reputation score overtime that is used by spam filters.

I discovered 5 of our 26 unpatched BSD servers were actually sending email thanks to these reports. After digging into the Perl code, I got a peek into the world of email from days gone by. These 5 servers were actively checking if they were being blacklisted and if so, would relay all of their email to a different server...until it got blacklisted. They apparently observed a pattern of blacklist removal after about 6 months. Because of the amount of user generated content (comments, forum posts, sales messages) that were being emailed from these servers, a great deal of it was flagged as spam.

We set the servers up to send through our Sendgrid account, which dropped them off of our DMARC reports permanently while ensuring their emails were compliant. We also helped the spam problem by using SpamAssassin to filter outgoing mail from these servers.

Enforce

Once you are confident that everything which should be passing in the reports is actually passing, it's time to enforce our rules. We also want to apply this enforcement very carefully, just in case despite all of our efforts, there was an unexpected problem. As it turns out, DMARC even makes this easy.

First, you're going to let your support staff know that you're going to begin enforcing the DMARC policy. Let them know that you don't expect anything to happen, but if they start hearing from customers that people aren't seeing an email they expect to see then ask them to check their spam folder. Then tell you immediately.

Next, we're going to enforce our policy by switching to the p=quarantine; setting...but only for a small percentage of our email. That's right, there's an optional DMARC argument that will let you request the policy only be enforced on a specific percentage of email which allows you to slowly ramp up the enforcement so that you can back off if you start hearing about problems. We'll start with 10%.

"v=DMARC1; p=quarantine; pct=10; rua=mailto:[email protected];"

And now we wait. Let it run this way for a day or two, see if anything is missing. Touch base with support to see if they have any reports of missing email that ended up in "Spam".

If there are no problems, we increase the percentage to 20%.

"v=DMARC1; p=quarantine; pct=20; rua=mailto:[email protected];"

And wait again. Still no problems? Move to 30%. Then 40%. Then 50%.

At the point you hit 50% and you're still not experiencing any issues we should feel comfortable that everything is working as expected. From here you can either keep slowly increasing the percentage per day or go ahead and jump straight to 100% by removing the pct entirely.

"v=DMARC1; p=quarantine; rua=mailto:[email protected];"

Now every message that doesn't pass our DMARC checks should be going directly to the spam folder. Let this simmer for a week or two. Check in with everybody you know to be sending email from different email tools just to make absolutely sure things are working as they should.

At this point, it's time to move to the final phase: p=reject;

"v=DMARC1; p=reject; rua=mailto:[email protected];"

Now, any mail that doesn't pass our checks but claims to be from us won't be delivered at all. This is where we want to be.

p=Quarantine is Not Enough

Even more, with the reject policy in place no new tools that we setup to send email on our behalf will work until we've set them up properly with SPF and DKIM. That will help us make sure that nothing sneaks in unexpectedly.

How could that happen? Let's say you're growing so much that you hire someone to handle marketing. Without telling you, they might sign up for an email service that they've used at a previous job. If you have a p=quarantine; policy setup this tool will still deliver successfully, while your new marketing person may just wonder why the messages are going to spam. They could even start asking customers to "add us to your contact list to get our emails!" and this is all bad advice. If the messages don't get through at all, they know something isn't setup right and must ask for help to get it setup correctly.

In a much larger enterprise this problem gets multiplied into what's called Shadow IT, where 3rd party services are setup without the knowledge or approval of your IT team. But with your p=reject; policy now in place, the long term maintenance of your email rules becomes a virtually automatic process. Because it has to...otherwise it won't work.

That was wasy

All of that may seem like a lot of detail, but let's think about what was really involved?

Recon: We setup p=none; record so we could collect reports
Implement: We reviewed the reports while we made sure everything we found was setup with aligned SPF and DKIM
Enforce: We slowly turned on our DMARC policy after we knew everything was setup correctly, just to make sure we didn't miss anything. Eventually, we reached full p=reject; enforcement!

Easy peasy!

DMARC in the Enterprise

Congratulations! We've reached the hilltop of DMARC for our startup! Continue in the enterprise guide next week to learn about some more advanced topics, including:

Politics of selling the DMARC project internally to different departments
Scale and subdomains
Overcoming the SPF 10 Lookup Limit permanently, without SPF Flattening
ShadowIT - Find and retake control
DMARC Project Management in the Enterprise
Adapting company policies for long term DMARC management
PII and Failure/Forensic Emails
Long term maintenance and monitoring

*Disclosure: I worked for dmarcian for 3 years, but have no current affiliation nor am I receiving any form of compensation for mentioning their tools in this article. I know the tools well and I know that the leadership of the company has played an active role in advancing DMARC adoption globally since its inception.


  // Share.on([hacker news,
    linkedin,
    twitter,
    facebook,
    reddit])

Recent Presentations

Security Automation with Gitlab

Ansible + Terraform, the PBJ of DevOps

Video: Developing a Layered Email Security Strategy Webinar

What to Expect When You're Emailing

Repeating History with Elixir...again

Repeating History...on Purpose...with Elixir

Go from a PHP Perspective

Protecting Users from Phishing and Fraud

Video: SQL vs NoSQL Discussion at UpstatePHP

Exploring Ruby on Rails and PostgreSQL