This question is quickly becoming the hottest question in the channel, especially after the Kaseya incident. In this article, I’ll illustrate a kill chain with an RMM-based attack, some thoughts on attack surface reduction, and some thoughts on ditching the RMM altogether.
The Biggest Problem with RMM
By far, the biggest problem with our RMM tools is that they are – quite literally- RATs. There is little to no functional difference between an RMM and a Remote Access Trojan. Granted, we typically use our RMMs for different things than a threat actor would use a RAT for. However, the functionality still stands. It allows you to discreetly manipulate a system, remotely.
The catch here becomes that we deploy this RAT, and then we tell our endpoint protection solution “not to worry about that little guy.” We ignore the “ltsvc” folder or the “kworking” folder or whatever folder your tool of choice uses. Now, as of now, this is a “necessary evil.” After all, we’ve just deployed a RAT. It does some sketchy things, even if the intent is good. All is well and good, until it’s not.
Hijacking the Good Guys
The term “living off the land” applies here. Threat actors will use tools already available on the system to fly under the radar of detection. To help here, we need to leverage attack surface reduction, and limit the land they can live off of. Here’s the problem! We’ve deployed an easy-to-use RAT that runs as SYSTEM, and we’ve told our security tools to ignore it. So we have, in effect, given away the whole castle. The security of your environments is 100% dependent on your and the vendor’s ability to secure access to the RAT.
Still following me? Let’s look at Kaseya. Kaseya got stuck in the unfortunate situation that every software vendor does, there was a zero-day bug in their code, it’s very much an unfortunate circle of life conversation. Womp womp, score 1 for the bad guys. This is why we need robust vulnerability disclosure programs so we have a higher chance of a good hacker finding our bugs for us. In this attack, REvil took advantage of the vast land we’d given away, hijacked the good tools, and used them for bad. Let’s look at how this works.
The RMM Attack Kill Chain
Let’s put Kaseya aside, zero-days are not super common. Let’s take the more common approach in our attack (and something that has come to fruition in multiple large scale attacks), the forgotten “service account” with full admin access and no MFA. So we’ve made step 1 pretty easy, we popped that credential. Let’s get into our RMM tool and have at it:
1. Disable Endpoint Protection. Yep, we’ve somehow integrated our RAT with our EPP. So we’ve given it carte blanche to turn it off, uninstall it, add exceptions, put it in detect mode, whatever. So I click 2 buttons, and disable EPP across your org. But remember, I don’t have to do this really because I’ve excluded the working directory of the RMM tool.
2. Upload my Packages. Many RMMs have this incredibly handy feature where I can upload my files right to the RMM, so my endpoints can download it. Thanks RMM vendor! So, I upload my payload (def-not-ransomware.exe) to the file store. Now, when endpoints download this, DNS security and other tactics (such as firewalls) will think business as normal. Why? Because machines connect to rmm.domkirby.com for legitimate purposes all day long. Moreover, I’m connecting over TLS. So everything is blind to what is being downloaded (it doesn’t see the /files/def-not-ransomware.exe).
3. Create an Automation Task. The scripting available in many RMM platforms is quite incredible. For real, I used to write some wicked stuff with the help of friends. I automated so many things (more on the good later). I’m gonna use that same awesome tool to download def-not-ransomware.exe to your endpoints, in the excluded working directory, on the machines where I’ve disabled EPP.
4. Click Go. Now, I select all, select run. I don’t care about versions or dependencies or any of that. I want to hit EVERYTHING at once and whatever sticks, sticks.
That’s it, you’re pwned. More specifically, most of your clients are pwned. Once I gained access to the RMM, I used four steps to execute my payload. And I did it entirely in the confines of your RMM, so the machines have experienced expected behavior thus far. By the way, def-not-ransomware.exe is such a lie of a filename, here’s by bitcoin wallet. Pay up or I splatter your clients’ business all over somethingsomethingdarkside.onion.
Roll the Tape Back!
In that (admittedly simplified, but accurate) kill chain, where could you have stopped me. The most obvious answer is the credential part. GET RID OF THOSE SERVICE ACCOUNTS, or at least implement granular access control. That Warranty Master (or whatever) account does not need unlimited rights to your RMM. Let’s dig deeper though.
First Order of Business – Reduce RMM Capabilities
Why in the world do I need pre-built tasks to turn off EPP? Even worse, why do I need a special plugin that lets me manipulate settings for everyone at once. I get it, it makes thins a little convenient. But I’m literally killing off a major defensive line here. Everything involved in administering EPP should happen within the EPP portal, period. The RMM should have no pre-programmed methodology to kill it (remember though, it runs as SYSTEM. An educated threat actor will work around this pretty quick).
Stop Ignoring your RMM
Like I said before, the difference between an RMM and a RAT is who deployed it and why. They are the same in pretty much every other way. If we’re going to deploy something like that to our machines, we at least need to watch it. Imagine if, in the Kaseya incident, EPP wasn’t able to be turned off (again a stretch, but just follow me down this rabbit hole). Imagine if the EPP was watching C:\kworking or whatever. Trusted, signed binaries could do as they please. Maybe we need to sign our PowerShell scripts, we can do that on the cheap. As soon as def-not-ransomware gets in there and tries to detonate its unsigned self and start calling crypto libraries and pinging C2, EPP is gonna say “OH HELL NO” (at least we hope so).
This is a harder battle, but vendors like Connectwise are making progress in committing to getting rid of AV exclusions (zero exclusions period should be the goal). In the meantime, I say do two things:
1. Push on your RMM vendor. Call them up, tell them to find a way to stop requiring exclusions. It’s totally doable, enterprise endpoint management can do it just fine.
2. Reduce Exclusions. You DO NOT need to exclude entire working directories. Use certificate-level exceptions for binaries that trip up your EPP (this will require a little R&D work). Point being, entire directories should never be excluded. Anything that’s not signed by the specific certificate of the RMM vendor should be subject to scrutiny (you can also exclude your own code signing certs for PowerShell scripts and such). When you use RMM to download signed, safe packages, they should have no problem doing their job. As soon as def-not-ransomware comes along, your EPP at least has a chance to help.
Beef up your Security Management Program
An RMM is a central, almost universal point of vulnerability in the MSP model. It just is, nothing we can do about that fact. What we can do is manage that risk. Your Cyber program needs to carefully document EVERY piece of the RMM. Who has access to what and why? What accounts exist for services? Are there MFA exemptions, and why? Can we centralize our identity to use a hardened identity and SSO to login to RMM? You need to have documented SOPs for managing and changing your RMM, that you follow every single time to the letter. If you do this, you’ll at least know your risk and be able to mitigate over time.
We’ve Hardened it, but can it Go Away?
This is a tough question I can’t answer for you. After all, when I ran an MSP, my RMM did a lot of work I was grateful for, for quite some time. However, I can share some key experiences from my own past.
Modern Workplace is an RMM Killer
For clients I had who we’d moved to the modern stack, RMM wasn’t doing much of anything. Seriously. It was just sitting there, not even looking pretty. Maybe throwing me the occasional useless stopped service alert that I’d forgot to disable. Why? Intune did the work, it managed patching, configuration, reported on device state and compliance. There was little to no on-premise metal for the RMM to help with. Everything I did with RMM, I moved elsewhere.
In one specific environment, we got rid of unattended access altogether. It was something the client asked about, and we decided together to experiment. If we needed to help a user, we used a one-time session. And you know what?! It worked freaking awesome. I didn’t have the risk of constant access to endpoints, but we were able to help people when they needed it. This is something I’d strongly consider as universal; it just eliminates one more angle of attack. Solutions like Control or LogMeIn Rescue make it very easy to point users to a site to enter a session code.
What’s more, all those fancy scripts I had to fix common Windows issues became obsolete. Devices became dumb. Windows having problems? Push a Reset from Intune, 30 minutes later the machine is fully back with all its apps and user’s files. So, what does the RMM even do at this point? Sit there in the corner like the awkward kid at the cool kid’s party (trust me I know, I was absolutely that kid)…
We managed our EPP (SentinelOne at the time) from the SentinelOne portal. We managed Microsoft systems via Microsoft. We managed identity in Azure AD and provisioned users in SaaS apps through there. When infrastructure was needed, we put it in private cloud (at the time). Everything was present and accounted for, and the RMM wasn’t really monitoring or managing anything.
Do your own Experimentation
I can’t give you the answers here, unfortunately. But, from my experience, being a cloud-native operation greatly reduced my need for RMM. Your mileage may vary, so you need to play with it. Build out a Modern Workplace test environment (or roll it out internally) and see how much work your RMM is really doing for you. With major update revamps in Windows 11, the consistent state configurations of Windows 365, and mass exodus to browser based work, you might just find that you can take the RMMless plunge as you modernize your clients. If you experiment, and have results to share, please let me know! I’d be extremely curious to see how this goes for others.