Hi everyone. Heard this one?
Three engineers were riding in a car: a mechanical engineer, a chemical engineer, and a Microsoft software engineer. The car stalled, so they rolled it to the side of the road and got out to see what the problem might be.
The mechanical engineer popped the hood, looked in and said “Look. The drive belt is loose. All we have to do is tighten it up and the car will work just fine.”
The chemical engineer replied “No, that’s all wrong. The problem is fuel contamination. We have to drain the fuel, filter it, and then everything will be A-OK.”
The Microsoft software engineer told the other two “No, I’ve seen this problem before. We have to get back in the car, roll up the windows, get out, wait a minute then get back in, roll down the windows and then start the car. That should do it.”
I love that joke and although we don’t really know for sure, I’m betting the damn car started. When I had to roll a stalled Windows 2008 R2 server running DFS to the side of the road, I considered this kind of fix. There were no errors to diagnose, a pegged CPU with the DFSR service (dfsrs.exe) using 99% of the cycles, and a partner server that was performing normally. Removing all of the DFS configuration including partnerships and folder pairs from the broken server didn’t help at all. The DFSR service would still peg the CPU without any configuration while its partner had no symptoms. It was probably a DFSR database corruption and I wasn’t about to drill into that thing to find the one file that was jacking it all up.
I set up a temporary replication solution (PeerSync) and proceeded to rip out DFSR and all the entrails that came with it.
- Remove the DFS Role Service. DFS is part of the File Management Role but you can remove the service itself. In Server Manager–>Roles you’ll find Role Services listed. Click Distributed File System and Remove. The server will not need to be rebooted.
- Give yourself access to the DFSR folder. Microsoft would rather you not do this in Server 2008 and 2012 so this takes a couple of extra steps. The DFSR folder is in the hidden Local Systems Information folder on each drive that you are replicating files from. Enable hidden folders and give yourself full control over all the DFSR folders.
- Kill the DFSR folder. You can’t delete this thing from Explorer, if you try you’ll get Access Denied even though you have Full Control permissions. Open a command prompt and run the following command to delete the folder:
Rd “c:\system volume information\dfsr” /s /q”
s = subfolders, q= quiet
The command will remove the database and all configuration files that the DFS Role Service uses. Run this command on every drive that had folders replicating, substituting “c” in the command for the other drive letters.
- Install the DFS Role Service. This will not require a reboot.
- Re-configure DFS with the partner servers and folder pairs. This will rebuild the DFSR database and config files.
- Validate DFS (sort of…) from the command prompt:
dfsrdiag replicationstate Shows the replication state (der.) of the partners
dfsrdiag pollad /verbose Shows if the server can poll AD successfully
After getting back in the server and rolling the “windows” down, it started up fine. The CPU usage and replication has returned to normal. It isn’t an elegant solution, but DFS isn’t either and if you can afford to do a rip and replace it may save some time.