Now, I’m not sure when, but at some point copying files became a really big deal. For example, here’s Windows PowerShell. Brand new technology. You can ask for help, and it does have a Copy-Item command. So if you look at the help for that.
Now the problem with this is that it doesn’t really do anything that’s all that different from what Windows Explorer would do if you were just dragging the files from one place to another. In other words, it sets the same ACL as a brand new copied file would have, meaning the copy of the file is going to inherit its parent folder permissions. It’s not going to copy the ACL from the original file.
You know, in a lot of cases, when you’re migrating files this does you no good. You could, I suppose, switch over – or fall back – to good old xcopy. Xcopy can copy file ownership and ACL information. It’s got some problems, though. If you try to copy into an environment where the ACL doesn’t match up – Let’s say you had a file that had some local permissions assigned to it, and you’re copying to a different machine that doesn’t have that local account, or if you’re copying across domains and the SIDs don’t all match up, which they won’t, well this all falls apart.
Xcopy is also single-threaded, which means that its really, really slow if you’re migrating a lot of files. So I thought, “You know, this can’t be that hard.” So I jumped over, still using PowerShell here, and wrote myself a little script. It’s called Migrate-File, and here’s what it does: It accepts two inputs, one which can be one or more files. That’s what this InputObject is going to represent. And then the TargetPath, that’s the string where you want to copy it to.
For each file you give this thing, it starts off by grabbing its access control list. It pipes the file to Get-ACL, and that resulting access control list is stored here. Then it goes ahead and copies the file, to whatever destination you specified, and it passes that file through to the Set-ACL cmdlet and puts that original ACL back onto it. So I ran this, and it occurred to me, I actually run into some of the same problems I had with xcopy.
For one, PowerShell is single-threaded, which means if I’ve got millions of files, then this is going to take millions of hours. If I’ve got ACL information, access control entries, that don’t exist in the target environment – like a cross-domain copy, or if I have local permissions – that won’t happen. There’s also no logging.
And I started to think, you know what? I guess I could start to build that, but you start thinking about all the things you need: You’ve got a log. You’ve got to watch for all the open files, because otherwise it’ll get an error because you can’t copy them, so you’ve got to track that error, log that information, maybe retry the file later. Getting this to run across multiple threads so it’ll perform a little better. All of those things just become increasingly difficult and piling on.
I started to look at robocopy, which is not even a Resource Kit utility any more – it’s built right in. You’ve got robocopy, and at least it does a log, and I guess I could set it to retry. I started building out one of these command lines for robocopy, with all of these different options. It winds up being incredibly long and complicated. Still runs fairly slowly, although you can get it to run across multiple threads at least. There’s a bit for that.
So you have it recurse subdirectories. This is how you get it to copy files with security. You can attempt to fix security on files. You can have it attempt to skip files, so if a source file isn’t available you can work with that way and have it skip the file and log it. You can do multithreaded copies.
It starts to get really complicated. It can’t take care of, for example, the ACL information not matching, local things going across machines or domain accounts going across domains – it can’t fix that. It seems like you run into all these edge cases. Something like this you could spend a couple of hours just getting the perfect command line written, and it works for 90% of your files, but you still have to spend this incredible amount of time on the other 10%.
Sometimes it makes sense to see if somebody’s built a tool that incorporates some of that in, but this is what you run up against when you’re doing massive file migrations.