I have a 25TB dataset which is made up of hundreds of thousands of files where about 20% of the files by count make up 80% of the total disk usage. I’ve setup an encrypted Google Drive remote which works really well on the large files (completely saturating my 40Mbps uplink using default settings), yet the small files don’t seem to go anywhere close. I’ve tried a number of different setting for the options below, no combination seems to go close to saturating my link so far;
Wondering if I am missing something or should be taking a different approach. Any suggestions?
I am currently using a Windows 2012 R2 virtual machine with 4GB ram (16GB available on host). Rather than reinvent the wheel, just wondering if there were any well regarded community developed scripts that can be scheduled to backup & verify?
Oh…bummer. Was trying to get my head into all the rclone parameters and was way off the mark. Was wondering why uploads seemed to shot off like a rocket and then stall. The 3 files per second limit would explain it.
I guess about the only thing you could do would be to TAR / ZIP files up into batches prior to being uploaded. Not sure how you’d go about doing this in an automated way that still guarantees data integrate. Should it could be done though.
Fair point. I have about 340,000 files in 25,000 directories to backup, I’ll see how it goes over the next week or two.
I noticed that when I kicked off rclone to start copying all my data the progress stats are only listing out about 10,000 files (10TB in total). Would there be any reason it isn’t calculating based on all the files/data that I’ve source path I specified?