Most efficient way to sync new files to a S3-compatible iDrive e2

What is the problem you are having with rclone?

I want to perdiodically (every day) sync the contents of a Thunderbolt connected drive to a iDrive e2 bucket, only copying the new files.

I have tryed with the sync command but it takes a lot of time to identify and copy the changes.

Run the command 'rclone version' and share the full output of the command.

rclone v1.59.2

  • os/version: darwin 12.6 (64 bit)
  • os/kernel: 21.6.0 (arm64)
  • os/type: darwin
  • os/arch: arm64
  • go/version: go1.18.6
  • go/linking: dynamic
  • go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

iDrive e2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

./rclone sync /Volumes/Fotos/ e2:/rgomez-fotos-backup/ -P --exclude-from ~/Documents/exclusiones-rclone.txt --fast-list

The rclone config contents with secrets removed.

[e2]
type = s3
provider = IDrive
access_key_id = 
secret_access_key = 
endpoint = m6b7.da.idrivee2-33.com

A log from the command with the -vv flag

2022/09/28 18:23:17 DEBUG : rclone: Version "v1.59.2" starting with parameters ["./rclone" "sync" "/Volumes/Fotos/" "e2:/rgomez-fotos-backup/" "-P" "--exclude-from" "/Users/rodrigo/Documents/exclusiones-rclone.txt" "--fast-list" "-vv"]
2022/09/28 18:23:17 DEBUG : Creating backend with remote "/Volumes/Fotos/"
2022/09/28 18:23:17 DEBUG : Using config file from "/Users/rodrigo/.config/rclone/rclone.conf"
2022/09/28 18:23:17 DEBUG : Creating backend with remote "e2:/rgomez-fotos-backup/"
2022/09/28 18:23:17 DEBUG : fs cache: renaming cache item "e2:/rgomez-fotos-backup/" to be canonical "e2:rgomez-fotos-backup"
2022-09-28 18:23:17 DEBUG : $RECYCLE.BIN: Excluded
2022-09-28 18:23:17 DEBUG : .DS_Store: Excluded
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:      2m13.0s

In this last example, there are no changes to sync. But it will take probably the same 16 hours it took to backup the first time the ~2TB and ~132,000 files in the original drive to traverse. At least yesterday I tried it did that, until I returned to the computer and canceled the task.

This are just tests, I still don't have the real data that will be backed up to the cloud, but I want to see how rclone works and how should I create a script or something similar that could be just double-clicked and start the sync of new files on the drive to the cloud.

There will be around 500Gb or maybe 1TB of new files to sync some days. At the end, the total size of data will be around 32TB, but the files will not be modified, just new files will be created and added to the drive.

So, how should I set this up so that I could:

  1. plug the drive at night
  2. call rclone, preferably with a predefined script, that would sync the drive with the cloud
  3. the synching should be finished by morning so...
  4. I can unplug the drive and take it with me to write the new data.

Thanks in advance.

hello and welcome to the forum,

that can be reduced by
on a daily basis, use --max-age=1d
once a week or as needed, do a full sync without using --max-age

Hi Rodrigo,

The files transferred before you cancelled the task will be compared and then skipped the next time you sync, so no worries.

You didn't tell how much was synced when you cancelled so let us assume it was 50% (1TB) in 16 hours. That would correspond to 17MB/s (1TB/16/60/60s) which roughly corresponds to a 170 mbps upload speed.

This sounds reasonable fast, but may not fully saturate your connection. If not, then try doubling the number of concurrent transfers by adding --transfers=8 to your command. Note: You may meet the limitations of your external drive (or iDrive) before the limitations of your internet connection.

I suggest you add --log-file=photo_uploads.log -v to you command to keep track on the uploads. You may also want to add --progress --stats=3s to see the progress.

Your log shows that it only took 2 minutes to identify the changes, so I would initially focus on the upload speed discussed above.

You may be able to make your sync quicker by using concurrent checks and starting the transfers before the checks have finished. You can do this by removing --fast-list, you can find the detailed description in the docs. Note: I assume you have free API calls at iDrive.

You may be able to increase the speed of the checks by using more checkers, e.g. --checkers=16; except if using --fast-list.

If at some point the checks still take to long then you may consider switching to the top-up approach which may be quicker when the number of files to be transferred are considerable less than the number of files in the target. I cannot predict when, so consider it when you start to notice a considerable delay (e.g. 10 minutes) before the transfers start, or long pauses in your transfers where rclone only performs checks.

The top-up approach is explained in this post:

In your case the --max-age should be a duration of at least twice the maximum length between syncs (e.g. 240h) and the full syncs could then be every week or month.

Hello asdffdsa, Ole.

Thanks for the replys!

I did try yesterday running the following:

./rclone sync /Volumes/Fotos/ e2:/rgomez-fotos-backup/ -P --exclude-from ~/Documents/exclusiones-rclone.txt --fast-list --max-age=2d

And it's been running for almost 15 hours without finishing. There are no changes that pass that filter, so I don't know what's wrong or why it has not finished.

The first run, where there was nothing on the bucket, took 15h52m, or maybe something less, to copy the 2.082TiB. But the command did not finish by itself, I "had" to Ctrl+C. I have a 500mbps upload speed, and it maybe used around 350-400mbps in average.

I don't know if that's normal? It would seem to be something similar to all the other times I have run it: it does not finish, no matter if there are changes or not. I thought that maybe that was by design, that it would keep running and searching for changes, but for what I have read here and on the docs that doesn't seem to be the case... so I don't know if there is something I am not doing correctly.

The files here in this test are DNG mostly, not too big, but the real info that will be uploaded will be recordings from an Arri camera, so they will be bigger but less quantity.

I'll try again logging the run, right now I don't know if it's doing something or not.

I do have the --progress flag, but not the --stats, I'll try also with that.

Thanks!

those flags are just for controlling console output, has nothing to do with your issue.

to understand what rclone is doing, need to use a log file
and maybe --dry-run, rclone will simulate a copy, but will not copy any files.

I think I found what was the cause for the command to keep running: there was an error on the previous runs about a directory that could not be accesed: System Volume Information.

I added that to the exclusions and at least the copy command finished in 1m49s.

I'm trying again with the sync to see what happens. This with the --max-age=2d flag, which would be perfect for the day to day uploads.

that makes sense, as by default, rclone will retry the entire sync three times.
https://rclone.org/docs/#retries-int
"Retry the entire sync if it fails this many times it fails"

Yes, this was it.

This last run for sync finished in 9m14s without nothing to transfer, which was expected. This with 24 checkers.

I'll do more tests but I think this was the problem.

Thanks a lot!

That is pretty good, perhaps you can get 50 mbps higher if increasing --transfers. I guess --transfers=6 would be sufficient. You could test for other bottlenecks by trying speedtest.net. I would consider anything around 80% of the speedtest rating to be good, unless you want to spend a lot of time on optimizations.

I recommend using using a similar folder structure and file size to get a realistic picture, things sometimes change dramatically with the number of file creations and number of folders and files.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.