When running rclone to sync multi-GB data between high performance file systems in Linux (Lustre / GPFS) it seems to only use 32k read/write sizes (verified with strace). Can this be changed (either via parameter or source) to use something much larger (like 1MiB)? I had tried --buffer-size and it doesn't seem to effect the read/write size. I'm not sure this is really a primary use case for rclone, but the tool is generally desirable for what we are doing (users moving large data sets).
Run the command 'rclone version' and share the full output of the command.
Using the rclone rpm in RedHat 8.4.
Which cloud storage system are you using? (eg Google Drive)
Local storage (GPFS / Lustre)
The command you were trying to run (eg rclone copy /tmp remote:tmp)
I built the latest version and it did help some. I actually didn't realize there was a post copy checksum and disabling that was a significant improvement. I also dropped multi-threaded transfers in favor of just increasing --transfers to have more file level parallelism. Thanks!
Using 8 transfers / checkers and ignore-checksum (to prevent the post-copy verify, which for a local copy shouldn't be necessary I'm assuming) seem required to hit a stable 1.2BiG/s. A few samples seem to show the buffer around 64M or more helps a small amount.
We will need to be able to resync data sets as we do an initial sync and then periodically update until we have a window to switch to the new area so it's important that we don't unnecessarily recopy data. One quirk I have seen is that some files seem to trigger a checksum even though the file is unchanged:
I'm not sure why this occurs. It may have something to do with differing storage technologies on the source and destination. Even if I add "--modify-window 1s" it still sometimes does the seemingly unnecessary read / checksum (not sure if it's doing that to the source or destination).
My thought is I can use rclone to do the bulk transfers and then use a final rsync to sync up permissions, acls, etc.