I’m copying data from one Remote(Google Drive) to another one(AWS S3).
rclone copy --ignore-existing -v GoogleDriveRemote:Files S3Remote:MyBucket/Backup-Documents
My source Google Drive has recursive type folder structure, like, Folder(s)>Folder(s)>Folder(s)>>File(s).
Before putting this into place, I tested the above command and I got what I needed, ie. Copying only new files from source to destination and keeping the existing files(on destination) as they are.
But now, when I ran this command second or third time, it is showing:
2017/06/19 02:01:00 INFO : xxxxxxxxx:Copied (new)
for the file, it has already copied last time.
Also, the speed is very slow, starting with as low as 481 Bytes/s
It is taking too much time. Here Transferred is showing almost 100 GB, but only around 40 GB of files got copied.
I have gone through https://github.com/ncw/rclone/issues/517 but did not understand thoroughly.
I’m running the cron-job including the rclone copy on Ubuntu 14.04 64-bit on EC2 instance.
Check google drive for duplicates using rclone dedupe GoogleDriveRemote:Files - that is likely the problem.
Dedupe will let you fix the duplicates also - see the docs.
If you want it to go faster try increasing --checkers. If you use --checksum or --size-only it will run much faster as it doesn’t have to do another HTTP query on S3 to check the modtime.
Also try the latest beta. If you have enough memory then you can use --fast-list which will save you on S3 transactions and may or may not be faster.
Thank You @ncw for the reply, I crosschecked for duplicates but it is not there.
The newly noticed issue is, it is copying the same file multiple times. I confirmed this in S3, the file’s last modified time was continuously changing. I have attached a screenshot regarding this, please have a look.