RCLONE equivalent of cp -a?

Hello @ncw and other experts. I need to find a solution for a current cpu usage problem. Here's an overview-ish screenshot:

Currently, I use rclone on a seedbox to upload files to GDrive. In order to trick the software there that those files are present on the filesystem after a recursive rclone move command, I have another server that allows FUSE and uses rclone mount. For the longest time I've just been doing a "cp -a /mnt/rclonedir /mnt/fakedir" to copy the files without their content. The result is a perfect replica of near-zero size that I can rsync over to the seedbox (especially useful because rsync's --delete for files that have been removed). This has worked well but now I want to add additional functionality to that meager VM. Is there an rclone-native method I can have it recurse a remote and copy down the entire directory structure with empty files similar to cp -a without chewing up the processor?

You don't seem to be sorted on CPU so it's hard to tell.

The server is highly unresponsive, and the only process using more than a few percent is rclone according to htop. I mean is there a way to accomplish what I'm doing without the mount command?

I've looked at the docs but I keep copying the contents of the files too. A recursive touch?

rclone is pretty light on CPU most of the time as I haven't seen it peg mine before, but what is the number of CPUs you are working with? It is a small machine?

I normally just do a rclone move /local/path gcrypt:/files

Oh yes this is a single core $5/mo Vultr vm. 1 vcpu.

I'm just looking for a method to minimize my cpu usage across the board because I am going to add a couple other lightweight services to this vm.

You'd probably want to really limit transfers/checkers/buffer size so like 1/1/0M and slowly turn it up using --nmap. That's a teeny tiny machine :slight_smile:

Pretend I have unlimited cpu and ram resources, but no FUSE system allowed. What would you recommend under those circumstances?

I'm confused than a to what your question is.

I just run rclone copy/move and upload my stuff with all the defaults as that works well for my use case.

I can't leave the files on the source drive.
I also can't use FUSE on the source drive.

I need the equivalent of what my automation server does:

  1. I have an rclone mount at /mnt/GCrypt
  2. I run a "cp -a" for each root folder in the mount that I want replicated to create a replica directory structure including blank files in a destination, we'll call /mnt/GDirClone so it looks like cp -a -r /mnt/GCrypt/{Movies,TV,Anime\ Movies,Anime\ TV,Cartoons} but it's split into several commands that fork, and wait is used.
  3. I rsync --delete /mtn/GDirClone to the box where I can't leave files on the drive

This process fools Sonarr and Radarr into thinking those files are still there so it no longer searches for them. I've played all morning with find and cp and rsync and can't find a great way to do it outside of what I've already done but the tiny box I am using seems to lose it's rclone mount connection and it fails. I'm wondering if there is an rclone-only method to achieve this. I'm open to anything really, but I need an improvement on what I've already written. It's become insufficient in a low-resource environment.

Might be worth figuring out why your mount is failing.

But about your question, why aren't you simply using "rclone sync" or move to create the directory structure.

I'm going to go over the full setup... Maybe my approach is inefficient and you guys can make some recommendations.

  • Seedbox - remote - Feral Hosting
    • 1 TB storage locally
    • Multi-tenant configuration .... however
      • Full access to all 32 CPU cores and 128GB RAM (multi-tenancy however)
    • Software
      • Sonarr
        • Transmission
      • Radarr
        • Deluge
      • rclone but no FUSE - security reasons
        • no rclone mounts

I've got a directory here on this seedbox that I've called post-processing that is a full 0-size replica of my library. Radarr and Sonarr move files into this folder structure into their appropriate locations and I run the following as an every 5 minute cron job:

[[ $(pgrep -fu $(whoami) 'rclone move') ]] || /media/sdah1/$(whoami)/bin/rclone move --transfers 10 --min-size 1k --size-only --fast-list --tpslimit 5 /media/sdah1/$(whoami)/private/post-processing/ GCrypt: &

To go over the paremeters individually:

  • --transfers 10 - just a recommendation I've received on Plex forums
  • --min-size 1k - I am not trying to clobber my media files with empty files
  • --size-only - If a file of the same size exists, don't bother copying just remove locally, amiright?
  • --fast-list - just a recommendation I've received on Plex forums
  • --tpslimit 5 - I'm told this helps with GDrive Error 428/429 rate limiting

Now an rclone lsd GCrypt: looks like this:
-1 2019-01-06 03:10:58 -1 Anime Movies
-1 2019-01-06 03:21:26 -1 Anime TV
-1 2019-05-18 17:18:24 -1 Cache
-1 2019-01-06 03:11:37 -1 Cartoons
-1 2019-05-25 02:30:37 -1 DB_BAK
-1 2019-01-06 04:07:58 -1 Movies
-1 2019-01-06 03:18:47 -1 TV

Honestly I don't know what is with the Cache folder. I tried using rclone cache at one point, perhaps it's a remnant? I don't even know. DB_BAK is where Plex is storing it's database backups every 3 days - we can totally ignore that for this conversation I hope.

On to server 2.

  • Automation box - remote - Vultr
    • 25GB storage locally
    • single-tenant w/ root access configuration .... however
      • 1 cpu core
      • 512 MB ram
    • Software
      • FUSE
      • rclone

I have a systemd service which looks like this:

ExecStart=/usr/bin/rclone mount GCrypt: /mnt/GCrypt --allow-other --read-only --buffer-size 256M --dir-cache-time 6h --drive-chunk-size 128M --vfs-read-chunk-size 128M --vfs-read-chunk-size-limit off --use-mmap
ExecStop=/bin/fusermount -uz /mnt/GCrypt


ls /mnt
Directory_Sync GCrypt

As I look at this I should definitely lower those chunk-size values (rubber duck debugging)... but... This is recursed with the following replication script. Obviously I'm forking the smaller libraries to run in parallel, and then waiting for those jobs to complete while the two large libraries run in parallel.

/bin/cp -n --recursive --attributes-only /mnt/GCrypt/Anime\ Movies/ /mnt/Directory_Sync/ &
/bin/cp -n --recursive --attributes-only /mnt/GCrypt/Anime\ TV/ /mnt/Directory_Sync/ &
/bin/cp -n --recursive --attributes-only /mnt/GCrypt/Cartoons/ /mnt/Directory_Sync/ &
/bin/cp -n --recursive --attributes-only /mnt/GCrypt/Movies/ /mnt/Directory_Sync/
/bin/cp -n --recursive --attributes-only /mnt/GCrypt/TV/ /mnt/Directory_Sync/
/usr/bin/rsync -a --delete --ignore-existing --ignore-errors /mnt/Directory_Sync/ mysupersecretusername@prometheus.feralhosting.com:private/post-processing/

Now that recursive is failing, and I'm immediately cutting those values down [but please pretend I'm ignorant of the RAM usage that is being incurred there]. Am I doing this the best way I can? Is there not a way to rclone move those files from the seedbox and immediately touch what used to be there to leave a 0-size file of the same name? If I could do that, I could save $5/mo on not needing some separate box to run automation. I used to run this automation from a home server but I'm in the middle of a cross-country move, and my internet connectivity will be gone for some time.

The library being managed here is only around 35TB and I know rclone can handle SO much more than that. Your advice here would mean the world to me.

I lowered those values to 64 32 32 respectively and I still get disconnects - no forking this time. How do I troubleshoot this?

root@Automata:~/Automata/Feral_Utils# bash -v feral_directory_sync.sh 
/bin/cp -n --recursive --attributes-only /mnt/GCrypt/Anime\ Movies/ /mnt/Directory_Sync/
/bin/cp -n --recursive --attributes-only /mnt/GCrypt/Anime\ TV/ /mnt/Directory_Sync/
/bin/cp: cannot access '/mnt/GCrypt/Anime TV/Naruto Shippuden/Season 12': Transport endpoint is not connected
/bin/cp: cannot stat '/mnt/GCrypt/Anime TV/Naruto Shippuden/Season 13': Transport endpoint is not connected
/bin/cp: cannot stat '/mnt/GCrypt/Anime TV/Naruto Shippuden/Season 10': Transport endpoint is not connected
/bin/cp: cannot stat '/mnt/GCrypt/Anime TV/Naruto Shippuden/Season 11': Transport endpoint is not connected
/bin/cp: cannot stat '/mnt/GCrypt/Anime TV/Naruto Shippuden/Season 16': Transport endpoint is not connected

Enable debug logging (-vv --log-file /some/path)

While I look into troubleshooting the failures on my mini-instance, I still beg the question - is there a way to accomplish what I've outlined above from the box without FUSE installed? There has to be, I'm just not sure how to go about it.

With or without --vv --log-file log.txt I get the following. Let me preface this by /home/travis doesn't exist anywhere on the filesystem or within any configuration or file that I can find. What gives?

runtime.(*mheap).alloc_m(0x208cb80, 0x1, 0x2a, 0x0)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/mheap.go:977 +0xc2 fp=0x7ffd98a66358 sp=0x7ffd98a66308 pc=0x4248c2
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/mheap.go:1048 +0x4c fp=0x7ffd98a66390 sp=0x7ffd98a66358 pc=0x4588bc
runtime.(*mheap).alloc(0x208cb80, 0x1, 0x1002a, 0x0)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/mheap.go:1047 +0x8a fp=0x7ffd98a663e0 sp=0x7ffd98a66390 pc=0x424b9a
runtime.(*mcentral).grow(0x208d980, 0x0)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/mcentral.go:256 +0x95 fp=0x7ffd98a66428 sp=0x7ffd98a663e0 pc=0x417ad5
runtime.(*mcentral).cacheSpan(0x208d980, 0x7fe917426000)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/mcentral.go:106 +0x2ff fp=0x7ffd98a66488 sp=0x7ffd98a66428 pc=0x4175df
runtime.(*mcache).refill(0x7fe917426008, 0x2a)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/mcache.go:135 +0x86 fp=0x7ffd98a664a8 sp=0x7ffd98a66488 pc=0x417076
runtime.(*mcache).nextFree(0x7fe917426008, 0x208582a, 0x7fe917426008, 0x7fe917426000, 0x8)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/malloc.go:786 +0x88 fp=0x7ffd98a664e0 sp=0x7ffd98a664a8 pc=0x40b998
runtime.mallocgc(0x180, 0x136c360, 0x1, 0x20a5800)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/malloc.go:939 +0x76e fp=0x7ffd98a66580 sp=0x7ffd98a664e0 pc=0x40c2ae
runtime.newobject(0x136c360, 0x4000)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/malloc.go:1068 +0x38 fp=0x7ffd98a665b0 sp=0x7ffd98a66580 pc=0x40c6b8
runtime.malg(0xc09e00008000, 0x208f1f0)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/proc.go:3220 +0x31 fp=0x7ffd98a665f0 sp=0x7ffd98a665b0 pc=0x436ae1
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/proc.go:618 +0xc2 fp=0x7ffd98a66628 sp=0x7ffd98a665f0 pc=0x430512
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/proc.go:540 +0x74 fp=0x7ffd98a66680 sp=0x7ffd98a66628 pc=0x4301a4
runtime.rt0_go(0x7ffd98a666b8, 0x11, 0x7ffd98a666b8, 0x0, 0x0, 0x11, 0x7ffd98a66d78, 0x7ffd98a66d7f, 0x7ffd98a66d85, 0x7ffd98a66d8d, ...)
        /home/travis/.gimme/versions/go1.12.4.linux.amd64/src/runtime/asm_amd64.s:195 +0x11a fp=0x7ffd98a66688 sp=0x7ffd98a66680 pc=0x45aa3a

rclone --version and rclone --help - same output... Guess I'm reinstalling.

Appears my service didn't obey the systemctl stop and disable commands. Running now with -vv --log-file to see what the error is when it fails.

Hm. Are you using the latest stable or dev build? Rclone version command. (Ignore the Travis part.. ) that may be a bug. I'd open a issue with log and Stacktrace.

I'm logging it right now. I think it's just too intensive for the little box.

Which is why I'm back to wondering.... how, after an rclone move, can I replace the file that was removed with the same filename, 0 size. Touch after move, if you will.