Rclone lsf on Pcloud taking so long

What is the problem you are having with rclone?

Rclone lsf on Pcloud taking so long

Run the command 'rclone version' and share the full output of the command.

rclone v1.66.0

  • os/version: debian bookworm/sid (64 bit)
  • os/kernel: 6.5.0-28-generic (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.22.1
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Pcloud

The command you were trying to run (eg rclone copy /tmp remote:tmp)


rclone lsf --hash "SHA1" --files-only --recursive --csv --checkers=8 --fast-list --links --human-readable --ignore-errors --ignore-case --checksum --low-level-retries=100 --retries=3 --tpslimit 100 --verbose=2 --log-file "~/Downloads/Rclone/rclone-lsf-log-file.log" "PcloudChunker:/LARGE-DIRECTORY" > ~/Downloads/Rclone/remote-metadata.csv

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.


[Pcloud]
type = pcloud
client_id = XXX
client_secret = XXX
hostname = eapi.pcloud.com
username = XXX
password = XXX
token = XXX

[PcloudChunker]
type = chunker
remote = Pcloud:
description = Chunker for my remote Pcloud, each chunk will be of the size specified in chunk_size parameter in rclone.conf
chunk_size = 1Gi
hash_type = sha1

### Double check the config for sensitive info before posting publicly


A log from the command that you were trying to run with the -vv flag


2024/04/26 17:11:09 INFO  : Starting transaction limiter: max 100 transactions/s with burst 1
2024/04/26 17:11:09 DEBUG : rclone: Version "v1.66.0" starting with parameters ["rclone" "lsf" "--hash" "SHA1" "--files-only" "--recursive" "--csv" "--checkers=8" "--fast-list" "--links" "--human-readable" "--ignore-errors" "--ignore-case" "--checksum" "--low-level-retries=100" "--retries=3" "--tpslimit" "100" "--verbose=2" "--log-file" "/home/laweitech/Downloads/Rclone/rclone-lsf-log-file.log" "--config" "/home/laweitech/.config/rclone/rclone.conf" "PcloudChunker:/Z-LARGE-DIRECTORY"]
2024/04/26 17:11:09 DEBUG : Creating backend with remote "PcloudChunker:/Z-LARGE-DIRECTORY"

2024/04/26 17:11:16 DEBUG : Using config file from "/home/laweitech/.config/rclone/rclone.conf"
2024/04/26 17:11:16 DEBUG : Creating backend with remote "Pcloud:/Z-LARGE-DIRECTORY"
2024/04/26 17:11:17 DEBUG : fs cache: renaming cache item "Pcloud:/Z-LARGE-DIRECTORY" to be canonical "Pcloud:Z-LARGE-DIRECTORY"
2024/04/26 17:11:17 DEBUG : Reset feature "ListR"
2024/04/26 17:11:18 DEBUG : 5 go routines active

The directory 'Pcloud:Z-LARGE-DIRECTORY' contains 1,348,080 items, totalling 415.0 GB

I had it run for 12 hours 56 minutes and ~/Downloads/Rclone/remote-metadata.csv contains 54,280 records.

How do I get it to run faster should I Increase the checkers? e.g: --checkers=8000

welcome to the forum,

pcloud is known to be very slow.
one of the reasons is that ListR is not supported.

Thank you for the response. Will increasing

--checkers=8000

help in this situation?

rclone lsf is for listing files, so not sure what you want to check?

Alright well noted, that is with rclone lsf , --checkers is not used.

So I basically have to wait since --fast-list does not help because Pcloud does not support ListR

Is my understanding right?

yes, that is my understanding, can be seen in the rclone docs.
https://rclone.org/overview/#optional-features

also, given how slow pcloud is, to make it all the more worse, you are using chunker, which creates many, many small files, so listing takes longer.
is there a reason to use chunker with pcloud?

i would try a much simpler command, without
--checkers=8 --fast-list --ignore-errors --checksum --low-level-retries=100 --retries=3 --tpslimit 100

and you limit its speed explicitly... any reason to do it?

You should try the same command without all your flags - many of them even have no meaning for lsf. Did you chose them randomly?

So run:

rclone lsf --hash "SHA1" --files-only --recursive --csv "PcloudChunker:LARGE-DIRECTORY"

and see how it goes.

rclone lsf --hash "SHA1"
does that actually output the hash?

pehaps you want
rclone lsf --hash=SHA1 --format=ph

fwiw, for testing, pick a smaller directory

@asdffdsa @kapitainsky Thank you guys, your responses are very helpful.

to answer your questions.

I built a bash script using rclone to handle my backups.

default_flags="--checkers=8 \
--links \
--human-readable \
--config "$config_folder/rclone.conf" \
--ignore-errors \
--ignore-case \
--checksum \
--low-level-retries=100 \
--retries=3 \
--tpslimit 100"

I had defined these flags as default for my rclone instances such as
rclone sync and rclone copy

firstly > also, given how slow pcloud is, to make it all the more worse, you are using chunker, which creates many, many small files, so listing takes longer.

is there a reason to use chunker with pcloud?
The reason is that "PcloudChunker:/Z-LARGE-DIRECTORY" contains very large files as well some as huge as 29GB in filesize. During upload if the connection fails the upload is not resumed and restarted all over again wasting my bundled internet data. So I decided to chunk files larger than 1GB.

using these codes

full_backup_source='/home/laweitech/Z-LARGE-DIRECTORY'
full_backup_destination='Pcloud:/Z-LARGE-DIRECTORY'
full_backup_destination_chunked='PcloudChunker:/Z-LARGE-DIRECTORY'

chunking_min_size="1G"
larger_files_only_transfer_flags="--min-size $chunking_min_size --transfers=1"
smaller_files_only_transfer_flags="--max-size $chunking_min_size --transfers=4"

# so upload smaller files
rclone copy $default_flags $smaller_files_only_transfer_flags $full_backup_source $full_backup_destination

# for larger files 
rclone sync $default_flags $larger_files_only_transfer_flags $full_backup_source $full_backup_destination_chunked

I sucessfully backed up the whole folder /home/laweitech/Z-LARGE-DIRECTORY to Pcloud but it took so long.

Also I faced rate limiting issues during the copy and sync operations, so I had to do alot of research and found solutions on using the flags --checkers=5000 --fast-list --ignore-errors --checksum --low-level-retries=100 --retries=3 --tpslimit 100. Added all these to ensure that the backup completes successfully and if it fails should retry till it is successful.

The issue with my algorithm was that I had to go through the whole lengthy process again which takes about 2-3 days to complete.

So I decided to upgrade the script. This time around the script will go through these steps:

  • create DB of local files meta data
  • create DB of remote files meta data (using rclone lsf $default_flags --hash "SHA1" --files-only --recursive --csv --fast-list "PcloudChunker:/LARGE-DIRECTORY" > ~/Downloads/Rclone/remote-metadata.csv)

Backup process

  • Check DB for files to copy - - > copy list
  • Check DB for files to sync - - > sync list
  • Check DB for files to delete - - > to be deleted list

The idea is that most of the work will be done locally on my computer and the script will only contact remote if it needs to build remote file metadata database, upload, move, copy or delete files.

To list the files I had to use the chunker or else chunked files will be treated as individual files instead of composite files.

So based on your responses I have to remove alot of my defined default flags from rclone lsf

Also with rclone copy and rclone sync i used the flag --checkers=5000 I don't really know if that helped in anyway to speed up the process but it rclone used alot of ram.

So currently the script is on create DB of remote files meta data using rclone lsf which is now taking long to complete.

The idea is that it should be better than the normal rclone copy and rclone sync of "PcloudChunker:/LARGE-DIRECTORY"
which takes a long time because it will check each and every file with pcloud which is a lenghty process.

with this the check is done locally and use rclone sync --files-from ${sync_files_from_file} to upload the files.

Thank you, yes I used rclone lsf --format 'hstp' --hash 'SHA1' and tested everything with a small directory before using it on my main directory.

you must have the worst internet on planet earth, to run such a complex setup.

not sure that is correct, don't you want rclone copy, not rclone sync

the best way to do that is --max-age.

and i think this is a better solution
https://forum.rclone.org/t/recommendations-for-using-rclone-with-a-minio-10m-files/14472/4?u=asdffdsa

Yes the internet is not great where I am so I had to use this approach. Thank you for your help. for now I will just wait for rclone lsf to complete building the metadata, this is where my issue lies and see how things go from there.

I use rclone copy for the smaller non chunked files first and then afterwards use rclone sync for larger files chunked, this prevents chunked files from being deleted, sync also deletes files and folders that don't exist in source folder. It works for my current setup so it's fine.

ok, if you need more help, let us know.

1 Like

Yes I am using similar algorithm to determine which files are to be deleted, and copied.

it is just that rclone lsf is slow for my case Pcloud.

ok, good, just want to be sure you saw that other topic

@asdffdsa @kapitainsky

The Idea of this approach is to make the backup process Stateful since rclone is Stateless.

So after building the files databases it will be used for subsequent backups and the database updated accordingly.

This will also help with slow backend in my case Pcloud.

yes, for that, i use veeam agent which creates snapshots. each backup is a single file, with just the changed blocks.
so if i have a 50GiB file on local, and i change a single byte, the snapshot is going to be very small.

whereas, rclone has re-upload the entire 50GiB file.

normally, i use do not use rclone for first-copy backups.
use rclone to copy veeam and 7z files to cloud.

Nice Will look into your approach, thank you