Rclone w/ Mega via WebDAV - Tight Limits, Endless Retries, Crashing Megacmd

random_skrub · April 27, 2024, 6:38pm

What is the problem you are having with rclone?

See below

Which cloud storage system are you using? (eg Google Drive)

Mega via WebDAV

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

 .\rclone.exe sync LOCAL_REMOTE MEGA_REMOTE_BEHIND_CRYPT -P --check-first

Hi, have been using rclone for years and it's always been solid, so I want to thank you for the project first.

I dunno this is a bug or not, I will leave it for the devs to decide; this is prolly gonna be a long post documenting my findings and how I worked out the deficiencies in rclone's default config as well as quirks of the megacmd software. Granted, megacmd is sort of poorly-developed and -maintained, but after going through the whole process I think the dev team at rclone might wanna consider a fix.

Recently I decided to back up my data to Mega; I checked out recommended configs for Mega and also posts on this forum. I want to put crypt on top of Mega remote to encrypt all my uploads. One of the members here suggested that I use megacmd and expose the remote via WebDAV, and then set that up as remote endpoint, so that I can place it behind a crypt, which I did.

By uploading to the WebDAV remote, the temp files for intermediate transfers will first be cached under megacmd's local temp folder, and once they are fully cached (shown as 100% transferred on rclone), then megacmd will take over and upload the file in whole to the Mega server. Once the job's done, rclone gets a positive response. <- this is the typical workflow involved and although it sounds fine on paper, it had cost me days trying to figure out what went wrong. More on that later.

So I first did a couple of syncs that mostly had smaller files (single file size were on average a few hundred MB, largest being ~3GB; using command listed above with almost all settings as default). I found that after a few hours the megacmd backend will crash, bringing down the WebDAV server and stopping the transfers as a result. So I had to rerun the command multiple times to get my files uploaded. It ended up working out.

Then I started a backup routine that involved much larger files (in GBs on average; largest would exceed 20GB). I used the same command and the megacmd backend would still crash (I should note that for this routine I figured out that my upload speeds to WebDAV had to do with the drive's -- before on HDD I would max out at 5MB/s; I moved the temp folder to SSD and I now got 30MB/s upload. The upload speed to Mega servers was anywhere between 100Mbps to 1Gbps, at most times on the lower end, which also contributed to the issue); worse off, there were a few dozen files that just seemed to refuse to be uploaded. Whenever I reran the sync command above, it would always show these few dozen files pending upload. And it seems that once the files got transferred to WebDAV (100% progress on rclone) moments later rclone will retry the same files. I then decided to monitor the list of transfers on megacmd, and indeed, SAME files got repeatedly queued up for uploads.

So the reason behind the megacmd backend crashes seems to be that the repeated upload requests for the same set of files clogged up the transfer queue. And it looks to me that if instructed to upload the same file to the same location, megacmd will first remove that file on remote, attempt the transfer, then sync when it's complete. Still, that would have at least led to some progress before the crash, but for me the few dozen (large) files just failed to register on Mega's end.

Moreover, megacmd is capable of uploading one file at a time only, and it might get stuck at 100% for some time before the sync finishes. I mentioned earlier that after the tweak, my upload speed to WebDAV mostly trumped megacmd's (from local to Mega servers). For larger files in my use case, it would take ~20 mins for megacmd to finish uploading one file, not taking into account extra time needed to sync. That becomes a problem for rclone as it listens to HTTP response from Mega to make sure the file gets uploaded successfully. Without such confirmation it will deem it a connection / io error (not exactly sure here) and just reupload the file once more, within the low-level-retry limit (10 by default?)

With this in mind, I had to limit the number of synchronous transfers as well as timeouts: --transfers=2 --contimeout=20m --timeout=20m. This largely fixed my problem but at one time I again saw the reupload of previously uploaded files. I toggled -vv and noticed this line:

2024/04/27 10:30:10 DEBUG : pacer: low level retry 1/1 (error Put "http://127.0.0.1:4443/crypt/endpoint": net/http: timeout awaiting response headers)

After this the file was uploaded once more. This wasn't captured by the timeout settings because by default it only takes 1 error for rclone to attempt a retry. I think this is too tight and the error encompasses even normal behaviors.

Also, a side effect was that repeated retransfers from rclone will lead to same files being cached locally multiple times, which consumes a lot of disk space.

TLDR

When using Mega + WebDAV, files get cached locally first (rclone uploads), and then megacmd takes over and uploads them to Mega servers
rclone waits for HTTP response (from Megacmd backend / Mega servers) to ensure files are successfully transferred; if timeout is too long it considers it a connection / IO error and retry uploading the same file
Megacmd won't check for duplicates (if there is, remote files get removed first) and if the transfer queue gets too long, the server crashes
The default timeout and retries settings are very insufficient

Thanks for reading.

asdffdsa · April 27, 2024, 7:41pm

some basic info is missing:

rclone config redacted
rclone version
the exact command
the ouput of a debug log, not just a one-line snippet.

random_skrub · April 27, 2024, 8:06pm

rclone version - recent release obtained from website

rclone v1.66.0
- os/version: Microsoft Windows 11 (64 bit)
- os/kernel: 10.0.22631.3527 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.22.1
- go/linking: static
- go/tags: cmount

rclone config - just basic webdav remote

[webdav]
type = webdav
url = http://127.0.0.1:4443/crypt/remote
vendor = other

[crypt]
type = crypt
remote = webdav:
password = XXX
password2 = XXX

Exact command already mentioned above.

Debug log in more detail
Below's when a file gets uploaded successfully

2024/04/27 12:11:24 DEBUG : file.xyz: Update will use the normal upload strategy (no chunks)
2024/04/27 12:11:29 DEBUG : pacer: Reducing sleep to 11.25ms
2024/04/27 12:11:29 DEBUG : pacer: Reducing sleep to 10ms
2024/04/27 12:11:29 INFO  : file.xyz: Copied (new)
2024/04/27 12:11:29 DEBUG : pacer: low level retry 1/10 (error Mkcol "http://127.0.0.1:4443/crypt/remote/abc": EOF)
// This error seemed to not matter
2024/04/27 12:11:29 DEBUG : pacer: Rate limited, increasing sleep to 20ms
2024/04/27 12:11:30 DEBUG : pacer: Reducing sleep to 15ms

Here's when it goes wrong...(I already manually configured --low-level-retries to try to combat that IO error with only 1 allowance, but seems it's not working; both timeout flags set to 20 mins)

2024/04/27 12:22:15 DEBUG : file2.xyz: Update will use the normal upload strategy (no chunks)
2024/04/27 12:35:53 DEBUG : pacer: Reducing sleep to 11.25ms
2024/04/27 12:35:53 DEBUG : pacer: Reducing sleep to 10ms
// moments later...the file2.xyz should take 10-15 min to be cached locally via WebDAV
// below seems the full 20 min timeout had elapsed
// at the time of error megacmd should still be uploading the file in the background
2024/04/27 12:55:28 DEBUG : pacer: low level retry 1/1 (error Put "http://127.0.0.1:4443/crypt/remote/def": net/http: timeout awaiting response headers)
2024/04/27 12:55:28 DEBUG : pacer: Rate limited, increasing sleep to 20ms
2024/04/27 12:55:29 DEBUG : pacer: Reducing sleep to 15ms
2024/04/27 12:55:29 DEBUG : file2.xyz: Received error: Put "http://127.0.0.1:4443/crypt/remote/def": net/http: timeout awaiting response headers - low level retry 0/10
// 10 retries for full reuploads?
2024/04/27 12:55:29 DEBUG : pacer: Reducing sleep to 11.25ms
2024/04/27 12:55:29 DEBUG : file2.xyz: Update will use the normal upload strategy (no chunks)
// file2.xyz transfer started...again

random_skrub · April 29, 2024, 3:03am

Since applying the tweaks (a smaller --transfers flag, --contimeout and --timeout to sth like 30 mins) I've gone > 2TB files uploaded in the past day without overwhelming megacmd again.

I think

2024/04/27 12:55:28 DEBUG : pacer: low level retry 1/1 (error Put "http://127.0.0.1:4443/crypt/remote/def": net/http: timeout awaiting response headers)

this "1 strike and out" mechanism is too stringent given the time needed for megacmd's backend to upload the file separately. Is it possible to may be in "webdav" config wizard add mega as a vendor option and relax the timeout there? (Regardless of whether it goes with the default 1 min or 5 min timeout, they would not be enough for large files.) Or maybe add those instructions to the documentation like what's been done here

system · May 29, 2024, 3:03am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Rclone w/ Mega via WebDAV - Tight Limits, Endless Retries, Crashing Megacmd

What is the problem you are having with rclone?

Which cloud storage system are you using? (eg Google Drive)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)