Default for copy/move to local HDDs causes extreme fragementation and slow reads after transfer is complete

raomaosa · September 10, 2023, 11:28pm

What is the problem you are having with rclone?

Default transfer settings to local hard drives uses multithreaded streaming and sparse files from what I read. This tanks performance and makes reading the file afterward take forever due to the extremely high fragmentation. Moving the files after to clean up the fragmentated mess takes longer than the download itself to two locally attached SATA3 speed drives.

~3TB transfer to a WD Gold 18TB 7200rpm CMR drive that previously was 0% fragmentated prior to test via a 1000mbps internet connection.

Performance with default settings is very poor,
~20-40MB/s read after completion, 50MB/s during download, 100% disk busy, 050% drive fragmentation after

(Log does mention the commands needed but isn't very clear on when to use it, Writing sparse files: use --local-no-sparse or --multi-thread-streams 0 to disable)
With mult-threaded streams off and sparse file off --multi-thread-streams 0 speeds improve a lot

120MB/s read after completion, >80MB/s during download, 30% disk busy, 000% drive fragmentation

I think if rclone did a best effort detection if the local drive was an SSD or HDD it would improve user experience and likely lead to less hard drive wear and tear. From what I read people still wanted multi-threaded downloading but not sparse allocated files but for hard drives you probably should just have one thread to disk and multiple threads downloading into a cache. For SSDs it is the opposite you probably want like 20+ writing to SSD threads or more to get the queue depth up as high as possible while HDDs really don't like doing that which might be more than is reasonable for download threads.

Users might think their hard drive slow or some other problem and while there is a log message to say which flag to use most bulk storage will be to hard drives so I'm not sure if it make sense to by default cause extreme fragementation. It doesn't really tell you if you have a hard drive you should definitely choose one of the options for best performance.

Windows has the MediaType flag for the disk type and if it is unknown and the user doesn't specify the target disk type it should probably ask what type of local drive is being targeted. There are probably different ideal settings for single HDD, SSD, NVMe SSD, Arrays of HDDs/SSDs. Problem is that even after the transfer is complete it can take 24+ hours to transfer the extremely fragmentated data to another disk which fixes the fragementation or even longer to defragment the drive in place. Checking this media type flag the targeted local drives all advertise as HDDs in windows.

Run the command 'rclone version' and share the full output of the command.

rclone v1.63.1

os/version: Microsoft Windows 10 Pro for Workstations 22H2 (64 bit)
os/kernel: 10.0.19045.3324 (x86_64)
os/type: windows
os/arch: amd64
go/version: go1.20.6
go/linking: static
go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

Slow highly fragmentated command
rclone move GoogleDrive1: T:\Transfer --delete-empty-src-dirs --create-empty-src-dirs --fast-list --progress

vs

Much faster ~2x speed improvement
rclone move GoogleDrive1: T:\Transfer --delete-empty-src-dirs --create-empty-src-dirs --fast-list --progress --multi-thread-streams 0

The rclone config contents with secrets removed.

[GoogleDrive1]
type = drive
client_id = [Redacted]
client_secret = [Redacted]
scope = drive
acknowledge_abuse = true
token = [Redacted]
team_drive =

A log from the command with the `-vv` flag

Changed command to copy instead of move.

rclone-v1.63.1-windows-amd64>rclone copy GoogleDrive1:Work P:\Test --create-empty-src-dirs --fast-list --progress -vv
2023/09/10 16:01:11 DEBUG : rclone: Version "v1.63.1" starting with parameters ["rclone" "copy" "GoogleDrive1:Work" "P:\\Test" "--create-empty-src-dirs" "--fast-list" "--progress" "-vv"]
2023/09/10 16:01:11 DEBUG : Creating backend with remote "GoogleDrive1:Work"
2023/09/10 16:01:11 DEBUG : Using config file from "C:\\Users\\Username\\.config\\rclone\\rclone.conf"
2023/09/10 16:01:12 DEBUG : Google drive root 'Work': 'root_folder_id = [Redacted]' - save this in the config to speed up startup
2023/09/10 16:01:12 DEBUG : Creating backend with remote "P:\\Test"
2023/09/10 16:01:12 DEBUG : fs cache: renaming cache item "P:\\Test" to be canonical "//?/P:/Test"
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Need to transfer - File not found at Destination
...
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Need to transfer - File not found at Destination
2023-09-10 16:01:12 INFO  : Writing sparse files: use --local-no-sparse or --multi-thread-streams 0 to disable
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Starting multi-thread copy with 4 parts of size 10.000Gi
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Starting multi-thread copy with 4 parts of size 10.000Gi
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Starting multi-thread copy with 4 parts of size 10.000Gi
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Starting multi-thread copy with 4 parts of size 10.000Gi
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 4/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Need to transfer - File not found at Destination
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 1/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 4/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 3/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 2/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 1/4 () size 10.000Gi starting
...
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 1/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 3/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 3/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 1/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: multi-thread copy: stream 2/4 () size 10.000Gi starting
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Need to transfer - File not found at Destination
...
2023-09-10 16:01:12 DEBUG : [RedactedFilename]: Need to transfer - File not found at Destination
2023-09-10 16:01:12 DEBUG : Local file system at //?/P:/Test: Waiting for checks to finish
2023-09-10 16:01:12 DEBUG : Local file system at //?/P:/Test: Waiting for transfers to finish
2023-09-10 16:01:13 DEBUG : [RedactedFilename]: multi-thread copy: write buffer set to 128Ki
...
2023-09-10 16:01:13 DEBUG : [RedactedFilename]: multi-thread copy: write buffer set to 128Ki
2023-09-10 16:01:52 INFO  : Signal received: interrupt

ncw · September 11, 2023, 8:29am

We've just re-worked the multi-thread streaming for the latest beta
(shortly to be released as v1.64) and one of the consequences of this is that it will do the downloads in 64M chunks rather than in chunks of 1/4 of the size of the file. I think this should improve the the fragmentation of the files - I'd be interested if you give it a try.

This is discussed here in Revisit sparse file creation on Windows · Issue #4245 · rclone/rclone · GitHub (check the referenced issue also)

It may be possible to improve this further by buffering those 64M chunks in memory and only writing one at a time which would be a relatively easy tweak.

I don't understand why sparse files are causing a performance problem for you, unless you are using EXFAT or VFAT in which case they will cause a problem as sparse files aren't supported and the OS writes the whole file first.

That is an interesting thought. I guess failing the auto detection rclone could have a --local-disk-type hdd flag.

What kind of disks are you using when you've seen this performance problem and which file system have they been formatted with?

raomaosa · September 12, 2023, 2:40am

Thanks for the response, I tried the latest beta release,

rclone v1.64.0-beta.7353.e8879f3e7

os/version: Microsoft Windows 10 Pro for Workstations 22H2 (64 bit)
os/kernel: 10.0.19045.3324 (x86_64)
os/type: windows
os/arch: amd64
go/version: go1.21.1
go/linking: static
go/tags: cmount

Tested with the command

rclone copy GoogleDrive1:test A:\test --create-empty-src-dirs --fast-list --progress -vv

Edit: I retested to double check and the download speed improved quite a bit with the beta and default settings. Updated the results to the better 2nd test of the beta version. Ended the test a bit early as it was pretty clear that the read speed would not improve and the partial file on disk was already very fragmentated.

It does seem to go faster in download and no longer pegs the hard drive on initial testing of the beta version I think I messed up and ran two tests at once obviously ruining things during the download tests for each settings type.

I did notice with my previous run of the stable version between files the verification also took forever compared to the download which I thought was odd and is likely the read aspect being slowed down after the transfer finishes.

Download Speed is ~80-90MB/s ~35% disk busy, which is much better
Files are still extremely fragmented ~180000 fragments for a 33GB file.
Verification is taking longer than the download and will take ~8x the download time.
Verification Speed ~12MB/s, 100% disk busy, hard drive is having a bad day with the 180,000 fragments.
Fastcopy Read Speed ~60MB/s, 100% disk busy, not ideal speeds

Trying the beta with the disabled threaded downloading and as a result also no sparse file resulted in the following speed
Download Speed is ~80-90MB/s, ~30% disk busy, contig says the files have 1 fragment which is ideal.
Didn't catch the verify read speed.
Fastcopy Read Speed ~260MB/s, 100% disk busy, faster than crystal disk mark benchmark and the desired state after a download.

Tested downloading to a PCIe NVMe ssd and got 90MB/s limited by the network with disk busy time of 0-3% fragmentation obviously doesn't matter for this kind of drive

Volume is formatted with NTFS and has USN journaling without range tracking enabled for use with Everything search.

C:\Windows\system32>fsutil fsinfo ntfsinfo A:
NTFS Version      :                3.1
LFS Version       :                1.1
Bytes Per Sector  :                512
Bytes Per Physical Sector :        4096
Bytes Per Cluster :                65536
Bytes Per FileRecord Segment    :  1024

Benchmark of the single HDD all in single drives no raid just manage file replicas manually
Used WD Gold 18TB, Seagate Seagate Exos X16 16TB, drives are all in good health 0% fragmentation before testing
via USB3.2 5/10gbps controllers to SATA (QNAP TR-004, Sabrent DS-SC5B)
on an Intel or ASmedia root controller no external hubs
thought it was the connection causing problems but benchmarking the drives as shown below they have plenty of bandwidth and swapping enclosures, controllers, ports, cables didn't improve things.

Q is queue depth and T is number of threads. So with threading on even though the disk has enough sequential speed it doesn't make it there.

CrystalDiskMark 8.0.4 x64 (C) 2007-2021 hiyohiyo
[Read]
  SEQ    1MiB (Q=  8, T= 1):   142.122 MB/s [    135.5 IOPS] < 58662.81 us>
  SEQ    1MiB (Q=  1, T= 1):   140.584 MB/s [    134.1 IOPS] <  7450.52 us>
[Write]
  SEQ    1MiB (Q=  8, T= 1):   144.056 MB/s [    137.4 IOPS] < 57723.52 us>
  SEQ    1MiB (Q=  1, T= 1):   139.245 MB/s [    132.8 IOPS] <  7521.85 us>

Profile: Default  Test: 1 GiB (x5)  Time: Measure 5 sec / Interval 5 sec

I read the github issue thread and I will let the download finish with the different settings and compare how many fragments the new beta creates vs the current latest stable release. There is definitely a big improvement for downloading although disabling threaded writing seems to work about the same for large files for downloading.

The increased chunk size does not seem to reduce the fragmentation sigificantly in the beta defaults and this does still cause reads afterwards to run extremely slowly currently. My first assumption is that windows is interleaving the IO from the 4 writing threads causing the low chunk count to balloon into the crazy 180k fragments for some reason.

The penalty after downloading is persistant on disk as well and with extremely large files it may be impossible to defragment in place and the solution is to move the files to another disk in a serial manner that will reassemble the fragments (very slowly) and bring the speed back up. Since I moved the files to local disk it was impossible to redownload it with the correct settings and I just had to let it do the slow local to local copy with a tool called fastcopy.

ncw · September 12, 2023, 9:35am

I think this is because the new code doesn't write the end of the file first, it only writes blocks within a few multiples of --multi-thread-chunk-size. I was hoping since that number is relatively small that Windows would coalesce the writes and write them in sequence, but obviously not.

When downloading files to the local backend these flags will come into play

  --multi-thread-chunk-size SizeSuffix          Chunk size for multi-thread downloads / uploads, if not set by filesystem (default 64Mi)
  --multi-thread-write-buffer-size SizeSuffix   In memory buffer size for writing when in multi-thread mode (default 128Ki)

Youl could try --multi-thread-write-buffer-size 0 to disable the write buffer which might help. You could also try setting it large, maybe to the same value as --multi-thread-chunk-size.

It might be that --multi-thread-chunk-size is not optimum so you could try halving or doubling that and see what difference that makes.

Ouch!

That makes the fragments 180kB which is way too small. This is quite close to --multi-thread-write-buffer-size which gives me hope that increasing that will decrease the number of fragments.

Do you think this could be having an effect? It sounds like it might since rclone is writing the blocks out of order.

I think you are right. Looking at the code, the writes will be done in --multi-thread-write-buffer-size chunks.

If you could have an experiment with the settings above then we can come up with an optimum set for HDD.

I don't know if I can cross platform detect whether a disk is HDD or not, but it looks possible if hard work...

raomaosa · September 13, 2023, 7:30am

Thanks for the info those other parameters do sound like they might affect the actual disk fragmentation.

I do agree finding out if the drive is an SSD or not is not the same on each OS which makes things complicated. Windows has the media type which is easy, linux has the rotational speed 0=SSD thing which is maybe easy, MacOS which also seems to have a media type thing that can be accessed. Although it would possibly prevent a hard drive from dying early (they get very clickly clacky and warmer during 100% busy pure random workload) and would greatly improve default performance when downloading to slow HDDs. This detection would obviously break for RAID/network mounted/crappy USB enclosures... and many other cases.

4, random 1GB files downloaded from google drive to a local HDD over a 1gbps link.

USN Journal Disabled Sanity Check

rclone copy GoogleDrive1:test A:\Test --create-empty-src-dirs --fast-list --progress -vv

Sanity Check still very fragmented baseline default config. I don't think the USN journal will have much traffic because in the mode it is in only new files generally create any entry and the way it is allocated keeps it far away from the data part of the NTFS volume. Verifying took forever possibly because they are all trying to read very random fragments.

Runtime 4m36s
Write 80MB/s USN off
Verify ~16MB/s 100% Busy
FastCopy Read 65MB/s 100% Busy
Fragmentation 4800 frags/file (1GB)

Chunk Size, Write Buffer To 64Mi
--multi-thread-write-buffer-size 64Mi
Verify still took a long time but the read afterward is better and the fragmentation did get reduced.

Runtime 4m18s
Write 86MB/s USN off
Verify ~18MB/s 100% Busy
Read 120MB/s 100% Busy
Fragmentation 1742 frags/file (1GB)

Write Buffer To Disabled
--multi-thread-write-buffer-size 0
Much reduced fragmentation verify still not super fast files still quite fragmentated.

Runtime 3m54s
Write 87MB/s USN off
Verify ~22MB/s 100% Busy
Read 155MB/s 100% Busy
Fragmentation 295 frags/file (1GB)

Chunk Size 32Mi
Runtime 2m59s
Write 83MB/s USN off
Verify 27MB/s 100% Busy
Read 52MB/s 100% Busy
Fragmentation 5698 frags/file (1GB)

Chunk Size 128Mi
Runtime 4m9s
Write 82MB/s USN off
Verify 25MB/s 100% Busy
Read 52MB/s 100% Busy
Fragmentation 5144 frags/file (1GB)

No Threading Much Better Performance

--multi-thread-streams 0

Very fast ironically even though the download was slower
Runtime 1m20s
Write 50MB/s
Verify instant ??? is it being skipped as there was no read from disk step hashes still displayed
Read 160MB/s
Fragmentation 1 frags/file (1GB)

No Sparse File Best Performance

--local-no-sparse

Runtime 1m1s
Write 76MB/s
Verify instant ??? is it being skipped as there was no read from disk step hashes still displayed
Read 161MB/s
Fragmentation 1 frags/file (1GB)

ncw · September 13, 2023, 2:16pm

Thank you for the excellent set of benchmarks.

For the verify speed are you measuring the time that it takes rclone to caclulate the checksum? That explains why some verifications are instant as if rclone writes the file sequentially it calculates the checksum as it goes along.

You could simulate this on its own with rclone hashsum md5 /path/to/file. If you want to ignore the CPU time taken by the MD5 calculation then try rclone hashsum crc32 /path/to/file. I'd be interested to see what those transfer rates are for the files above if you've still got them.

This looks like the most promising result. 295 fragments means the fragments are about 3M which is better. The read speed is pretty close to the maximum also. I'd just like to understand that Verify time better.

I'm still puzzled over the difference between --multi-thread-streams 0 and --local-no-sparse.

Perhaps setting --local-no-sparse is forcing windows to do write coalescing otherwise it would have to write 0s to the file. Was that test done with both --multi-thread-streams 0 and --local-no-sparse or just --local-no-sparse? If the latter it should have been doing multi-thread downloads which you can check in the -vv log.

BTW How are you measuring the fragmentation of a file?

raomaosa · September 13, 2023, 7:10pm

The two tests were done with each one by them selves.

rclone copy GoogleDrive1:test A:\Test --create-empty-src-dirs --fast-list --progress -vv --multi-thread-streams 0
rclone copy GoogleDrive1:test A:\Test --create-empty-src-dirs --fast-list --progress -vv --local-no-sparse

Fragmentation was checked using sysinternals

contig -a filenamehere

I will poke at it more later in the day as I didn't keep the vv logs as there were lots of tests I will check on what the difference is between the two or even having both on at the same time later.

The do the verify while it is writing definitely speeds things up over having to read it again later. The read speed during verify doesn't seem to be optimal in the multithreaded download cases as when the fragmentation is lower the fastcopy tool can copy the files much faster than the effective verify read speed in rclone.

raomaosa · September 14, 2023, 6:41am

Checked the --local-no-sparse on giagantic files 4TB in the current release version results in some interesting behavior on windows 10. To test how it behaves with large files that need to be pre-allocated instead of the tiny 1GB one. So --local-no-sparse on 1.63.1 definitely allocates 0s.

Control-C can't cancel the operation, kill rclone becomes impossible (access denied error in task manager), and the allocation of the empty TB file continues likely because it is passed off to a system call that won't return for a very very long time in the future.

Can't test the beta one's behavior yet as the hard drive will be busy for a long time and I don't really want to force a drive removal during write activity even if it is all 0's.

system · October 14, 2023, 6:42am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.