Simultaneous freeze of macOS and rclone

What is the problem you are having with rclone?

I am experiencing what appears to be a simultaneous freeze of macOS and rclone. Most likely due to some bug in macOS (Ventura 13.2.1).

rclone works normally until the "Finished making map for --track-renames + Waiting for checks to finish + Waiting for renames to finish" stage, where all activity seems to freeze, and macOS stops responding. Especially the Finder and any application that interacts with the hard drive.

The debug log is of this freezing moment.

Detail of SORCE / DEST

  • The SOURCE is a hard drive connected to macOS via Thunderbolt. Formatted as HFS+ sensitive.
  • The DEST is a TrueNAS CORE 13.0-U4 machine. Formatted as RAIDZ1.
  • The connection between the two machines is via direct 1gbps Ethernet LAN (cable to cable, no switch).
  • I am quite confident in the TrueNAS machine, it is stable. On the other hand, I am more wary of macOS.
  • I have deliberately decided not to upgrade to macOS Ventura 13.3 (released yesterday).

Run the command 'rclone version' and share the full output of the command.

rclone v1.62.2
- os/version: darwin 13.2.1 (64 bit)
- os/kernel: 22.3.0 (x86_64)
- os/type: darwin
- os/arch: amd64
- go/version: go1.20.2
- go/linking: dynamic
- go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

sftp

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone /Volumes/Users/vicmarto nas:/mnt/zbackup/Users/vicmarto --rc -vv --copy-links --create-empty-src-dirs --track-renames --local-case-sensitive --progress --log-file=/Users/vicmarto/rclone_log.txt

The rclone config contents with secrets removed.

[nas]
type = sftp
host = 192.168.1.150
user = vicmarto
key_file = /Users/vicmarto/.ssh/id_rsa_nas
key_file_pass = EDITED
md5sum_command = md5 -r
sha1sum_command = sha1 -r
shell_type = unix

A log from the command with the -vv flag

2023/03/28 16:24:47 INFO  : sftp://vicmarto@192.168.1.150:22//mnt/zbackup/Users/vicmarto: Making map for --track-renames
2023/03/28 16:24:47 INFO  : sftp://vicmarto@192.168.1.150:22//mnt/zbackup/Users/vicmarto: Finished making map for --track-renames
2023/03/28 16:24:47 DEBUG : sftp://vicmarto@192.168.1.150:22//mnt/zbackup/Users/vicmarto: Waiting for checks to finish
2023/03/28 16:24:47 DEBUG : sftp://vicmarto@192.168.1.150:22//mnt/zbackup/Users/vicmarto: Waiting for renames to finish
2023/03/28 16:25:47 DEBUG : sftp://vicmarto@192.168.1.150:22//mnt/zbackup/Users/vicmarto: Closing 9 unused connections

A debug log

goroutine profile: total 22
7 @ 0x100072e1f 0x10008f1a9 0x10010362e 0x100103616 0x100103396 0x10010e05e 0x10010e056 0x1004f5fb1 0x1000ad732 0x1001c8697 0x1001c8670 0x100cc5ccc 0x101472c96 0x101473df9 0x101470975 0x100076a81
#	0x100072e1e	syscall.syscall+0x7e							runtime/sys_darwin.go:23
#	0x10008f1a8	syscall.read+0x48							syscall/zsyscall_darwin_amd64.go:1209
#	0x10010362d	syscall.Read+0x2cd							syscall/syscall_unix.go:178
#	0x100103615	internal/poll.ignoringEINTRIO+0x2b5					internal/poll/fd_unix.go:794
#	0x100103395	internal/poll.(*FD).Read+0x35						internal/poll/fd_unix.go:163
#	0x10010e05d	os.(*File).read+0x5d							os/file_posix.go:31
#	0x10010e055	os.(*File).Read+0x55							os/file.go:118
#	0x1004f5fb0	github.com/rclone/rclone/lib/readers.(*contextReader).Read+0x70		github.com/rclone/rclone/lib/readers/context.go:28
#	0x1000ad731	io.copyBuffer+0x1b1							io/io.go:427
#	0x1001c8696	io.Copy+0x416								io/io.go:386
#	0x1001c866f	github.com/rclone/rclone/fs/hash.StreamTypes+0x3ef			github.com/rclone/rclone/fs/hash/hash.go:119
#	0x100cc5ccb	github.com/rclone/rclone/backend/local.(*Object).Hash+0x64b		github.com/rclone/rclone/backend/local/local.go:945
#	0x101472c95	github.com/rclone/rclone/fs/sync.(*syncCopyMove).renameID+0x115		github.com/rclone/rclone/fs/sync/sync.go:699
#	0x101473df8	github.com/rclone/rclone/fs/sync.(*syncCopyMove).tryRename+0x118	github.com/rclone/rclone/fs/sync/sync.go:816
#	0x101470974	github.com/rclone/rclone/fs/sync.(*syncCopyMove).pairRenamer+0xd4	github.com/rclone/rclone/fs/sync/sync.go:408

4 @ 0x100040496 0x100050abe 0x10146d6b8 0x101470ba5 0x100076a81
#	0x10146d6b7	github.com/rclone/rclone/fs/sync.(*pipe).GetMax+0xb7			github.com/rclone/rclone/fs/sync/pipe.go:126
#	0x101470ba4	github.com/rclone/rclone/fs/sync.(*syncCopyMove).pairCopyOrMove+0xe4	github.com/rclone/rclone/fs/sync/sync.go:423

1 @ 0x100034f56 0x10006f045 0x100441db5 0x100441bcd 0x10043e96b 0x10044f365 0x10044feb3 0x1003a68af 0x1003a8a89 0x1004c2356 0x1003a68af 0x1013fb6fa 0x1003a68af 0x1013fb6fa 0x1003a68af 0x1004c0090 0x1003aa236 0x1003a53d2 0x100076a81
#	0x10006f044	runtime/pprof.runtime_goroutineProfileWithLabels+0x24		runtime/mprof.go:844
#	0x100441db4	runtime/pprof.writeRuntimeProfile+0xb4				runtime/pprof/pprof.go:734
#	0x100441bcc	runtime/pprof.writeGoroutine+0x4c				runtime/pprof/pprof.go:694
#	0x10043e96a	runtime/pprof.(*Profile).WriteTo+0x14a				runtime/pprof/pprof.go:329
#	0x10044f364	net/http/pprof.handler.ServeHTTP+0x4a4				net/http/pprof/pprof.go:259
#	0x10044feb2	net/http/pprof.Index+0xf2					net/http/pprof/pprof.go:376
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1003a8a88	net/http.(*ServeMux).ServeHTTP+0x148				net/http/server.go:2500
#	0x1004c2355	github.com/go-chi/chi/v5.(*Mux).routeHTTP+0x215			github.com/go-chi/chi/v5@v5.0.8/mux.go:444
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1013fb6f9	github.com/go-chi/chi/v5/middleware.SetHeader.func1.1+0x139	github.com/go-chi/chi/v5@v5.0.8/middleware/content_type.go:13
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1013fb6f9	github.com/go-chi/chi/v5/middleware.SetHeader.func1.1+0x139	github.com/go-chi/chi/v5@v5.0.8/middleware/content_type.go:13
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1004c008f	github.com/go-chi/chi/v5.(*Mux).ServeHTTP+0x30f			github.com/go-chi/chi/v5@v5.0.8/mux.go:90
#	0x1003aa235	net/http.serverHandler.ServeHTTP+0x315				net/http/server.go:2936
#	0x1003a53d1	net/http.(*conn).serve+0x611					net/http/server.go:1995

1 @ 0x10003a4be 0x100072468 0x100416499 0x100076a81
#	0x100072467	os/signal.signal_recv+0x27	runtime/sigqueue.go:149
#	0x100416498	os/signal.loop+0x18		os/signal/signal_unix.go:23

1 @ 0x100040496 0x100009edd 0x1000099d8 0x100416b8f 0x100076a81
#	0x100416b8e	github.com/rclone/rclone/lib/atexit.Register.func1.1+0x2e	github.com/rclone/rclone/lib/atexit/atexit.go:45

1 @ 0x100040496 0x100009edd 0x1000099d8 0x100784665 0x100076a81
#	0x100784664	github.com/rclone/rclone/fs/march.(*March).Run.func2+0x44	github.com/rclone/rclone/fs/march/march.go:208

1 @ 0x100040496 0x100009edd 0x100009a18 0x101466734 0x100076a81
#	0x101466733	github.com/rclone/rclone/cmd.SigInfoHandler.func1+0x93	github.com/rclone/rclone/cmd/siginfo_bsd.go:20

1 @ 0x100040496 0x1000392d7 0x10006f4c9 0x100102092 0x1001035f9 0x1001035e7 0x1001808a9 0x100194205 0x10039ee5f 0x100076a81
#	0x10006f4c8	internal/poll.runtime_pollWait+0x88		runtime/netpoll.go:306
#	0x100102091	internal/poll.(*pollDesc).wait+0x31		internal/poll/fd_poll_runtime.go:84
#	0x1001035f8	internal/poll.(*pollDesc).waitRead+0x298	internal/poll/fd_poll_runtime.go:89
#	0x1001035e6	internal/poll.(*FD).Read+0x286			internal/poll/fd_unix.go:167
#	0x1001808a8	net.(*netFD).Read+0x28				net/fd_posix.go:55
#	0x100194204	net.(*conn).Read+0x44				net/net.go:183
#	0x10039ee5e	net/http.(*connReader).backgroundRead+0x3e	net/http/server.go:674

1 @ 0x100040496 0x1000392d7 0x10006f4c9 0x100102092 0x100107b1d 0x100107b0b 0x100182a95 0x10019f305 0x10019df7d 0x1003aa865 0x1004c8c28 0x100076a81
#	0x10006f4c8	internal/poll.runtime_pollWait+0x88			runtime/netpoll.go:306
#	0x100102091	internal/poll.(*pollDesc).wait+0x31			internal/poll/fd_poll_runtime.go:84
#	0x100107b1c	internal/poll.(*pollDesc).waitRead+0x2bc		internal/poll/fd_poll_runtime.go:89
#	0x100107b0a	internal/poll.(*FD).Accept+0x2aa			internal/poll/fd_unix.go:614
#	0x100182a94	net.(*netFD).accept+0x34				net/fd_unix.go:172
#	0x10019f304	net.(*TCPListener).accept+0x24				net/tcpsock_posix.go:148
#	0x10019df7c	net.(*TCPListener).Accept+0x3c				net/tcpsock.go:297
#	0x1003aa864	net/http.(*Server).Serve+0x384				net/http/server.go:3059
#	0x1004c8c27	github.com/rclone/rclone/lib/http.instance.serve+0xa7	github.com/rclone/rclone/lib/http/server.go:132

1 @ 0x100040496 0x100050abe 0x1009c5f0d 0x100076a81
#	0x1009c5f0c	go.opencensus.io/stats/view.(*worker).start+0xac	go.opencensus.io@v0.24.0/stats/view/worker.go:292

1 @ 0x100040496 0x100050abe 0x101466027 0x100076a81
#	0x101466026	github.com/rclone/rclone/cmd.startProgress.func3+0x106	github.com/rclone/rclone/cmd/progress.go:57

1 @ 0x100040496 0x100051d6f 0x100051d46 0x100071c47 0x10008682b 0x101471597 0x101474492 0x10147588a 0x101475978 0x10168cc0c 0x101461fdd 0x10168cb65 0x101436627 0x101436f3d 0x101463c76 0x101463c6b 0x10169cdd7 0x100040067 0x100076a81
#	0x100071c46	sync.runtime_Semacquire+0x26						runtime/sema.go:62
#	0x10008682a	sync.(*WaitGroup).Wait+0x4a						sync/waitgroup.go:116
#	0x101471596	github.com/rclone/rclone/fs/sync.(*syncCopyMove).stopRenamers+0x76	github.com/rclone/rclone/fs/sync/sync.go:494
#	0x101474491	github.com/rclone/rclone/fs/sync.(*syncCopyMove).run+0x331		github.com/rclone/rclone/fs/sync/sync.go:908
#	0x101475889	github.com/rclone/rclone/fs/sync.runSyncCopyMove+0x229			github.com/rclone/rclone/fs/sync/sync.go:1130
#	0x101475977	github.com/rclone/rclone/fs/sync.Sync+0x77				github.com/rclone/rclone/fs/sync/sync.go:1136
#	0x10168cc0b	github.com/rclone/rclone/cmd/sync.glob..func1.1+0x4b			github.com/rclone/rclone/cmd/sync/sync.go:68
#	0x101461fdc	github.com/rclone/rclone/cmd.Run+0x11c					github.com/rclone/rclone/cmd/cmd.go:257
#	0x10168cb64	github.com/rclone/rclone/cmd/sync.glob..func1+0xc4			github.com/rclone/rclone/cmd/sync/sync.go:66
#	0x101436626	github.com/spf13/cobra.(*Command).execute+0x846				github.com/spf13/cobra@v1.6.1/command.go:920
#	0x101436f3c	github.com/spf13/cobra.(*Command).ExecuteC+0x3bc			github.com/spf13/cobra@v1.6.1/command.go:1044
#	0x101463c75	github.com/spf13/cobra.(*Command).Execute+0x75				github.com/spf13/cobra@v1.6.1/command.go:968
#	0x101463c6a	github.com/rclone/rclone/cmd.Main+0x6a					github.com/rclone/rclone/cmd/cmd.go:567
#	0x10169cdd6	main.main+0x16								github.com/rclone/rclone/rclone.go:14
#	0x100040066	runtime.main+0x206							runtime/proc.go:250

1 @ 0x100072e1f 0x10008e12f 0x100103071 0x100101973 0x10010310f 0x100111a8a 0x10010fc65 0x100cc5cf4 0x101472c96 0x101473df9 0x101470975 0x100076a81
#	0x100072e1e	syscall.syscall+0x7e							runtime/sys_darwin.go:23
#	0x10008e12e	syscall.Close+0x2e							syscall/zsyscall_darwin_amd64.go:509
#	0x100103070	internal/poll.(*FD).destroy+0x50					internal/poll/fd_unix.go:84
#	0x100101972	internal/poll.(*FD).decref+0x52						internal/poll/fd_mutex.go:213
#	0x10010310e	internal/poll.(*FD).Close+0x4e						internal/poll/fd_unix.go:107
#	0x100111a89	os.(*file).close+0x89							os/file_unix.go:262
#	0x10010fc64	os.(*File).Close+0x24							os/file_posix.go:25
#	0x100cc5cf3	github.com/rclone/rclone/backend/local.(*Object).Hash+0x673		github.com/rclone/rclone/backend/local/local.go:946
#	0x101472c95	github.com/rclone/rclone/fs/sync.(*syncCopyMove).renameID+0x115		github.com/rclone/rclone/fs/sync/sync.go:699
#	0x101473df8	github.com/rclone/rclone/fs/sync.(*syncCopyMove).tryRename+0x118	github.com/rclone/rclone/fs/sync/sync.go:816
#	0x101470974	github.com/rclone/rclone/fs/sync.(*syncCopyMove).pairRenamer+0xd4	github.com/rclone/rclone/fs/sync/sync.go:408

Hi vicmarto,

Did you try the ping suggestion I gave you in this post:
https://forum.rclone.org/t/rclone-crashes-macos-on-local-transfers/35477/43

Thanks for the suggestion, Ole.

I tried using ping during the transfer, with the same result.

The ping did NOT cut off at any time, even when macOS/rclone froze.

I am almost certain that this is a macOS bug. The problem is that, in my case, it only triggers when using rclone. That's why I'm asking for help here.

About your question of networked drives: no, I don't mount any networked drives on my system.

Interesting, and a somewhat surprising that the terminal and network stack continues without any issues while rclone and the MacOS (Finder) freezes.

I therefore think you need to systematically reduce data and possibilities in your setup to isolate the root cause.

Here are some ideas for your inspiration:

Are you able to reproduce with a smaller data set (a sub folder)? (trying to reduce the amount of data)

Are you able to reproduce if doing a sync (of a subfolder) from /Volumes/Users/vicmarto to somewhere else on the hard drive e.g. /Volumes/Users/testuser/testfolder? (trying to rule out things related to the NAS)

Are you able to reproduce if doing a sync (of a subfolder) from nas:/mnt/zbackup/Users/vicmarto somewhere else on the NAS e.g. nas:/mnt/zbackup/Users/testuser/testfolder? (trying to rule out things related to the hard drive)

Are you able to reproduce if doing a sync (of a subfolder) with default rclone parameters from /Volumes/Users/vicmarto to somewhere else on the NAS drive e.g. nas:/mnt/zbackup/Users/testuser/testfolder? (trying to rule out the rclone parameters: --rc --copy-links --create-empty-src-dirs --track-renames --local-case-sensitive)

I would agree with you. No user space app (like rclone) should be able to hang macOS. rclone mount does have kernel space parts (OSXFUSE/macFUSE) but you aren't using that.

I don't have much mac experience, but if this was a Linux machine I would say that there is a lock taken in the OS that hasn't been released properly.

Your Go trace from rclone shows that it is stuck waiting for disk stuff too.

So it looks like that there is a filing system lock that hasn't been released properly on your hard drive FS.

As for why? This is clearly something rclone is triggering. Rclone works filing systems very hard by doing things in parallel. I'd try using these flags

--checkers 1 --transfers 1 --check-first

to minimise the parallelism and see if that helps.

Thank you very much for your help.

First, with ncw's suggestion to use --checkers 1 --transfers 1 --check-first I was able to complete the backup.

And then, with Ole's suggestion, I was able to isolate the problem on the external hard drive (connected via Thunderbolt).
In other words, now the NAS does NOT enter the equation. It all comes down to the interaction of 3 factors: external hard drive ↔ macOS ↔ rclone.

Actually, this is enough to freeze macOS/rclone:

rclone check ~/test /Volumes/hdd/test --progress --rc -vv --log-file=/Users/vicmarto/rclone_check_log.txt

The external hard drive is mounted on /Volumes/hdd (I formatted it after completing the backup), and /Volumes/hdd/test consists of 40 files of about 2 GB each. The disk is fine, according to smartctl -t long.

log:

2023/03/31 12:50:47 DEBUG : rclone: Version "v1.62.2" starting with parameters ["rclone" "check" "/Users/vicmarto/test" "/Volumes/hdd/test" "--progress" "--rc" "-vv" "--log-file=/Users/vicmarto/rclone_check_log.txt"]
2023/03/31 12:50:47 NOTICE: Serving remote control on http://127.0.0.1:5572/
2023/03/31 12:50:47 DEBUG : Creating backend with remote "/Users/vicmarto/test"
2023/03/31 12:50:47 DEBUG : Using config file from "/Users/vicmarto/.config/rclone/rclone.conf"
2023/03/31 12:50:47 DEBUG : Creating backend with remote "/Volumes/hdd/test"
2023/03/31 12:50:47 INFO  : Using md5 for hash comparisons
2023/03/31 12:50:47 DEBUG : Local file system at /Volumes/hdd/test: Waiting for checks to finish
2023/03/31 12:50:47 DEBUG : .DS_Store: md5 = 431ced5af2e785f162ee5d2c6c898268 OK
2023/03/31 12:50:47 DEBUG : .DS_Store: OK

debug:

goroutine profile: total 34
8 @ 0x100040496 0x100051d6f 0x100051d46 0x100071c47 0x10008682b 0x1003f9947 0x100796b88 0x10079685f 0x1007aefd4 0x10078ab83 0x10078b408 0x100076a81
#	0x100071c46	sync.runtime_Semacquire+0x26							runtime/sema.go:62
#	0x10008682a	sync.(*WaitGroup).Wait+0x4a							sync/waitgroup.go:116
#	0x1003f9946	golang.org/x/sync/errgroup.(*Group).Wait+0x26					golang.org/x/sync@v0.1.0/errgroup/errgroup.go:53
#	0x100796b87	github.com/rclone/rclone/fs/operations.checkHashes+0x2c7			github.com/rclone/rclone/fs/operations/operations.go:96
#	0x10079685e	github.com/rclone/rclone/fs/operations.CheckHashes+0x15e			github.com/rclone/rclone/fs/operations/operations.go:62
#	0x1007aefd3	github.com/rclone/rclone/fs/operations.Check.func1+0x73				github.com/rclone/rclone/fs/operations/check.go:275
#	0x10078ab82	github.com/rclone/rclone/fs/operations.(*checkMarch).checkIdentical+0x242	github.com/rclone/rclone/fs/operations/check.go:135
#	0x10078b407	github.com/rclone/rclone/fs/operations.(*checkMarch).Match.func1+0xa7		github.com/rclone/rclone/fs/operations/check.go:154

8 @ 0x100072e1f 0x10008f1a9 0x10010362e 0x100103616 0x100103396 0x10010e05e 0x10010e056 0x1004f5fb1 0x1000ad732 0x1001c8697 0x1001c8670 0x100cc5ccc 0x10079716f 0x1003f9ae4 0x100076a81
#	0x100072e1e	syscall.syscall+0x7e						runtime/sys_darwin.go:23
#	0x10008f1a8	syscall.read+0x48						syscall/zsyscall_darwin_amd64.go:1209
#	0x10010362d	syscall.Read+0x2cd						syscall/syscall_unix.go:178
#	0x100103615	internal/poll.ignoringEINTRIO+0x2b5				internal/poll/fd_unix.go:794
#	0x100103395	internal/poll.(*FD).Read+0x35					internal/poll/fd_unix.go:163
#	0x10010e05d	os.(*File).read+0x5d						os/file_posix.go:31
#	0x10010e055	os.(*File).Read+0x55						os/file.go:118
#	0x1004f5fb0	github.com/rclone/rclone/lib/readers.(*contextReader).Read+0x70	github.com/rclone/rclone/lib/readers/context.go:28
#	0x1000ad731	io.copyBuffer+0x1b1						io/io.go:427
#	0x1001c8696	io.Copy+0x416							io/io.go:386
#	0x1001c866f	github.com/rclone/rclone/fs/hash.StreamTypes+0x3ef		github.com/rclone/rclone/fs/hash/hash.go:119
#	0x100cc5ccb	github.com/rclone/rclone/backend/local.(*Object).Hash+0x64b	github.com/rclone/rclone/backend/local/local.go:945
#	0x10079716e	github.com/rclone/rclone/fs/operations.checkHashes.func2+0x4e	github.com/rclone/rclone/fs/operations/operations.go:86
#	0x1003f9ae3	golang.org/x/sync/errgroup.(*Group).Go.func1+0x63		golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75

7 @ 0x100040496 0x100050abe 0x10078483e 0x100076a81
#	0x10078483d	github.com/rclone/rclone/fs/march.(*March).Run.func1+0x15d	github.com/rclone/rclone/fs/march/march.go:159

1 @ 0x100034f56 0x10006f045 0x100441db5 0x100441bcd 0x10043e96b 0x10044f365 0x10044feb3 0x1003a68af 0x1003a8a89 0x1004c2356 0x1003a68af 0x1013fb6fa 0x1003a68af 0x1013fb6fa 0x1003a68af 0x1004c0090 0x1003aa236 0x1003a53d2 0x100076a81
#	0x10006f044	runtime/pprof.runtime_goroutineProfileWithLabels+0x24		runtime/mprof.go:844
#	0x100441db4	runtime/pprof.writeRuntimeProfile+0xb4				runtime/pprof/pprof.go:734
#	0x100441bcc	runtime/pprof.writeGoroutine+0x4c				runtime/pprof/pprof.go:694
#	0x10043e96a	runtime/pprof.(*Profile).WriteTo+0x14a				runtime/pprof/pprof.go:329
#	0x10044f364	net/http/pprof.handler.ServeHTTP+0x4a4				net/http/pprof/pprof.go:259
#	0x10044feb2	net/http/pprof.Index+0xf2					net/http/pprof/pprof.go:376
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1003a8a88	net/http.(*ServeMux).ServeHTTP+0x148				net/http/server.go:2500
#	0x1004c2355	github.com/go-chi/chi/v5.(*Mux).routeHTTP+0x215			github.com/go-chi/chi/v5@v5.0.8/mux.go:444
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1013fb6f9	github.com/go-chi/chi/v5/middleware.SetHeader.func1.1+0x139	github.com/go-chi/chi/v5@v5.0.8/middleware/content_type.go:13
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1013fb6f9	github.com/go-chi/chi/v5/middleware.SetHeader.func1.1+0x139	github.com/go-chi/chi/v5@v5.0.8/middleware/content_type.go:13
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1004c008f	github.com/go-chi/chi/v5.(*Mux).ServeHTTP+0x30f			github.com/go-chi/chi/v5@v5.0.8/mux.go:90
#	0x1003aa235	net/http.serverHandler.ServeHTTP+0x315				net/http/server.go:2936
#	0x1003a53d1	net/http.(*conn).serve+0x611					net/http/server.go:1995

1 @ 0x10003a4be 0x100072468 0x100416499 0x100076a81
#	0x100072467	os/signal.signal_recv+0x27	runtime/sigqueue.go:149
#	0x100416498	os/signal.loop+0x18		os/signal/signal_unix.go:23

1 @ 0x100040496 0x1000090ce 0x100008c7d 0x10078b01b 0x100787994 0x1007848dd 0x100076a81
#	0x10078b01a	github.com/rclone/rclone/fs/operations.(*checkMarch).Match+0x29a	github.com/rclone/rclone/fs/operations/check.go:148
#	0x100787993	github.com/rclone/rclone/fs/march.(*March).processJob+0xd93		github.com/rclone/rclone/fs/march/march.go:490
#	0x1007848dc	github.com/rclone/rclone/fs/march.(*March).Run.func1+0x1fc		github.com/rclone/rclone/fs/march/march.go:166

1 @ 0x100040496 0x100009bf0 0x1000099d8 0x100784665 0x100076a81
#	0x100784664	github.com/rclone/rclone/fs/march.(*March).Run.func2+0x44	github.com/rclone/rclone/fs/march/march.go:208

1 @ 0x100040496 0x100009edd 0x1000099d8 0x100416b8f 0x100076a81
#	0x100416b8e	github.com/rclone/rclone/lib/atexit.Register.func1.1+0x2e	github.com/rclone/rclone/lib/atexit/atexit.go:45

1 @ 0x100040496 0x100009edd 0x100009a18 0x101466734 0x100076a81
#	0x101466733	github.com/rclone/rclone/cmd.SigInfoHandler.func1+0x93	github.com/rclone/rclone/cmd/siginfo_bsd.go:20

1 @ 0x100040496 0x1000392d7 0x10006f4c9 0x100102092 0x100107b1d 0x100107b0b 0x100182a95 0x10019f305 0x10019df7d 0x1003aa865 0x1004c8c28 0x100076a81
#	0x10006f4c8	internal/poll.runtime_pollWait+0x88			runtime/netpoll.go:306
#	0x100102091	internal/poll.(*pollDesc).wait+0x31			internal/poll/fd_poll_runtime.go:84
#	0x100107b1c	internal/poll.(*pollDesc).waitRead+0x2bc		internal/poll/fd_poll_runtime.go:89
#	0x100107b0a	internal/poll.(*FD).Accept+0x2aa			internal/poll/fd_unix.go:614
#	0x100182a94	net.(*netFD).accept+0x34				net/fd_unix.go:172
#	0x10019f304	net.(*TCPListener).accept+0x24				net/tcpsock_posix.go:148
#	0x10019df7c	net.(*TCPListener).Accept+0x3c				net/tcpsock.go:297
#	0x1003aa864	net/http.(*Server).Serve+0x384				net/http/server.go:3059
#	0x1004c8c27	github.com/rclone/rclone/lib/http.instance.serve+0xa7	github.com/rclone/rclone/lib/http/server.go:132

1 @ 0x100040496 0x100050abe 0x1009c5f0d 0x100076a81
#	0x1009c5f0c	go.opencensus.io/stats/view.(*worker).start+0xac	go.opencensus.io@v0.24.0/stats/view/worker.go:292

1 @ 0x100040496 0x100050abe 0x101466027 0x100076a81
#	0x101466026	github.com/rclone/rclone/cmd.startProgress.func3+0x106	github.com/rclone/rclone/cmd/progress.go:57

1 @ 0x100040496 0x100051d6f 0x100051d46 0x100071c47 0x10008682b 0x100784525 0x10078b8c5 0x10078bf29 0x101482225 0x101461fdd 0x101481ed9 0x101436642 0x101436f3d 0x101463c76 0x101463c6b 0x10169cdd7 0x100040067 0x100076a81
#	0x100071c46	sync.runtime_Semacquire+0x26				runtime/sema.go:62
#	0x10008682a	sync.(*WaitGroup).Wait+0x4a				sync/waitgroup.go:116
#	0x100784524	github.com/rclone/rclone/fs/march.(*March).Run+0x3a4	github.com/rclone/rclone/fs/march/march.go:213
#	0x10078b8c4	github.com/rclone/rclone/fs/operations.CheckFn+0x224	github.com/rclone/rclone/fs/operations/check.go:231
#	0x10078bf28	github.com/rclone/rclone/fs/operations.Check+0x68	github.com/rclone/rclone/fs/operations/check.go:290
#	0x101482224	github.com/rclone/rclone/cmd/check.glob..func1.1+0x304	github.com/rclone/rclone/cmd/check/check.go:196
#	0x101461fdc	github.com/rclone/rclone/cmd.Run+0x11c			github.com/rclone/rclone/cmd/cmd.go:257
#	0x101481ed8	github.com/rclone/rclone/cmd/check.glob..func1+0x1d8	github.com/rclone/rclone/cmd/check/check.go:176
#	0x101436641	github.com/spf13/cobra.(*Command).execute+0x861		github.com/spf13/cobra@v1.6.1/command.go:916
#	0x101436f3c	github.com/spf13/cobra.(*Command).ExecuteC+0x3bc	github.com/spf13/cobra@v1.6.1/command.go:1044
#	0x101463c75	github.com/spf13/cobra.(*Command).Execute+0x75		github.com/spf13/cobra@v1.6.1/command.go:968
#	0x101463c6a	github.com/rclone/rclone/cmd.Main+0x6a			github.com/rclone/rclone/cmd/cmd.go:567
#	0x10169cdd6	main.main+0x16						github.com/rclone/rclone/rclone.go:14
#	0x100040066	runtime.main+0x206					runtime/proc.go:250

1 @ 0x100072e1f 0x10008f1a9 0x10010362e 0x100103616 0x100103396 0x1001808a9 0x100194205 0x10039ee5f 0x100076a81
#	0x100072e1e	syscall.syscall+0x7e				runtime/sys_darwin.go:23
#	0x10008f1a8	syscall.read+0x48				syscall/zsyscall_darwin_amd64.go:1209
#	0x10010362d	syscall.Read+0x2cd				syscall/syscall_unix.go:178
#	0x100103615	internal/poll.ignoringEINTRIO+0x2b5		internal/poll/fd_unix.go:794
#	0x100103395	internal/poll.(*FD).Read+0x35			internal/poll/fd_unix.go:163
#	0x1001808a8	net.(*netFD).Read+0x28				net/fd_posix.go:55
#	0x100194204	net.(*conn).Read+0x44				net/net.go:183
#	0x10039ee5e	net/http.(*connReader).backgroundRead+0x3e	net/http/server.go:674

Things I tried, with the same result.

  • Unplug everything connected to the Mac except the external hard drive.
  • Test all four Thunderbolt ports.
  • Another Thunderbolt cable.
  • Upgrade to macOS 13.3 Ventura (released this week).
  • Install a clean macOS 13.3 Ventura: same thing, nothing changes.

With the above, I think @ncw is right in suggesting that it is due to lock taken in the OS that has not been properly released. The problem is that I have only managed to activate it using rclone. With no other software that stresses the thunderbolt hard drive have I managed to freeze macOS. I have tried recreating the rclone check conditions using find + md5:

find /Volumes/hdd/test -type f -print0 | xargs -L1 -P8 -0 md5

With the above, I get eight md5 in parallel... And it works without problems! In the same files!

log:

MD5 (/Volumes/hdd/test/.DS_Store) = 431ced5af2e785f162ee5d2c6c898268
MD5 (/Volumes/hdd/test/11) = 460a32a8b2210233459d9efa837b7501
MD5 (/Volumes/hdd/test/12) = 4f04c7baf003acbb7d19f5f8e75f9510
MD5 (/Volumes/hdd/test/16) = c0e79137e99d14b7d1e34409987b2d0e
MD5 (/Volumes/hdd/test/10) = 55abc018eb752e5f3ce67ad580a505bf
MD5 (/Volumes/hdd/test/13) = b2095e4b4810ef7875e6d1908ade0033
MD5 (/Volumes/hdd/test/15) = 199ce9b5c784a7e97ba1eff44792c03d
MD5 (/Volumes/hdd/test/14) = fd0c485e8f605c59ea63383dd25081c4
MD5 (/Volumes/hdd/test/1) = fa8ca5f78f3591f54d723522023c73ec
MD5 (/Volumes/hdd/test/19) = a24abf54cd400f1556c0dfe7e512e68a
MD5 (/Volumes/hdd/test/22) = 9c0f2129c3384d07d91ac490429e9fea
MD5 (/Volumes/hdd/test/17) = 101827522a731fb2bb2e7f433bf9f848
MD5 (/Volumes/hdd/test/18) = 74ff13160163dc38075e47664330f699
MD5 (/Volumes/hdd/test/2) = f3a5fafc31e562edd78e5c85b524ad73
MD5 (/Volumes/hdd/test/20) = 3762e339e4cf1c5efe3d20ecaa449c66
MD5 (/Volumes/hdd/test/21) = 04b1353498e091d9e70c204f32a3bbca
MD5 (/Volumes/hdd/test/23) = c8e60e5d20f6159bc9c224f6b3053161
MD5 (/Volumes/hdd/test/24) = 0be8d17ee3ebe62b4a166dd88eb102f7
MD5 (/Volumes/hdd/test/27) = 8e892b29270785e19461c9104932833f
MD5 (/Volumes/hdd/test/25) = b93510d382764ed8853ca1bd70840e09
MD5 (/Volumes/hdd/test/26) = 45502ca50f248daf92f57000b6a010fd
MD5 (/Volumes/hdd/test/28) = b49c09e1cb047989a0d5616ded716216
MD5 (/Volumes/hdd/test/29) = 49835af4a3e12b9cfdb9435e423e741b
MD5 (/Volumes/hdd/test/30) = a7055d209da9dd1e34cf8ece0cb90ac1
MD5 (/Volumes/hdd/test/3) = e04cf921f1d0b969521c439bf3332032
MD5 (/Volumes/hdd/test/31) = 42674de481a4031e90958078d7b0804a
MD5 (/Volumes/hdd/test/33) = 884dbdf8a40182b6e62c97b72e4e5adc
MD5 (/Volumes/hdd/test/32) = cb641a9411be91084035a8b3491d02d4
MD5 (/Volumes/hdd/test/37) = 21b5e47aeed7a88df2351a61e6a0923e
MD5 (/Volumes/hdd/test/34) = 86b07812268e6534e1d2efcb205a04cd
MD5 (/Volumes/hdd/test/35) = 69d91b1b9a16e99c8dc8cda9ba7618db
MD5 (/Volumes/hdd/test/36) = 0a7854b7e5fa48e666dc5061b975a78e
MD5 (/Volumes/hdd/test/38) = 5d8b5435120a4f9e3dc3b9ec6bc5f5c0
MD5 (/Volumes/hdd/test/4) = d7f11d1caaf16f5af3490e3b14911887
MD5 (/Volumes/hdd/test/40) = f6c31b268a967a2a686ae84f3eecb61d
MD5 (/Volumes/hdd/test/39) = 054fe01a96c7ecc11ad78d51d3d85571
MD5 (/Volumes/hdd/test/5) = 58b89da0b30e1e2a8f7e89327728ed7d
MD5 (/Volumes/hdd/test/6) = b49a95393376630e15434ad5f62f6f51
MD5 (/Volumes/hdd/test/7) = e92c591d1ee0b57ac030212d08606ac5
MD5 (/Volumes/hdd/test/8) = f4812122894d06fb3f5b3d462f1caf20
MD5 (/Volumes/hdd/test/9) = bde509a106878e6b704daf30a87534fd
1 Like

Finally, I tried going back to macOS 12.6.4 Monterey and.... it works! no macOS/rclone freezes!

log:

2023/03/31 13:08:48 DEBUG : rclone: Version "v1.62.2" starting with parameters ["rclone" "check" "/Users/vicmarto/test" "/Volumes/hdd/test" "--progress" "--rc" "-vv" "--log-file=/Users/vicmarto/rclone_check_log.txt"]
2023/03/31 13:08:48 NOTICE: Serving remote control on http://127.0.0.1:5572/
2023/03/31 13:08:48 DEBUG : Creating backend with remote "/Users/vicmarto/test"
2023/03/31 13:08:48 NOTICE: Config file "/Users/vicmarto/.config/rclone/rclone.conf" not found - using defaults
2023/03/31 13:08:48 DEBUG : Creating backend with remote "/Volumes/hdd/test"
2023/03/31 13:08:48 INFO  : Using md5 for hash comparisons
2023/03/31 13:08:48 DEBUG : Local file system at /Volumes/hdd/test: Waiting for checks to finish
2023/03/31 13:08:48 DEBUG : .DS_Store: md5 = 431ced5af2e785f162ee5d2c6c898268 OK
2023/03/31 13:08:48 DEBUG : .DS_Store: OK
2023/03/31 13:08:59 DEBUG : 16: md5 = c0e79137e99d14b7d1e34409987b2d0e OK
2023/03/31 13:08:59 DEBUG : 16: OK
2023/03/31 13:08:59 DEBUG : 10: md5 = 55abc018eb752e5f3ce67ad580a505bf OK
2023/03/31 13:08:59 DEBUG : 10: OK
2023/03/31 13:08:59 DEBUG : 1: md5 = fa8ca5f78f3591f54d723522023c73ec OK
2023/03/31 13:08:59 DEBUG : 1: OK
2023/03/31 13:08:59 DEBUG : 15: md5 = 199ce9b5c784a7e97ba1eff44792c03d OK
2023/03/31 13:08:59 DEBUG : 15: OK
2023/03/31 13:08:59 DEBUG : 14: md5 = fd0c485e8f605c59ea63383dd25081c4 OK
2023/03/31 13:08:59 DEBUG : 14: OK
2023/03/31 13:08:59 DEBUG : 13: md5 = b2095e4b4810ef7875e6d1908ade0033 OK
2023/03/31 13:08:59 DEBUG : 13: OK
2023/03/31 13:09:00 DEBUG : 12: md5 = 4f04c7baf003acbb7d19f5f8e75f9510 OK
2023/03/31 13:09:00 DEBUG : 12: OK
2023/03/31 13:09:00 DEBUG : 11: md5 = 460a32a8b2210233459d9efa837b7501 OK
2023/03/31 13:09:00 DEBUG : 11: OK
2023/03/31 13:09:10 DEBUG : 22: md5 = 9c0f2129c3384d07d91ac490429e9fea OK
2023/03/31 13:09:10 DEBUG : 22: OK
2023/03/31 13:09:10 DEBUG : 2: md5 = f3a5fafc31e562edd78e5c85b524ad73 OK
2023/03/31 13:09:10 DEBUG : 2: OK
2023/03/31 13:09:10 DEBUG : 19: md5 = a24abf54cd400f1556c0dfe7e512e68a OK
2023/03/31 13:09:10 DEBUG : 19: OK
2023/03/31 13:09:10 DEBUG : 21: md5 = 04b1353498e091d9e70c204f32a3bbca OK
2023/03/31 13:09:10 DEBUG : 21: OK
2023/03/31 13:09:11 DEBUG : 23: md5 = c8e60e5d20f6159bc9c224f6b3053161 OK
2023/03/31 13:09:11 DEBUG : 23: OK
2023/03/31 13:09:11 DEBUG : 18: md5 = 74ff13160163dc38075e47664330f699 OK
2023/03/31 13:09:11 DEBUG : 18: OK
2023/03/31 13:09:11 DEBUG : 20: md5 = 3762e339e4cf1c5efe3d20ecaa449c66 OK
2023/03/31 13:09:11 DEBUG : 20: OK
2023/03/31 13:09:11 DEBUG : 17: md5 = 101827522a731fb2bb2e7f433bf9f848 OK
2023/03/31 13:09:11 DEBUG : 17: OK
2023/03/31 13:09:20 DEBUG : 27: md5 = 8e892b29270785e19461c9104932833f OK
2023/03/31 13:09:20 DEBUG : 27: OK
2023/03/31 13:09:20 DEBUG : 24: md5 = 0be8d17ee3ebe62b4a166dd88eb102f7 OK
2023/03/31 13:09:20 DEBUG : 24: OK
2023/03/31 13:09:21 DEBUG : 25: md5 = b93510d382764ed8853ca1bd70840e09 OK
2023/03/31 13:09:21 DEBUG : 25: OK
2023/03/31 13:09:21 DEBUG : 30: md5 = a7055d209da9dd1e34cf8ece0cb90ac1 OK
2023/03/31 13:09:21 DEBUG : 30: OK
2023/03/31 13:09:21 DEBUG : 26: md5 = 45502ca50f248daf92f57000b6a010fd OK
2023/03/31 13:09:21 DEBUG : 26: OK
2023/03/31 13:09:21 DEBUG : 29: md5 = 49835af4a3e12b9cfdb9435e423e741b OK
2023/03/31 13:09:21 DEBUG : 29: OK
2023/03/31 13:09:21 DEBUG : 28: md5 = b49c09e1cb047989a0d5616ded716216 OK
2023/03/31 13:09:21 DEBUG : 28: OK
2023/03/31 13:09:22 DEBUG : 3: md5 = e04cf921f1d0b969521c439bf3332032 OK
2023/03/31 13:09:22 DEBUG : 3: OK
2023/03/31 13:09:31 DEBUG : 37: md5 = 21b5e47aeed7a88df2351a61e6a0923e OK
2023/03/31 13:09:31 DEBUG : 37: OK
2023/03/31 13:09:31 DEBUG : 36: md5 = 0a7854b7e5fa48e666dc5061b975a78e OK
2023/03/31 13:09:31 DEBUG : 36: OK
2023/03/31 13:09:31 DEBUG : 31: md5 = 42674de481a4031e90958078d7b0804a OK
2023/03/31 13:09:31 DEBUG : 31: OK
2023/03/31 13:09:32 DEBUG : 38: md5 = 5d8b5435120a4f9e3dc3b9ec6bc5f5c0 OK
2023/03/31 13:09:32 DEBUG : 38: OK
2023/03/31 13:09:32 DEBUG : 33: md5 = 884dbdf8a40182b6e62c97b72e4e5adc OK
2023/03/31 13:09:32 DEBUG : 33: OK
2023/03/31 13:09:33 DEBUG : 35: md5 = 69d91b1b9a16e99c8dc8cda9ba7618db OK
2023/03/31 13:09:33 DEBUG : 35: OK
2023/03/31 13:09:33 DEBUG : 32: md5 = cb641a9411be91084035a8b3491d02d4 OK
2023/03/31 13:09:33 DEBUG : 32: OK
2023/03/31 13:09:34 DEBUG : 34: md5 = 86b07812268e6534e1d2efcb205a04cd OK
2023/03/31 13:09:34 DEBUG : 34: OK
2023/03/31 13:09:40 DEBUG : 39: md5 = 054fe01a96c7ecc11ad78d51d3d85571 OK
2023/03/31 13:09:40 DEBUG : 39: OK
2023/03/31 13:09:41 DEBUG : 40: md5 = f6c31b268a967a2a686ae84f3eecb61d OK
2023/03/31 13:09:41 DEBUG : 40: OK
2023/03/31 13:09:42 DEBUG : 4: md5 = d7f11d1caaf16f5af3490e3b14911887 OK
2023/03/31 13:09:42 DEBUG : 4: OK
2023/03/31 13:09:42 DEBUG : 5: md5 = 58b89da0b30e1e2a8f7e89327728ed7d OK
2023/03/31 13:09:42 DEBUG : 5: OK
2023/03/31 13:09:42 DEBUG : 7: md5 = e92c591d1ee0b57ac030212d08606ac5 OK
2023/03/31 13:09:42 DEBUG : 7: OK
2023/03/31 13:09:43 DEBUG : 6: md5 = b49a95393376630e15434ad5f62f6f51 OK
2023/03/31 13:09:43 DEBUG : 6: OK
2023/03/31 13:09:43 DEBUG : 8: md5 = f4812122894d06fb3f5b3d462f1caf20 OK
2023/03/31 13:09:43 DEBUG : 8: OK
2023/03/31 13:09:44 DEBUG : 9: md5 = bde509a106878e6b704daf30a87534fd OK
2023/03/31 13:09:44 DEBUG : 9: OK
2023/03/31 13:09:44 NOTICE: Local file system at /Volumes/hdd/test: 0 differences found
2023/03/31 13:09:44 NOTICE: Local file system at /Volumes/hdd/test: 41 matching files
2023/03/31 13:09:44 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Checks:                41 / 41, 100%
Elapsed time:        55.5s

2023/03/31 13:09:44 DEBUG : 9 go routines active

debug:

goroutine profile: total 41
8 @ 0x100040496 0x100051d6f 0x100051d46 0x100071c47 0x10008682b 0x1003f9947 0x100796b88 0x10079685f 0x1007aefd4 0x10078ab83 0x10078b408 0x100076a81
#	0x100071c46	sync.runtime_Semacquire+0x26							runtime/sema.go:62
#	0x10008682a	sync.(*WaitGroup).Wait+0x4a							sync/waitgroup.go:116
#	0x1003f9946	golang.org/x/sync/errgroup.(*Group).Wait+0x26					golang.org/x/sync@v0.1.0/errgroup/errgroup.go:53
#	0x100796b87	github.com/rclone/rclone/fs/operations.checkHashes+0x2c7			github.com/rclone/rclone/fs/operations/operations.go:96
#	0x10079685e	github.com/rclone/rclone/fs/operations.CheckHashes+0x15e			github.com/rclone/rclone/fs/operations/operations.go:62
#	0x1007aefd3	github.com/rclone/rclone/fs/operations.Check.func1+0x73				github.com/rclone/rclone/fs/operations/check.go:275
#	0x10078ab82	github.com/rclone/rclone/fs/operations.(*checkMarch).checkIdentical+0x242	github.com/rclone/rclone/fs/operations/check.go:135
#	0x10078b407	github.com/rclone/rclone/fs/operations.(*checkMarch).Match.func1+0xa7		github.com/rclone/rclone/fs/operations/check.go:154

7 @ 0x100040496 0x100050abe 0x10078483e 0x100076a81
#	0x10078483d	github.com/rclone/rclone/fs/march.(*March).Run.func1+0x15d	github.com/rclone/rclone/fs/march/march.go:159

6 @ 0x100072e1f 0x10008f1a9 0x10010362e 0x100103616 0x100103396 0x10010e05e 0x10010e056 0x1004f5fb1 0x1000ad732 0x1001c8697 0x1001c8670 0x100cc5ccc 0x10079716f 0x1003f9ae4 0x100076a81
#	0x100072e1e	syscall.syscall+0x7e						runtime/sys_darwin.go:23
#	0x10008f1a8	syscall.read+0x48						syscall/zsyscall_darwin_amd64.go:1209
#	0x10010362d	syscall.Read+0x2cd						syscall/syscall_unix.go:178
#	0x100103615	internal/poll.ignoringEINTRIO+0x2b5				internal/poll/fd_unix.go:794
#	0x100103395	internal/poll.(*FD).Read+0x35					internal/poll/fd_unix.go:163
#	0x10010e05d	os.(*File).read+0x5d						os/file_posix.go:31
#	0x10010e055	os.(*File).Read+0x55						os/file.go:118
#	0x1004f5fb0	github.com/rclone/rclone/lib/readers.(*contextReader).Read+0x70	github.com/rclone/rclone/lib/readers/context.go:28
#	0x1000ad731	io.copyBuffer+0x1b1						io/io.go:427
#	0x1001c8696	io.Copy+0x416							io/io.go:386
#	0x1001c866f	github.com/rclone/rclone/fs/hash.StreamTypes+0x3ef		github.com/rclone/rclone/fs/hash/hash.go:119
#	0x100cc5ccb	github.com/rclone/rclone/backend/local.(*Object).Hash+0x64b	github.com/rclone/rclone/backend/local/local.go:945
#	0x10079716e	github.com/rclone/rclone/fs/operations.checkHashes.func2+0x4e	github.com/rclone/rclone/fs/operations/operations.go:86
#	0x1003f9ae3	golang.org/x/sync/errgroup.(*Group).Go.func1+0x63		golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75

4 @ 0x1004f5fd4 0x1000ad732 0x1001c8697 0x1001c8670 0x100cc5ccc 0x1007972af 0x1003f9ae4 0x100076a81
#	0x1004f5fd3	github.com/rclone/rclone/lib/readers.(*contextReader).Read+0x93	github.com/rclone/rclone/lib/readers/context.go:23
#	0x1000ad731	io.copyBuffer+0x1b1						io/io.go:427
#	0x1001c8696	io.Copy+0x416							io/io.go:386
#	0x1001c866f	github.com/rclone/rclone/fs/hash.StreamTypes+0x3ef		github.com/rclone/rclone/fs/hash/hash.go:119
#	0x100cc5ccb	github.com/rclone/rclone/backend/local.(*Object).Hash+0x64b	github.com/rclone/rclone/backend/local/local.go:945
#	0x1007972ae	github.com/rclone/rclone/fs/operations.checkHashes.func1+0x4e	github.com/rclone/rclone/fs/operations/operations.go:75
#	0x1003f9ae3	golang.org/x/sync/errgroup.(*Group).Go.func1+0x63		golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75

2 @ 0x100072e1f 0x10008f1a9 0x10010362e 0x100103616 0x100103396 0x10010e05e 0x10010e056 0x1004f5fb1 0x1000ad732 0x1001c8697 0x1001c8670 0x100cc5ccc 0x1007972af 0x1003f9ae4 0x100076a81
#	0x100072e1e	syscall.syscall+0x7e						runtime/sys_darwin.go:23
#	0x10008f1a8	syscall.read+0x48						syscall/zsyscall_darwin_amd64.go:1209
#	0x10010362d	syscall.Read+0x2cd						syscall/syscall_unix.go:178
#	0x100103615	internal/poll.ignoringEINTRIO+0x2b5				internal/poll/fd_unix.go:794
#	0x100103395	internal/poll.(*FD).Read+0x35					internal/poll/fd_unix.go:163
#	0x10010e05d	os.(*File).read+0x5d						os/file_posix.go:31
#	0x10010e055	os.(*File).Read+0x55						os/file.go:118
#	0x1004f5fb0	github.com/rclone/rclone/lib/readers.(*contextReader).Read+0x70	github.com/rclone/rclone/lib/readers/context.go:28
#	0x1000ad731	io.copyBuffer+0x1b1						io/io.go:427
#	0x1001c8696	io.Copy+0x416							io/io.go:386
#	0x1001c866f	github.com/rclone/rclone/fs/hash.StreamTypes+0x3ef		github.com/rclone/rclone/fs/hash/hash.go:119
#	0x100cc5ccb	github.com/rclone/rclone/backend/local.(*Object).Hash+0x64b	github.com/rclone/rclone/backend/local/local.go:945
#	0x1007972ae	github.com/rclone/rclone/fs/operations.checkHashes.func1+0x4e	github.com/rclone/rclone/fs/operations/operations.go:75
#	0x1003f9ae3	golang.org/x/sync/errgroup.(*Group).Go.func1+0x63		golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75

2 @ 0x1004f5fd4 0x1000ad732 0x1001c8697 0x1001c8670 0x100cc5ccc 0x10079716f 0x1003f9ae4 0x100076a81
#	0x1004f5fd3	github.com/rclone/rclone/lib/readers.(*contextReader).Read+0x93	github.com/rclone/rclone/lib/readers/context.go:23
#	0x1000ad731	io.copyBuffer+0x1b1						io/io.go:427
#	0x1001c8696	io.Copy+0x416							io/io.go:386
#	0x1001c866f	github.com/rclone/rclone/fs/hash.StreamTypes+0x3ef		github.com/rclone/rclone/fs/hash/hash.go:119
#	0x100cc5ccb	github.com/rclone/rclone/backend/local.(*Object).Hash+0x64b	github.com/rclone/rclone/backend/local/local.go:945
#	0x10079716e	github.com/rclone/rclone/fs/operations.checkHashes.func2+0x4e	github.com/rclone/rclone/fs/operations/operations.go:86
#	0x1003f9ae3	golang.org/x/sync/errgroup.(*Group).Go.func1+0x63		golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75

1 @ 0x100034f56 0x10006f045 0x100441db5 0x100441bcd 0x10043e96b 0x10044f365 0x10044feb3 0x1003a68af 0x1003a8a89 0x1004c2356 0x1003a68af 0x1013fb6fa 0x1003a68af 0x1013fb6fa 0x1003a68af 0x1004c0090 0x1003aa236 0x1003a53d2 0x100076a81
#	0x10006f044	runtime/pprof.runtime_goroutineProfileWithLabels+0x24		runtime/mprof.go:844
#	0x100441db4	runtime/pprof.writeRuntimeProfile+0xb4				runtime/pprof/pprof.go:734
#	0x100441bcc	runtime/pprof.writeGoroutine+0x4c				runtime/pprof/pprof.go:694
#	0x10043e96a	runtime/pprof.(*Profile).WriteTo+0x14a				runtime/pprof/pprof.go:329
#	0x10044f364	net/http/pprof.handler.ServeHTTP+0x4a4				net/http/pprof/pprof.go:259
#	0x10044feb2	net/http/pprof.Index+0xf2					net/http/pprof/pprof.go:376
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1003a8a88	net/http.(*ServeMux).ServeHTTP+0x148				net/http/server.go:2500
#	0x1004c2355	github.com/go-chi/chi/v5.(*Mux).routeHTTP+0x215			github.com/go-chi/chi/v5@v5.0.8/mux.go:444
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1013fb6f9	github.com/go-chi/chi/v5/middleware.SetHeader.func1.1+0x139	github.com/go-chi/chi/v5@v5.0.8/middleware/content_type.go:13
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1013fb6f9	github.com/go-chi/chi/v5/middleware.SetHeader.func1.1+0x139	github.com/go-chi/chi/v5@v5.0.8/middleware/content_type.go:13
#	0x1003a68ae	net/http.HandlerFunc.ServeHTTP+0x2e				net/http/server.go:2122
#	0x1004c008f	github.com/go-chi/chi/v5.(*Mux).ServeHTTP+0x30f			github.com/go-chi/chi/v5@v5.0.8/mux.go:90
#	0x1003aa235	net/http.serverHandler.ServeHTTP+0x315				net/http/server.go:2936
#	0x1003a53d1	net/http.(*conn).serve+0x611					net/http/server.go:1995

1 @ 0x10003a4be 0x100072468 0x100416499 0x100076a81
#	0x100072467	os/signal.signal_recv+0x27	runtime/sigqueue.go:149
#	0x100416498	os/signal.loop+0x18		os/signal/signal_unix.go:23

1 @ 0x100040496 0x1000090ce 0x100008c7d 0x10078b01b 0x100787994 0x1007848dd 0x100076a81
#	0x10078b01a	github.com/rclone/rclone/fs/operations.(*checkMarch).Match+0x29a	github.com/rclone/rclone/fs/operations/check.go:148
#	0x100787993	github.com/rclone/rclone/fs/march.(*March).processJob+0xd93		github.com/rclone/rclone/fs/march/march.go:490
#	0x1007848dc	github.com/rclone/rclone/fs/march.(*March).Run.func1+0x1fc		github.com/rclone/rclone/fs/march/march.go:166

1 @ 0x100040496 0x100009bf0 0x1000099d8 0x100784665 0x100076a81
#	0x100784664	github.com/rclone/rclone/fs/march.(*March).Run.func2+0x44	github.com/rclone/rclone/fs/march/march.go:208

1 @ 0x100040496 0x100009edd 0x1000099d8 0x100416b8f 0x100076a81
#	0x100416b8e	github.com/rclone/rclone/lib/atexit.Register.func1.1+0x2e	github.com/rclone/rclone/lib/atexit/atexit.go:45

1 @ 0x100040496 0x100009edd 0x100009a18 0x101466734 0x100076a81
#	0x101466733	github.com/rclone/rclone/cmd.SigInfoHandler.func1+0x93	github.com/rclone/rclone/cmd/siginfo_bsd.go:20

1 @ 0x100040496 0x1000392d7 0x10006f4c9 0x100102092 0x1001035f9 0x1001035e7 0x1001808a9 0x100194205 0x10039f3b1 0x1001ae85f 0x1001aea3d 0x1003a553c 0x100076a81
#	0x10006f4c8	internal/poll.runtime_pollWait+0x88		runtime/netpoll.go:306
#	0x100102091	internal/poll.(*pollDesc).wait+0x31		internal/poll/fd_poll_runtime.go:84
#	0x1001035f8	internal/poll.(*pollDesc).waitRead+0x298	internal/poll/fd_poll_runtime.go:89
#	0x1001035e6	internal/poll.(*FD).Read+0x286			internal/poll/fd_unix.go:167
#	0x1001808a8	net.(*netFD).Read+0x28				net/fd_posix.go:55
#	0x100194204	net.(*conn).Read+0x44				net/net.go:183
#	0x10039f3b0	net/http.(*connReader).Read+0x170		net/http/server.go:782
#	0x1001ae85e	bufio.(*Reader).fill+0xfe			bufio/bufio.go:106
#	0x1001aea3c	bufio.(*Reader).Peek+0x5c			bufio/bufio.go:144
#	0x1003a553b	net/http.(*conn).serve+0x77b			net/http/server.go:2030

1 @ 0x100040496 0x1000392d7 0x10006f4c9 0x100102092 0x100107b1d 0x100107b0b 0x100182a95 0x10019f305 0x10019df7d 0x1003aa865 0x1004c8c28 0x100076a81
#	0x10006f4c8	internal/poll.runtime_pollWait+0x88			runtime/netpoll.go:306
#	0x100102091	internal/poll.(*pollDesc).wait+0x31			internal/poll/fd_poll_runtime.go:84
#	0x100107b1c	internal/poll.(*pollDesc).waitRead+0x2bc		internal/poll/fd_poll_runtime.go:89
#	0x100107b0a	internal/poll.(*FD).Accept+0x2aa			internal/poll/fd_unix.go:614
#	0x100182a94	net.(*netFD).accept+0x34				net/fd_unix.go:172
#	0x10019f304	net.(*TCPListener).accept+0x24				net/tcpsock_posix.go:148
#	0x10019df7c	net.(*TCPListener).Accept+0x3c				net/tcpsock.go:297
#	0x1003aa864	net/http.(*Server).Serve+0x384				net/http/server.go:3059
#	0x1004c8c27	github.com/rclone/rclone/lib/http.instance.serve+0xa7	github.com/rclone/rclone/lib/http/server.go:132

1 @ 0x100040496 0x100050abe 0x1009c5f0d 0x100076a81
#	0x1009c5f0c	go.opencensus.io/stats/view.(*worker).start+0xac	go.opencensus.io@v0.24.0/stats/view/worker.go:292

1 @ 0x100040496 0x100050abe 0x101466027 0x100076a81
#	0x101466026	github.com/rclone/rclone/cmd.startProgress.func3+0x106	github.com/rclone/rclone/cmd/progress.go:57

1 @ 0x100040496 0x100051d6f 0x100051d46 0x100071c47 0x10008682b 0x100784525 0x10078b8c5 0x10078bf29 0x101482225 0x101461fdd 0x101481ed9 0x101436642 0x101436f3d 0x101463c76 0x101463c6b 0x10169cdd7 0x100040067 0x100076a81
#	0x100071c46	sync.runtime_Semacquire+0x26				runtime/sema.go:62
#	0x10008682a	sync.(*WaitGroup).Wait+0x4a				sync/waitgroup.go:116
#	0x100784524	github.com/rclone/rclone/fs/march.(*March).Run+0x3a4	github.com/rclone/rclone/fs/march/march.go:213
#	0x10078b8c4	github.com/rclone/rclone/fs/operations.CheckFn+0x224	github.com/rclone/rclone/fs/operations/check.go:231
#	0x10078bf28	github.com/rclone/rclone/fs/operations.Check+0x68	github.com/rclone/rclone/fs/operations/check.go:290
#	0x101482224	github.com/rclone/rclone/cmd/check.glob..func1.1+0x304	github.com/rclone/rclone/cmd/check/check.go:196
#	0x101461fdc	github.com/rclone/rclone/cmd.Run+0x11c			github.com/rclone/rclone/cmd/cmd.go:257
#	0x101481ed8	github.com/rclone/rclone/cmd/check.glob..func1+0x1d8	github.com/rclone/rclone/cmd/check/check.go:176
#	0x101436641	github.com/spf13/cobra.(*Command).execute+0x861		github.com/spf13/cobra@v1.6.1/command.go:916
#	0x101436f3c	github.com/spf13/cobra.(*Command).ExecuteC+0x3bc	github.com/spf13/cobra@v1.6.1/command.go:1044
#	0x101463c75	github.com/spf13/cobra.(*Command).Execute+0x75		github.com/spf13/cobra@v1.6.1/command.go:968
#	0x101463c6a	github.com/rclone/rclone/cmd.Main+0x6a			github.com/rclone/rclone/cmd/cmd.go:567
#	0x10169cdd6	main.main+0x16						github.com/rclone/rclone/rclone.go:14
#	0x100040066	runtime.main+0x206					runtime/proc.go:250

1 @ 0x100076a81

As a conclusion: something is up with the macOS Ventura ↔ rclone combination, at least on my system. Is there any other test I can do to isolate the problem? Or to get more information to troubleshoot it? Is there any log/test to help find the cause? Thanks.

Very good debugging :slight_smile:

I'd have suggested something like your 8 parallel MD5s to see if you can find the problem like that but you already thought of that!

A slightly more accurate simulation of rclone would be to run 8 finds each running one MD5 - rclone will be traversing multiple directories at once which may / may not make a difference.

Rclone doesn't open files on the local file system in any special way, however the Go runtime does use async polling of file reads which is a less travelled path in the OS.

You could try setting the environnment variable GOMAXPROCS=1 to limit the Go runtime to using a single thread and then try your rclone check test. This may/may not make a difference.

It might also be worth trying your rclone check test with some older rclones. When you run rclone version it tells you the Go version it was compiled with and maybe this problem was introduced with a particular Go version.

I might try writing a python program which used threads to read the files simultaneously. I suspect it is the threading of the file reads which is confusing the OS somehow as your test with lots of MD5 processed was fine.

Here is a possibly related Go bug

Though it isn't very similar. It does have some more env vars you can try in it!

Agree, excellent troubleshooting :slightly_smiling_face:

@ncw already gave some very good suggestions, so all I can add is some ideas to eliminate more possibilities:

Did you try these two (trying to eliminating the (drivers related to the) external or internal disk):

rclone check ~/test ~/test --progress --rc -vv --log-file=/Users/vicmarto/rclone_check_log.txt
rclone check /Volumes/hdd/test /Volumes/hdd/test --progress --rc -vv --log-file=/Users/vicmarto/rclone_check_log.txt

You may also want to try this to eliminate networking (--rc), screen output (--progress -vv), disk writes (--log-file -vv):

rclone check ~/test /Volumes/hdd/test -q

I made a little parallel threading md5summer in python which simulates what rclone does quite accurately.

Call it with python3 md5summer.py directory

I'd be interested to see if this causes the same problem.

#!/usr/bin/python3
"""
A parallel MD5 sum program
"""

import os
import sys
import threading
import queue
import hashlib
import os.path

q = queue.Queue()
threads = 8

def md5(file_path):
    hash_md5 = hashlib.md5()
    with open(file_path, "rb") as f:
        for chunk in iter(lambda: f.read(1048576), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()

print_mutex = threading.Lock()

def worker():
    """
    Do MD5 sums on the filename passed in
    """
    while True:
        file_path = q.get()
        hash = md5(file_path)
        print_mutex.acquire()
        print(f"{hash}  {file_path}")
        print_mutex.release()
        q.task_done()

def main():
    if len(sys.argv) != 2:
        print(f"Syntax: {sys.argv[0]} <directory>")
        raise SystemExit(2)
    path = sys.argv[1]

    # Start the worker threads
    for _ in range(threads):
        threading.Thread(target=worker, daemon=True).start()

    # Traverse the directory
    for root, dirs, files in os.walk(path):
        for file_path in files:
            q.put(os.path.join(root, file_path))

    # Block until all tasks are done.
    q.join()

if __name__ == "__main__":
    main()

Thank you @ncw for such good ideas. I think we are making progress:

GOMAXPROCS=1 has worked with macOS 13 Ventura!!!
Even GOMAXPROCS=3 seems stable. From GOMAXPROCS=4 onwards the problems start.

Then I have tried with earlier versions of rclone, to find out where the incompatibility with macOS 13 Ventura started (because with earlier versions of macOS there is no problem!).
I have tried all versions, one by one, until I got to v1.53.0. But without success, with all I experience the same freezing:

v1.62.1		go1.20.2
v1.62.0		go1.20.2
v1.61.1		go1.19.4
v1.61.0		go1.19.4
v1.60.1		go1.19.3
v1.60.0		go1.19.2
v1.59.2		go1.18.6 
v1.59.1		go1.18.5
v1.59.1		go1.18.5
v1.59.0		go1.18.3
v1.58.1		go1.17.9	
v1.58.0		go1.17.8
v1.57.0		go1.17.2
v1.56.2		go1.16.8
v1.56.1		go1.16.8
v1.56.0		go1.16.6
v1.55.1		go1.16.3
v1.55.0		go1.16.2
v1.54.1		go1.15.8
v1.54.0		go1.15.7
v1.53.4		go1.15.6
v1.53.3		go1.15.4
v1.53.2		go1.15.3
v1.53.1		go1.15.1
v1.53.0		go1.15

I have not tested anything prior to v1.53.0 because rclone check only checks one file at a time (instead of eight).

It seems some incompatibility between Go and macOS 13 Ventura that only triggers under extreme conditions.


Then I tried md5summer.py (thank you very much for your time) with this result:

About the internal SSD, there is no problem:

python3 md5summer.py ~/test
431ced5af2e785f162ee5d2c6c898268  /Users/vicmarto/test/.DS_Store
884dbdf8a40182b6e62c97b72e4e5adc  /Users/vicmarto/test/33
8e892b29270785e19461c9104932833f  /Users/vicmarto/test/27
cb641a9411be91084035a8b3491d02d4  /Users/vicmarto/test/32
69d91b1b9a16e99c8dc8cda9ba7618db  /Users/vicmarto/test/35
86b07812268e6534e1d2efcb205a04cd  /Users/vicmarto/test/34
bde509a106878e6b704daf30a87534fd  /Users/vicmarto/test/9
74ff13160163dc38075e47664330f699  /Users/vicmarto/test/18
3762e339e4cf1c5efe3d20ecaa449c66  /Users/vicmarto/test/20
e92c591d1ee0b57ac030212d08606ac5  /Users/vicmarto/test/7
49835af4a3e12b9cfdb9435e423e741b  /Users/vicmarto/test/29
460a32a8b2210233459d9efa837b7501  /Users/vicmarto/test/11
b49c09e1cb047989a0d5616ded716216  /Users/vicmarto/test/28
c0e79137e99d14b7d1e34409987b2d0e  /Users/vicmarto/test/16
b49a95393376630e15434ad5f62f6f51  /Users/vicmarto/test/6
101827522a731fb2bb2e7f433bf9f848  /Users/vicmarto/test/17
fa8ca5f78f3591f54d723522023c73ec  /Users/vicmarto/test/1
a24abf54cd400f1556c0dfe7e512e68a  /Users/vicmarto/test/19
55abc018eb752e5f3ce67ad580a505bf  /Users/vicmarto/test/10
f4812122894d06fb3f5b3d462f1caf20  /Users/vicmarto/test/8
45502ca50f248daf92f57000b6a010fd  /Users/vicmarto/test/26
0a7854b7e5fa48e666dc5061b975a78e  /Users/vicmarto/test/36
04b1353498e091d9e70c204f32a3bbca  /Users/vicmarto/test/21
5d8b5435120a4f9e3dc3b9ec6bc5f5c0  /Users/vicmarto/test/38
42674de481a4031e90958078d7b0804a  /Users/vicmarto/test/31
21b5e47aeed7a88df2351a61e6a0923e  /Users/vicmarto/test/37
a7055d209da9dd1e34cf8ece0cb90ac1  /Users/vicmarto/test/30
0be8d17ee3ebe62b4a166dd88eb102f7  /Users/vicmarto/test/24
d7f11d1caaf16f5af3490e3b14911887  /Users/vicmarto/test/4
c8e60e5d20f6159bc9c224f6b3053161  /Users/vicmarto/test/23
054fe01a96c7ecc11ad78d51d3d85571  /Users/vicmarto/test/39
199ce9b5c784a7e97ba1eff44792c03d  /Users/vicmarto/test/15
e04cf921f1d0b969521c439bf3332032  /Users/vicmarto/test/3
f6c31b268a967a2a686ae84f3eecb61d  /Users/vicmarto/test/40
f3a5fafc31e562edd78e5c85b524ad73  /Users/vicmarto/test/2
4f04c7baf003acbb7d19f5f8e75f9510  /Users/vicmarto/test/12
58b89da0b30e1e2a8f7e89327728ed7d  /Users/vicmarto/test/5
b2095e4b4810ef7875e6d1908ade0033  /Users/vicmarto/test/13
9c0f2129c3384d07d91ac490429e9fea  /Users/vicmarto/test/22
fd0c485e8f605c59ea63383dd25081c4  /Users/vicmarto/test/14
b93510d382764ed8853ca1bd70840e09  /Users/vicmarto/test/25

But macOS freezes on external disk..... :frowning:

python3 md5summer.py /Volumes/hdd/test
431ced5af2e785f162ee5d2c6c898268 /Volumes/hdd/test/.DS_Store

What can this mean?

@Ole, thank you very much for the suggestion! I think we have found something very interesting. All the following tests are with version 1.62.2 and WITHOUT GOMAXPROCS=1:

Internal SSD tested against external hard drive:

rclone check ~/test /Volumes/hdd/test -q

Result: Frozen, as usual.


Internal SSD checked against itself:

rclone check ~/test ~/test -q

Result: no problems.


External hard drive checked against itself:

rclone check /Volumes/hdd/test /Volumes/hdd/test -q

and

rclone check /Volumes/hdd/test /Volumes/hdd/test --progress --rc -vv --log-file=/Users/vicmarto/rclone_check_log.txt

Result: no problem !!!!! I thought it was going to freeze!!!!! :open_mouth:


With the above result, can we conclude that the Internal SD + External HDD combination is the conflicting one? It seems so, but the result of md5summer.py puzzles me?

Any other ideas or suggestions would be welcome, what would be the next logical step now?

I use rclone for backups, so fixing this problem is of vital importance to me.

Interesting, but not necessary puzzling if chasing what seems to be a race condition in either hardware or software.

The point is that we are talking probabilities that something bad happens. Imagine a traffic light which sometimes is briefly green in both directions; it takes a certain amount of traffic (or occasional unluck) until something bad happens. So the fact that nothing bad happens cannot be used say that a component is OK, we can only exclude a component if it wasn't used/active when the bad thing happened.

So we now now for certain that the issue isn't related to:

  • Your NAS
  • Go + rclone - thanks to @ncw !!!
  • Python

That is the root cause is something in or below MacOS, which is a huge progress :sweat_smile:

The most likely issue origins are now (ordered by my judgement of decreasing likelihood):

  1. Your external hard disk incl. thunderbolt hardware and associated drivers in the Mac.
  2. Your Mac hardware generally (e.g bad DMA controller or processor cooling)
  3. MacOS Ventura
  4. MacOS generally

The reason I put the your specific hardware on the top is that a more general problem with the "traffic lights" in macOS (Ventura) also would become more visible on the internet due to a higher than normal occurrence of occasional freezes. I don't see such a solid (above normal) pattern of MacOS (Ventura) freezes on disk access when searching the internet.

I like the scientific approach, so my next step would again be to reduce possibilities e.g. by trying to disprove that the origin lies in the external hard disk (incl. Thunderbolt HW/SW in the Mac) or in your Mac hardware generally.

Here are some ideas:

Can you reproduce with the external hard disk disconnected if you try hard enough? (e.g. by running rclone check and/or the python program in a loop for some hours using only using the internal disk - perhaps also concurrently to max the stress)

Can you reproduce if using another Mac with the same external hard disk?

Can you find others reporting similar issues when you search for macOS+Thunderbolt, macOS+theExternalDiskBrand, ...?

Can you reproduce with macOS Monterey if you try hard enough?

Can you reproduce if using another external hard disk with Thunderbolt?

Can you reproduce if using another external hard disk without Thunderbolt?

Another mac user checking in...

I am running

rclone v1.62.2
- os/version: darwin 13.3 (64 bit)
- os/kernel: 22.4.0 (arm64)
- os/type: darwin
- os/arch: arm64 (ARMv8 compatible)
- go/version: go1.20.2
- go/linking: dynamic
- go/tags: cmount

on new 2023 Mac mini with an M2 processor running Ventura 13.3 (22E252).

Do note that this is ARM while @vicmarto is intel. May matter.

I ran @ncw's MD5 program with only the modification to print the MD5 of the filename too for privacy.

  • Local SSD: No deadlock
  • USB3 external hard drive HFS+: No deadlock
  • USB3 external SSD APFS: No Deadlock
  • APFS Sparsebundle on USB3 external SSD APFS: No Deadlock

This "doesn't prove a negative" but I haven't been able to reproduce.

I wonder if it is Intel vs ARM?

I have an Intel work laptop but I mayn't plug in personal external drives. So I can test it for the local SSD today but an external drive test will have to wait until Monday. It is also running macOS 12 (for now, I need to upgrade).

So that is definitely pointing towards concurrency being the issue.

That was good testing. I know the Go standard library way of access files has changed over the last few versions but you've tested enough versions that we can rule that out.

Ah... So not a Go specific thing.

It looks like heavy concurrency is triggering this bug.

You could run it on a loop for the internal SSD to see if you can get it to trigger there to rule out OS vs hardware. I think my money would be on something wrong with the external disk interface - perhaps errors on the cable...

Once thing you should probably do is look at the kernel logs to see if there is anything in there. I don't know how you do this on macos but I'm sure a bit of Googling will reveal the way. That may show something interesesting.

I suspect the results will be a little random, so you'll probably need to repeat any passing tests a few times to see if you can get them to fail.

The suggestions from @Ole are all good.

PS I enjoyed writing md5summer.py - I haven't done any threading in python for ages so it was interesting to see how it has progressed.

I am really overwhelmed and grateful for so many good tips/suggestions. Even ncw has taken the trouble to write md5summer.py (which, has turned out to be really helpful). Many thanks to all of you!

I have tried to reproduce the problem on macOS 12 Monterey (on the external hard disk), and macOS 13.3 Ventura (on the internal SSD disk) with the help of md5summer.py for hours: both seem to be fine.

So the logical conclusion is some problem with this thunderbolt external hard disk + macOS 13 Ventura combination. So I will work from now on in that direction.

At the moment, I am checking the disk for errors.

Good luck!

I wrote a little program you may find helpful for checking the disk

Its been used to find errors on a lot of different types of disk!

Agree, but I doubt you will be able to get much closer to the exact cause of this issue with the tools at your disposal. Sorry!

I therefore suggest you try finding a reasonable workaround, that is reducing rclone concurrency to a level around half of the level where you see issues (even if testing for hours).

You can use GOMAXPROCS like you tried above and/or limit the number of concurrent checkers/transfers like this:

rclone sync checkers=4 transfers=2
rclone sync checkers=2 transfers=2
rclone sync checkers=2 transfers=1
rclone sync checkers=1 transfers=1
rclone sync checkers=1 transfers=1 --check-first

It probably takes some testing to find the sweet spot.

Well, it looks like I finally found the cause of the issue: the external thunderbolt enclosure..... Fortunately, the hard drive was fine!

Until recently I didn't have another enclosure to test, and that's why it took me so long.

For posterity: it was NOT a bug in rclone or go, but a hardware bug or incompatibility that only arose when using rclone, and only under macOS Ventura.

2 Likes

Happy to hear, excellent troubleshooting!