Before i start testing this any further i would like to make sure i understand this properly.
Recently i started including directories in my sync which see frequent files being written.
One of those programs is Handbrake which converts a mp4 to another format (so files being written > 1 GiB).
I was in the understanding that rclone would detect files that are being written and ignore them (don't know anymore where i read this).
But rclone started happily uploading a file that was still being written by handbrake.
The filesize for --progress was marked at 0 but the file was uploading data.
So did i misunderstand? or should rclone ignore files that are currently being written?
If i understood correctly... does this work properly on Windows as far as you know?
Based on this i will either start reproducing the issue or adjust my sync accordingly
What is your rclone version (output from rclone version)
rclone v1.54.0
os/arch: windows/amd64
go version: go1.15.7
Which OS you are using and how many bits (eg Windows 7, 64 bit)
Windows 10 x64 20H2
Which cloud storage system are you using? (eg Google Drive)
Google Drive
The command you were trying to run (eg rclone copy /tmp remote:tmp)
I was just going to ask if such an option exists because, in my tests, rclone does not detect files being written. It would be great if a new param would be introduced to do this.
I too am making a sync on directories that have constantly files being written, it would be great if rclone just ignore those files, I know --ignore-errors exists but wouldn't that ignore other errors too?
@Animosity022
It didn't for me. But i didn't wait for the upload to catch up. Besides my upload is sufficiently slow that it might never.
Of course it's possible that on the check run it found the file but because it was being written the size was 0. But by the time it actually got around to uploading it might have been completed.
Like I said I would need to test this further once we know what should happen based on intended design
VSS isn't much use for me since it's specific folders that have the files written. After which they are moved. It was never an option to upload those folders in the past since --track-renames did not exist yet (especially for crypt).
@asdffdsa
Yeah i know but before i actually invested the time into creating the situation with debug logs (i don't create those by default) i thought maybe my thinking was wrong so the time would be wasted
But based on both your replies. Am i correct to assume that Rclone should produce an error the moment it starts reading the file for upload when it is still being written by another application?
if a program has a file open and actively writing to it, then if VSS were to copy that file, it be corrupted and unusable.
with VSS, to create a snapshot for an open file that is being written to:
the application like sql server, or exchange server, has to register itself with VSS with a callback hook.
then when VSS wants to backup sql database, VSS will contact the database server; which will put that database into a self-consistent state, then tell VSS it is ready.
only then will VSS is copy that file.
True
I would consider SQL Server one of those programs constantly modifying files.
But we are getting side tracked and I believe I've wasted enough of your time.
Somebody please confirm or deny this so that I can either start working on the debug logs or close this thread
So that means you ran something but there isn't a log to validate what you are seeing as if a file is being modified / written to, it'll stop uploading that file as the way handbrake works, it constantly opens and closes the file and doesn't put a lock on it. Rclone only know it changes as it detects the size changing, which is in the log.
I can't figure out what you are trying to get answered at this point.
If you want to take a 'snap' of a system that constantly writes data, you use VSS snapshot on windows and make a snap and back that up. That gives you a consistent point in time copy as that's how every piece of backup software I've ever touched on Windows (I'm not talking about VMDK/Virtual servers backups as those are much different).
Rclone does its best but the detection isn't perfect.
The way it works is
rclone checks the size and modification time of files as they
are being uploaded and aborts with a message which starts "can't copy
source file is being updated" if the file changes during upload.
So if the file is open by another app, but doesn't get written to for the time it takes rclone to upload it, then the file will be sucessfully uploaded.
It would be useful to see what error messages you are getting in your log and that will tell us exactly what is going on.
The logs I had are production environment logs. They only log notice level events. I would have to reproduce the issue to get debug logs. I also cancelled the job at the time because i thought something wasn't right based on --progress.
I want to know whether or not I understand Rclone correctly and the way it is currently designed to work (assuming no bugs or design flaws). I'm asking because i suspect this is more an issue with my understanding of Rclone and random things I've read in the past on the forum and documentation and not with Rclone.
Purely theoretical and system agnostic:
Let's assume a new file is being written by a program and is not constantly being opened and closed by that program. So it has a write lock for the duration of the Rclone session.
Would Rclone try uploading or detect the writelock beforehand?
Would Rclone produce an error when starting, during or after the upload? Guess that depends on when it finds out the filesize is changing
So while i was writing this @ncw answered a lot of my questions.
Sorry no logs yet since i cancelled the job mid-transfer when i noticed something weird in --progress.
But based on your answers and what i know of Windows.
Windows would report the filesize while the new file is being written as 0. Only reporting the actual filesize once it is done. Windows only allocates the full size beforehand when it knows the full size (like file copying). Which does not apply for video conversion where the final size is not known. Same goes for modification times which are updated at creation and when writing finishes, but not during.
So based on this Rclone's detection would not work properly on Windows. Not Rclone's fault, its just the way Windows works.
Thank you @Animosity022@asdffdsa@ncw for your patience and answers.
Maybe i'll try creating a debug log but i don't think its worth it based on the available information. Unless this is something where there is a desire to improve the existing functionality or there is an interest from one of you for the debug logs. Personally I'm just as happy excluding the folders in question.
felix@flux:/mnt/c/Users/earlt/Downloads/rclone-v1.53.4-windows-amd64$ ./rclone.exe copy ../Test/TestFile.m4v blah.mkv
2021/03/05 11:12:29 NOTICE: Config file "C:\\Users\\earlt\\.config\\rclone\\rclone.conf" not found - using defaults
2021/03/05 11:12:29 NOTICE: TestFile.m4v: Removing partially written file on error: can't copy - source file is being up
dated (size changed from 40632368 to 40894512)
2021/03/05 11:12:29 ERROR : TestFile.m4v: Failed to copy: can't copy - source file is being updated (size changed from 4
0632368 to 40894512)
2021/03/05 11:12:29 ERROR : Attempt 1/3 failed with 1 errors and: can't copy - source file is being updated (size change
d from 40632368 to 40894512)
2021/03/05 11:12:29 ERROR : Attempt 2/3 succeeded
Since there is no file locking happening.
It works based on what OS does so not sure here. The file isn't locked for writing and it changes. No program would pick that up as it is not a rclone issue of working or not working as it's reporting what the OS tells it.
Generally what people do is not upload in process folders or extracting folders. Move it to a completed location when done to avoid trying to copy or process things are in flight.
Since you invested the time felt i had to do it as well.
As you see my error is a little bit different but amounts to the same result.
It completes eventually but the file in question was actually still being written so incomplete and corrupted.
2021/03/05 17:31:52 DEBUG : rclone: Version "v1.54.0" starting with parameters ["C:\\rclone\\rclone.exe" "sync" "--bwlimit" "3M:off" "--fast-list" "--track-renames" "--track-renames-strategy" "modtime" "--check-first" "--delete-after" "--progress" "--log-file" "C:\\Users\\username\\Desktop\\MirrorD Log\\Rclone TEST 2021-03-05 173151.log" "--log-level" "DEBUG" "local:E:/Rclone test/" "cryptGDrive:TEST"]
2021/03/05 17:31:52 DEBUG : Using config file from "C:\\Users\\username\\.config\\rclone\\rclone.conf"
2021/03/05 17:31:52 INFO : Starting bandwidth limiter at 3M:offBytes/s
2021/03/05 17:31:52 DEBUG : Creating backend with remote "local:E:/Rclone test/"
2021/03/05 17:31:52 DEBUG : fs cache: renaming cache item "local:E:/Rclone test/" to be canonical "//?/E:/Rclone test/"
2021/03/05 17:31:52 DEBUG : Creating backend with remote "cryptGDrive:TEST"
2021/03/05 17:31:52 DEBUG : Creating backend with remote "remoteGDrive:Crypt/m74tehp8uh9l7dulkggbj13qvc"
2021/03/05 17:31:52 DEBUG : remoteGDrive: Loaded invalid token from config file - ignoring
2021/03/05 17:31:52 DEBUG : Keeping previous permissions for config file: -rw-rw-rw-
2021/03/05 17:31:52 DEBUG : remoteGDrive: Saved new token in config file
2021/03/05 17:31:53 INFO : Encrypted drive 'cryptGDrive:TEST': Running all checks before starting transfers
2021/03/05 17:31:53 INFO : Encrypted drive 'cryptGDrive:TEST': Making map for --track-renames
2021/03/05 17:31:53 INFO : Encrypted drive 'cryptGDrive:TEST': Finished making map for --track-renames
2021/03/05 17:31:53 DEBUG : Encrypted drive 'cryptGDrive:TEST': Waiting for checks to finish
2021/03/05 17:31:53 INFO : Encrypted drive 'cryptGDrive:TEST': Checks finished, now starting transfers
2021/03/05 17:31:53 DEBUG : Encrypted drive 'cryptGDrive:TEST': Waiting for renames to finish
2021/03/05 17:31:53 DEBUG : Encrypted drive 'cryptGDrive:TEST': Waiting for transfers to finish
2021/03/05 17:32:13 ERROR : testconv.mp4: corrupted on transfer: sizes differ 48 vs 32768048
2021/03/05 17:32:13 INFO : testconv.mp4: Removing failed copy
2021/03/05 17:32:14 ERROR : Encrypted drive 'cryptGDrive:TEST': not deleting files as there were IO errors
2021/03/05 17:32:14 ERROR : Encrypted drive 'cryptGDrive:TEST': not deleting directories as there were IO errors
2021/03/05 17:32:14 ERROR : Attempt 1/3 failed with 1 errors and: corrupted on transfer: sizes differ 48 vs 32768048
2021/03/05 17:32:14 INFO : Encrypted drive 'cryptGDrive:TEST': Running all checks before starting transfers
2021/03/05 17:32:14 INFO : Encrypted drive 'cryptGDrive:TEST': Making map for --track-renames
2021/03/05 17:32:14 INFO : Encrypted drive 'cryptGDrive:TEST': Finished making map for --track-renames
2021/03/05 17:32:14 DEBUG : Encrypted drive 'cryptGDrive:TEST': Waiting for checks to finish
2021/03/05 17:32:14 INFO : Encrypted drive 'cryptGDrive:TEST': Checks finished, now starting transfers
2021/03/05 17:32:14 DEBUG : Encrypted drive 'cryptGDrive:TEST': Waiting for renames to finish
2021/03/05 17:32:14 DEBUG : Encrypted drive 'cryptGDrive:TEST': Waiting for transfers to finish
2021/03/05 17:32:15 DEBUG : 925dbeon2raajojj0dklpassbk: Sending chunk 0 length 8388608
2021/03/05 17:32:17 DEBUG : 925dbeon2raajojj0dklpassbk: Sending chunk 8388608 length 8388608
2021/03/05 17:32:19 DEBUG : 925dbeon2raajojj0dklpassbk: Sending chunk 16777216 length 8388608
2021/03/05 17:32:22 DEBUG : 925dbeon2raajojj0dklpassbk: Sending chunk 25165824 length 8388608
2021/03/05 17:32:24 DEBUG : 925dbeon2raajojj0dklpassbk: Sending chunk 33554432 length 4465824
2021/03/05 17:32:27 INFO : testconv.mp4: Copied (new)
2021/03/05 17:32:27 DEBUG : Waiting for deletions to finish
2021/03/05 17:32:27 ERROR : Attempt 2/3 succeeded
2021/03/05 17:32:27 INFO :
Transferred: 67.517M / 67.517 MBytes, 100%, 2.031 MBytes/s, ETA 0s
Transferred: 1 / 1, 100%
Elapsed time: 35.4s
2021/03/05 17:32:27 DEBUG : 6 go routines active
You could try the --local-no-check-updated flag which will allow transfer of files that are being modified on a best-efforts basis. That would work if you think it is better to have the file partially uploaded than not at all.
Not needed. Like earlier said it's only a temporary folder where the files are written before they are moved to a permanent location.
Syncing the folder was only feasible thanks to --track-renames. But because of the uploads issue it's better to add the folder to the exclusion list.