We have multiple MFP printers that we use to scan documents into an SMB share. Next year, we're migrating to Chrome OS, so I'd rather not have two sets of credentials (one for G Suite, one for Active Directory for the SMB share), so I'm experimenting and found that I could just have an SMB share that Rclone routinely checks and moves files from their to their respective locations in Google Drive. The only problem I'm having so far, is that Rclone is uploading a copy while the file is in use - is there a way I can tell Rclone NOT to try moving a file if it's in use by another process? I do not want scans being uploaded partially before the scanner is finished writing the file.
Thank you!
What is your rclone version (output from rclone version)
That means the file wasn't in use the time is being copied.
An example would be like this:
C:\Users\earlt\Downloads\rclone-v1.53.3-windows-amd64>rclone copy blah.txt GD: -vv
2020/12/10 14:05:14 DEBUG : rclone: Version "v1.53.3" starting with parameters ["rclone" "copy" "blah.txt" "GD:" "-vv"]
2020/12/10 14:05:14 DEBUG : Creating backend with remote "blah.txt"
2020/12/10 14:05:14 DEBUG : Using config file from "C:\\Users\\earlt\\.config\\rclone\\rclone.conf"
2020/12/10 14:05:14 DEBUG : fs cache: adding new entry for parent of "blah.txt", "//?/C:/Users/earlt/Downloads/rclone-v1.53.3-windows-amd64"
2020/12/10 14:05:14 DEBUG : Creating backend with remote "GD:"
2020/12/10 14:05:14 DEBUG : blah.txt: Need to transfer - File not found at Destination
2020/12/10 14:05:14 ERROR : blah.txt: Failed to copy: failed to open source object: The process cannot access the file because it is being used by another process.
2020/12/10 14:05:14 ERROR : Attempt 1/3 failed with 1 errors and: failed to open source object: The process cannot access the file because it is being used by another process.
2020/12/10 14:05:15 DEBUG : blah.txt: Need to transfer - File not found at Destination
2020/12/10 14:05:15 ERROR : blah.txt: Failed to copy: failed to open source object: The process cannot access the file because it is being used by another process.
2020/12/10 14:05:15 ERROR : Attempt 2/3 failed with 1 errors and: failed to open source object: The process cannot access the file because it is being used by another process.
2020/12/10 14:05:15 DEBUG : blah.txt: Need to transfer - File not found at Destination
2020/12/10 14:05:15 ERROR : blah.txt: Failed to copy: failed to open source object: The process cannot access the file because it is being used by another process.
2020/12/10 14:05:15 ERROR : Attempt 3/3 failed with 1 errors and: failed to open source object: The process cannot access the file because it is being used by another process.
2020/12/10 14:05:15 INFO :
Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors: 1 (retrying may help)
Elapsed time: 1.0s
2020/12/10 14:05:15 DEBUG : 4 go routines active
2020/12/10 14:05:15 Failed to copy: failed to open source object: The process cannot access the file because it is being used by another process.
But I had Adobe Acrobat open the whole time, not just during the time it wasn't copying it.
Adobe Acrobat was an example.. is there a native way Rclone can ignore a file for uploading until it's not being written to anymore?
I'd have to do more testing to see how exactly the MFPs write files to SMB share - if it caches it then transfers it, or directly writes to the file it saves to the share.
just because a file is in use with acrobat, does not mean that acrobat has the file open at any given moment.
acrobat, might open a file for edit, save a change and then close the file.
as a test, you might write a simple batch script to test it, running as the same user, that rclone runs as.
As a separate test, I had Adobe Acrobat open with that file and I tried deleting the file in the folder manually, in File Explorer, and it rejected it, saying that the file is open in a program. And it continues to do this, so it appears it is constantly opened with Adobe Acrobat in one way or another, maybe a way that Rclone ignores?
UPDATE: I tested scanning a document with 35 pages in a scanner, and had Rclone running the command every second. In the middle of the scanner writing, Rclone came up with an error saying something along the lines of:
"..source file is being updated. Size changed..."
Looks like the functionality I'm looking for is definitely there!!! THANKS
That is a pretty good test. It isn't perfect though.
What you probably want is to include something like a --min-age 5m filter flag which won't attempt to sync things unless they are at least 5 minutes old.
That would be nice, but it is not ideal. We scan a lot of documents in, and the potential of people having to wait 5 minutes for their scans to appear in Google Drive is way too long.
Ah, I didn't realise you might have impatient users on the other end!
You can make the number smaller which might work well.. Does the scanner write the PDF all in one go at the end or does it write it while scanning each page? If the former then setting --min-age 15s might work provided it takes the scanner less than 15 seconds to save the file (which seems likely),
It appears the solution I have with Rclone is working correctly. It's been able to detect when the PDF is open and being modified, then works after the MFP is done with it.