Idea about corrupt on transfer

What is the problem you are having with rclone?

Files will not copy to GDrive File Stream either direct from Box or from a local copy without a "corrupt on transfer file sizes differ" error unless I turn on flags --ignore-size --ignore-checksum, whereupon they copy but are now a different size. I inspect the local source and Google target with Beyond Compare and find that rclone has apparently padded the tail of target with 00's to some "nice" blocksize (this is not normal Google behavior, if I use Beyond Compare for example to upload files). This means that I cannot do a post-compare (with BC) to ensure the files are the same because they are not. I can send a screengrab if you want. I am hoping to be able to transfer a large number of files direct from Box to Google Drive and not have them changed en route and not having to download first to local drive then re-upload to Google.

What is your rclone version (output from rclone version)

1.53.1

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Win 10 64

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone.exe copy "D:\fromBox\1996_BRP_FL_S1083" "G:\My Drive\CCB_Datastore\Projects\1996\1996_BRP_FL_S1083" -i --ignore-size --ignore-checksum

The rclone config contents with secrets removed.

[Box]
type = box
box_sub_type = enterprise
token = {"access_token":"","token_type":"bearer","refresh_token":"","expiry":"2020-10-22T15:51:21.30599-04:00"}

[GoogleFS]
type = drive
scope = drive
token = {"access_token":","token_type":"Bearer","refresh_token":"","expiry":"2020-10-22T15:52:56.2940105-04:00"}


A log from the command with the -vv flag

Paste  log here

Do you have a log you can share?

Can you tell me where it is created? I ran with -vv and can't locate it (newb, sorry)

I should add (or point out) that I am running in -i mode and stepping through files, so there is actually no error in the command I execute, the error is that the output file has been altered. Not sure a log is going to be much help but happy to provide.

Another piece of info: I can find the file in the Google Cache on the local machine and it has the same (wrong) size as the file I see on the Google Drive after upload. So the file size change (0-padding) happened as rclone made the copy from local drive to Google, not something Google did on the way up to Googleville. I can find other files I have recently uploaded using Beyond Compare to Google Drive and they have not been altered (padded) in size (and if fact, will CRC check as binary same).

You'd use -vv and --log-file rclong.log or if you don't use log-file, it would output to the screen. I'd use a log-file though.

Grr won't let me upload the .log file

2020/10/23 10:13:09 DEBUG : rclone: Version "v1.53.1" starting with parameters ["rclone.exe" "copy" "D:\fromBox\1996_BRP_FL_S1083" "G:\My Drive\CCB_Datastore\Projects\1996\1996_BRP_FL_S1083\S1083_FL01_FLAC" "-i" "--ignore-checksum" "--ignore-size" "-vv" "--log-file" "C:\users\crp11\rclone.log"]
2020/10/23 10:13:09 DEBUG : Creating backend with remote "D:\fromBox\1996_BRP_FL_S1083"
2020/10/23 10:13:09 DEBUG : Using config file from "C:\Users\crp11\.config\rclone\rclone.conf"
2020/10/23 10:13:09 DEBUG : fs cache: renaming cache item "D:\fromBox\1996_BRP_FL_S1083" to be canonical "//?/D:/fromBox/1996_BRP_FL_S1083"
2020/10/23 10:13:09 DEBUG : Creating backend with remote "G:\My Drive\CCB_Datastore\Projects\1996\1996_BRP_FL_S1083\S1083_FL01_FLAC"
2020/10/23 10:13:09 DEBUG : fs cache: renaming cache item "G:\My Drive\CCB_Datastore\Projects\1996\1996_BRP_FL_S1083\S1083_FL01_FLAC" to be canonical "//?/G:/My Drive/CCB_Datastore/Projects/1996/1996_BRP_FL_S1083/S1083_FL01_FLAC"
2020/10/23 10:13:09 DEBUG : Local file system at //?/G:/My Drive/CCB_Datastore/Projects/1996/1996_BRP_FL_S1083/S1083_FL01_FLAC: Waiting for checks to finish
2020/10/23 10:13:09 DEBUG : Local file system at //?/G:/My Drive/CCB_Datastore/Projects/1996/1996_BRP_FL_S1083/S1083_FL01_FLAC: Waiting for transfers to finish
2020/10/23 10:13:12 INFO : S1083_FL01_19960426_deployInfo/New Smyrna Data.doc: Copied (new)
2020/10/23 10:13:15 INFO : S1083_FL01_19960426_deployInfo/New Smyrna
Data.txt: Copied (new)
2020/10/23 10:13:17 INFO : S1083_FL01_FLAC/2CH_44100H_60s/NS_09/S1083FL01_44100H_M02_NS09_19960501_000000.flac: Copied (new)
2020/10/23 10:13:20 NOTICE: Quitting rclone now

hi,

there does not seem to be any problems with that log.

you need to run the original command that generated the problems with a debug log.

what is the G: drive, a GDrive File Stream, a rclone mount or what?
if it is a rclone mount, can you post the mount command?

Animosity022 asked for a log, so I reran the original copy (first post) by appending -vv --log-file rclone.log, ran it, copied the contents of that .log and pasted it since the forum won't let a new user just drag and drop.
I don't think there's anything wrong with the log either! The problem statement is that when I execute the command, the transferred file (the .flac) ends up bigger than it started cause something (I suspect rclone) padded it with 0's to some new size for no apparent reason. This is unacceptable: i want the exact same bits on both ends when the copy concludes.
The G: drive in this case is a Google Drive File Stream (not team) mountpoint on a Win10 64-bit Dell PC. I routinely copy thousands of files using this and similar machines from various sources to this same mountpoint and never before have I seen the files "get bigger" as a result. So since I found no prior solutions to similarly reported issues about corruption, I am simply offering the information that rclone padded my files. If there is a flag to say "don't pad my friggin' files" please tell me what it is. Cheers

ok, now i better understand and yes, that is frustrating.

it is possible that the issue is not a rclone bug but something about file stream.
that flac file can only get into the google cache, if google itself put it there.

have you tried to copy that flac file via file stream, on another computer?

i would try a few tests with that flac file

  • rclone local to local
  • copy to gdrive using rclone mount GoogleFS h:
  • copy to gdrive using google website

Thanks, the first 2 ideas are good tests (in a couple zooms right now so will try later and post results). The third won't help due to the fact that once I debug this, what I really need to do is move 100+TB from Box to Google so trying to pipeline this. :grinning:
I did use rclone to pull down files from Box to this PC (D:) and the files are fine, so yes, Google may be involved. Note that i checked the file in the intermediate local cache that Google employs and it is already 'corrupt' at that point, even prior to upload to Google proper. Thanks again

The most efficient way to do this would be to use rclone to copy straight from drive to box and not use google file stream at all. The files will appear in google file stream once they are copied.

This will stream the files through your computer but you won't need local disk space.

I'm not sure why you are seeing the NULs at the end of the blocks when you transfer a file. It sounds like it is something to do with how rclone writes to the file system and for some reason google file stream doesn't like that. One thing you could try is --multi-thread-streams 0 - that will stop rclone making sparse files which might be affecting things.

Thanks, Nick. That sounds like a good idea but I have no idea how to do it. My only (known to me) interface to Google is by firing up Google Drive File Stream on this PC. is there a more direct route? My boss uses ftp (local->Google) but I don't think that will help with the Box->Google need I have. If I missed something where rclone would do this, I'd use it! I'll go back to trying the other tests meanwhile.

My understanding of Google Drive File Stream is that it lets you see the contents of your existing google drive as a mounted drive - is that right?

If so, then follow the setup here to create a google drive remote - call it gdrive (so where the docs use the word "remote" you type in "gdrive").

Check it is working by typing rclone lsf gdrive: - you should see a list of your files.

Do the same thing for the box setup , call the remote box this time. Check it is working with rclone lsf box:

Now you can use rclone copy to copy straight from box: to gdrive:, something like

rclone copy -v box:path/to/files gdrive:path/to/destination

Add the -i flag to check that rclone is doing what you want before running anything,

executing 'eyes -scales' on local-user
Completely overthinking it here; I had set up both remotes (Box and GoogleFS) thinking those configs were simply dealing with authentications, but then missed out the concept that those were effective mount point names, then reverted to using the Windows drive mount points where I had Box Drive and Google FS services on the local PC. God knows what those two get up to under the covers. I see now that I do something vastly simpler like 'rclone copy Box:/dir GoogleFS:/dir', so will proceed with that plan as soon as the big job running on this PC ends. I expect this will work (lsd's did work), but will holler if not.
Thanks much Nick (and the others who responded!

Let us know if you need any more help :slight_smile:

Completed first substantial copy (17 hrs, 143 GB, 47000 files) from Box to Google Drive with no errors. All good! (yes I know how to use tar, but some of our work requires lots of files, other times I tar first; tars too large don't fit on Box anyway)

For the record, the dumb thing I did that caused the initial report was to build configs for remotes 'Box' and 'GoogleFS' (Google Drive via File Stream) to establish the authentication keys, then because I already had these two services mounted on Windows 10 with Box Drive app showing my Box dirs under C:/users/me/Box and Google File Stream mounted at G:, I tried to:
rclone copy C:/users/me/Box/subdir G:/subdir
and while the copy job ran, it created the corrupt (padded) files

When I did it right, it was simply:
rclone copy Box:/subdir GoogleFS:/subdir

Great, glad it is working now.

I think the null padded files are a bug, but not sure if that is in rclone or Google file stream or the box program. I'd guess probably not rclone since you used it to just copy local files as far as it was concerned.

I have encountered the same problem today when trying to use rclone with google drive file stream on windows. I have managed to pin down the problem to something going wrong when rclone attempts to copy any files to the drive file stream mount. The problem arises even if an attempt is made to copy a local file to the drive file stream mount point with rclone. For example:

C:\Windows\system32>rclone copy D:\Desktop\1.txt "G:\My Drive"
2020/12/24 14:55:30 ERROR : 1.txt: corrupted on transfer: sizes differ 3 vs 512
2020/12/24 14:55:30 ERROR : Attempt 1/3 failed with 1 errors and: corrupted on transfer: sizes differ 3 vs 512
2020/12/24 14:55:30 ERROR : 1.txt: corrupted on transfer: sizes differ 3 vs 512
2020/12/24 14:55:30 ERROR : Attempt 2/3 failed with 1 errors and: corrupted on transfer: sizes differ 3 vs 512
2020/12/24 14:55:30 ERROR : 1.txt: corrupted on transfer: sizes differ 3 vs 512
2020/12/24 14:55:30 ERROR : Attempt 3/3 failed with 1 errors and: corrupted on transfer: sizes differ 3 vs 512
2020/12/24 14:55:30 Failed to copy: corrupted on transfer: sizes differ 3 vs 512

In this scenario D:\ is a NTFS partition of a physical local hard disk and G:\ is the mount point of google drive file stream.

Conversely, when attempting to copy file with other command lines tools, no problems arise and no file corruption is present. For example:

C:\Windows\system32>COPY D:\Desktop\1.txt "G:\My Drive"
        1 file(s) copied.

While the bug is irrelevant where local to local copying is concerned, the same situation arises when trying to directly copy from other remotes to drive file stream or trying to write local files to a crypt remote whose files source is a drive file stream mount.

Still no idea what the actual cause of this behavior might be though.

hello and welcome to the forum,

imho, using rclone to copy to gfstream is not a good idea
rclone mount will do that same as gfstream

as per the OP,

and per the rclone author

Thank you!

However, for my case and in my experience on windows at least, rclone mount cannot handle opening small files as smoothly as file stream can. Whilst file stream seems to open files only a bit slower than a mechanical hard drive after accounting for time to transfer the file, rclone mount often makes my viewer software outright freeze for 5-10 seconds each time a new file is opened. Even a private API key or vfs-mode full don't do much to alleviate that. It's not a problem when large files like videos are concerned since those must buffer a bit anyway, but it does make it quite frustrating working with a large amount of image files which is one of my main tasks.

ok, i undertand., trying to get the best solution for your use-case.

as a test, copy one file and post
-- debug log with errors
-- the config , redacting secrets.