The flag: --use-server-modtime has no effect on a remote of type azure blob
Note: I am connecting to an azure storage account of type "Datalake Gen2".
When I upload a file "report1.txt" from my Linux server to an Azure Data Lake Gen2 storage, the
rclone lsl command always shows the time when the file was created on Linux (Metadata key Mtime).
I was expecting to see the time when the file was uploaded to the storage container "server-modtime" (Last Modified time in the screenshot)
Both commands provide the same result:
rclone lsl datalake2:sbdatalake/reports/ --use-server-modtime
rclone lsl datalake2:sbdatalake/reports/
1048576 2022-01-26 09:55:21.588528368 report1.txt
Note:
2022-01-26 09:55 : is the time the file was created on Linux
I uploaded the file today 28.Jan to the container.
I was expecting to see the date when I uploaded the file: 2022-01-28 .. in case I use the flag --use-server-modtime
In AWS S3 it works without any issues. When I provide the --use-server-modtime flag it shows the date of the upload to the bucket.
Run the command 'rclone version' and share the full output of the command.
rclone v1.57.0
os/version: Microsoft Windows 10 Pro 2009 (64 bit)
os/kernel: 10.0.19042.1466 (x86_64)
os/type: windows
os/arch: amd64
go/version: go1.17.2
go/linking: dynamic
go/tags: cmount
Which cloud storage system are you using? (eg Google Drive)
Azure Blob Storage, Data Lake Gen2 ( basically Azure Blob with support for directories)
The command you were trying to run (eg rclone copy /tmp remote:tmp)
Both commands provide the same result:
rclone lsl datalake2:sbdatalake/reports/ --use-server-modtime
rclone lsl datalake2:sbdatalake/reports/
1048576 2022-01-26 09:55:21.588528368 report1.txt
![2022-01-28 16_36_25-reports_report1.txt - Microsoft Azure|455x500](upload://n7tJZW6qozTmNwox9LAWtXzbDpo.png)
The rclone config contents with secrets removed.
[datalake2]
type = azureblob
account = stonebranchsd
service_principal_file = C:\coding\datalake\azure-principal.json
azure-principal.json:
{
"appId": "xyz",
"displayName": "abc",
"password": "klm",
"tenant": "abc"
}
connectio to azure data lake gen2 is via service principal
A log from the command with the -vv flag
C:\coding\datalake\rclone>rclone lsl -vv datalake2:sbdatalake/reports/ --use-server-modtime
2022/01/28 17:05:54 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "lsl" "-vv" "datalake2:sbdatalake/reports/" "--use-server-modtime"]
2022/01/28 17:05:54 DEBUG : Creating backend with remote "datalake2:sbdatalake/reports/"
2022/01/28 17:05:54 DEBUG : Using config file from "C:\\Users\\nils.buer\\AppData\\Roaming\\rclone\\rclone.conf"
2022/01/28 17:05:54 DEBUG : fs cache: renaming cache item "datalake2:sbdatalake/reports/" to be canonical "datalake2:sbdatalake/reports"
1048576 2022-01-26 09:55:21.588528368 report1.txt
2022/01/28 17:05:54 DEBUG : 6 go routines active
C:\coding\datalake\rclone>rclone lsl -vv datalake2:sbdatalake/reports/
2022/01/28 17:06:11 DEBUG : rclone: Version "v1.57.0" starting with parameters ["rclone" "lsl" "-vv" "datalake2:sbdatalake/reports/"]
2022/01/28 17:06:11 DEBUG : Creating backend with remote "datalake2:sbdatalake/reports/"
2022/01/28 17:06:11 DEBUG : Using config file from "C:\\Users\\nils.buer\\AppData\\Roaming\\rclone\\rclone.conf"
2022/01/28 17:06:11 DEBUG : fs cache: renaming cache item "datalake2:sbdatalake/reports/" to be canonical "datalake2:sbdatalake/reports"
1048576 2022-01-26 09:55:21.588528368 report1.txt
2022/01/28 17:06:11 DEBUG : 6 go routines active
Hi,
thanks for supporting me.
what I noticed is that when I upload a file from my local disc to an azure container using the azure console everything works fine (No Metadata.Mtime property is set -> see screenshot).
The command: rclone lsf -vv datalake2:sbdatalake/reports/report2.txt --format "tsp" --use-server-modtime
returns the modification time, when I uploaded the file to the container ( and not the time, when I last modified it on my local file system) -> this is what I want
=> 2022-01-29 01:55:48;104;report2.txt
When I upload the same file using the rclone copy command:
than a Metadata.Mtime property is set and the lsf command:
rclone lsf -vv datalake2:sbdatalake/reports/ --format "tsp" --use-server-modtime
always returns the time when I last modified the file on my local disc ( which was last year).
the flag "--use-server-modtime" does not change this behaviour. the command always returns the value in the field Metadata.Mtime property.
Note:
See below the requested log-file from the copy command:
rclone copy windows:C:\demo\report1.txt datalake2:sbdatalake/reports/ --dump=headers --retries=1 --low-level-retries=1 --log-level=DEBUG --log-file=rclone.log rclone.log (9.4 KB)
each example seems uses a different source file with a different set of dates?
so going forward, let's just use the latest example, reports/report2.txt
and using both rclone lsl and rclone lsf?
since your latest example uses rclone lsf, let's use that.
Hi,
I tested it on AWS S3, Azure Blob storage and Azure Blob storage with ADLS Gen2 ( Datalake) enabled. On AWS S3 is works fine. When using the flag use-server-modtime I see the data, when I uploaded the file to the container.
On Azure the flag has no effect independent if it is Azure Blob storage with ADLS Gen2 enabled or not.
Azure Blob storage with ADLSGen2 (Datalake) enabled
Hi Nick,
I had a quick look. It seems azure blob should work similarly, then AWS S3.
When the flag "--use-server-modtime" is set we need to ignore the value in the metadata key: mtime and use the LastModified time ( hopefully returned in the http headers - need to check that).
When the flag "--use-server-modtime" is not set we will first try to read the metadata key mtime and if that isn't present the LastModified returned is used.
I think I need to add to the azureblob.go file the function
func (o *Object) ModTime(ctx context.Context) time.Time {
if o.fs.ci.UseServerModTime {
return o.lastModified
}
...
I ask our Software Architect ( rclone user: asaglam0) to help me on the implementation for the "--use-server-modtime" for Azure and if possible also for Google GCS.
Is there a "How-to" document we could read upfront on the development process or should I just send you the changes we made after we tested everything?
Hi Nick,
Abdullah (rclone user: asaglam0) our Software Architect fixed the issue with the --use-server-modtime flag when using Azure Storage buckets as remote. When the flag --use-server-modtime is provided, the server modified time is used instead of the object metadata. The behavior is now the same as it is currently for the AWS S3 remote. Note: Once you accept our fix, we will also implement it for Google GCS. Abdullah plans to upload our changes within the next week.