Gdrive mounted folder visible from HPC-nodes

What is the problem you are having with rclone?

Hi,

I'm working on a Linux cluster where I have access to the SLURM queuing system ( I do not have root pw).
I'have mount from the logging node my gdrive with the below command:

The mounted works fine and from the logging node I can see the folders & files.
Nonetheless, if I enter a node I can't see any more the gdrive folders under the mounted folder.

I tried to use the --allow-other option but unfortunately did not work.

Are there some flags that could make the mounted folder accessible from all the nodes.

Besides that, Are there some flags that could make read speeds faster?

I tried to change the https://rclone.org/commands/rclone_mount/#vfs-file-caching but no improvement.

Fracking speaking some examples in the help page how/when to use these settings will be appreciated.

Thanks

Giuseppe

What is your rclone version (output from rclone version)

rclone v1.51.0

  • os/arch: linux/amd64
  • go version: go1.13.7

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Red Hat Enterprise Linux Server release 7.7 (Maipo)

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone mount --daemon -vvv remote: /path/gdriveGA

The rclone config contents with secrets removed.

type = drive
scope = drive

A log from the command with the -vv flag

rclone mount --daemon  -vv  remote: /gpfs/loomis/project/sbsc/hydro/dataproces/gdriveGA
DEBUG : rclone: Version "v1.45" starting with parameters ["rclone" "mount" "--daemon" "-vv" "remote:" "/gpfs/loomis/project/sbsc/hydro/dataproces/gdriveGA"]
DEBUG : Using config file from "/home/ga254/.config/rclone/rclone.conf"
DEBUG : rclone: Version "v1.45" finishing with parameters ["rclone" "mount" "--daemon" "-vv" "remote:" "/gpfs/loomis/project/sbsc/hydro/dataproces/gdriveGA"]

hello and welcome to the forum

Nope as it's got nothing to do with rclone, but with fuse as it's a user file system and only the user running it can see it. Allow_other is required as it's a server level setting to turn on as it allows everyone to see that user's fuse mount.

You'd have to turn it on to make a user other than the one running the fuse process see what's being mounted.

Thanks!
at the moment I'm using the rclone v1.51.0 which is available in the HPC-cluster.
The --vfs-cache-mode=full definitely improve the speed, but no-visibility of the mounted folder from the different nodes still persist.
To solve this issues I did a work-around by creating a unique folder (and mounted) for each job. Unfortunately, also this solution is not working. The folders are mounted correctly with no issues but in the moment i try to read the files (simultaneously from the different nodes) i got some errors that are specific to the software (gdal.org) that I use (see below). The error disappear if I lunch one job at the time - so there is some issues in the multiple-reading action.
I thought this was due to some overload of the cache so I try to set the mount in this way

rclone mount --daemon -vvv --vfs-cache-mode=full --vfs-cache-max-size 12G --vfs-read-chunk-size 200M --vfs-cache-max-age 0h1m00s remote: /path/gdriveGA

but no improvement.

Please consider that in GDAL I can control the GDAL_CACHEMAX=????. So I think that I should probably set in accordance to --vfs-cache-* as function of how many job I'm running.
Could be that I have to set also a unique cache-folder for each job?
Anyone has experience on this multi-core procedure?
Any idea will be appreciate
Thank you for your support
Best
Giuseppe

RROR 1: ZIPDecode:Decoding error at scanline 8448, unknown compression method
ERROR 1: TIFFReadEncodedTile() failed.
ERROR 1: /project/fas/sbsc/hydro/dataproces/GDRIVEGA/GDRIVEGA10/TERRA/ppt_acc/1958/tiles20d/ppt_1958_06_h32v12_acc.tif, band 1: IReadBlock failed at X offset 33, Y offset 6: TIFFReadEncodedTile() failed.
ERROR 1: ZIPDecode:Decoding error at scanline 9216, unknown compression method

I think you may be hitting bugs we've fixed in the latest release so I would recommend v1.53.3 if you can, especially if you are using --vfs-cache-mode full.

Hi,
thanks for the fast replays,
Now I have installed the Rclone 1.53 and I'm using it with the following line:

rclone mount --daemon -vvv --vfs-cache-mode full   remote: /path/gdriveGA

What I noticed is that when I run a command that uses files that are under the rclone-mounted point all the files will be downloaded under my home (/home/ga254/.cache/rclone/vfs/remote/ )
In my case, this behavior is not very ideal because under the mounted point ( /path/gdriveGA) I have several files (overall size 1T).
The command that i'm trying to run is gdallocationinfo which is part of gdal.org.
The gdallocationinfo retrieves the pixel value ( eg. 25) of a GeoTiff at a specific latitudinal (e.g. 20degree) and longitudinal location (e.g. 10degree) .
gdallocationinfo -geoloc /path/gdriveGA/input.tif 10 20
Pixel value = 25
so there is no need to pre-downloading the full input.tif for retrieving the value of one pixel.

Is there a way to avoid the full downloading of the file and retrieve only the output of the command (in my case gdallocationinfo)?

And also, use an rclone mount in such a way that the multicore operation (done by the different jobs) does not slow down the computation?
Thank you Giuseppe

hi,

--vfs-cache-mode full on v1.53.x uses a sparse file on file systems that support that.

does your file system, that hosts the vfs cache, support sparse files?

  • if true, rclone will download the part of the file that it is requested to.
  • if false, rclone will download the entire file before you can access the part the was requested.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.