Yandex.Cloud Object Storage, first impression


#1

In November, 2018, Yandex team started test operation of their own Yandex.Cloud soultion. I had a chance to take part in Yandex Object Storage tests to check general compatibility its S3-like API with RClone v1.45.

No errors were found, and RClone - Yandex.Cloud pair works just fine.

Below is a couple of observations I’d like to share with community.

1. Storage Classes

There two storage classes in Yandex Object Storage, cold and standard (ref Concepts - Storage class). My choice for cloud backups is a cold storage because it’s less expensive than standard one.

Yandex cold storage class corresponds to Storage Classes for Infrequently Accessed Objects in Amazon Simple Storage Service.

RClone also has an appropriate option to define S3 storage class. But what I’ve noticed RClone config sequence doesn’t support entering/editing storage class for s3 - other remotes.

In order to set cold storage class for all objects in the bucket (remote) you may manually

  • put following line in rclone.conf appropriate section
    storage_class = STANDARD_IA
  • or add command line option for each RClone call
    --s3-storage-class STANDARD_IA

2. Multipart Uploads

Yandex Object Storage supports S3 multipart uploads. And while testing uploads to Storage I could see it clearly in case of sudden connection break. Hundreds of pending chunks were waiting for upload to complete or cancel.

I didn’t find RClone command to handle incomplete multipart uploads stuck due to connection failure. Similair case was reported by @peixotorms in his post Multipart Uploads to Digital Ocean Spaces.

To clear upload queue manually

  • install AWS CLI (ref MS Windows example),
  • issue list command (ref list-multupart-uploads)
    aws s3api list-multipart-uploads --endpoint-url=https://storage.yandexcloud.net --bucket <name>
  • save list of pending uploads
    { "Uploads": [ { "UploadId": "<number>", "Key": "<path>", "Initiated": "<time>", "StorageClass": "type" }, { ... } ] }
  • issue abort command (ref abort-multipart-upload)
    aws s3api abort-multipart-upload --endpoint-url=https://storage.yandexcloud.net --bucket <name> --upload-id <number> --key "<path>"
  • repeat abort command for all pending uploads in the list

Note: –endpoint-url option is mandatory for Yandex Object Storage.

I do not think above observations can be treated as RClone feature requests. It’s more a matter of usability when working with a particular cloud service provider.


#2

Yipee :smile:

Did you try running the s3 integration tests against it?

go install github.com/ncw/rclone/fstest/test_all
test_all -remotes YourRemoteName:

Rclone has configuration for different providers - do you want to add a Yandex provider? This would make it possible for rclone to be asking those questions in the config and give an opportunity for rclone to have default endpoints etc. See the config here.

What do you think rclone should do here?

We’ve been thinking that rclone should cancel outstanding multipart upload requests on quit - would that help?


#3

No, have no golang and github installed and have to configure everything from scratch.
I can create temporary test bucket in the Yandex.Cloud and make dedicated service account for you.

Do not think that’s necessary, three options only have to be defined to get access to Yandex.Cloud bucket

  • access_key_id
  • secret_access_key
  • endpoint

I see here at least four use cases.

Current task to run

  • old (previous) task
  • new task, but same remote

User’s expectation

  • continue upload
  • abort pending uploads

Thus, use cases and proposed rclone actions are seen as follows

  1. old task restart, complete old pending - by default,
  2. old task restart, abort old pending - alter p. 1,
  3. new task start, complete old pending - invalid case,
  4. new task start, abort old pending - by default.

Not sure if it makes sense to complicate rclone to store the state and its context.


#4

That would be interesting - can you private message me the details? Thank you

If we created a Yandex provider, then we could

  • have some default endpoints
  • allow it to show the storage classes option

#5

Here are the results of my testing…

  • Yandex.Cloud works well with the rclone s3 backend
  • Currently server side Copy is not implemented at Yandex. This means
    • can’t set modification times on existing objects
    • can’t do server side copy
  • There are some disallowed characters in file names : * ? " < > | ! (unlike Amazon S3), see the docs for details
  • The configurator doesn’t currently prompt for storage class though you may put one in yourself storage_class = STANDARD_IA