Rclone and Yandex

Continuing the discussion from Rclone and Yandex.Disk (2022-09 summary):

@Korwin, can we continue this discussion? This seems to be a real problem when using the service and I'm just wondering if there's anything that you managed to figure out or do about it?

Thanks.

If, since the closure of my topic, no other topics have appeared on the forum about problems with rclone working with cloud storage from Yandex, I have no objections to continuing the discussion in my topic. However, I do not have the authority to open a topic to post new comments.

No, we could neither find out the reasons for the problems working with Yandex, nor get help from their support service. Its adequacy and desire to meet the needs of the client paying for their services have been questioned many times. We did not get the impression that you could do business with them being confident in their reliability.

In addition, I am sure that Yandex has not undergone any positive changes over the past four years, but has become worse in many ways. Now, being state monopolists of the Russian market with mail,ru, they have no reason to make efforts to improve their services. And their ability to do this after leaving the country of its most qualified specialists has become significantly less.

@Korwin, that makes sense. Thanks a lot for your reply. I'm just looking for a solution that can handle a few TBs of data tbh. Although not great, at least this is one of the options that can be considered, unlike others that have hard-capped their storage limits.

Would you happy to know or could you recommend something that you're using at the moment after the whole Google Workspace issue?

Thanks

@Korwin, it's all good. Thank you for your reply and you trying to help. The situation is crazy but I'm sure there's a way. There's always a way. Maybe not rn, but eventually there is. We'll keep looking :slight_smile:

We were forced to technologically roll back 25 years to the days of shared folders on the local network and VPNs. I would not recommend this solution to anyone now, in the 21st century. There is a feeling that after the sanctions and boycotts came into force, we found ourselves in the Stone Age. Actually, this is the goal that the civilized world has achieved in relation to my country. It’s hard to avoid mentioning politics in a situation where, because of it, you are forced to live as if you were back in the 90s. Sorry I couldn't help you.

Hello,
I'd like to investigate the situation with Yandex disk deeper.
I made an experiment with rclone 1.66.0 + remote with my own ouath token/secret on paid Yandex account.
I also used --user-agent flag to pretend to be Chrome browser.
Then I tried to upload 1.5 Gb folder with a dozen of video files to Yandex Disk using rclone, using regular web interface disk.yandex.ru and by manual curl.exe calls to REST API: https://yandex.ru/dev/disk-api/doc/en/

Rclone result is always bad, uploading takes more than 1 hour which is disappointing.
Web interface and curl upload is always fast and utilizes whole bandwitdth, it took about 5 minutes.
Okay, Yandex disk slows down rclone intentionally, we know that.

My question is simple. If I understand right, rclone uses the same REST API calls as I did manually with curl.exe. Then, how Yandex Disk recognizes rclone and forces upload speed to slow down?

My suggestions was about User-Agent, but I changed it with to "Mozilla/5.0 (Windows NT 10.0; Win64; x64)....." with appropriate rclone flag and the speed is still slow.

How can I trick Yandex Disk and convince it I don't use rclone?

For clearance, this is why I want Yandex Disk so much. Our company releases some materials for clients. But some clients can download only from Yandex Disk (not our Nextcloud storage, not by FTP etc) by their corporate policies. Also, Yandex disk has nice preview for most file types in browser and can share a folder by public link, not just single file, which is unavailable via Yandex Cloud S3. So, rclone + Yandex disk is very comfortable for us... except upload speed is unacceptably slow.

welcome to the forum,

one way to do what that, copy a single file using --dump=headers -vv
that will show all the api requests and responses with timestamps.

it might be that rclone is trying to upload too much data too fast, throttling, multi-thread uploads, chunksize, too many transfers, too many api calls too fast.
the rclone debug log might provide more details issues.

You are right, logs are the key to the solution.
So, I prepared a 1.5 Gb file made of random bytes (to avoid messing with Yandex caching/deduplication).
I started uploading it with the command:
rclone_log.txt (71.4 KB)

rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv copy c:\distrib\rclone-v1.66.0-windows-amd64\random_file_1500MB.mp4 yd_nastroim:share/test/ > rclone_log.txt 2>&1

I attach the file rclone_log.txt with all output here.

It took huge time before I noticed the upload was actually done, but I got "timeout awaiting response headers" error and the procedure started again, I interrupted it. This error is known so it is out of the case.
Still, the log says the initial upload lasted from 18:40 to 22:05 - 3.5 hours!!! Upload speed jumped from low to very low and back, despite the fact that I have 100 Mbit wired stable internet channel.

Can anybody look at the log and assume what can possibly be wrong? I couldn't see any bad thing except rclone uses hostname cloud-api.yandex.com but Yandex manuals tell it should be cloud-api.yandex.net. But I don't think it can affect the speed. All other things look OK for me.

It turns out to be interesting. I made several tests with various files and sizes and it seems *.mp4 files are always uploaded very slow. If file extension is not "mp4" (I used *.somefile) it uploads very fast with full available speed (~10 Mbytes/sec).

I tried to change HTTP header Content-Type to "application/octet-stream" (test #5) - no difference, the *.mp4 file was still uploaded with low speed.

All files except #6 are created with random bytes. In test #6 "spanish_kazakh(1).somefile" is actual MP4 video with extension changed manually.

Here are the tests:

1 - file extension = mp4, large file - slow
rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv copy c:\distrib\rclone-v1.66.0-windows-amd64\random_file_1500MB.mp4 yd_nastroim:share/test/ > rclone_log.txt 2>&1

2 - file extension = somefile - fast
rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv --timeout 60m copy c:\distrib\rclone-v1.66.0-windows-amd64\random_file_100MB_1.somefile yd_nastroim:share/test/ > rclone_log_2.txt 2>&1

3 - file extension = somefile, large file - fast
rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv --timeout 60m copy c:\distrib\rclone-v1.66.0-windows-amd64\random_file_1000MB_3.somefile yd_nastroim:share/test/ > rclone_log_3.txt 2>&1

4 - file extension = mp4 - slow
rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv --timeout 60m copy c:\distrib\rclone-v1.66.0-windows-amd64\random_file_100MB_4.mp4 yd_nastroim:share/test/ > rclone_log_4.txt 2>&1

5 - extension = mp4, header set to "Content-Type: application/octet-stream" - slow
rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv --timeout 60m --header "Content-Type: application/octet-stream" copy c:\distrib\rclone-v1.66.0-windows-amd64\random_file_100MB_5.mp4 yd_nastroim:share/test/ > rclone_log_5.txt 2>&1

6 - actual video file renamed to *.somefile - slow
rclone.exe --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36" --dump=headers -vv --timeout 60m copy c:\Users\btaran\Downloads\spanish_kazakh(1).somefile yd_nastroim:share/test/ > rclone_log_6.txt 2>&1

Any ideas how extension affects upload speed this strange way? At least, who is affecting - rclone or Yandex?

I attach all log files in case anybody wants to dig in. I interrupted some of tests when I saw the speed is definitely low.

rclone_log.txt (71.4 KB)
rclone_log_2.txt (13.1 KB)
rclone_log_3.txt (13.6 KB)
rclone_log_4.txt (16.2 KB)
rclone_log_5.txt (9.0 KB)
rclone_log_6.txt (13.1 KB)

try uploading another video file.
perhaps yandex is analyzing the media, to enable previews or look for illegal/pirated content?

might that the timestamps from the dump log.

*.mov (Quicktime video) acts the same as *.mp4.

As for me, it is quite bad idea to analyze video while in progress of uploading... To be honest, I would be surprised if Yandex really does it.

I did the same with pure curl.exe and REST API calls with the same results.

MP4 file is uploaded very slow (>10 minutes and I interrupted):

set BASE_URL=https://cloud-api.yandex.net/v1
set TOKEN=***secret***

# retrieving URL for uploading
curl.exe --get -X GET --header "Accept: application/json" --header "Content-Type: application/json" --header "Authorization: OAuth %TOKEN%" --data path=share/test/file.mp4 %BASE_URL%/disk/resources/upload

# upload with URL retrieved earlier
curl.exe -X PUT --header "Accept: application/json" --header "Content-Type: video/mp4" --data-binary "@c:\distrib\rclone-v1.66.0-windows-amd64\random_file_100MB_10.mp4" https://uploader72j.disk.yandex.net:443/upload-target/20240608T001540.529.utd.1rc9f35hlcki17q9jb99q4v1-k72j.32997107

*.somefile was uploaded very fast with full bandwidth used:

# retrieving URL for uploading
curl.exe -v --get -X GET --header "Accept: application/json" --header "Content-Type: application/json" --header "Authorization: OAuth %TOKEN%" --data path=share/test/random_file_100MB_11.somefile %BASE_URL%/disk/resources/upload

# upload with URL retrieved earlier
curl.exe -v -X PUT --header "Accept: application/json" --header "Content-Type: application/octet-stream" --data-binary "@c:\distrib\rclone-v1.66.0-windows-amd64\random_file_100MB_11.somefile" https://uploader65j.disk.yandex.net:443/upload-target/20240608T002948.950.utd.1gech5bqt1h8bkysfp0e6uybo-k65j.35659260

So, it seems rclone is not involved in this behavior. I will try to show these results to Yandex support (not mentioning rclone, I suppose) and report here.

Interesting thing, I thought I could trick Yandex and rename *.mp4 to something else, upload with high speed and rename uploaded file back to *.mp4. No, it seems the file content is analyzed and file "filename.somefile" with actual video inside is still uploaded slowly.