Wasabi Upload Many Small Files Many Directories

Thank you; despite incomplete information, IMO, you're providing excellent advice. :grinning:
Allow me to start with use case context:

  • vFlyer is Java web application providing web content authoring services similar to SquareSpace, Wix, WordPress, etc. Our service includes domain resale which make custom domain web sites possible: https://www.1240westmain.com/
  • Customers compose complex web pages; work products are HTML, style sheet, & image files i.e. "Resources".
  • Resources are served with Nginx which also reverse proxies Apache Tomcat hosting our Java web applications.
  • Resources are NFS shared by Red Hat Enterprise Linux 5 dedicated server stored on triple volume RAID5 SCSI arrays, ext3 format, and mounted as:
    • /resources/r1 - 1,833 GiB volume, 1,263 GiB used
    • /resources/r2 - 1,833 GiB volume, 1,664 GiB used
    • /resources/r3 - 1,833 GiB volume, 1,347 GiB used
    • 4,274 GiB used total in Production environment, last of three:
      1. Windows Developer workstation - no change, Resources remain local
      2. Linux Staging integration - Resources copy to Wasabi currently in progress
      3. Linux Production operation - Resources copy to Wasabi planning in progress
  • Production Resources are proxied by BunnyCDN now and in the future.
  • We will rely on rclone mount per forum topic 19903 21-Oct-2020 Linux NFS Server with Rclone and local disk for cache to provision a POSIX compliant NFS share taking great care to mount remote bucket just once from single NFS rclone host.
  • RAID5 array drives are old and slow making traversing large directory and file population painful so we plan to traverse old volumes just once.
  • Production Resources move is planned as "hot" move; Production application writing operation continues while move is in progress using an OverlayFS twist.
  • To make old volume single traversal possible we added Ubuntu 20.04 LTS virtual machine to NFS share an OverlayFS with 1 TiB XFS upper layer and old volume lower layer; an OverlayFS mount per old volume.
  • OverlayFS writes to upper layer only making lower layer effectively read only and safe to traverse just once.
  • Later catch up copy operations will be from the upper layer only, a much smaller data set as writes are relatively infrequent events compared to reads.
  • BunnyCDN caching slows read events at our server.

Some Production Resources date back to our 2006 service launch; we are considering options to order copy by modification times for some directories whereas other directories must be complete irrespective of age.

Our next steps are deploy Production Resources OverlayFS mounts then begin limited scope Production Resources copy tests from "read only" legacy volumes to Wasabi.

IMO all your suggestions are worthy considerations. We have these options:

  • continue current Staging Resources rclone copy or interrupt as seems prudent
  • start Production Resources rclone copy tests immediately after OverlayFS deployment

I am working toward having Production Resources OverlayFS mounts completed tonight.

Success looks like we fully utilize 1G network link between our legacy RHEL5 NFS shared RAID5 array server and Ubuntu NFS shared OverlayFS server performing rclone copy operations.