Print new line between files when using "rclone cat"?

What is the problem you are having with rclone?

I am trying to output several remote files from an S3 bucket to stdout. Each file contains a single line. I'd like each file to output to a separate line also. However, rclone cat joins the files without any separator, which makes the output unusable/unreadable. Is there any way to accomplish this?

Run the command 'rclone version' and share the full output of the command.

❯ rclone version
rclone v1.62.2
- os/version: ubuntu 20.04 (64 bit)
- os/kernel: 5.15.90.1-microsoft-standard-WSL2 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.20.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Amazon S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone cat :s3,env_auth=true:bucket/path

The rclone config contents with secrets removed.

I don't have an rclone config. I just use the inline options, e.g. :s3,env_auth=true:bucket/path

I'm not aware of how to do that as cat normally combines/concatenates things together so adding a new line would seemingly break it.

I'd imagine you could script the files out and cat them one by one.

1 Like

Yeah, that's what I did while I was waiting for an answer. It's just monstrously slow compared to an rclone cat that includes everything.

rclone lsf ':s3,env_auth=true:bucket/path' --include */file.txt -R --files-only | xargs -n1 -I{} bash -c 'echo "$(rclone cat :s3,env_auth=true:bucket/path/{} 2>/dev/null )"'

I've done it before with s3cmd, which outputs each file to a new line, but that project is seemingly abandoned, so I thought I'd try with rclone.

s3cmd --no-progress get s3://bucket/path/*/file.txt -

Using aws-cli works also, but similarly needs to loop over the results and so is very slow:

FILES=$(aws s3api list-objects --bucket bucket --prefix path --query 'Contents[?ends_with(Key, `file.txt`) == `true`].Key' --out text)
for file in $FILES; do aws s3 cp s3://bucket/$file - ; done

Note how both aws-cli and s3cmd support the common unix syntax of - as the target, meaning output to stdout instead of a file.

Perhaps you can pipe the result through sed/regex to insert the new lines?

Does your file content have an easy recognizable pattern in the beginning or end?

1 Like

I would, but rclone cat doesn't offer any way to post-process each individual file. It gets the contents of one file and dumps it to stdout, then the next and the next, and so on. There is no option to inject any processing before or after dumping to stdout.

Oh I see what you mean. Let rclone cat dump it all to stdout, then process that, trying to separate each file. No not really. Each file is minified json. It might just be a quoted string, or a full blob.

1 Like

Perhaps something like this...

}{   -->   }<nl>{
}"   -->   }<nl>"
"{   -->   "<nl>{
""   -->   "<nl>"

... the last probably need some extra context binding to ignore empty strings

(if you are OK, that it isn't a 100% fail safe solution)

1 Like

I see how that could possibly work, but :scream:! lol... I don't need to do this all that often at the moment, so the slowness of looping is good enough for now. Or I'll just use s3cmd, which is probably 100 times faster.

It would be nice if rclone might consider a feature that could handle some kind of processing for combining the files though, even if it is just injecting a separator character. That feels like it could be very powerful. Or just updating copy to use - as a target for stdout, similar to s3cmd and aws-cli and others, and just outputting each file to a new line.

Agree, we just need to find somebody to develop/contribute it or sponsor it - interested? :smile:

1 Like

If it were python, I could tackle it pretty quickly. But golang is a stretch for me! I always worry about claiming something, and then blocking someone that could work it much faster from picking it up...

Adding a new flag for this should be pretty easy...

A patch like this

--- a/fs/operations/operations.go
+++ b/fs/operations/operations.go
@@ -1259,7 +1259,7 @@ type readCloser struct {
 //
 // if count < 0 then it will be ignored
 // if count >= 0 then only that many characters will be output
-func Cat(ctx context.Context, f fs.Fs, w io.Writer, offset, count int64) error {
+func Cat(ctx context.Context, f fs.Fs, w io.Writer, offset, count int64, sep []bytes) error {
 	var mu sync.Mutex
 	ci := fs.GetConfig(ctx)
 	return ListFn(ctx, f, func(o fs.Object) {
@@ -1301,6 +1301,13 @@ func Cat(ctx context.Context, f fs.Fs, w io.Writer, offset, count int64) error {
 			err = fs.CountError(err)
 			fs.Errorf(o, "Failed to send to output: %v", err)
 		}
+		if len(sep) != 0 {
+			_, err = io.Copy(w, in)
+			if err != nil {
+				err = fs.CountError(err)
+				fs.Errorf(o, "Failed to send seperator to output: %v", err)
+			}
+		}
 	})
 }

and then a flag to configure it in cmd/cat/cat.go

1 Like

Ok, I might be able to play with that next week.

Alright, so I had some free time this afternoon. And I'm close, but I can't figure out how to get it to print the '\n' separator as a newline when passed in as an arg. It just prints it literally... I poked at it for an hour or two, but can't figure it out.

diff --git a/cmd/cat/cat.go b/cmd/cat/cat.go
index 03ed4e6e3..674f5b162 100644
--- a/cmd/cat/cat.go
+++ b/cmd/cat/cat.go
@@ -16,11 +16,12 @@ import (

 // Globals
 var (
-       head    = int64(0)
-       tail    = int64(0)
-       offset  = int64(0)
-       count   = int64(-1)
-       discard = false
+       head      = int64(0)
+       tail      = int64(0)
+       offset    = int64(0)
+       count     = int64(-1)
+       discard   = false
+       separator = string("")
 )

 func init() {
@@ -31,6 +32,7 @@ func init() {
        flags.Int64VarP(cmdFlags, &offset, "offset", "", offset, "Start printing at offset N (or from end if -ve)")
        flags.Int64VarP(cmdFlags, &count, "count", "", count, "Only print N characters")
        flags.BoolVarP(cmdFlags, &discard, "discard", "", discard, "Discard the output instead of printing")
+       flags.StringVarP(cmdFlags, &separator, "separator", "", separator, "Separator to use between objects when printing multiple files")
 }

 var commandDefinition = &cobra.Command{
@@ -82,7 +84,7 @@ Note that if offset is negative it will count from the end, so
                        w = io.Discard
                }
                cmd.Run(false, false, command, func() error {
-                       return operations.Cat(context.Background(), fsrc, w, offset, count)
+                       return operations.Cat(context.Background(), fsrc, w, offset, count, []byte(separator))
                })
        },
 }
diff --git a/fs/operations/operations.go b/fs/operations/operations.go
index 3f6fb6e73..78cf96b9a 100644
--- a/fs/operations/operations.go
+++ b/fs/operations/operations.go
@@ -1259,7 +1259,7 @@ type readCloser struct {
 //
 // if count < 0 then it will be ignored
 // if count >= 0 then only that many characters will be output
-func Cat(ctx context.Context, f fs.Fs, w io.Writer, offset, count int64) error {
+func Cat(ctx context.Context, f fs.Fs, w io.Writer, offset, count int64, sep []byte) error {
        var mu sync.Mutex
        ci := fs.GetConfig(ctx)
        return ListFn(ctx, f, func(o fs.Object) {
@@ -1301,6 +1301,14 @@ func Cat(ctx context.Context, f fs.Fs, w io.Writer, offset, count int64) error {
                        err = fs.CountError(err)
                        fs.Errorf(o, "Failed to send to output: %v", err)
                }
+               if len(sep) >= 0 {
+                       sepReader := bytes.NewReader(sep)
+                       _, err = io.Copy(w, sepReader)
+                       if err != nil {
+                               err = fs.CountError(err)
+                               fs.Errorf(o, "Failed to send seperator to output: %v", err)
+                       }
+               }
        })
 }

When using rclone cat ... --separator '\n', that code will output:

"file1"\n"file2"\n"file3"\n

If I change the default value of the separator to string("\n") and don't pass --separator at all, then it actually does the right thing:

"file1"
"file2"
"file3"

So I guess there's some trick to reading and parsing the arg? Any tips on that?

1 Like
echo "try"$'\n'"this"  # in bash
echo "try`nthis"       # in powershell
1 Like

This is either get the shell to pass in the \n as per one of @Ole 's tips, or we implement some kind of escaping scheme.

1 Like

echo "try"$'\n'"this" # in bash

:smacks forehead: Of course, $'\n' worked perfectly. I forgot about that trick.

or we implement some kind of escaping scheme

Other than \n, I'm not sure what other users might want to escape? :thinking: Absent some more uses cases, I'm a little hesitant to overcomplicate it too much too early... Or maybe, is there some simple golang library/builtin for handling shell escapes already?

Credit goes to Google Search :smile:

Agree and suggest the tricks are included as an example the rclone cat docs.

1 Like

Ok, pr open. I thought the test would run, but perhaps it is pending approval because I am a first-time contributor?

I have approved the execution of the tests and will leave the rest to @ncw - still new here.

Seems like there are some errors in the new code:

--- FAIL: TestCat (0.01s)
[222](https://github.com/rclone/rclone/actions/runs/4773286505/jobs/8486311336?pr=6969#step:12:223) run.go:180: Remote "Local file system at /tmp/rclone1219559205", Local "Local file system at /tmp/rclone2053913187", Modify Window "1ns"
[223](https://github.com/rclone/rclone/actions/runs/4773286505/jobs/8486311336?pr=6969#step:12:224) operations_test.go:550: Incorrect output from Cat(0,-1,ABCDEFGHIJ): "ABCDEFGHIJABCDEFGHIJ012345678ABCDEFGHIJ"
[224](https://github.com/rclone/rclone/actions/runs/4773286505/jobs/8486311336?pr=6969#step:12:225) operations_test.go:550: Incorrect output from Cat(0,-1,ABCDEFGHIJ): "ABCDEFGHIJABCDEFGHIJ012345678ABCDEFGHIJ"
[225](https://github.com/rclone/rclone/actions/runs/4773286505/jobs/8486311336?pr=6969#step:12:226) operations_test.go:550: Incorrect output from Cat(0,5,ABCDE): "ABCDEABCDE01234ABCDE"
[226](https://github.com/rclone/rclone/actions/runs/4773286505/jobs/8486311336?pr=6969#step:12:227) operations_test.go:550: Incorrect output from Cat(-3,-1,HIJ): "HIJHIJ678HIJ"
[227](https://github.com/rclone/rclone/actions/runs/4773286505/jobs/8486311336?pr=6969#step:12:228) operations_test.go:550: Incorrect output from Cat(1,3,BCD): "BCDBCD123BCD"

You may want to test locally:
https://github.com/rclone/rclone/blob/master/CONTRIBUTING.md#quick-testing