What prevents stdout/stderr from interleaving

up vote
4
down vote

favorite

Say I run some processes:

#!/usr/bin/env bash

foo &
bar &
baz &

wait;

I run the above script like so:

foobarbaz | cat

as far as I can tell, when any of the procs write to stdout/stderr, their output never interleaves - each line of stdio seems to be atomic. How does that work? What utility controls how each line is atomic?

edited 1 hour ago

Jeff Schaller

34.1k851113

asked 1 hour ago

Alexander Mills

2,0321132

2

How much data does your commands output? Try making them output a few kilobytes.
â€“Â Kusalananda
1 hour ago

You mean where one of the commands outputs a few kb before a newline?
â€“Â Alexander Mills
44 mins ago

add a commentÂ |Â

up vote
4
down vote

favorite

Say I run some processes:

#!/usr/bin/env bash

foo &
bar &
baz &

wait;

I run the above script like so:

foobarbaz | cat

edited 1 hour ago

Jeff Schaller

34.1k851113

asked 1 hour ago

Alexander Mills

2,0321132

2

How much data does your commands output? Try making them output a few kilobytes.
â€“Â Kusalananda
1 hour ago

You mean where one of the commands outputs a few kb before a newline?
â€“Â Alexander Mills
44 mins ago

add a commentÂ |Â

up vote
4
down vote

favorite

Say I run some processes:

#!/usr/bin/env bash

foo &
bar &
baz &

wait;

I run the above script like so:

foobarbaz | cat

edited 1 hour ago

Jeff Schaller

34.1k851113

asked 1 hour ago

Alexander Mills

2,0321132

Say I run some processes:

#!/usr/bin/env bash

foo &
bar &
baz &

wait;

I run the above script like so:

foobarbaz | cat

shell osx stdout output stderr

edited 1 hour ago

Jeff Schaller

34.1k851113

asked 1 hour ago

Alexander Mills

2,0321132

edited 1 hour ago

Jeff Schaller

34.1k851113

asked 1 hour ago

Alexander Mills

2,0321132

edited 1 hour ago

Jeff Schaller

34.1k851113

edited 1 hour ago

Jeff Schaller

34.1k851113

edited 1 hour ago

Jeff Schaller

34.1k851113

asked 1 hour ago

Alexander Mills

2,0321132

asked 1 hour ago

Alexander Mills

2,0321132

asked 1 hour ago

Alexander Mills

2,0321132

2

How much data does your commands output? Try making them output a few kilobytes.
â€“Â Kusalananda
1 hour ago

You mean where one of the commands outputs a few kb before a newline?
â€“Â Alexander Mills
44 mins ago

add a commentÂ |Â

2

How much data does your commands output? Try making them output a few kilobytes.
â€“Â Kusalananda
1 hour ago

You mean where one of the commands outputs a few kb before a newline?
â€“Â Alexander Mills
44 mins ago

How much data does your commands output? Try making them output a few kilobytes.
â€“Â Kusalananda
1 hour ago

You mean where one of the commands outputs a few kb before a newline?
â€“Â Alexander Mills
44 mins ago

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
5
down vote

It depends how the programs buffer their output. The stdio library that most programs use when they're writing uses buffers to make output more efficient. Instead of outputting data as soon as the program calls a library function to write to a file, the function stores this data in a buffer, and only actually outputs the data once the buffer has filled up. This means that output is done in batches. More precisely, there are three output modes:

Unbuffered: the data is written immediately, without using a buffer. This can be slow if the program writes its output in small pieces, e.g. character by character. This is the default mode for standard error.

Fully buffered: the data is only written when the buffer is full. This is the default mode when writing to a pipe or to a regular file, except with stderr.

Line-buffered: the data is written after each newline, or when the buffer is full. This is the default mode when writing to a terminal, except with stderr.

Programs can reprogram each file to behave differently, and can explicitly flush the buffer. The buffer is flushed automatically when a program closes the file or exits normally.

If all the programs that are writing to the same pipe either use line-buffered mode, or use unbuffered mode and write each line with a single call to an output function, and if the lines are short enough to write in a single chunk, then the output will be an interleaving of whole lines. But if one of the programs uses fully-buffered mode, or if the lines are too long, then you will see mixed lines.

Here is an example where I interleave the output from two programs. I used GNU coreutils on Linux; different versions of these utilities may behave differently.

yes aaaa writes aaaa forever in what is essentially equivalent to line-buffered mode. The yes utility actually writes multiple lines at a time, but each time it emits output, the output is a whole number of lines.

echo bbbb; done | grep b writes bbbb forever in unbuffered mode. It uses a buffer size of 8192, and each line is 5 bytes long. Since 5 does not divide 8192, the boundaries between writes are not at a line boundary in general.

Let's pitch them together.

$ yes aaaa & while true; do echo bbbb; done | head -n 999999 | grep -e ab -e ba
bbaaaa
bbbbaaaa
baaaa
bbbaaaa
bbaaaa
bbbaaaa
ab
bbbbaaa

As you can see, yes sometimes interrupted grep and vice versa. Only about 0.001% of the lines got interrupted, but it happened. The output is randomized so the number of interruptions will vary, but I saw at least a few interruptions every time. There would be a higher fraction of interrupted lines if the lines were longer, since the likelihood of an interruption increases as the number of lines per buffer decreases.

answered 1 hour ago

Gilles

515k12110231551

interesting, so what might be a good way to ensure that all lines were written to cat atomically, such that the cat process receives whole lines from either foo/bar/baz but not half a line from one and half a line from another, etc. Is there something I can do with the bash script?
â€“Â Alexander Mills
55 mins ago

1

sounds this applies to my case also where I had hundreds of files and awk was produced two (or more) lines of output for same ID with find -type f -name 'myfiles*' -print0 | xargs -0 awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' but with find -type f -name 'myfiles*' -print0 | xargs -0 cat| awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' it correctly produced only one line for every IDs.
â€“Â sddgob
49 mins ago

To prevent any interleaving, I can do that with in a programming env like Node.js, but with bash/shell, not sure how to do it.
â€“Â Alexander Mills
45 mins ago

@AlexanderMills Write to different files, then process the files separately, or concatenate them.
â€“Â Kusalananda
8 mins ago

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f476080%2fwhat-prevents-stdout-stderr-from-interleaving%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
5
down vote

Unbuffered: the data is written immediately, without using a buffer. This can be slow if the program writes its output in small pieces, e.g. character by character. This is the default mode for standard error.

Fully buffered: the data is only written when the buffer is full. This is the default mode when writing to a pipe or to a regular file, except with stderr.

Line-buffered: the data is written after each newline, or when the buffer is full. This is the default mode when writing to a terminal, except with stderr.

Programs can reprogram each file to behave differently, and can explicitly flush the buffer. The buffer is flushed automatically when a program closes the file or exits normally.

Here is an example where I interleave the output from two programs. I used GNU coreutils on Linux; different versions of these utilities may behave differently.

yes aaaa writes aaaa forever in what is essentially equivalent to line-buffered mode. The yes utility actually writes multiple lines at a time, but each time it emits output, the output is a whole number of lines.

echo bbbb; done | grep b writes bbbb forever in unbuffered mode. It uses a buffer size of 8192, and each line is 5 bytes long. Since 5 does not divide 8192, the boundaries between writes are not at a line boundary in general.

Let's pitch them together.

$ yes aaaa & while true; do echo bbbb; done | head -n 999999 | grep -e ab -e ba
bbaaaa
bbbbaaaa
baaaa
bbbaaaa
bbaaaa
bbbaaaa
ab
bbbbaaa

answered 1 hour ago

Gilles

515k12110231551

interesting, so what might be a good way to ensure that all lines were written to cat atomically, such that the cat process receives whole lines from either foo/bar/baz but not half a line from one and half a line from another, etc. Is there something I can do with the bash script?
â€“Â Alexander Mills
55 mins ago

1

sounds this applies to my case also where I had hundreds of files and awk was produced two (or more) lines of output for same ID with find -type f -name 'myfiles*' -print0 | xargs -0 awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' but with find -type f -name 'myfiles*' -print0 | xargs -0 cat| awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' it correctly produced only one line for every IDs.
â€“Â sddgob
49 mins ago

To prevent any interleaving, I can do that with in a programming env like Node.js, but with bash/shell, not sure how to do it.
â€“Â Alexander Mills
45 mins ago

@AlexanderMills Write to different files, then process the files separately, or concatenate them.
â€“Â Kusalananda
8 mins ago

add a commentÂ |Â

up vote
5
down vote

Unbuffered: the data is written immediately, without using a buffer. This can be slow if the program writes its output in small pieces, e.g. character by character. This is the default mode for standard error.

Fully buffered: the data is only written when the buffer is full. This is the default mode when writing to a pipe or to a regular file, except with stderr.

Line-buffered: the data is written after each newline, or when the buffer is full. This is the default mode when writing to a terminal, except with stderr.

Programs can reprogram each file to behave differently, and can explicitly flush the buffer. The buffer is flushed automatically when a program closes the file or exits normally.

Here is an example where I interleave the output from two programs. I used GNU coreutils on Linux; different versions of these utilities may behave differently.

yes aaaa writes aaaa forever in what is essentially equivalent to line-buffered mode. The yes utility actually writes multiple lines at a time, but each time it emits output, the output is a whole number of lines.

echo bbbb; done | grep b writes bbbb forever in unbuffered mode. It uses a buffer size of 8192, and each line is 5 bytes long. Since 5 does not divide 8192, the boundaries between writes are not at a line boundary in general.

Let's pitch them together.

$ yes aaaa & while true; do echo bbbb; done | head -n 999999 | grep -e ab -e ba
bbaaaa
bbbbaaaa
baaaa
bbbaaaa
bbaaaa
bbbaaaa
ab
bbbbaaa

answered 1 hour ago

Gilles

515k12110231551

interesting, so what might be a good way to ensure that all lines were written to cat atomically, such that the cat process receives whole lines from either foo/bar/baz but not half a line from one and half a line from another, etc. Is there something I can do with the bash script?
â€“Â Alexander Mills
55 mins ago

1

sounds this applies to my case also where I had hundreds of files and awk was produced two (or more) lines of output for same ID with find -type f -name 'myfiles*' -print0 | xargs -0 awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' but with find -type f -name 'myfiles*' -print0 | xargs -0 cat| awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' it correctly produced only one line for every IDs.
â€“Â sddgob
49 mins ago

To prevent any interleaving, I can do that with in a programming env like Node.js, but with bash/shell, not sure how to do it.
â€“Â Alexander Mills
45 mins ago

@AlexanderMills Write to different files, then process the files separately, or concatenate them.
â€“Â Kusalananda
8 mins ago

add a commentÂ |Â

up vote
5
down vote

Unbuffered: the data is written immediately, without using a buffer. This can be slow if the program writes its output in small pieces, e.g. character by character. This is the default mode for standard error.

Fully buffered: the data is only written when the buffer is full. This is the default mode when writing to a pipe or to a regular file, except with stderr.

Line-buffered: the data is written after each newline, or when the buffer is full. This is the default mode when writing to a terminal, except with stderr.

Programs can reprogram each file to behave differently, and can explicitly flush the buffer. The buffer is flushed automatically when a program closes the file or exits normally.

Here is an example where I interleave the output from two programs. I used GNU coreutils on Linux; different versions of these utilities may behave differently.

yes aaaa writes aaaa forever in what is essentially equivalent to line-buffered mode. The yes utility actually writes multiple lines at a time, but each time it emits output, the output is a whole number of lines.

echo bbbb; done | grep b writes bbbb forever in unbuffered mode. It uses a buffer size of 8192, and each line is 5 bytes long. Since 5 does not divide 8192, the boundaries between writes are not at a line boundary in general.

Let's pitch them together.

$ yes aaaa & while true; do echo bbbb; done | head -n 999999 | grep -e ab -e ba
bbaaaa
bbbbaaaa
baaaa
bbbaaaa
bbaaaa
bbbaaaa
ab
bbbbaaa

answered 1 hour ago

Gilles

515k12110231551

Unbuffered: the data is written immediately, without using a buffer. This can be slow if the program writes its output in small pieces, e.g. character by character. This is the default mode for standard error.

Fully buffered: the data is only written when the buffer is full. This is the default mode when writing to a pipe or to a regular file, except with stderr.

Line-buffered: the data is written after each newline, or when the buffer is full. This is the default mode when writing to a terminal, except with stderr.

Programs can reprogram each file to behave differently, and can explicitly flush the buffer. The buffer is flushed automatically when a program closes the file or exits normally.

Here is an example where I interleave the output from two programs. I used GNU coreutils on Linux; different versions of these utilities may behave differently.

yes aaaa writes aaaa forever in what is essentially equivalent to line-buffered mode. The yes utility actually writes multiple lines at a time, but each time it emits output, the output is a whole number of lines.

echo bbbb; done | grep b writes bbbb forever in unbuffered mode. It uses a buffer size of 8192, and each line is 5 bytes long. Since 5 does not divide 8192, the boundaries between writes are not at a line boundary in general.

Let's pitch them together.

$ yes aaaa & while true; do echo bbbb; done | head -n 999999 | grep -e ab -e ba
bbaaaa
bbbbaaaa
baaaa
bbbaaaa
bbaaaa
bbbaaaa
ab
bbbbaaa

answered 1 hour ago

Gilles

515k12110231551

answered 1 hour ago

Gilles

515k12110231551

answered 1 hour ago

Gilles

515k12110231551

answered 1 hour ago

Gilles

515k12110231551

interesting, so what might be a good way to ensure that all lines were written to cat atomically, such that the cat process receives whole lines from either foo/bar/baz but not half a line from one and half a line from another, etc. Is there something I can do with the bash script?
â€“Â Alexander Mills
55 mins ago

1

sounds this applies to my case also where I had hundreds of files and awk was produced two (or more) lines of output for same ID with find -type f -name 'myfiles*' -print0 | xargs -0 awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' but with find -type f -name 'myfiles*' -print0 | xargs -0 cat| awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' it correctly produced only one line for every IDs.
â€“Â sddgob
49 mins ago

To prevent any interleaving, I can do that with in a programming env like Node.js, but with bash/shell, not sure how to do it.
â€“Â Alexander Mills
45 mins ago

@AlexanderMills Write to different files, then process the files separately, or concatenate them.
â€“Â Kusalananda
8 mins ago

add a commentÂ |Â

interesting, so what might be a good way to ensure that all lines were written to cat atomically, such that the cat process receives whole lines from either foo/bar/baz but not half a line from one and half a line from another, etc. Is there something I can do with the bash script?
â€“Â Alexander Mills
55 mins ago

1

sounds this applies to my case also where I had hundreds of files and awk was produced two (or more) lines of output for same ID with find -type f -name 'myfiles*' -print0 | xargs -0 awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' but with find -type f -name 'myfiles*' -print0 | xargs -0 cat| awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' it correctly produced only one line for every IDs.
â€“Â sddgob
49 mins ago

To prevent any interleaving, I can do that with in a programming env like Node.js, but with bash/shell, not sure how to do it.
â€“Â Alexander Mills
45 mins ago

@AlexanderMills Write to different files, then process the files separately, or concatenate them.
â€“Â Kusalananda
8 mins ago

interesting, so what might be a good way to ensure that all lines were written to cat atomically, such that the cat process receives whole lines from either foo/bar/baz but not half a line from one and half a line from another, etc. Is there something I can do with the bash script?
â€“Â Alexander Mills
55 mins ago

sounds this applies to my case also where I had hundreds of files and awk was produced two (or more) lines of output for same ID with find -type f -name 'myfiles*' -print0 | xargs -0 awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' but with find -type f -name 'myfiles*' -print0 | xargs -0 cat| awk ' seen[$1]= seen[$1] $2 END for(x in seen) print x, seen[x] ' it correctly produced only one line for every IDs.
â€“Â sddgob
49 mins ago

To prevent any interleaving, I can do that with in a programming env like Node.js, but with bash/shell, not sure how to do it.
â€“Â Alexander Mills
45 mins ago

@AlexanderMills Write to different files, then process the files separately, or concatenate them.
â€“Â Kusalananda
8 mins ago

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky