Link to home
Start Free TrialLog in
Avatar of Maarten Bruins
Maarten Bruins

asked on

The echo command with respect to file descriptors?

Let's say I type the following "in a terminal":

echo 'bla'

Open in new window


In my case, the shell is bash, so I assume the shell/bash-process receives "echo 'bla'" as standard input? Then it sees "echo", so a child process will be started. So then we will have at least:

ECHO PROCESS:
fd 0 (standard input)   <- terminal-file (keyboard)
fd 1 (standard output)  -> terminal-file (monitor)
fd 2 (standard error)   -> terminal-file (monitor)

Open in new window


I thought that for this process, only "bla" is the standard input. And then the output is also "bla", so I'll see "bla" on my monitor.

I was just a bit playing with "input redirections" and I noticed that the following does not work:

echo < bla-file.txt

Open in new window


After some Google searches, I found out that "echo" does not read from stdin. However, it prints all of its arguments. So it's working differently than normal. So how I have to see/change this:

ECHO PROCESS:
fd 0 (standard input)   <- terminal-file (keyboard)
fd 1 (standard output)  -> terminal-file (monitor)
fd 2 (standard error)   -> terminal-file (monitor)

Open in new window


I thought every process by default has fd's 0,1,2? But if fd 0 would be there something like this:

fd 0 (standard input)   <- nothing

Open in new window


Then it should be still possible to redirect (input) to something. So this means I can not see it like that. Does this mean that the echo process doesn't have a fd 0 at all? Or I must not see "echo" as a process with a fd table et cetera?

But the echo command displays something on my monitor, so at least this should be there:

fd 1 (standard output)  -> terminal-file (monitor)

Open in new window


So that means that I have to see "echo" as a process with at least fd 1. Or is it working totally differently? How I have to see this?

Or maybe only the bash process is involved and it uses fd 1 of the bash process to output "bla". Then I must not see "echo" as a child process with its own fd table. Probably this is how it's working, but maybe someone can confirm this.

Then a command like "echo" differs from other commands (e.g. cat) because "echo" doesn't start a child process, but it stays in the main process (bash). Is this correct?
Avatar of noci
noci

That a file is PRE-opend by the shell and passed on to the child doesn;t mean the child needs to USE it.

echo just doesn't read ... it only prints....
there are more of these commands:   printf, seq, tail, head, mkdir, df
Some read sometimes: mv, rm ( if -i has been passed).

There are also command that print nothing and read nothing:
true, false

The all get the stdin, stdout & stderr assigned.

More of those can be done:

touch hello.txt
echo 4>test 105<hello.txt
rm hello.txt

Open in new window

Will pre-open fd=4 for output (and create test as a file)  and also pre-open fd=105 for reading from hello.txt.... if hello.txt exists, otherwise error...
Not that it helps echo..., it won't use them.

echo hello >hello.txt
will create a file hello.txt with contents 'hello' + newline.
oh most unix/linux commands have a manpage...

man echo

show the capabilities of echo... (needs 1 a4 size sheet to describe).
Avatar of Maarten Bruins

ASKER

You're saying:

echo just doesn't read ... it only prints....

I know.

there are more of these commands

I know.

There are also command that print nothing and read nothing

Irrelevant. And all the other things I also already know.

This doesn't answer my question, because I also already said:

"echo" does not read from stdin. However, it prints all of its arguments.

You're just telling me the same. The question is about what's behind this. And in my question-post I had some subquestions about it.

P.S. About your other command. I know how to use Google and manual pages too, but that's not the question. Again you're only posting general stuff. Before I'm asking a question I'm anyway reading all general stuff about a subject. You're only repeating things I already have read and I already understand.
This all depends on the shell you're using.

With BASH, echo is a builtin so no external process is started to run echo... unless you play games with PATH + override the echo builtin + force it to run as an external process.

This isn't recommended, especially if you have scripts running echo many times, as this can slow scripts down to a snail's pace.

Normally echo simply prints arguments passed, so no STDIN involved... normally...
@David: Thanks! So in bash it's actually just like this:

Or maybe only the bash process is involved and it uses fd 1 of the bash process to output "bla". Then I must not see "echo" as a child process with its own fd table.

and this:

Then a command like "echo" differs from some* other commands (e.g. cat) because "echo" doesn't start a child process, but it stays in the main process (bash).

For bash that's correct, right?

* I added "some" to avoid confusion.
But actually I think you're wrong, because you're saying:

so no STDIN involved

But there is a STDIN involved. The bash process reads "echo 'bla'". This is STDIN. But there is no STDIN involved in a child process. That's something different. Correct me if I'm wrong...
I've tried to explain the echo command in short like this:

BASH PROCESS:
terminal-file   -> FD 0 -> standard input (echo 'bla')
standard output -> FD 1 -> terminal-file (bla)
standard error  -> FD 2 -> terminal-file (empty, because no standard error)
                   FD 3 <- file with echo code
                   FD ...

P.S. With the echo code I mean the code that describes what the echo command
has to do.

Open in new window


Executing the echo code results in a standard output of "bla". Can I see it like that? In reality, probably the file with the echo code in it has no file descriptor, because it's a "kernel-file", so it doesn't need one? Or it has one, but I don't see it? I've used the lsof command to check this, but this doesn't show a fd for that file. So there are two options:

- There is one, but the fd is only known in kernel-space so the lsof command can not show the fd.
- It doesn't have a fd.

How I have to see this?
Even internal  commands are processed like regular programs.
try:
echo 1 & echo 2 & echo 3 & echo 4 & echo 5 & echo 6 & echo 8 & echo 9 & echo 10 &

Open in new window


that will start 8 background processes of internal commands so they behave the same a external commands.
The only difference with external commands is the exec() call which actualy loads a different program in place of the current running one...
Then again even if it is an "internal" command it is still part of the running bash shell program that has stdin/stdout/stderr.
The image loader does NOT need an FD., it uses the mapping service to map the program and shared libraries into memory.
lsof will show that as well:  ( the program will show up as txt , the others as mem).
bash    24104 user  txt    REG    9,5   778744  542321 /bin/bash
bash    24104 user  mem    REG    9,5  2315024  536340 /usr/lib64/locale/locale-archive
bash    24104 user  mem    REG    9,5   374384  148317 /lib64/libncurses.so.6.1
bash    24104 user  mem    REG    9,5  1865472  158846 /lib64/libc-2.26.so
bash    24104 user  mem    REG    9,5   308888  149836 /lib64/libreadline.so.7.0
bash    24104 user  mem    REG    9,5   157112  158148 /lib64/ld-2.26.so
bash    24104 user  mem    REG    9,5    26244  173129 /usr/lib64/gconv/gconv-modules.cache

Open in new window

You're saying:

Even internal  commands are processed like regular programs.

And then you give an example, but you're using "&", so that changes all my example. The & will fork stuff and it will run a separate sub-shell. So that's not a good example to explain me that internal commands are processes like regular programs, because the ampersand changes the whole situation what my question is about.

I've tried this in one terminal window:

for i in {1..50000000}; do echo "Test: $i"; done

Open in new window


And meanwhile in the another terminal window:

lsof | grep 'echo'

Open in new window


No results, while with the less command (non-inbuilt command) I got some results. So it looks like that no other files are opened because of the echo command. The echo command is part of the shell so everything you need for the "echo" is already there (open files in bash process). It looks like that, but maybe someone can tell me if this correct or what is wrong about thinking like that?

Furthermore, you're saying:

Then again even if it is an "internal" command it is still part of the running bash shell program that has stdin/stdout/stderr.

Even? I would be surprised if this would be the case for non-inbuilt commands, but for inbuilt commands this is already what I would expect. In my post I'm saying the same, so what are you trying to make clear?

I don't really see an answer to my question in your post?

And you're saying:

lsof will show that as well

Again you're just repeating what I also already said:

I've used the lsof command to check this, but this doesn't show a fd for that file.

My question about that was: If the lsof command doesn't show any file descriptor number/integer, does this mean that there is no file descriptor involved? Because I can imagine that a user doesn't need to know the file descriptor of a file that is used pure internally in kernel. So then the lsof command could "choose" not show it.

You're saying:

The image loader does NOT need an FD., it uses the mapping service to map the program and shared libraries into memory.

But does not need and does not use are two different things. But you're saying that it's using mapping services? So then the answer would be that if the lsof command doesn't show a file descriptor number, then there are no file descriptors involved? Is this what you're actually trying to say?
You really trying to think to much into this, I think ;)

Even though echo is built-in to bash, let's consider it as an external program just to highlight what is happening and reduce confusion (i.e. one thing at a time).

So bash will start our external "echo" program as a child process. All child processes will start with FD 0,1,2 open and connected to the terminal (if no redirections given on the command line). You are right in what you found out, that echo doesn't read from STDIN... but this doesn't mean that STDIN is not there. It just doesn't use it, and that's why in your original question, doing "< bla-file.txt" does nothing, because echo isn't reading from STDIN.

However, it prints all of its arguments. So it's working differently than normal.

No, not different to normal. All child processes, always get all 3 FD's connected with the terminal (unless redirected).

Hopefully, the many sub-questions that you asked (just after the above quote from your original post) should be clear now.
Now, once you've worked that out, if you want to consider echo as built-in to bash, then it is fairly easy... it's just bash reading from stdin, getting the "echo bla" command, interpreting it, and then printing "bla" to stdout. No child process involved, no changes to any FD's FD tables, nothing... just plain and simple.
Yep. mapped io is handled through the paging mechanism. Different way to solve things.
The page tables are initialised when the program is activated using exec*() overlay function through the linux-vdso.so loader.
When the program executes and some  memory gets hit, it will be loaded by the pagefault handler. Much more efficient for executable code.
It's not that the program knows what instrcution to execute next.
Loading everything in memory when it isn't needed is just wasting resources. (IO bandwidth)
BTW, the sources of bash are available and can be read. The execute_cmd.c does all the heavy lifting
there is a generic execute_simple_command() functions for ALL commands  which for builtins runs execute_subshell_builtin_or_function() for builtins or functions.
(subshell might fork if needed in the case it doesn't for it DOES save context (redirections etc.)  and goes from there.
@Noci: Thanks.

@mccarl: Thanks a lot!

First you said that I have to see "echo" as a separate process with its own stdin, stdout, stderr. But one post later you're saying:

Now, once you've worked that out, if you want to consider echo as built-in to bash, then it is fairly easy... it's just bash reading from stdin, getting the "echo bla" command, interpreting it, and then printing "bla" to stdout. No child process involved, no changes to any FD's FD tables, nothing... just plain and simple.

This is the correct explanation, right? This one doesn't conflict with my results. But why in the first place you said that I have to see it as a separate process with its own stdin,stdout,stderr? Was this just a way to change/simplify reality to make it easier to understand for me? But actually it's like in your second post about it? If it's working like the quote above, then it makes sense to me and I understand it.

I got a bit confused because your answer changed 180 degrees within two posts. So I'm trying to find out what the reason was behind that.
Sorry, yeah I was just trying to simplify things. Yes they are totally different, whether echo is built-in or not, but that doesn't mean either is wrong. On some systems, echo will be built-in and on others not, so it is good to understand both versions.
Yeah I know indeed from: https://www.computerhope.com/unix/uecho.htm

Note: This document covers the standalone program, /bin/echo. Its options are slightly different than the builtin echo command that is included in your shell. If you are using the bash shell, you can determine which echo is the default, using the type command:

But mine was: "echo is a shell builtin", so that's why I was wondering what the difference is. But then at least I know now that I must see it as part of the bash/shell process in my case. And that also explains why I didn't see something when executing "lsof | grep 'echo'". Unlike the echo command, the less command is not builtin, so it's a different process and that's why I saw something when executing "lsof | grep 'less'".

So then I assume that this is the correct answer? If you agree, then maybe you can quote it and say yes, so I can mark it as the correct answer.
Shell built-ins CAN be executed as separate images (forked etc., without changing images  with exec*()
or they can be executed using a "context" stack that runs it without forking the bash, but it has its own handling of redirects etc. etc.
when the command is ready the contextstack is popped to the previous level.  (bash sources:   execute_cmd.c:  execute_subshell_builtin_or_function() ).
As noci mentioned. You can execute a builtin in many shells or run /bin/echo as a process.

So both approaches are equally correct.
@david, i meant even internal executed built-ins can need a fork....   (f.e. when in a pipeline).
So the bash will fork but not exec()  the change to another image, So it  runs the forked bash (which is a straight copy, initialy only the pid's are different), so all data is already available.
Okay wait a minute ;).

First @mccarl said:

it's just bash reading from stdin, getting the "echo bla" command, interpreting it, and then printing "bla" to stdout. No child process involved, no changes to any FD's FD tables, nothing... just plain and simple.

Then immediately after it, @noci said:

Yep. mapped io is handled through the paging mechanism.

Then @mccarl said:

On some systems, echo will be built-in and on others not, so it is good to understand both versions.

This quote doesn't contradict mccarl's quote above and I'm aware of this. Then @David comes by with a useless post (sorry I'm just always honest, nothing personal, I still like you ;)...):

As noci mentioned. You can execute a builtin in many shells or run /bin/echo as a process.

So both approaches are equally correct.

This is just exactly the same what mccarl already said and this is not what the topic is about, because the topic is about builtin commands and how many processes are involved. It can be mentioned once (by mccarl), but again is not necessary ;).

But so far I thought I understood everything. But now @noci is saying this:

@david, i meant even internal executed built-ins can need a fork....   (f.e. when in a pipeline).
So the bash will fork but not exec()  the change to another image, So it  runs the forked bash (which is a straight copy, initialy only the pid's are different), so all data is already available.

Now I can start all over again and I don't understand it anymore ;). I can think of something like this:

echo 'bla' | less

Open in new window


Let's say the echo command is builtin. There is a pipe, so I see this as:

BASH PROCESS          LESS PROCESS (forked of bash process)
fd 0 stdin      pe--  fd 0 stdin
fd 1 stdout --pi      fd 1 stdout
fd 2 stderr           fd 2 stderr

Open in new window


But I can not see it like that? Or maybe there is one more fd table involved, but then the bash process is forked, so the echo command is still part of the bash process and can not be seen as a own separate process (then @mccarl would be right and then I would understand it). Now you too are telling me two different things?
builtin commands are still executed in a shell...., which might happen to be the current one, or if needed for redirection in a forked shell.
so there is still a context (proccess) which hold the  references for stdin,stdout, stderr.
There is only one fdtable / process.

In the case of echo bla | less      
which is run as:      COMMAND1 | COMMAND2
Which is a pipe when either input or output is a pipe a forked  is created...
Then the bash for COMMAND1 sees it needs to  run 'echo bla'
and the bash for COMMAND2 sees it need to run less:

so you get:
bash
     bash for COMMAND1  (piped output)
          it does: builtin('echo', 'bla',0)
                   reassignes FD's so stding, stdout, stderr) are to the right place
                   executed the builtin  
                   exit(0);    (note here NO exec() has been done).
     bash for COMMAND2 (piped input)
          reassign fd's to the right place
          it does: exec("/bin/less", "less", 0)
          (The less will execute exit()0..
main shell will do
     wait() until both forks exit().

Which would differ from:
bash echo bla >x.1 ; echo blob >x.2
which will be executed as COMMAND1 ; COMMAND2

bash
    no piping done... sequential commands:
    push SHELL context    (environment variables, redirections...)
        do redirection '>x.1'
            built in echo bla
    pop SHELL context  (restore main shell context, redirections...)
    push SHELL context
        do redirection '>x.2'
            built in echo blob
    pop SHELL context
Let's say the echo command is builtin. There is a pipe, so I see this as:

Yes, it could be done like the diagram that you have there, but I guess it is a little bit messy. Yes, bash could set the environment up like that, but what happens after the built-in echo runs? The bash process has it's stdout still redirected to the pipe. While this is not necessarily a major issue, to resolve it it would need to make a backup copy of the initial stdout (using a dup call before redirecting to the pipe) and then after the built-in echo code runs, it would need to restore the backed up stdout back into FD 1.

So what @noci is saying, is that instead of having to save this environment and restore it afterwards, bash can just fork a new child process, apply the redirection of stdout to the pipe in this new child process (which has it's own fd table therefore not affecting the parent bash process) and then just "exit" this child process after the built-in echo code has executed. But where this is different to the "fully standalone echo process" is that we fork a new process but then just run the built-in echo code within that process, we are NOT forking and then "exec"ing the standalone echo code. Is that clearer? ;)
The post of mccarl is a bit better to understand for me than the last post of noci. So for now I'll react on that one. When I was typing my previous post I also had a possible pipe in mind, so that's why I added:

Or maybe there is one more fd table involved

So I think I understand it. I thought noci said that something like this:

a builtin-echo-command can have its own process

In some way maybe you can see it like that, but actually the echo command still doesn't have his own process. It uses the (forked) bash-process. The forked bash-process can be purely there for the echo command, but then I still see it as a bash-process instead of a echo-process.

So in summary:

builtin echo command       -> part of bash-process
non-builtin echo command   -> own process

Open in new window


Now I'm writing this, I'm thinking: the "own process" above is probably also just a forked-bash-process in the first place, right? Then actually I understand noci a bit more, because then both are basically just forked bash-processes.

But I think then I still see the builtin-echo command more as part of the (forked) bash-process, because it doesn't invoke a new program. Does this make sense what I'm writing here?
ASKER CERTIFIED SOLUTION
Avatar of mccarl
mccarl
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks! Then I can close this question. Everyone thanks a lot for all your help/time. I really appreciate it!
You're welcome!