Linux Process Environment Explained for Beginners

July 4, 2014 | By
| Reply More

Processes are fundamental to Linux operating system. In order to understand how a Linux process works, it's important to know about its environment. For example, things like how main() function is called, how command line arguments are passed to a program, how a program accesses environment variables, how a process is laid out in memory, and different ways to terminate a process. In this two-part article series, we will touch base on all these aspects from a beginners point of view.

linux process

The main() Function

As a programmer, generally we know that the main() is the first function that gets called when a program is executed. But, have you ever thought who calls it? What happens before main() is executed?

Well, whenever you execute a program through command line, here is an overview of what happens:

  • The shell calls one of the exec family of functions with binary name, argument count (argc), and argument array (argv).
  • A kernel handler function is invoked which passes all the information to kernel. This information consists of pointer to program name string, argv array pointer, environment variable array pointer, and more.
  • The kernel then determines the executable file format (For example: ELF or a.out) being used, based on which it sets up related data structures like code size, data segment start, stack segment start, etc
  • The kernel then allocates user mode pages for the process and copies the argument array and environment variables to those allocated page addresses.
  • Finally, the _start() function is called, which is the entry point to a C executable. _start() then calls main() and passes all the required information to it.

That was just a brief overview. If you want to understand in detail each and every step that happens between when a program is executed and when main() is called, refer to this excellent tutorial.

Command Line Arguments

Now lets understand how a program accesses command line arguments. Lets take an example of the following code:

#include<stdio.h>

int main(void)
{
printf("\n This is a test program\n");
return 0;
}

So, as you can see, it's a very basic program that just prints a string in the output. Now, if you want the program to accept command line arguments, you have to first change the argument list of the main() function.

#include<stdio.h>

int main(int argc, char* argv[])
{
printf("\n This is a test program\n");
return 0;
}

In the code above, argc is an integer that represents the number of arguments (including the name of the binary), and argv is an array containing command line arguments as strings. Here are some more modifications that show how you can access command line arguments in code:

#include<stdio.h>

int main(int argc, char* argv[])
{
printf("\n This is a test program\n");

if(argc != 3)
{
printf("\n The program accepts 3 arguments\n");
return -1;
}

int temp = 0;

while(temp<3)
{
printf("\n %s \n", argv[temp]);
temp++;
}

return 0;
}

The program now makes of use of argc and argv. It loops over argc times and prints all the command line arguments passed to the program. Here is the output of the program when it is executed:

$ ./arg prog 5

This is a test program

./arg

prog

5

So you can see that the program, when executed through shell, prints all the command line arguments on stdout.

Environment List

Besides command line arguments, a program also receives information about the context in which it was invoked through the environment list passed to it. A standard environment list contains information like: user's home directory, terminal type, current locale, and so on; you can also define additional variables for other purposes.

By convention, the environment variables are defined in the following format:

name=value

The names are defined in upper case, but this is only a convention.

Just like the command line argument list, the environment list is also an array of pointers pointing to the address of a null-terminated string. And the environment list can be accessed through a global variable environ, which is defined as a pointer to pointer to char.

Here is an example, how to use environment list in a C program:

#include<stdio.h>

extern char **environ;

int main(int argc, char* argv[])
{
printf("\n This is a test program\n");
char **tmp = environ;

while(*tmp != '')
{
printf("\n %s \n", *tmp);
tmp++;
}

return 0;
}

So, as you can see, the environ variable is already defined as a global variable, so you've to just declare it as an extern variable. Using a temp pointer variable, to which you assign the address held by environ variable, the program loops over and prints each environment variable.

Here is the output:

This is a test program

XDG_VTNR=7

SSH_AGENT_PID=1508

XDG_SESSION_ID=c2

CLUTTER_IM_MODULE=xim

SESSION=ubuntu

GPG_AGENT_INFO=/run/user/1000/keyring-TfWiqP/gpg:0:1

TERM=xterm

XDG_MENU_PREFIX=gnome-

SHELL=/bin/bash

VTE_VERSION=3409

WINDOWID=69206026

UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1441

GNOME_KEYRING_CONTROL=/run/user/1000/keyring-TfWiqP

GTK_MODULES=overlay-scrollbar:unity-gtk-module

USER=himanshu

...

...

...

Conclusion

In this article, we learned about how main() function is called and how command line arguments and environment list can be accessed in the code.

Filed Under : HOWTOS, PROGRAMMING

Free Linux Ebook to Download

Leave a Reply

Commenting Policy:
Promotion of your products ? Comment gets deleted.
All comments are subject to moderation.