First of all we will
make a .c file (source code)
on an editor called vim and
write our program in c language.
Then we will give
gcc command in terminal to compile the .c file.
Gcc command working:
First of all the c
file is given to preprocessor program which will create a .i file
which contain our source code(.c) and header file(.h).
Then this file is
given to compiler which will perform 2 pass operation:
1) It creates and
refer the symbol table and check details of all variables and
function used and check whether their linkage is available.
2)In second pass it
converts .i file into .o file and goes through these phases:
a)Syntax Analysis
b)Lexical Analysis
c)Intermediate code
generation
d)Object code
generation
e)Code optimization
Then this hello.o
file is given to linker which links header file,source
file,libraries,symbol table and a.out (default name) file is
generated.
This a.out file
contain elf(header),code segment(complete
program-source,header,library file) and data segment.
So after compiling
using gcc command our a.out file is generated and stored in secondary
disk.
Now we will use
loader program using ./a.out command and our executable file is
transferred to ram where it is executed.
Now a program
context is made in the ram which contain:
a)stack segment(for
local variables)
b)file descriptor
(link to stdin,stdout,stderr) and memory description table which
contain information about 3 file descriptor
c)code
segment(computed code)
d)data
segment(global variables)
The program in
execution is called a process.
All processes are
governed by operating system and they take care of that with the help
of process control block which contain the process id which is
alotted by operating system whenever a program is executed in ram.
So this is how our
program works from writing it in c language to implementation and
execution on output screen.
Firstly, we will write a
simple hello world program in the vim editor and save this program as hello.c.
Then we will compile this program using the gcc instruction. Compilation
includes many different number of processes.
Our program will pass
through the preprocessor, where it combines with stdio.h to form a hello.i
file.
This hello.i file then
passes through a compiler which has 2 passes. In the first pass, a symbol table
is created which lists all entries from declarations like variables, constants,
functions, libraries, stored objects, etc. in various slots. Pass 1 also gives
the error, if any of the above mentioned things are missing or wrong.
After completion of pass
1, pass 2 starts where symbol table is checked and various analysis and checks
are performed on our program, such as-
1. Syntax Analysis
2. Lexical Analysis
3. Intermediate Code
Generation
4. Object Code
Generation
5. and Code
Optimization.
If these steps are
performed successfully without giving any error, then a hello.o file is
created, which is only a compiled version of hello.c.
This hello.o file
combines with stdio.h, hello.c and other libraries like L.libraries and L.ST
and then goes through the Linker.
In the Linker, all the
given and specified links are combined together to give an a.out file which has
3 parts-
1. ELF(Executable and
Linkable Format)- It contains all the executable files, object codes and
shared libraries.
2. Code Segment- all the
coding instructions are present in this segment.
3. Data Segment- all the
data, variables and constants are present in this segment.
After this, we will use
the ./a.out command which will create a program context in the RAM with the
help of loader, so that an executable file can be created in the RAM. Depending
on the number of process9es running, there can be multiple program contexts and
64 GB memory is allocated to them at max.
Program context has
various segments-
1. Stack segment- local
variables are present here.
2. File Descriptor- It
has 3 different I/O streams, stdin(input stream), stdout(output stream) and
stderr(error stream). It also has a memory descriptor table where pointer, type
and size are stored.
3. Code segment-
compiled code is present here.
4. Data segment-
different variables, macros, constants and preprocessor directives are present
here.
This is the complete
process of compilation and execution of a C program.
For
running the c program we have 3 major steps
1.
Writing of c program in the editor
2.
Compilation of the written program(high level to machine level
language)
3.
Output of the written program.
The
above steps are the steps that we see from the outside but what is
happening internally, we will see in this article.
First
we write our program in the editor (here I am using vim editor).
As we save our program in the vim editor with the extension (.c). A
file is created ,let us assume file name we save is hello.c
then the hello.c file is created.
After
the successfully creation of the file hello.c we need to compile it.
For compiling we type gcc(gcc is the compiler). After
presssing gcc the C preprocessor comes into the action.
The
C preprocessor check the (#) and include the header files that we
gave in the program such as stdio.h , maths.h . Header file consist
of all the prototype declaration of c library.
The
addition of the header file with the hello.c using preprocessor
creates the file hello.i
hello.i=hello.c + header
file
The
compiler comes after the creation of the hello.i file. The compiler
has the
following
steps :
1.
syntax analysis
2.
logical analysis
3.
intermediate code generation
4.
code optimisation
5.
source code generation
The
gcc compiler compiles with 2 pass process
1 pass
*
In the 1 pass the compiler goes upto the intermediate code generation
and creates a symbol table.
*
The symbol table contains all the declaration which are used in the
program such as the variables and their data types , constant,
function etc.
2
pass
*
In second pass it again checks all the stages that are enlisted
above and if there is no error then the source code generation is
done.
As
source code is generated the file hello.s is created.The
assembler takes the hello.s as input and the output of the
assembler is saved in the object file and hence the output file
hello.o is created.Hello.o
contains the machine code but the variables,constants remains
undefined in the object file at this stage.
Linker
gives the defination of all these variables and the constants that we
used in the program . Linker links the c library with the object file
to give the defination of all undefined variables and constant. It
takes the hello.o as input and produces the output as a.out
a.out
consists of three parts
1.
ELF (Executable and linkable format) -
It contains the library and executable files
2.
code segment-
coding instructions are present is this segment
3.
data segment-
All the variables and constants are present in this segment
The
a.out is save in the disk not in the RAM . The loader takes the file
from the disk to the RAM and creates the Process context
process
context contain the following segments.
1.
Stack segment – stack segment contains the local variable
2.
Connector file descriptors- connector file descriptor contains the
three streams.
a).
Standard input-used for input from the user
b).
Standard output- used for output on the screen
c).
Standard error- used for the error
3.Data
segment- data segment consist of the global variables and constants.
4.Code
segment- It contains the compiled code.
The process context is governed by the PCB(Process control box). Process control box consist of the PID(process ID), PPID(parent process ID), PC(process control).
This is the process for generating the output of a code in Cprogram.
Compilation of C
program
The compiler
software compile source code which was written in high level language
and generates object code with extenxion .o file
The whole C program
goes under four processes
1- Pre-processing
2-Compilation
3-Assembling and
4-Linking
1- Pre-processing-
The source code
program which was saved with filename.c begins with preprocessing and
perform tasks like removing comments , expansion of macros, including
header files.
After that it
generates a temporary file with extension .i having contents of
header file and source code file. Now this file with filename.i name.
2- Compilation- GCC
compiler is 2 way process. Once it checks syntax and lexical analysis
and generate intermediate code, secondly it optimize code and
generate object code . In this process compiler shows warning and
errors if source code had not written in specific format.
This compilation
generates file name with extension .s temporary file name. This
extension name file having contents header file , source file, link
table and symbol table. Link table have C libraries codes and syntax
and symbol table make customized table of type , argument , link to
symbol codes.
This filename.s file
will go to assembler.
3-Assembling – it
translates to low level machine code and generates a file with
extension .o file.
4- Linking- it is
final phase in which linking of functions calls with their proper
code definition are called. Linker add some other codes also which
will be needed to start and end the program and setting up the
environment . After that ./a.out file is generate which have
executable file format , code segment and data segment .
The whole process is
run on the RAM where physical address consists of stack segment ,
connectors file descriptor, code segment and data segment. The
descriptor having pipeline stream code Stdin, Stdout, Stderr. From
where we enter input through input device and result displays on
output screen .
By- nickushwaha@gmail.com
How c program
runs :
First of all we write a c program in the editor and then we
compile it and after compiling we get a.out as output file.By
seeing in outside it seems to be a two step process but internally it
requiress a no. of steps to convert a c file into a.out and we will
try to understand its internal process through a article..
First we write a c program in the editor(e.g-VIM editor)
and saves it with extention(.c) and a name .
(let’s assume file name be hello.c).
After the creation of hello.c we need to compile it and for compiling
we use a compiler (like gcc).
On compiling C preprocessor comes into action and it search for (#)
and include the header file
(like stdio.h,math.h,stdlib.h etc.) which contains all the
prototype declearation of c lib. And it gives a
preprocessed code file hello.i.
so hello.i :- header file + hello.c.
After the creation of hello.i a compiler comes into action and
it goes through following processes:-
1. Syntax analysis.
2. Logical analysis.
3. Intermediate code generation.
4. code optimization.
5. source code generation.
Gcc compiler is a 2 pass compiler and compilation occur through 2
pass process:-
In 1st pass compiler goes upto intermediate code
generation and creates a symbol table where it lists all the
important information of the variable like its id,datatype,argument
etc.
In 2nd pass it goes through all 5 steps and whatever it
does not understand takes the information about them from symbol
table and if there is no error then it generate a source code file
hello.s.
Assembler takes a source file hello.s as the input and
converts into machine code and create object file hello.o as its
output.
Linker links the c lib. With the object file to give the definition
of all undefined variables and constant .
It takes hello.o as i/p and gives a.out as o/p.
a.out consists of 3 parts:-
1. ELF(Executable and Linkable format).
It contains library and executable files.
2. Code segment
It contains coding instructions.
3. Data segment
It conatins all the variables and constants.
Finally a.out get saved in the disk .the loader takes the file from
the disk to RAM and creates a process context which contains
following:-
stack segment:
It contains local variables.
File discriptor:
It contains three streams named as: std o/p,std i/p,std error.It
bydefault takes value 0,1&2.
code segment:
It contains compiled code.
Data segment:
It contains global variable,static variable,Register
variable,volatile variable etc.
There is a process control Board created by o.s which keeps
information process context like
PID(process ID),PPID(parent process Id),PC ,etc.
@page { margin: 0.79in }
p { margin-bottom: 0.1in; line-height: 120% }
Compilation
Phases
There
are four phases in compilation process. You can achieve all these
phase with one command -
gcc
file_name.c file_name
1. Preprocessor : Preprocessor expend header files in the program.
Remove spaces and remove comment line.
Replace macros with their definition.
Command
: gcc -E file_name.c
Output
: .i file
2. Compiler
: Compiler takes .i file
as input and converts it into assembly code. It has two pass
Compilation.
In
the first pass, Compiler create Symbol table for all function. Symbol
table contains various attributes of a function like name of the
function , arguments, and link of the function from gnuglibc.
In
the second pass, Compiler create assembly code.
Command
: gcc -S file_name.i
Output
: .s file
3. Assembler
: Assembler takes .s file as input (.s file is written in
assembly language) and converts it into native code.
Command
: gcc -c file_name.s
Output
: .o file
4. Linker
: At the time of linker phase, linker link the program to the
library files (library files contain the
definition
of predefined function) and load file in the memory and save .out
file to the secondary memory and create Process Context of that
program.
Command
: gcc file_name.o
Output
: .out file
Process
Context
Stack
: Stack contain local variable and function call.
Data
segment :
Data segment section contains initialized variables. Initialized -
Static, Global, Register, Volatile, Constant variables, Marcos,
Qualifier, Preprocessor directives.
Code
segment : Code segment is also knows as Text segment.
This segment contains compiled code.
Process
Context contains file directives. There are 3 connections in a
program all connections are fd. Different fd’s are used for
different works.
fd
- 0 used for standard input
fd
- 1 is used for standard output
fd
- 2 used for standard error
Compilation process of C
program
When we compile a C program
then it have gone threw 4 different phases of compilation.
These 4 phases of
compilation are as follows:-
1.Preprocessor :-(gcc
-E )
In this phase of compilation
all the included header files expends and add them to source code.
It also removes comments and
spaces from the source code.
It also replace Macros with
the value in source code.
It takes .c file as
input and generates .i file.
2.Compiler :-(gcc -S)
In this phase Source code
converted into assembly code.
It takes .i file as
input and generates .s file.
This phase also knows as
2-Pass compiler.
In first pass compiler
generates Symbol Table for all the functions,library used in
source code.
Symbol table stores the name ,
arguments, link to definition from gnulib for each
function/library.
In second pass it converts
source code into assembly code.
This phase further divided
into 5 phases of compiler.
1.Lexical Analysis.
2.Syntax Analysis.
3.Intermediate code
generation.
4.Code optimization.
5. Object code generation.
3.Assembler :-(gcc -c)
In this phase assembly code
converted into Machine code.
It takes .s file as
input and generates .o(Object file) .
@page { size: 21cm 29.7cm; margin: 2cm }
p { margin-bottom: 0.25cm; line-height: 115%; background: transparent }
4.Linker:-
It takes .o file as
input and generates .out file.
Linker links all the
functions/libraries with their definition.
.out file contains 3 segments
-
ELF(Executable
Linkable Format).
Code Segment-Contains
compiled code.
Data Segment -contains
Values.
.out file
stored in secondary (Hard Disk) memory.
When we run .out file then
loader loads .out file into primary memory from secondary
memory. And creates Process Context for that file in primary
memory.
Process
Context
Process Context Table have 3
segments :-
1. Stack Segment:-
Stack segment stores local variables .
2. Data Segment :- Data
segment stores Global variables,static variable,volatile
variables,register variables, Constants, Macros,Preprocessors
Directives and Qualifiers.
3. Code Segment:- Code
segment stores Compiled Source code.
Process Context table also
have 3 File descriptors (FDs):-
FD0 – For stdin stream. It
takes all input signals.
FD1 – For stdout stream. It
used to sends all output signals .
FD2 – For strerr stream. It
is used to send all error signals.
Compilation Process And
Process Context
There are several phases of
complilation of C program…..
1. Preprocessor Phase :- In
this phase .c file
converts into .i
file by using -E
command.
Include the header file and
expand it.
Removes all the comments from
the .c file.
Replace
all the macros with
their values in the .c file.
2.
Compilation Phase :-
Converts .i
file into .s file
by using -S
command. There are two passes
in this
phase.
In
1st pass of
compilation , maintains the
list of all the declarations, unknown attributes and links of all
libraries in the symbol table.
It
contains several phases.
Lexical
Analysis.
Syntax
Analysis.
Intermediate
Code Generation.
Code
Optimisation.
Object
Code Generation.
In 2nd
pass of compilation it converts the program into assembly
code.
3. Assembler Phase :- It
converts .s into .o file by using -c.
In this phase Assembly
code converts into Machine code.
4. Linker :-
It takes .o file as input and converts it into .out
file.
.out file
stored in secondary storage.
It links all the functions,
attributes and libraries to its definitions.
.out file
contains 3 segments.
Code Segment(Compiled
code).
Data Segment(Values).
ELF(Executable
Linkable Format).
5. When we Execute .out
file then Loader loads it into primary storage from secondary storage
and also create a Process Context for it.
p { margin-bottom: 0.25cm; line-height: 115%; background: transparent none repeat scroll 0% 0%; }
Process Context
It contains 3 segments :-
1. Stack Segment :-
It stores all local variables.
2. Data Segment :-
It stores static variables, register variables, constants,
global variables and Qualifiers.
3. Code Segment :-
It stores compiled source code.
Process context table contains
3 File Descriptors :-
FD0 :- It used for
stdin stream, takes all input values.
FD1 :- It used for
stdout stream, gives all outputs.
FD2 :- It used for
strerr stream, send all errors.
$gcc –Wall –save-temps filename.c –o filename
First we will make a
.c file
which contain the sourcecode
After compilation
every .c
file have to convert in executable code
Every .c file have to pass through five
phases before generating executable code
1 . PREPROCESSOR
2 . TWO PASS COMPILER
3 . ASSEMBLER
4 . LINKER
5 . LOADER
1 . PREPROCESSOR
input file :
filename.c
output file : filename.i
command : gcc - E filename.c
Preprocessor is the first phase . It performs some operations
:-
(i) Include the Header file
(ii) Remove all the comment lines
(iii) Remove all the white spaces
(iv) Expansion of Macros
2
. TWO PASS COMPILER
input
file : filename.i
output
file : filename.s
command
: gcc - S filename.i
filename.s
file pass through five
phases by using Symbol
Table & Error Handler
before producing Assembly
code. These Phases are :-
(i)
Lexcal Analysis
(ii) Syntax Analysis
(iii) Intermediate Code Generation
(iv) Code Optimization
(v) Object Code Generation
This phase contains ( assembly code + Links )
3 . ASSEMBLER
input file : filename.s
output file : filename.o
command : gcc - c
filename.s
This phase contains ( code segment + data
segment + links )
code segment contains : assembly level instruction
data segement contains : variable and constants
links : contain links with libraries in symbol table
4
. LINKER
input
file : filename.o
output
fie : filename.out
This
phase produce the final executable code which contains (
ELF + code segment + data segment )
ELF
: Executable Linkable Format
filename.out
will store in the secondary storage
5
. LOADER
when
we execute file then LOADER loads
the filename.out
into primary memory(RAM) from secondary memory
What does compilation means?
Compilation is a process in which any source code is converted into an object code with the help of a compiler. While compiling a source code which is let say a C program, the compiler will generate some files which are of the same name as that of the name of the source code file but with some different extensions.
The stages of compilation process and the files which are generated on each stages are as follows:
File_Name.c(Source Code File)->[Preprocessor]->File_Name.i(Intermediate File)->[Compiler]->File_Name.s(Assembly File)->[Assembler]->File_Name.o(Object File)->[Linker]->File_Name.exe or a.out(Executable File)
Preprocessor
# gcc –E File_Name.c -o File_Name.i
Here, option –E will produce a output file of pre-processor which is an intermediate file but it will not pass this file to the compiler for the compilation.
# gcc –S File_Name.c
Assembler
# gcc –c File_Name.s
Linker
# gcc File_Name.o
It looks like you're new here. If you want to get involved, click one of these buttons!