Utility awk - Basics

The awk is a utility developed by Aho, Weinberger and Kerninghan (AWK). It is a pattern scanning and processing language.By default it reads from standard input and writes to standard output.

> The awk programs can be used in two ways.  
> a.Commandline  
> awk 'pattern {print}' input_files  
> awk '/regular expression/ {print}' input_files
> 
> b.Program file  
> awk -f awk_script input_file

Let's take below file for awk processing

File: myrecord

EMPID_001 SBAL C005 DEP_SALES   
EMPID_002 GIRI C008 DEP_ENGG
EMPID_003 JAVV C001 DEP_HR
EMPID_004 SIKD C011 DEP_HR
EMPID_005 FARO C204 DEP_ENGG
EMPID_006 MABR C012 DEP_SALES

a. Simple awk command to print the file

> awk '{print}' myrecord  
> EMPID_001 SBAL C005 DEP_SALES  
> EMPID_002 GIRI C008 DEP_ENGG  
> EMPID_003 JAVV C001 DEP_HR  
> EMPID_004 SIKD C011 DEP_HR  
> EMPID_005 FARO C204 DEP_ENGG  
> EMPID_006 MABR C012 DEP_SALES

b. Print employees in SALES

> awk '/DEP_SALES/ {print}' myrecord  
> EMPID_001 SBAL C005 DEP_SALES  
> EMPID_006 MABR C012 DEP_SALES

c. Print only empolyee names and their department (Splitting the fields)

> awk '{print $2,$4}' myrecord  
> SBAL DEP_SALES  
> GIRI DEP_ENGG  
> JAVV DEP_HR  
> SIKD DEP_HR  
> FARO DEP_ENGG  
> MABR DEP_SALES

d.Print the record with line numbers (NR: Print line numbers)

> awk '{print NR,$0}' myrecord  
> 1 EMPID_001 SBAL C005 DEP_SALES  
> 2 EMPID_002 GIRI C008 DEP_ENGG  
> 3 EMPID_003 JAVV C001 DEP_HR  
> 4 EMPID_004 SIKD C011 DEP_HR  
> 5 EMPID_005 FARO C204 DEP_ENGG  
> 6 EMPID_006 MABR C012 DEP_SALES
> 
> awk '{print NR,$4}' myrecord  
> 1 DEP_SALES  
> 2 DEP_ENGG  
> 3 DEP_HR  
> 4 DEP_HR  
> 5 DEP_ENGG  
> 6 DEP_SALES

e. Print the lines 4-6 from myrecord

> awk 'NR==4, NR==6 {print NR,$0}' myrecord  
> 4 EMPID_004 SIKD C011 DEP_HR  
> 5 EMPID_005 FARO C204 DEP_ENGG  
> 6 EMPID_006 MABR C012 DEP_SALES

f. Print first field with line numbers

> awk '{print NR "- " $1 }' myrecord  
> 1- EMPID_001  
> 2- EMPID_002  
> 3- EMPID_003  
> 4- EMPID_004  
> 5- EMPID_005  
> 6- EMPID_006

g.Print only the last field (NF: Display Last Field)

> awk '{print $NF}' myrecord  
> DEP_SALES  
> DEP_ENGG  
> DEP_HR  
> DEP_HR  
> DEP_ENGG  
> DEP_SALES