Jan 182011
 

awkAwk has always been for me a source of great hatred and love, isan incredibly powerful command with which it is possible to build real programs.

In this article I will give you 6 examples ready for use with your preferred terminal.

AWK is a data driven programming language designed for processing text-based data, either in files or data streams. It is an example of a programming language that extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions.




AWK is one of the early tools to appear in Version 7 Unix and gained popularity as a way to add computational features to a Unix pipeline. A version of the AWK language is a standard feature of nearly every modern Unix-like operating system available today. AWK is mentioned in the Single UNIX Specification as one of the mandatory utilities of a Unix operating system. Besides the Bourne shell, AWK is the only other scripting language available in a standard Unix environment. It is also present amongst the commands required by the Linux Standard Base specification.

Implementations of AWK exist as installed software for almost all other operating systems.

The power, terseness, and limitations of early AWK programs inspired Larry Wall to writePerl just as a new, more powerful POSIX AWK and gawk (GNU AWK) were being defined. Although AWK and sed were designed to support one-liner programs, even the early Bell Labs users of AWK often wrote well-structured large AWK programs.

1 Remove duplicate entries in a file without sorting.

Using awk, find duplicates in a file without sorting, which reorders the contents. awk will not reorder them, and still find and remove duplicates which you can then redirect into another file.

#awk '!x[$0]++'

Example

echo -e "aaanbbbnaaanaancccnaa"|awk !'x[$0]++'

output:

aaa
bbb
aa
ccc

2 Sum the size of a selected group of files

With this command you can select some files and sum up their size, for example to sum up the size of all files in a directory use:

ls -l | awk '{s = s+$5 }; END { print s }'

or to sum up all your .mp3 files in current directory and subdir use:

ls -lR |grep .mp3 | awk '{s = s+$5 }; END { print s }'

And so on, just change your ls and you can select different name or kind of files.

3 alternative to former command with find

Search in all your computer your .mp3 files and sum up their size:

find / -name "*.mp3" -exec ls -l {} ; | awk '{s = s+$5 }; END { print s }'

4 Show the most used command in your history

List of commands you use most often:

history | awk '{a[$'`echo "1 2 $HISTTIMEFORMAT" | wc -w`']++}END{for(i in a){print a[i] "t" i}}' | sort -rn | head

5 Analyze awk fields

Breaks down and numbers each line and it’s fields. This is really useful when you are going to parse something with awk but aren’t sure exactly where to start.

awk '{print NR”: “$0; for(i=1;i<=NF;++i)print “t”i”: “$i}'

6 rename some files

Rename some files with extension .new

 ls -1 pattern | awk '{print "mv "$1" "$1".new"}' | sh

You can vary the pattern to show and change only certain types of files.

References:

  • Robbins, Daniel (2000-12-01). “Awk by example, Part 1: An intro to the great language with the strange name”. Common threads. IBM DeveloperWorks. http://www.ibm.com/developerworks/linux/library/l-awk1.html. Retrieved 2009-04-16.
  • Robbins, Daniel (2001-01-01). “Awk by example, Part 2: Records, loops, and arrays”. Common threads. IBM DeveloperWorks. http://www.ibm.com/developerworks/linux/library/l-awk2.html. Retrieved 2009-04-16.
  • Robbins, Daniel (2001-04-01). “Awk by example, Part 3: String functions and … checkbooks?”. Common threads. IBM DeveloperWorks. http://www.ibm.com/developerworks/linux/library/l-awk3.html. Retrieved 2009-04-16.
  • AWK  – Become an expert in 60 minutes
  • awk tutorial
  • All commands and variables: awk cheat sheet

    Popular Posts:

    flattr this!

      4 Responses to “6 Tricks with awk”

    1. ciao,
      ho trovato molto interessante il post.
      Ho sempre avuto interesse su awk che trovo molto versatile e comodo.
      Mi è piaciuto in particolare l’esempio 1 per la rimozione dei duplicati di cui non ho capito il funzionamento, devo studiare ancora qualche sfaccettatura del linguaggio.
      Ho solo notato due inesattezze negli esempi 2 e 3.
      Nel 2 sono elencati solo i file il cui nome termina per .mp3 contenuti nella directory corrente in quanto è la shell che sostituisce *.mp3 con l’elenco dei file passandoli poi come argomenti al comando ls e non li cerca in modo ricorsivo.
      Nel 3 non viene trovato alcun file perchè il valore da passare all’opzione -name dovrebbe essere *.mp3 e per evitare che tale valore venga sostituito con i nomi di file presenti nella directory corrente conviene racchiuderlo tra doppie virgolette così: “*.mp3″.
      Spero di essere stato utile e che il mio intervento sia gradito

      ciao

      • Ciao Antonio

        Grazie mille per la segnalazione, esempi corretti come da tue indicazioni ;)

    2. Instead: ls -lR |grep .mp3 | awk ‘{s = s+$5 }; END { print s }’
      you can use awk to filter like:
      ls -lR | awk /.mp3/ ‘{s = s+$5 }; END { print s }’

    3. 6 RENAME SOME FILES

      there is also rename – man rename

     Leave a Reply

    (required)

    (required)


    *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>