Linux + C

Linux + C – Wrapup

Folks, we’ve pretty much reached the end of this series. Let’s take a look back on all the topics we’ve covered. Continue Reading…

Linux + C – Some other great tools

Well, we’ve pretty much run the gamut with Linux and C. However, there are a few other excellent tools you should know about.


Documenting code is a pain for most people. After months and months of diligent coding, they have to go back and create meaningful descriptions of all their library functions and how they operate.

I never have that problem, because I use Doxygen.

Doxygen is a program that translates specially-formatted comments into HTML documentation, complete with diagrams, hyperlinks, and other useful descriptors.

The format is fairly simple. Instead of using a single /*…*/ block, you use a /**…**/ block. Within this block, you can use the following tags:

  • @param – describe an input/output argument for the function
  • @brief – Write a brief description
  • @detail – Write a longer description
  • @return – describe the return values
  • @author – Put your name down, so people remember you’re the one who wrote the code
  • …and many more

So long as you continue to comment as you go, you will never deal with the documentation struggle that plagues so many.

Protip 1: All meaningful comments and debug logs are useful documentation. Don’t just throw them away when you’re done.

Protip 2: Document as you go. That way, when you come back to your code, you don’t have to re-discover how it works.


When we work on code, we make changes. Sometimes, it makes sense to track those changes, especially when multiple people are changing the code.

Git is a Linux utility that automates this code-versioning task. When you create a git directory, you create a tracking system which records all the “committed” changes between files.

The important things to remember with git are the basic functions:

  • Checkout – pull some code from a repository so only you can change it
  • Clone – Pull some code from a repository while allowing other people to change it
  • Commit – Tell your repository to save all the current files as the next version
  • Push – Send your committed files to a distant repository
  • Branch – Create a clone of a repository with a new name, which you will track separately
  • Diff – compare different versions of the code in your repository

It’s a great tool. For information on using this tool in the Open Source community, check out GitHub.

Note: the github site is a free, open-source repository host. It is not the official source of the git tool, nor do you need it to run git.


Sometimes you have two different versions of a file, and you want to know what the differences are.

Meld is a tool that takes two files or directories and prints out the differences between them in a well-designed GUI. It can also merge the files, generating one file with all the contents of the original input files.

Linux + C – Awk Example: Line Counter

Like most things, learning awk is much easier with examples. Today we’ll look at a simple awk script which counts lines of code based on their type.

The Code: linecount

The following code uses a lot of regular expressions to determine the nature of a line. There are three formats for these if statements:

if($0 ~ /REGEX/) {}

/REGEX/ {}

match($0, /REGEX/) {}

For simplicity, I am only using the first format.

Also, if we simply write the /REGEX/ without the braces, the default operation is /REGEX/ { print $0 }. I prefer to do it manually, but the option is there for you.

#! /bin/bash
echo ''
echo 'FILE            LINES    CODE    COMMENTS    ART'
echo '--------------------------------------------------------------'

for file in *; do
if [ -f $file ]; then
awk '
    BEGIN    {
        if(    $0 ~ /^$/         ||
            $0 ~ /^[\t]*;$/     ||
            $0 ~ /^[\t]*{$/     ||
            $0 ~ /^[\t]*}$/     ||
            $0 ~ /^[\t]*[(]$/        ||
            $0 ~ /^[\t]*[)]$/        ||
            $0 ~ /^[\t]*[};]$/    )
        else if($0 ~ /[//]+/)
        else if($0 ~ /[/*]/)
            while(!($0 ~ /[*/]/))
        printf("%16s\t%d\t%d\t%d\t\t%d\n", FILENAME, LINES, CODE, COMMENTS, ART);
' $file
echo ''

If you’re having trouble following the flow, this is what the program is doing:

  • If the line contains nothing or one of the “art” characters ( ) { } ; or };, classify it as “art” (meaningless, but makes the code prettier)
  • If the line contains //, classify that line as a comment
  • If the line contains /*, classify every line between that line and the line containing */ as a comment
  • Otherwise, classify the line as code

At the end, we print out the total number of lines and the breakdown of those lines.

Note: This code only works properly on C code. It won’t properly count, for example, scripts.

For more information and an excellent tutorial on all the facets of awk, check out the Grymoire (again).

Linux + C – Your New Friend, Awk

In modern internet lingo, awk is a synonym for awkward or abnormal. However, the Linux tool awk (and its brethren, gawk and nawk) are anything but awkward.

Awk – the programmable filter

So far, we’ve looked at a set of filters that perform fairly singular tasks. LESS and MORE are designed to filter output to the screen for easier viewing. SED is designed to use regular expressions to find and replace values in a stream. CAT is designed to print a file to the terminal, and ECHO is designed to print some text to the terminal.

What we’ve lacked so far is a filter whose operations are almost entirely under our control. While we had a lot of power with sed, we could only reliably use it to replace simple patterns in a stream; it wasn’t designed to, say, print a set of fields from a table to our terminal, then create a running sum of those field values.

This is where awk comes into play.

AWK (named after its developers, Aho, Weinberger, and Kernighan) is a pseudo-C language designed to manipulate files (and streams, but that’s not it’s best use), allowing us to create complex filtering operations with simple syntax and operations. There are a number of things we can teach the filter to do:

  • Find and replace in a similar manner to sed
  • Create multi-dimensional arrays
  • Print individual fields out of a file (extremely useful for filtering log files and raw data)
  • Print only specific lines out of a file, based on length or contents
  • Perform arithmetic operations, including increment and decrement

Perhaps there are even more applications for this fascinating filter, but so far I’ve not had need for them.

Three Stages of Awk

There are three basic phases of an awk program: the beginning, middle, and end. (Ed. Whoa, what a crazy concept)

The beginning phase occurs before awk begins to operate on the file. We can set up variables here, print lines of text, and perform system operations (terminal commands) exactly once, before we start processing any data.

The middle phase is performed on every line of the input file(s). This is where we do most of our filtering.

The end phase occurs exactly once, when the middle phase has finished reading all inputs. Generally, we use this phase to print out some results, or provide a message, or call some system operation.

An awk script basically looks like this:

awk '

Soon we’ll look at some of the nifty features that make awk such a powerful tool.

Linux + C – Brief Overview of Sed

via Fort Collins Program

Sed is a powerful editing tool, but difficult for most modern computer users to grasp. We’ll cover some of the basics here so that you can understand it when you see it in a script. Continue Reading…

Linux + C – Some Useful Filters

via Fort Collins Program

Soon we’ll talk about perhaps the most powerful programmable filter ever conceived (the scripting language AWK), but first let’s look at a few other common filters and their uses.


The filter “more” is a program that lets us print output one screen at a time. If you’ve ever run a diff between two files or a recursive ls, you might know that the output can be thousands of lines long. Normally, these lines will be presented in one long stream, and you will never see more than the hundred or so lines that the terminal stores in memory by default.

The solution is simple enough. more stores all these lines in memory and prints them out one screen at a time. This way, you can see all the output without piping it to a file and opening it in vi.

The most common usage of more is as follows:

cat file_name | more

Note: cat is a filter that prints a file out to the terminal. However, any program that creates output can be piped to more.


Unfortunately for more, it has no history. That means that you can only ever scroll down, not up.

The solution? less is a program that acts like more, but it allows you to scroll up and down as well. The wikipedia page on less is more useful (in my mind) than the manual page, but basically you can scroll down with the Space Bar (just like more) and scroll back up with b. You can also scroll by one line using j and k (j down, k up). To quit this program (just like with more), you hit the q key.

We usually call less exactly like we call more:

cat file_name | more


Sed is a stream editor (as opposed to a file editor). What this means is that the tool takes a stream of input, performs some operation on it, and moves on.

Sed is less powerful (and generally less useful) than awk, but I do have one excellent application: removing tabs from the end of a line.

The following operations will extract the tabs and spaces (respectively) from the end of lines in a file. Very useful for someone like me, who tends to accidentally leave a tab or two in some files.

sed -i ‘s/[\t]*s//’ File_Name

sed -i ‘s/[ ]*s//’ File_name

Bonus: Clear

While we’re talking about printing lines on a screen, it’s worth noting that the program “clear” will print a number of blank lines sufficient to clear your terminal window. This makes it look somewhat like you just started the terminal, but with all the history and memory of the terminal still intact.

If you’re done with all the output on your screen, just run clear once.

Linux + C – Regular Expressions (REGEX)

via Fort Collins Program

Many valuable programs in Linux rely on the concept of the Regular Expression (REGEX). It’s important for us to understand the basics of this “language” so that we can better use the tools we’re provided. Continue Reading…

Facebook Auto Publish Powered By :