UNIX

Archived Posts from this Category

UNIX Filters

Posted by Bobby Corpus on 02 Nov 2006 | Tagged as: UNIX

JAVAERO - UNIX was designed with simplicity in mind. Each command is supposed to do only one thing and do it well. However, this does not mean that
unix can only do simple things. On the contrary, highly complex tasks can be done by stringing these commands together. The output of one command can be
the input of another.

$ cat poem.txt| less

In this example, the command “cat” will dump to the console the contents of the file poem.txt. This is then made as input to the less command. This is quite
a simplistic example. In fact we can just do “less poem.txt” to have to same effect. However, this example is just to illustrate the concept of a unix pipe. The symbol “|” is called the pipe symbol and separates commands. The output of “cat poem.txt” is fed to the “less” command. Notice that the word immediately to the right of the “|” symbol will be interpreted by the shell as a command. An error will occur if the command does not exist.

Now that we have some familiarity with unix, let us add more to our arsenal of commands. One very common task we want to do is to find a specified file matching a certain pattern. The command to do this is “find”.

$ find . -name test.c

This command will search for files in the current directory named test.c. I will also search for test.c in all subdirectories contained in the current directory. The first argument to find is the directory to search. In this case, the . tells find to search in the current directory. The option “-name test.c” tells find that the filename matches test.c. In this command, we specified an exact match. We can also tell find to search for all files ending in .c

$ find . -name \*.c

The “\” backslash is necessary in order for the command to work. The “\” is an escape character. We will have more to say about it in the coming sections.

More advanced shell.

Some characters are treated specially by the shell. When the shell encounters them in a command, it does some preprocessing before executing the command. The most common of these characters have something to do filename expansion.

Filename expansion allows us to specify a group of files using a pattern. For example, the pattern *.c will expand into all files whose name end in .c.

$ ls

$ ls *.c

is the same as

$ ls blah.c foo.c …

In other words, the shell first looks for all files ending in .c and constructs a list of all these files and makes them the argument to the “ls” command.

The following characters can be used to construct a pattern for filename expansion.

* - matches one or more characters
? - matches a single character
[] - matches any single character contains within the “[]”. If “^” is the first character of the list, it matches all files not containing any character in the list.

$ ls
README
license.txt
main.c
pr.h

To match main.c and pr.h you can either do

$ ls *.?

or

$ ls *.[hc]

The first command will match all files with a single letter extension. The second will match all .c and .h files and is equivalent to the command ls *.h *.c.

To match the file README, you can

$ ls R*

Exercise: Suppose we have the following files in the directory

$ ls
README
INSTALL
main.c
init.c
Makefile

How do you match README and INsTALL?

Since the characters *, ?, and [] are interpreted by the shell, we must avoid filenames that contain them. For example, if we have a file named *, we can’t just do this:

$ rm *

for this will match all files in the directory and will delete all of them. To match the file named * we must quote the * using the “\” backslash character.

$ rm \*

will delete the file named *.

However, this expansion will only be applicable to the current files in the directory and not to the files in the subdirectories. To find all files ending in .c in all subdirectories of the current folder, we use the find command.

$ find . -name \*.c

Notice the backslash character. This is necessary in order for the shell not to expand the *.c.

REGULAR EXPRESSIONS

A very important skill that any unix professional should acquire is the ability to create and use regular expressions. A regular expression is a way to specify a pattern that can be used in matching text. A regular expressioin is similar to shell filename expansions but they are more powerful. The power of regular expressions is very hard to express in writing, they can only be experienced.

Three important filters in unix make use extensively of regular expressions, namely: sed, grep and awk.

The following characters are used to construct regular expressions.

^ - matches the beginning of a string
$ - matches the end of a string
. - matches any single character
[..] - matches a single character in the list
[^..] - matches a single character NOT in the list
(..) - used to group a regular expression
| - used to indicate alternative regular expression
* - an operator, it specifies that the preceding regular expression be matched zero or more times
+ - similar to the * operator but requires the preceding regular expression to be matched at least once.
? - similar to the * operator but requires the preceding regular expression to be matched at most once.
{N} - also known as a braced regular expression, is an operator that the requires the preceding regex to be matched N times
{N,} - at least N times
{N,M} - N to M times.

The above list will already produce a lot of possible regular expressions. In daily work, we usually only use a few of these constructs.

Let us give examples of the above constructs.

$ touch Readme theReadme

To match the file “Readme”,

$ /bin/ls -1 |grep “^Readme”

To match the file “theReadme”

$ /bin/ls -1 |grep “Readme$”

$ touch main.c main.x test.C test.cxx
$ /bin/ls -1 |grep “\.[ch]$”

will match all files ending in .c or .h Notice that the “.” is explicitly matched and should be escaped by a backslash. The $ requires that the character “c” or “h” should be the last character on the line. Without the $, the pattern will also match .cxx.

$ /bin/ls -1 | grep “\.[^ch]”

will match all files not containing a c or h after the dot.

Suppose we have the following files:

$ /bin/ls *Readme*
theReadme
Readme
dontReadme

If we want to match on theReadme and Readme, we can write

$ /bin/ls -1|grep “^\(the\)*Readme”

Notice that we grouped “the” as a regular expression. The * operator after the regex “the” will match zero or more “the” patterns. The ^ in the beginning of the pattern acts as an anchor. It requires that the pattern occur at the beginning of the line. This prevents the file dontReadme to be matched.

$ touch peterson johnson benson

To match peterson and johnson we use the alternation operator

$ /bin/ls -1 |grep “\(peter\|john\)son”

We need to escape the “|” in order to protect it from the shell.

Sed is another tool makes extensive use of regular expressions. Sed is short for stream editor. It has a lot of options but most of the time, sed is used in
find-replace operations.

The syntax of sed find-replace is

sed ’s/pattern to find/replacement/g’

$ echo “the quick brown fox jumped over another fox”|sed ’s/fox/cat/g’
the quick brown cat jumped over another cat

In the above example, sed substituted the word fox with cat. Since the word “fox” occurred twice in this string, the substitution occured
twice. To replace only the first occurece, we omit the “g” in the above sed argument.

$ echo “the quick brown fox jumped over another fox”|sed ’s/fox/cat/’
the quick brown cat jumped over another fox

The pattern “fox” is an exact pattern, We can also specify a regular expression in place of fox, for example,

$ echo “the quick brown fox jumped over another fox”|sed ’s/f.*x/cat/’
the quick brown cat

This tells sed to substitute the word cat to the string that matched the pattern f.*x. Let us analyze this pattern. It consists of strings that start with “f”, one or more characters in between ( as specified by .*) and ends in y. You might think that sed will match the word “fox”. However, in this case, sed scanned the input string and first meets the letter “f”. Then it examines the next character, which is “o”. Since this satisfied the condion “one or more character in between”, sed accepts this and continues to examine the next character. Upon seeing “x”, sed should probably stop. However, it does not. It continues to scan the whole string until it encounters the last “x”. Therefore, the pattern that matched is “fox jumped over another fox”. In this way, sed is said to match the longest matching pattern.

Editing files using VI editor.

Posted by Bobby Corpus on 02 Nov 2006 | Tagged as: UNIX

JAVAERO - The vi editor is probably the only editor you will use in the unix command line. It is quite different from the ordinary editor you are familiar in windows. It will take time to get used to. However, when you have mastered it, you’re will be more efficient in your work.

To edit a file named poem.txt, invoke vi like this:

$ vi poem.txt

If the file poem.txt exists, vi will open the file for editing. If not, this will tell vi to create a new file named poem.txt.

Unlike our usual microsoft editors where you can begin typing your text, vi does not allow you to do that. When it is launched it is said to be in “command mode”. In this mode, vi will interpret your keystrokes as commands. These commands are usually what you see in the toolbar of our familiar editors. In vi, there is no toolbar. Commands are invoked using key strokes. In order to type any text at all, you need to be in the “edit” mode. You can do this by pressing the “i” key. This is a command that tell vi to be in the edit mode. Once in the edit mode, you can then start typing. Here is a nice unix poem you can type using vi.

Waka waka bang splat tick tick hash,
Caret quote back-tick dollar dollar dash,
Bang splat equal at dollar under-score,
Percent splat waka waka tilde number four,
Ampersand bracket bracket dot dot slash,
Vertical-bar curly-bracket comma comma CRASH.

One you are done entering the text, you go to the command mode using the escape key (ESC). In the command mode, you can save your text by typing “:wq”.

This is basic vi editing. other advance features we will learn as we progress.

Getting Started With the BASH Shell

Posted by Bobby Corpus on 02 Nov 2006 | Tagged as: UNIX

Perhaps the first question you want to know when you are at the shell is where you are. The command pwd will give you your current working directory.

$ pwd
/home/bobby

The ls command will give you a list of all files in the current directory.

$ ls

Giving it an argument, ls will determine if the argument is a directory. If it is, it will list the files contained in that directory. If it is not a file, ls will just list down the file as if echoing what you typed. However, if the file does not exist, ls will tell you that fact.

$ ls non-existent
non-existent not found

Some unix commands require a parameter in order to complete successfully. Some, like the ls command above, have default values. A unix command is structured in the following way:

command_name options arguments

options are usually preceded by a “-” sign. Options allow you have more control over the output of a command. For example, the command

$ ls -l

will list down detailed information about the files in the current directory. We will learn how to interpret the output of this command later when we gain more familiarity with the basic commands.

Exercise: List down the files of your parent directory.

Other basic commands.

To change to a directory, we use the cd command. It takes a single argument, which is the directory to change to. If you don’t specify an argument, it will change to your home directory, as defined by the HOME environment variable.

To create a directory, we use the mkdir command. It takes one or more arguments. These arguments are the names of the directories to create.

$ mkdir tmp1 tmp2 tmp3

will create directories tmp1, tmp2 tmp3 as shown by the ls command below.

$ ls -F

tmp1/ tmp2/ tmp3/

Question: What does the -F option to ls do?
We can remove the directories we just created using the rmdir command.

$ rmdir tmp1 tmp2 tmp3

If a directory is not empty, rmdir will refuse to remove that directory unless it is first emptied of all files.

Exercise: How do you force rmdir to remove a non-empty directory.

We can create empty files using the command “touch”.

$ touch me

Will create the file named “me” in the current working directory.

To remove this file, use the “rm” command.

$ rm me

We should use the rm command with caution because you cannot undelete the file you just deleted. You can tell rm to ask you for confirmation to delete the file using the “-i” commandline switch.

$ rm -i me
me: ? (y/n)

We now know how to do some basic commands for unix. But we are still not able to do anything useful. Next we want to do is to be able to view the content of a file. There are many ways to peek
at the contents of files. For short files that will fit the screen, we can use the “cat” command.

$ cat filename

This will dump on the screen the contents of file named filename. If the file is long and will not fit one
screenful, the contents will just flash quickly on the screen and you will just be able to view the end portion
of the file.

To view longer files, we can use the “more” command.

$ more filename

will let you view the file page by page. The more command is also known as a “pager” command because it let’s you view a file page by page. If a file is long, you can scroll the the next page using the space bar. The only problem with the “more” command is that you cannot go back to the previous page. You can only scroll-down, no up.

A more powerful pager than “more” is the “less” command. It allows you to scroll up and down. In this respect, we can say that “less” is more.

$ less long-file

To scroll backwards, press the “b” key.

Shell environment variables.

The shell is not only a command interpreter, it is also a programming environment. By an environment, what we mean is that is gives you the necessary tools to create programs conveniently and according to your own programming style and preference. We will explore that later.

As we have seen before, some commands are able to operate without an argument, like the cd command. The cd commands depends on the environment variable HOME which stores the value of your home directory. To view the value of this variable we use the echo command.

$ echo $HOME
/home/bobby

Using this value, cd command will change to that directory when there are no arguments.

There are predefined environment variables when your account is first created. Many commands depend on the definitions of these variable for proper functioning. (just like the human body depends on some factors in order to operate properly). Among the most important variables you the shell depends are PATH. The path tells the shell where to find programs. Accidentally changing the value of this variable can give you a big headache as the shell cannot anymore find some programs.

You can also define your own environment variables. To define an environment variable, the syntax is

name=value

For example, to define the variable QUOTE to “Health is wealth.”
we type

$ QUOTE=”Health is wealth.”

Notice that the value is enclosed in quotation marks. This will tell the shell that the value contains spaces. If the quotation marks is ommitted an error will occur, like this:

$ QUOTE=Health is wealth
bash: is: command not found

As you can see, the shell interpreted “is” as a command, and having found none, it issues the error “command not found”.

Unix Questions

Posted by Bobby Corpus on 02 Nov 2006 | Tagged as: UNIX

Unix Questions. This is a commandline exercise. Find a way to make execute these instructions in the fastest way possible.
1. cut and paste the following to a file named the_road_not_taken
TWO roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;
Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,
And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.
I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I.
I took the one less traveled by,
And that has made all the difference
2. count the number of words
3. count the number of unique words
4. create a directory for every unique word
5. move the directories to their uppercase equivalent. For example,
mv Yet YET
6. copy the file the_road_not_taken to each directory with the name of the directory in lower case and extension is .txt
for example:
cp the_road_not_taken WORN/worn.txt
7. rename all .txt files to .doc extension. For example
mv worn.txt worn.doc
8. format the file the_road_not_taken in such a way that it will look like this:

TWO |roads |diverged |in |a |yellow |wood, |
And |sorry |I |could |not |travel |both |
And |be |one |traveler, |long |I |stood |
And |looked |down |one |as |far |as |I |could |
To |where |it |bent |in |the |undergrowth;|

9. grep the lines that contain the word “the” in the 4th column.
10. create an executable file named mytest.sh with the content:
#!/bin/sh
while true
do
sleep 5
done
a. launch 5 process in the background.
b. kill all these processes.