Inhaltsverzeichnis
UNIX tutorial
This tutorial teaches some basics about UNIX. Like Microsoft Windows and Apple Mac OS, UNIX is an operating system. By operating system, we mean the suite of programs which make the computer work. The Neumann Cluster runs a variant of UNIX called Linux. Because jobs must be submitted to the supercomputer using UNIX commands, it is pertinent that you, as a user of the supercomputer, obtain at least a basic knowledge of the UNIX environment.
In the first section, you'll find a brief overview of useful commands.
Then you'll find a series of video introductions by Brain Will.
Starting from the subsequent chapter, The UNIX operating system, we'll try to introducing concepts in written format.
If something is not explained well, or questions remain: provide your comments, and ideas for improving this guide.
Quick Reference Guide
ls, cd, cp, mkdir, rm, mv, echo
Printing a file's content (to the terminal) | |
---|---|
cat file.sh | Completely prints a (text) file to the terminal. |
head -20 file.sh | Prints the first 20 lines of a file to the terminal. |
tail -10 file.sh | prints the last 10 lines of a file to the terminal. |
Searching a String within a file | |
---|---|
grep -h starccm job.sh | Print all lines which contain the string starccm in the file job.sh |
grep -n starccm job.sh | Print all lines which contain the string starccm in the file job.sh , including the line numbers |
grep -h -i starccm job.sh | Print all lines which contain the string starccm ignoring (upper/lower) case in the file job.sh |
Search for Files | |
---|---|
find ~ | lists all files and directories found in home(~ ) |
find ~ -name run.sh | lists all files and directories found in home(~ ) with the name run.sh |
find ~ -iname run.sh | lists all files and directories found in home(~ ) with the case-insensetive name run.sh , such as „Run.sh“, „run.SH“ or „rUN.sh“ |
find ~ -name run.sh -type f | lists only files found in home(~ ) with the name run.sh |
find ~ -regextype posix-extended -regex '.*\.(sh|tcl)' -type f | lists all files which have the extension .sh or .tcl . |
find ~ -regextype posix-extended -iregex '.*\.(sh|tcl)' -type f | lists all files which have the extension .sh or .tcl , case-insensitively, such as .TCL , and .SH . |
find . -name „*.tcl“ | xargs grep -l -i „proc ic_save_tetin“ | Search a string contained in files of a list of files provided by find |
top, ps, kill, killall
Video Tutorials
This is a series found on youtube and has been made by Brain Will.
The UNIX operating system
In general, the UNIX operating system is made up of three parts; the kernel, the shell, and the programs.
The kernel
If we think of the UNIX operating system in terms of layers, the kernel is the lowest layer. It interfaces directly with the computer hardware and is responsible for allocating and managing the resources available to programs. It allocates processor time and memory to each program and determines when each program will run. The kernel also provides an interface to programs whereby they may access files, the network, and devices.
The shell
The shell acts as an interface between the user and the kernel. When a user logs in, the login program checks the username and password, and then starts another program called the shell. The shell is a command line interpreter (CLI). It interprets the commands the user types in and executes them. The commands are themselves programs. Once programs terminate, control is returned to the shell and the user receives another prompt ($
on our systems), indicating that another command may be entered.
The programs
One of the main features of UNIX is that it includes a variety of small programs to meet various needs. Typically, each of these programs does one thing and does it well. This modular design allows the functionality of small programs to be mixed and matched. As you become more familiar with UNIX, you will find that this design provides you great flexibility and power to accomplish almost any task. Typically these programs operate on top of the shell, but they may also interface directly with the kernel. Files and Processes
Files and Processes
Everything in UNIX is either a file or a process.
- A process is an executing program identified by a unique PID (process identifier).
- A file is a collection of data.
Some examples include:
- a document (report, essay, etc.)
- the text of a program written in some high-level programming language
- instructions comprehensible directly to the machine and incomprehensible to a casual user, for example, a collection of binary digits (an executable or binary file)
- a directory, containing information about its contents, which may be a mixture of other directories (subdirectories) and ordinary files
The Directory Structure
(explain:)
/ root ~ home . current directory .. parent directory /scratch/tmp/ /cluster/apps/
Using the shell
The default shell for users on the supercomputer is the bash shell, but each user has the ability to customize his/her shell. Please see the FAQ question „How do I change my password and/or login shell?“ from the marylou4 FAQ on the left for more information on how this is done. Most shells, including bash, provide features that make inputing commands easier for the user. Here are just a few of those features:
- Tab Completion - By typing part of the name of a command, filename or directory and pressing the [Tab] key, the shell will complete the rest of the name automatically. If the shell finds more than one name beginning with those letters you have typed, it will show you all of the possibilites beginning with that combination.
- The
history
command - The shell keeps a list of the commands you have typed in. If you need to repeat a command, you can use the up and down arrow keys to scroll up and down the list or enter history for a list of previous commands.
When you open up a shell window a prompt of the following structure will be printed
[cfd01@ws15 ~]$
What does this line mean? cfd01
is your username you logged in with. ws15
is the name of the computer you are logged in. ~
is a symbol which is a substitution of the path to your home directory. In a different directory the directory's name is printed instead of ~
. Try out the following command (type cd Documents
followed by the [Enter] key).
[cfd01@ws15 ~]$ cd Documents
The Tilde ~
should now be replaced with Documents
in the next line. What did you just type? cd
is command to change the current directory. So you just jumped to the subdirectory Documents
of your home
directory. The folder structure of a typical linux system is described here. But first let's have a look at how a command looks like. For simplicity the part [cfd01@ws15 ~]
will be skipped from now on. You will only found the symbol $
in the beginning of a line containing a command.
Command Syntax
In the active terminal window you can run an installed program by simply typing its name and press the enter key. This action would be called command
.
You may try out whoami
or pwd
. The use of these commands is pretty obvious, if not just keep on reading.
Often you need to tell a program more than „just run yourself“. For example cd
you may have tried before. To change the current directory you need to type in the new location. In this context the directory name is a so called argument for the program cd
. Typically, arguments are optional. The same is true for so called options
. The symbol -
or –
in front of some characters usually indicate an option. Well, what's the difference between an argument and an option you may ask… Typically an argument is the element the program is working on and an option defines how the work is done.
$ command argument $ command -option
Let's try out a few commands.
$ ls $ ls -l $ mkdir newdirname $ ls -t $ ls -l -t $ cd newdirname $ ls -lt $ ls -s $ ls -ls $ cd .. $ ls -a
What do all those options do? It's not necessary to remember all those options, even if they are handy sometimes. To find out how to use certain program there typically three methods available offline, coming with your system. The first method is the option -help
or –help
available on most installed programs.
$ command --help
The second method is to use the manual database man
. Just call man
and the program of interest as an argument. You can navigate with the arrow key, screen up/down,.. To quit man
press the key <q>. Remember it. <q> is also used in other often used programs.
$ man command
And, the third method is similar to the manual database. It's called info
. It is used in the same way as man
.
Try to find out the meaning of the options of the previous ls
-example.
$ man ls
A man
ual can only be read if it is installed on the system where it's called. Which manuals are available differs between systems.
More details on directories and files
Making a direcotry
We will now make a subdirectory in your home directory to hold the files you will be creating and using in the course of this tutorial. To make a subdirectory called unixstuff
in your current working directory type
$ mkdir unixstuff
To see the directory you have just created, type
$ ls
Copying Files
cp file1 file2
is the command which makes a copy of file1
in the current working directory and calls it file2
.
What we are going to do now, is to take a file stored in an open access area of the file system, and use the cp command to copy it to your unixstuff directory.
First create a file in home:
$ echo "this is a text line"> ~/myfile.txt
Then, cd to your unixstuff directory:
$ cd ~/unixstuff
Now at the UNIX prompt, type,
$ cp ~/myfile.txt .
The above command means copy the file myfile.txt
from the parent directory to the current directory, keeping the name the same.
Moving files
mv file1 file2
moves (or renames) file1
to file2
. This has the effect of moving rather than copying the file, so you end up with only one file rather than two. It can also be used to rename a file, by moving the file to the same directory, but giving it a different name.
We are now going to move the file myfile.txt
to your home directory. First, change directories to your unixstuff
directory (can you remember how?). Then, inside the unixstuff
directory, type
$ mv myfile.txt ~/moved.txt
Type ls
and ls ~
to see if it has worked.
Removing files and directories
To delete (remove) a file, use the rm
command. As an example, we are going to create a copy of the myfile.txt
file then delete it.
$ cp ~/myfile.txt tempfile.txt $ ls $ rm tempfile.txt $ ls
You can use the rmdir
command to remove a directory (make sure it is empty first). Try to remove the backups directory. You will not be able to since rmdir will
not let you remove a non-empty directory. If you wish to remove a directory and its contents use the rm -r
command, but be careful and make sure this is exactly what you want!
Displaying the contents of a file on the screen
cat (concatenate)
The command cat can be used to display the contents of a file on the screen. Type:
$ cat myfile.txt
As you can see, the file is longer than than the size of the window, so it scrolls past making it unreadable.
less
The command less writes the contents of a file onto the screen a page at a time. Type
$ less myfile.txt
Press the [space] if you want to see another page, type [q] if you want to quit reading. You can also use the arrow keys and page up/down keys to scroll through the file as you would in a text editor. less is used in preference to cat for long files.
$ clear
This will clear all text and leave you with the prompt at the top of the window.
Searching the contents of a file
Simple searching using less
Using less, you can search though a text file for a keyword (pattern). Open a file. Then, still in less (i.e. don't press [q] to quit), type a forward slash [/] followed by the word to search, such as /line
As you can see, less finds and highlights the keyword. Type [N] to search for the next occurrence of the word. You can also hold down [Shift] and type [N] to search for the previous occurrence of the word.
Grep
prepare science.txt
grep is one of many standard UNIX utilities. It searches files for specified words or patterns. First clear the screen, then type
$ grep science science.txt
As you can see, grep has printed out each line containing the word science, or has it?
Try typing
$ grep Science science.txt
The grep command is case sensitive by default; it distinguishes between Science and science. To ignore upper/lower case distinctions, use the -i option, i.e. type
$ grep -i science science.txt
To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe symbol). For example to search for spinning top, type
$ grep -i 'spinning top' science.txt
Some of the other options of grep are:
- v display those lines that do NOT match
- n precede each matching line with the line number
- c print only the total count of matched lines
Try some of them and see the different results. Don't forget that you can use more than one option at a time, for example, the number of lines without the words science or Science is
$ grep -ivc science science.txt
wc (word count)
A handy little utility is the wc command, short for word count. To do a word count on science.txt, type
$ wc -w science.txt
To find out how many lines the file has, type
$ wc -l science.txt
Summary | |
---|---|
cp file1 file2 | copy file1 and call it file2 |
mv file1 file2 | move or rename file1 to file2 |
mkdir dirname | creates directory |
rm file | remove a file |
rmdir directory | remove a directory |
cat file | display a file |
less file | opens a file |
grep 'keyword' file | search a file for keywords |
Redirecting Output
Typically, when you run a command some form of output (a so called stream) is generated. This output is distinguished in so called standard output (1) and in the error output (2). The error output receives all error messages, e.g., file not found, command crashed, … All other output is send to the standard output. Both are typically printed to the terminal. But, sometimes you want to save the output into a file. This can be easily done by redirecting the 'standard output to a specified file
'.
$ command -options arguments> outputfile.txt
Bear in mind, that by redirecting the output with the angle bracket >
a previously existing
file is overwritten
! However, it is possible to 'append the output to an existing file
' by the following syntax
$ command>> existingfile.txt
When you do not specify which kind of output you want to redirect, only standard output will be redirected. Error messages need there own redirection. Otherwise, the standard output will saved to a file, while error messages is printed to the terminal and will be lost when the terminal is closed/full. For that reason you typically add a redirection which 'appends error messages to the standard output
':
$ command > outputfile.txt 2>&1
Of course, it's possible to write error messages into a separate file
$ command > outputfile.txt 2>errors.txt
Now let's say you're running an OpenFOAM solver and redirected all output to a log file. But, you still want to see the output continuously as the solver is running. This command will not work out of the box for everyone. Move further down to the command ping
when you don't have an OpenFoam installation.
Running a command
$ ping -c 500 www.ovgu.de> ping.log
then open another terminal window and run tail
with the option -f
. It will continuously follow the file and print every new line to the terminal.
$ tail -f ping.log
To exit the tailing press [CTRL]] + [c]. This shortcut isn't just working with tail
. [CTRL]] + [c] can send a stop command to every command which is currently running in the active terminal window.
Piping
In the previous section it is shown how to redirect output to a file. Often, it is the case that you need to process the output of a command by a secondary command. Let's say you want to count how many files there are in a directory. You could save the output of ls
to a file and call this file by wc
to count the filenames. See the example:
$ ls file1.txt file2.txt file3.txt video.mpeg plot.png $ ls> allfiles.txt $ wc -l allfiles.txt 5
This is inconvenient. There is an easier method which is called piping. This method allows to send output of a first command directly to a second command. <br> The symbol | (vertical bar) is used to serially connect commands. To count files in a directory with piping run:
$ ls | wc -l 5
As you may notice, no intermediate file is written with that method.
It's possible to repeatedly pipe commands. Theoretically, as many times as you need.
$ command1 | command2 | command3 | ....
Wildcards
The characters * and ?
The character * is called a wildcard, and will match against zero or more characters in a file (or directory) name.
$ ls list*
This will list all files in the current directory starting with list
Now try typing
$ ls *list
This will list all files in the current directory ending with list
The character ? will match exactly one character. So ls ?ouse will match files like house and mouse, but not grouse. Try typing
$ ls ?list
File system security (access rights)
In your directory, type
$ ls -l
You will see that you now get lots of details about the contents of your directory, similar to the example below.
Each file (and directory) has associated access rights, which may be found by typing ls -l.
In the left-hand column is a 10 symbol string consisting of the symbols d, r, w, x, -, and, occasionally, s or S. If d is present, it will be at the left hand end of the string, and indicates a directory: otherwise „-“ will be the starting symbol of the string.
The 9 remaining symbols indicate the permissions, or access rights, and are taken as three sets of 3.
- The left set of 3 gives the file permissions for the user that owns the file (or directory) (dhansen7 in the above example).
- the middle set of 3 gives the permissions for the group of people to whom the file (or directory) belongs (also dhansen7 in the above example). This group name may or may not be the same as the username. If the file is in your home directory, you are the only one who needs access to it and it is most likely the same. However, if you are working on files in a fsl group (files found in the fsl_groups subdirectory of your home directory), the group name is most likely different, as everyone in your group needs to have access to those files. If you are working with other users on files in an fsl group and one or more users of the group are unable to access certain files, it is most likely because this field is set incorrectly or because the group to which the file belongs is incorrect.
- The rightmost group gives the permissions for all others.
The symbols r, w, etc., have slightly different meanings depending on whether they refer to a simple file or to a directory.
Access rights on files.
- r (or -), indicates read permission (or otherwise), that is, the presence or absence of permission to read and copy the file
- w (or -), indicates write permission (or otherwise), that is, the permission (or otherwise) to change a file
- x (or -), indicates execution permission (or otherwise), that is, the permission to execute a file, where appropriate
Access rights on directories.
- r allows users to list files in the directory;
- w means that users may delete files from the directory or move files into it;
- x means the right to access files in the directory. This implies that you may read files in the directory provided you have read permission on the individual files.
So, in order to read a file, you must have execute permission on the directory containing that file, and hence on any directory containing that directory as a subdirectory, and so on, up the tree. Some examples
-rwxrwxrwx a file that everyone can read, write and execute (and delete). -rw——- a file that only the owner can read and write - no-one else can read or write and no-one has execution rights (e.g. your mailbox file).
Changing access rights
Only the owner of a file can use chmod
to change the permissions of a file. The options of chmod
are as follows
Symbol | Meaning |
---|---|
u | user |
g | group |
o | other |
a | all |
r | read |
w | write (and delete) |
x | execute (and access directory) |
+ | add permission |
- | take away permission |
For example, to remove read write and execute permissions on the file biglist for the group and others, type
$ chmod go-rwx biglist
This will leave the other permissions unaffected.
To give read and write permissions on the file biglist to all (this is generally a bad idea),
$ chmod a+rw biglist
Changing the ownership of a file
Just like chmod
, only the owner of a file can use chown
. This command is used to change the ownership of a file or directory. That is the user and group that owns a file can be changed. The format of the command is as follows:
$ chown <user>:<group>
So, if we wanted to change the owner of the file biglist to newuser and the group to newgroup we would do like so:
$ chown newuser:newgroup biglist
Also, a useful option when using chown on directories is the -R option. If -R is used, the new ownership will be applied to all files and subdirectories contained within the directory. Suppose that I have a folder named sharedgroup which is not accessible to the rest of my group, but I want the group to have access to everything in it. In order to give the rest of the group access, I could run
$ chown -R myusername:mygroup sharedgroup
Processes and Jobs
A process is an executing program identified by a unique PID (process identifier). To see information about your processes, with their associated PID and status, type
$ ps
A process may be in the foreground, in the background, or be suspended. In general the shell does not return the UNIX prompt until the current process has finished executing.
Some processes take a long time to run and hold up the terminal. Backgrounding a long process has the effect that the UNIX prompt is returned immediately, and other tasks can be carried out while the original process continues executing. Running background processes
To background a process, type an & at the end of the command line. For example, the command sleep waits a given number of seconds before continuing. Type
$ sleep 10
This will wait 10 seconds before returning the command prompt ( sleep 10 &
[1] 6259
The & runs the job in the background and returns the prompt right away, allowing you run other programs while waiting for that one to finish.
The first line in the above example is typed in by the user; the next line, indicating job number and PID, is returned by the machine. The user is be notified of a job number (numbered from 1) enclosed in square brackets, together with a PID and is notified when a background process is finished. Backgrounding is useful for jobs which will take a long time to complete. Backgrounding a current foreground process
At the prompt, type
$ sleep 100
You can suspend the process running in the foreground by holding down the [Ctrl] key and typing [Z] (written as ^Z) Then to put it in the background, type
$ bg
Note: do not background programs that require user interaction e.g. pine
When a process is running, backgrounded or suspended, it will be added to a list along with a job number. To examine this list, type
$ jobs [1] Suspended sleep 100 [2] Running top [3] Running nano
To resume (foreground) a suspended processes, type
$ fg %jobnumber
For example, to resume sleep 100, type
$ fg %1
Typing fg with no job number foregrounds the last suspended process.
Killing a process
kill (terminate or signal a process)
It is sometimes necessary to kill a process (for example, when an executing program is in an infinite loop)
To kill a job running in the foreground, type ^C ([Ctrl] [C]). For example, run
$ sleep 100
To kill a suspended or background process, type
$ kill %jobnumber
For example, run
$ sleep 100 & $ jobs
If it is job number 4, type
$ kill %4
To check whether this has worked, examine the job list again to see if the process has been removed.
ps (process status)
Alternatively, processes can be killed by finding their process numbers (PIDs) and using kill PID_number
$ sleep 100 & $ ps PID TT S TIME COMMAND 20077 pts/5 S 0:05 sleep 100 21563 pts/5 T 0:00 top 21873 pts/5 S 0:25 nano
To kill off the process sleep 100, type
$ kill 20077
and then type ps again to see if it has been removed from the list.
If a process refuses to be killed, use the -9 option, i.e. type
$ kill -9 20077
Persistant processes
If you just execute a program, such as batch.sh
in your terminal, it will run just fine.
However, almost all programs executed in a terminal will be attached to that very terminal.
This has implications.
If you close a terminal, or log out, then all attached programs will be shut down!
The same is true, if you lose the connection to a remote terminal (opened via ssh).
This behavior is undesired for most intensive computations. To avoid such issue, the computation has to be detached from the terminal.
To detach a program from a terminal use the program nohup
.
It's an acronym for no-hang-up.
More detail here: nohup.
When you run the following structure, the program (here batch.sh
) will not be killed on logout!
$ nohup ./batch.sh > nohup.out 2>&1 &
The most important parts of this command are nohup
and &
.
Don't forget the last character (&
)!
Be warned, that a detached program cannot be closed/killed easily anymore. A detached program will run as long as it wants. Make sure the program is able to close itself down savely. Otherwise, you have to kill a detached program manually. Review the previous section to read how.
This tutorial contains substantial parts from this tutorial:
https://marylou.byu.edu/documentation/unix-tutorial
This tutorial is licensed under CC BY-NC-SA 2.0.
Remixed by LSS, Sebastian Engel (2018), Previously modified by Daniel Hansen (2009), Original tutorial can be found at http://www.ee.surrey.ac.uk/Teaching/Unix/
M.Stonebank@surrey.ac.uk, 2002