--------------------------------------------------------------------- Basic LINUX (UNIX) Manual --------------------------------------------------------------------- URL: http://faculty.ucr.edu/~tgirke/Documents/UNIX/linux_manual INDEX (1) INTRODUCTION (2) BASICS (3) UNIX HELP (4) FINDING THINGS (5) USEFUL COMMANDS (6) JOB/PROCESS MANAGEMENT (7) TEXT EDITORS (8) THE UNIX SHELL (9) SIMPLE SHELL SCRIPTS (Kevin) (10) REMOTE COPY (11) UNPACK FILES (12) PERMISSIONS & OWNERSHIP (13) SIMPLE INSTALLS (Josh) (14) DEVICES (15) DISPLAY AND ENVIRONMENT VARIABLES (16) EXERCISES --------------------------------------------------------------------- 1. INTRODUCTION --------------------------------------------------------------------- Why UNIX? - Multitasking - Remote tasking ("real networking") - Multiuser - Access to shell, programming languages, databases, open-source projects - Better performance, less expensive (free), more up-to-date - Many more reasons How to get access - Install on local machine (not required!!!) - Get account on one bioinformatics server: 1. email tgirke@citrus.ucr.edu 2. requirements for Windows and Mac OS X: http://faculty.ucr.edu/~tgirke/GCG.htm#General UNIX variants - UNIX: Solaris, IRIX, HP-UX, Tru64-UNIX, FreeBSD, LINUX, ... LINUX distributions - RedHat, Debian, Mandrake, Caldera, Slackware, SuSE, ... --------------------------------------------------------------------- 2. BASICS --------------------------------------------------------------------- Login from PuTTY (Windows): - open PuTTY and select ssh - provide host name (IP) and session name $ user name: ... $ password: ... - for graphics emulation see: http://kansas.ucr.edu/gcg/gcg_simple.html - download PuTTY: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html from OS-X (Mac) or local Linux computer: $ ssh @ $ user name: ... $ password: ... changing password: $ passwd # follow instructions Orientation $ pwd # present working directory $ ls # content of pwd $ ll # similar as ls, but provides additional info on files and directories $ ll -a # includes hidden files (.name) as well $ ll -R # lists subdirectories recursively $ ll -t # lists files in chronological order $ stat # provides all attributes of a file $ whoami # shows as who you are logged in $ hostname # shows on which machine you are Files and directories $ mkdir # creates specified directory $ cd # switches into specified directory $ cd .. # moves one directory up $ cd ../../ # moves two directories up (and so on) $ cd # brings you to highest level of your home directory $ rmdir # removes empty directory $ rm # removes file name $ rm -r # removes directory including its content, but asks for confirmation, 'f' argument turns confirmation off $ mv # renames of directories or files $ mv # moves file/directory as specified in path $ cp # copy file/directory as specified in path (-r to include content in directories) Copy and paste Depends on local environment. Usually one of the following methods works: Copy: Ctrl&Shift&c or right/middle mouse click Paste: Ctrl&Shift&p or right/middle mouse click Handy shortcuts $ . # refers to pwd $ ~/ # refers to user's home directory $ history # shows all commands you have used recently $ ! # starts an old command by providing its ID number $ up(down)_key # scrolls through command history $ TAB # completes path/file_name $ SHIFT&TAB # completes command $ Ctrl a # cursor to beginning of command line $ Ctrl e # cursor to end of command line $ Ctrl d # delete character under cursor $ Ctrl k # delete line from cursor, content goes into kill buffer $ Ctrl y # paste content from Ctrl k --------------------------------------------------------------------- 3. FINDING HELP --------------------------------------------------------------------- $ man # general help $ man wc # manual on program 'word count' wc $ wc --help # short help on wc $ info wc # more detailed information system (GNU) $ apropos wc # retrieves pages where wc appears --------------------------------------------------------------------- 4. FINDING THINGS --------------------------------------------------------------------- Files, applications and directories $ find . -myfile # searches for file in current directory $ find -name "*pattern*" # searches for *pattern* in and below current directory $ find /usr/local -name "*blast*" # finds file names *blast* in spec. directory $ find /usr/local -iname "*blast*" # same as above, but case insensitive additional useful arguments: -user , -group , -ctime $ find ~ -type f -mtime -2 # finds all files you have modified in the last two days $ locate # finds files and dirs that are written into update file $ which # location of application $ whereis # searches for executeables in set of directories, doesn't depend on your pathi $ dpkg -l | grep mypattern # find Debian packages and refine search with grep pattern In files $ grep pattern file # provides lines in 'file' where pattern 'appears', if pattern is shell function use single-quotes: '>' $ grep -H pattern # -H prints out file name in front of pattern $ grep 'pattern' file | wc # pipes lines with pattern into word count wc (see chapter 8) wc arguments: -c: show only bytes, -w: show only words, -l: show only lines help on regular expressions: $ man 7 regex or man perlre --------------------------------------------------------------------- 5. USEFUL UNIX COMMANDS --------------------------------------------------------------------- $ df # disk space $ free # memory info $ uname -a # shows tech info about machine $ bc # command-line calculator (to exit type 'quit') $ /sbin/ifconfig # give IP and other network info $ ln -s myfilename # creates symbolic link to myfilename, so that it can be opened from any directory $ du -sh # displays what is used by user accounts $ du -sh * # displays usage of individual accounts $ du -s * | sort -nr # sorts output by size $ cat /proc/cpuinfo # prints CPU info of system --------------------------------------------------------------------- 6. JOB/PROCESS MANAGEMENT --------------------------------------------------------------------- $ who # who is logged into system $ w # show which users are logged into system and what they are doing $ ps # show processes running by user $ ps -e # shows all processes on system; try also '-a' and '-x' arguments $ ps aux | grep # shows all processes of one user $ top # view top consumers of memory and CPU $ Ctrl z bg or fg # suspends a process to bring into back- or foreground $ Ctrl c # stops an initiated process $ kill # Kills specified job; if this doesn't do it, add -9 as argument. Also, type <%1> then . $ renice -n # change priority value, which range from 1-19, the higher the value the lower the priority, default is 10 --------------------------------------------------------------------- 7. TEXT EDITORS --------------------------------------------------------------------- VI and VIM Non-graphical (terminal-based) editor. Vi is guaranteed to be available on any system. Vim is improved version of vi. EMACS Window-based editor. You still need to know keystroke commands to use it. Installed on all Linux distributions and on most other Unix systems. XEMACS More sophisticated version of emacs, but usually not installed by default. All common commands are available from menus. Very powerful editor, with built-in syntax checking, Web-browsing, news-reading, manual-page browsing, etc. PICO Simple terminal-based editor available on most versions of Unix. Uses keystroke commands, but they are listed in logical fashion at bottom of screen. EMACS/XEMACS - Powerful X Windows-based editor VIM MANUAL (essentials marked with '>') General BASIC EDDITING > $ vim my_file_name # open/create file with vim > $ ESC # NORMAL (NON-EDITING) MODE > $ i # insert MODE $ R # replace MODE $ r # replace only one character under cursor $ :w # save, if you are in editing mode, you have to hit ESC first!! $ :q # quit file, don't save > $ :q! # exits WITHOUT saving any changes you have made > $ :wq $ q: # history of commands (from NORMAL MODE!), to reexecute one of them, select and hit enter! $ :w new_filename # saves into new file $ :#,#w new_filename # saves specific lines (#,#) to new file $ :o # open new line below cursor $ :O # open new line above cursor HELP $ Useful list of vim commands: http://www.fprintf.net/vimCheatSheet.html $ vimtutor # open vim tutorial from shell $ :help # opens help within vim, hit :q to get back to your file $ :help # opens help on specified topic $ |help_topic| CTRL-] # when you are in help this command opens help topic specified between |...|, CTRL-t brings you back to last topic $ :help CTRL-D # gives list of help topics that contain key word $ : # like in shell you get recent commands!!!! MOVING AROUND IN FILE $ $ # moves cursor to end of line $ A # same as $, but switches to insert mode $ 0 (zero) # moves cursor to beginning of line $ CTRL-g # shows at status line filename and the line you are on $ SHIFT-G # brings you to bottom of file, type line number (isn't displayed) then SHIFT-G # brings you to specified line# DISPLAY WRAPPING AND LINE NUMBERS $ :set nowrap # no word wrapping, :set wrap # back to wrapping $ :set number # shows line numbers, :set nonumber # back to no-number mode SPLITTING WINDOWS $ :split # shows same file in two windows $ :split # opens second file in new window $ :vsplit # splits windows vertically, very useful for tables, ":set scrollbind" let's you scroll all open windows symultaneously $ CTRL-w # switch between windows $ :close # closes current window $ :only # closes all windows except current one SPELL CHECKING & Dictionary $ aspell -c #shell command $ aspell -l < my_file #shell command $ dict #shell command (in Redhat 8.0) Help on how to integrate with Vim can be found at http://www.highley-recommended.com/text-process PRINTING FILE $ :ha # prints entire file $ :#,#ha # prints specified lines: #,# MERGING/INSERTING FILES $ :r # inserts content of specified file after cursor UNDO/REDO $ u # undo last command $ U # undo all changes on current line $ CTRL-R # redo one change which was undone DELETION/CUT (be in NORMAL mode) $ x # deletes what is under cursor $ dw # deletes from curser to end of word including the space $ de # deletes from curser to end of word NOT including the space $ cw # deletes rest of word and lets you then insert, hit ESC to continue with NORMAL mode $ c$ # deletes rest of line and lets you then insert, hit ESC to continue with with NORMAL mode $ d$ # deletes from cursor to the end of the line $ dd # deletes entire line $ 2dd # deletes next two lines, continues: 3dd, 4dd and so on. PUT (PASTE) $ p # uses what was deleted/cut and pastes it behind cursor COPY & PASTE $ yy # copies line, for copying several lines do 2yy, 3yy and so on $ p # pastes clipboard behind cursor SEARCH IN FILE # most regular expressions work here $ /my_pattern # searches for my_pattern downwards, type n for next match $ ?my_pattern # seraches for my_pattern upwards, type n for next match $ :set ic # switches to ignore case search (case insensitive) $ :set hls # switches to highlight search (highlights search hits) REPLACE # most regular expressions work here $ :s/old_pat/new_pat/ # replaces first occurence in a line $ :s/old_pat/new_pat/g # replaces all occurence in a line $ :s/old_pat/new_pat/gc # add 'c' to ask for confirmation $ :#,#s/old_pat/new_pat/g # replaces all occurence between line numbers: #,# $ :%s/old_pat/new_pat/g # replaces all occurence in file $ :%s/$pattern1$$pattern2$/\1test\2/g # regular expression to insert, you need here '\' in front of parentheses (<# Perl) $ :%s/$pattern.*$/\1 my_tag/g # appends something to line containing pattern (<# .+ from Perl is .* in VIM) $ :%s/$pattern$$.*$/\1/g # removes everything in lines after pattern $ :%s/$At\dg\d\d\d\d\d\.\d$$.*$/\1\t\2/g # inserts tabs between At1g12345.1 and Description $ :%s/\n/new_pattern/g #Replaces return signs $ :%s/pattern/\r/g #Replace pattern with return signs!! $ :%s/$\n$/\1\1/g # insert additional return signs $ :%s/$^At\dg\d\d\d\d\d.\d\t.\{-}\t.\{-}\t.\{-}\t.\{-}\t$.\{-}\t/\1/g # replaces content between 5th and 6th tab (5th column), '{-}' turns off 'greedy' behavior $ :#,#s/$ \{-} \|\.\|\n$/\1/g # performs simple word count in specified range of text MATCHING PARENTHESES SEARCH - place curser on (, [ or { and type % # curser moves to matching parentheses HTML EDITING -Convert text file to html format: $ :runtime! syntax/2html.vim # run this command with open file in Vim SHELL COMMAND IN VIM $ :! # executes any shell command, hit to return $ :sh # switches window to shell, 'exit' switches back to vim MODIFY VIM SETTINGS (in file .vimrc) - see last chapter of vimtutor (start from shell) - when vim starts to respond very slowly, the you may want to delete the .viminf* files in your home directory --------------------------------------------------------------------- 8. THE UNIX SHELL --------------------------------------------------------------------- When you log into UNIX/LINUX the system starts a program called SHELL. It provides you with a working environment and interface to the operating system. Usually there are many different shell programs installed. $ finger # shows which shell you are using $ chsh -l # gives list of shell programs available on your system (does not work on all UNIX variants) $ # switches to different shell STDIN, STDOUT, REDIRECTORS, OPERATORS & WILDCARDS By default, many UNIX commands read from standard input (STDIN) and send their output to standard out (STDOUT). You can redirect them by using the following commands: $ ls > file # prints ls output into specified file $ command < # uses file after '<' as STDIN $ command >> # appends output of one command to file $ grep | wc # Pipes (|) output of 'grep' into 'wc' $ file.* # to specify many files Useful shell commands: $ more # viewes text, use space bar to browse, hit 'q' to exit $ less # more versatile text viewer than 'more': Basics for 'less': - 'SHIFT G' moves to end of text, 'g' to beginning. - / # find forwards -? # find backwards $ cat > # concatenate files in output file 'cat.out' $ paste > # merges lines of files and separates them by tabs (useful for tables) $ cmp # tells you whether two files are identical $ diff # find differences between two files $ head - # prints first lines of a file $ tail - # prints last lines of a file $ split -l # splits lines of file into many smaller ones $ csplit -f out fasta_batch "%^>%" "/^>/" "{*}" # splits fasta batch file into many files at '>' $ join -1 1 -2 1 # joins two tables based on specified column numbers (-1 file1, 1: col1; -2: file2, col2); assumes join fields are sorted, if that is not the case, do the following: $ sort table1 > table1a; sort table2 > table2a; join -a 1 -t "`echo -e '\t'`" table1a table2a > table3 -a # prints all lines of specified table! Default prints only all lines the two tables have in common. -t "`echo -e '\t'`" -> forces join to use tabs as field separator in its output. Default is space(s)!!! $ sort # sorts single file, many files and can merge (-m) them, -b ignores leading white space, ... $ sort -k 2,2 -k 3,3n input_file > output_file # sorts in table column 2 alphabetically and column 3 numerically -k for column -n for numeric $ sort input_file | uniq > output_file # uniq command removes duplicates and makes file/table with unique lines/fields $ cat my_table | cut -d , -f1-3 # cut command prints only specified sections of a table, -d specifies here comma as column separator (tab is default), -f specifies column numbers $ du -s * | sort -nr # shows disk space used by different directories/files sorted by size $ grep and egrep # see chapter 4 --------------------------------------------------------------------- 9. SIMPLE SHELL SCRIPTS --------------------------------------------------------------------- Useful One-Liners $ for i in *.input; do mv $i ${i/name\.old/name\.new}; done # renames file name.old to name.new - To test things first, insert 'echo' between 'do mv' (above). $ for i in *.input; do ./application $i; done # runs application in loops on many input files $ for i in *.input; do fastacmd -d /data/../database_name -i $i > $i.out; done # runs fastacmd in loops on many *.in files and creates *.out files $ for i in *.pep; do target99 -db /usr/../database_name -seed $i -out $i; done # runs SAM's target99 on many input files $ for j in 0 1 2 3 4 5 6 7 8 9; do grep -iH *$j.seq; done # searches in > 10,000 files for pattern and prints occurences together with file names. $ for i in *.pep; do echo -e "$i\n\n17\n33\n\n\n" | ./tmpred $i > $i.out; done #example of how to run an interactive application (tmpred) that asks for file name input/output How to write a script - create file which contains in first line: #!/bin/bash - place shell commands in file - run to make it executable - run shell script like this: ./my_shell_script - when you place it into /usr/local/bin you only type its name from any user account --------------------------------------------------------------------- 10. REMOTE COPY: wget, scp and ncftp --------------------------------------------------------------------- WGET (file download from the www) $ wget http://www... # file download from www SCP (secure copy between machines) General syntax $ scp source target (from # to). Use form 'userid@machine_name' if your local and remote user ids are differnt. If they are the same you can use only 'machine_name'. Examples 1) Copy file from Server to Local Machine (type from local machine prompt): $ scp user@remote_host:file.name . # '.' copies to pwd, you can specify here any directory, use wildcards to copy many files at once. 2) Copy file from Local Machine to Server: $ scp file.name user@remote_host:~/dir/newfile.name 3) Copy entire directory from Server to Local Machine (type from local machine prompt): $ scp -r user@remote_host:directory/ ~/dir 4) Copy entire directory from Local Machine to Server (type from local machine prompt): $ scp -r directory/ user@remote_host:directory/ 5) Copy between two remote hosts (e.g. from biocore to cache): similar as 1) - 4) just be logged in one of the remote hosts, e.g.: $ scp -r directory/ user@remote_host:directory/ NICE FTP $ open ncftp $ ncftp> open ftp.ncbi.nih.gov $ ncftp> cd /blast/executables $ ncftp> get blast.linux.tar.Z (skip extension: @) $ ncftp> bye --------------------------------------------------------------------- 11. UNPACK FILES --------------------------------------------------------------------- $ gunzip (or uncompress my_file.Z, or bunzip2 for file.tar.bz2) $ tar xvf try: tar zxf blast.linux.tar.Z tar xvzf file.tgz options: f: use archive file p: preserve permissions v: list files processed x: exclude files listed in FILE z: filter the archive through gzip --------------------------------------------------------------------- 12. PERMISSIONS & OWNERSHIP --------------------------------------------------------------------- $ ls -al # shows something like this for each file/dir: drwxrwxrwx d # directory rwx # read write execute first triplet # user (u) second triplet # group (g) third triplet # other or world (o) To assign read and write permissions to user and group: $ chmod ug+rx tgirke/ To remove all permissions from all three user groups: $ chmod ugo-rwx tgirke/ '+' causes the permissions selected to be added '-' causes them to be removed '=' causes them to be the only permissions that the file has. Example: public_html folder: $ chmod +rx public_html/ or $ chmod 755 public_html/ CHANGE OWNERSHIP $ chown # changes user ownership $ chgrp # changes group ownership $ chown : # changes user & group ownership --------------------------------------------------------------------- 13. SIMPLE INSTALLS --------------------------------------------------------------------- APPLICATIONS Applications for general use (job of system admin) - to find out if an application is installed type: $ which $ whereis # searches for executeables in set of directories, doesn't depend on your path - most applications are installed in /usr/local/bin or /usr/bin, you need root permissions to do this, send us email to install there what you need. - Perl scripts go into /usr/local/bin, Perl modules (*.pm) into /usr/local/share/perl/5.8.0/. To copy executable in a batch use command: cp `find -perm -111 -type f` /usr/local/bin Applications in user accounts - create a new directory, download application into this directory and unpack it (see chapter 9) - usually you can then already run this application when you specify its location e.g.: /home/user/my_app/blastall - if you want you can add this directory to your PATH by typing from this directory: $ PATH=.:$PATH; export PATH # this allows you to run application by providing only its name; when you do echo $PATH you will see .: added to PATH - intstallation of RPMs: $ rpm -i application_name.rpm - to check which version of RPM package is installed $ rpm --query - Help and upgrade files for RPMs can be found at http://rpmfind.net/ Quick install from Debian project - Check whether your application is available at: http://www.debian.org/intro/about, then you type (no download): $ apt-cache search phylip #searches for application "phylip" from command line $ apt-cache show phylip #provides description of program $ apt-get install phylip # example for phylip install, manuals can be found in /usr/doc/phylip/, use zless or lynx to read documentation (don't unzip). $ apt-get update # do once a month do update Debian packages $ apt-get upgrade -u # to upgrade after update from above $ dpkg -i # install data package from local package file (e.g. after download) --------------------------------------------------------------------- 14. DEVICES --------------------------------------------------------------------- Mount/unmount floppy/cdrom $ mount /mnt/floppy $ mount /mnt/cdrom $ eject /mnt/floppy --------------------------------------------------------------------- 15. DISPLAY AND ENVIRONMENT VARIABLES --------------------------------------------------------------------- $ xhost user@host # adds X permissions for user on server. $ echo DISPLAY # shows current display settings $ export (setenv) DISPLAY=:0 # change environment variable $ unsetenv DISPLAY # removes display variable $ printenv # prints all environment variables $ $PATH # list of directories that the shell will search when you type a command - You can edit your default DISPLAY setting for your account by adding it to file .bash_profile. --------------------------------------------------------------------- (16) EXERCISES (start after you have read chapter 8) --------------------------------------------------------------------- Exercise 1 a) Download proteome of Halobacterium spec. from ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Halobacterium_sp/AE004437.faa (use wget or netscape for download) b) How many predicted proteins are there? $ grep '>' AE004437.faa | wc c) How many proteins contain the pattern "WxHxxHH"? $ grep W.H..HH AE004437.faa | wc d) Use the find function (/) in 'less' to fish out the proteins containing this pattern e) Run blastall with those proteins against the Human genome, like this: blastall -p tblastn -i input.file -d /usr/local/blast/db/NCBI/human/hs_chr -o blast.out -v 10 -b 10 f) Parse blastall output into Excel spread sheet: - using biocore parser blastParse -c -i -o - using BioPerl parser ./bioblastParse.pl < blast.out Exercise 2 a) split sample fasta batch file with csplit (use sequence file from exercise 1) b) concatenate single fasta files from (1) to one batch file c) BLAST two related sequences, retrieve the result in table format and use join to identify common hit IDs in the two tables Exercise 3 a) write a shell script that executes several BLAST searches at once: #!/bin/sh blastall -p blastp -d /.../my_database -i /.../my_input -o my_out -e 1e-6 -v 10 -b 10 & blastall -p blastp -d /.../my_database -i /.../my_input -o my_out -e 1e-6 -v 10 -b 10 & Exercise 4 a) download an application, install and run it, e.g.: MULTALIN: ftp://ftp.toulouse.inra.fr/pub/multalin --------------------------------------------------------------------- Webpage update: scp linux_manual tgirke@cache.ucr.edu:~/public_html/Documents/UNIX/linux_manual