LINUX ESSENTIALS


Workshops ]   R_&_BioC ]   BioC-Seq ]   R_Programming ]   EMBOSS ]   Linux ]   Cluster ]

Author: Thomas Girke, UC Riverside

New version of this manual

Index

  1. INTRODUCTION
  2. BASICS
  3. UNIX HELP
  4. FINDING THINGS
  5. PERMISSIONS & OWNERSHIP
  6. USEFUL COMMANDS
  7. JOB/PROCESS MANAGEMENT
  8. TEXT VIEWING
  9. TEXT EDITORS
  10. THE UNIX SHELL
  11. SIMPLE SHELL ONE-LINER SCRIPTS
  12. SIMPLE PERL ONE-LINER SCRIPTS
  13. REMOTE COPY
  14. ARCHIVING AND COMPRESSING
  15. SIMPLE INSTALLS
  16. DEVICES
  17. ENVIRONMENT VARIABLES
  18. EXERCISES

  1. INTRODUCTION
  2. Why UNIX?

    How to get access

    UNIX variants
    • UNIX: Solaris, IRIX, HP-UX, Tru64-UNIX, Free's, LINUX, ...

    LINUX distributions

  3. BASICS
  4. Syntax for this manual

    Login from Windows:

    Login from Mac OS-X or LINUX

    Changing password:

    Orientation

    Files and directories

    Copy and paste

    Handy shortcuts

  5. UNIX HELP
  6. FINDING THINGS
  7. Finding files, directories and applications

    Finding things in files

  8. PERMISSIONS & OWNERSHIP
  9. How does it work

    To assign write and execute permissions to user and group:

    To remove all permissions from all three user groups:

    Change ownership

  10. USEFUL UNIX COMMANDS
  11. JOB/PROCESS MANAGEMENT
  12. TEXT VIEWING
  13. TEXT EDITORS
  14. VI and VIM

    EMACS

    XEMACS

    PICO

    VIM MANUAL (essentials marked in red)
    BASICS
      $ vim my_file_name # open/create file with vim
      $ i # INSERT MODE
      $ ESC # NORMAL (NON-EDITING) MODE
      $ : # commands start with ':'
      $ :w # save command; if you are in editing mode you have to hit ESC first!!
      $ :q # quit file, don't save
      $ :q! # exits WITHOUT saving any changes you have made
      $ :wq # save and quit
      $ R # replace MODE
      $ r # replace only one character under cursor
      $ q: # history of commands (from NORMAL MODE!), to reexecute one of them, select and hit enter!
      $ :w new_filename # saves into new file
      $ :#,#w new_filename # saves specific lines (#,#) to new file
      $ :# go to specified line number

    HELP
      $ Useful list of vim commands: Vim Commands Cheat Sheet, VimCard, Vim Basics
      $ vimtutor # open vim tutorial from shell
      $ :help # opens help within vim, hit :q to get back to your file
      $ :help <topic> # opens help on specified topic
      $ |help_topic| CTRL-] # when you are in help this command opens help topic specified between |...|, CTRL-t brings you back to last topic
      $ :help <topic> CTRL-D # gives list of help topics that contain key word
      $ : <up-down keys> # like in shell you get recent commands!!!!

    MOVING AROUND IN FILE
      $ $ # moves cursor to end of line
      $ A # same as $, but switches to insert mode
      $ 0 (zero) # moves cursor to beginning of line
      $ CTRL-g # shows at status line filename and the line you are on
      $ SHIFT-G # brings you to bottom of file, type line number (isn't displayed) then SHIFT-G # brings you to specified line#

    DISPLAY
      WRAPPING AND LINE NUMBERS
      $ :set nowrap # no word wrapping, :set wrap # back to wrapping
      $ :set number # shows line numbers, :set nonumber # back to no-number mode

    WORKING WITH MANY FILES & SPLITTING WINDOWS
      $ vim *.txt # opens many files at once; ':n' switches between files
      $ :wall or :qall # write or quit all open files
      $ vim -o *.txt # opens many files at once and displays them with horizontal split, '-O' does vertical split
      $ :args *.txt # places all the relevant files in the argument list $ :all # splits all files in the argument list (buffer) horizontally $ CTRL-w # switch between windows
      $ :split # shows same file in two windows
      $ :split <file-to-open> # opens second file in new window
      $ :vsplit # splits windows vertically, very useful for tables, ":set scrollbind" let's you scroll all open windows symultaneously
      $ :close # closes current window
      $ :only # closes all windows except current one

    SPELL CHECKING & Dictionary
      $ aspell -c <file> # shell command
      $ aspell -l <my_file> # shell command
      $ :! dict <word> # meaning of word
      $ :! wn 'word' -over # synonyms of word

    PRINTING FILE
      $ :ha # prints entire file
      $ :#,#ha # prints specified lines: #,#

    MERGING/INSERTING FILES
      $ :r <filename> # inserts content of specified file after cursor

    UNDO/REDO
      $ u # undo last command
      $ U # undo all changes on current line
      $ CTRL-R # redo one change which was undone

    DELETION/CUT (switch to NORMAL mode)
      $ x # deletes what is under cursor
      $ dw # deletes from curser to end of word including the space
      $ de # deletes from curser to end of word NOT including the space
      $ cw # deletes rest of word and lets you then insert, hit ESC to continue with NORMAL mode
      $ c$ # deletes rest of line and lets you then insert, hit ESC to continue with with NORMAL mode
      $ d$ # deletes from cursor to the end of the line
      $ dd # deletes entire line
      $ 2dd # deletes next two lines, continues: 3dd, 4dd and so on.

    PUT (PASTE)
      $ p # uses what was deleted/cut and pastes it behind cursor

    COPY & PASTE
      $ yy # copies line, for copying several lines do 2yy, 3yy and so on
      $ p # pastes clipboard behind cursor

    SEARCH IN FILE (most regular expressions work)
      $ /my_pattern # searches for my_pattern downwards, type n for next match
      $ ?my_pattern # seraches for my_pattern upwards, type n for next match
      $ :set ic # switches to ignore case search (case insensitive)
      $ :set hls # switches to highlight search (highlights search hits)

    REPLACE WITH REGULAR EXPRESSIONS (great intro: A Tao of Regular Expressions)
      $ :s/old_pat/new_pat/ # replaces first occurence in a line
      $ :s/old_pat/new_pat/g # replaces all occurence in a line
      $ :s/old_pat/new_pat/gc # add 'c' to ask for confirmation
      $ :#,#s/old_pat/new_pat/g # replaces all occurence between line numbers: #,#
      $ :%s/old_pat/new_pat/g # replaces all occurence in file
      $ :%s/\(pattern1\)\(pattern2\)/\1test\2/g # regular expression to insert, you need here '\' in front of parentheses (<# Perl)
      $ :%s/\(pattern.*\)/\1 my_tag/g # appends something to line containing pattern (<# .+ from Perl is .* in VIM)
      $ :%s/\(pattern\)\(.*\)/\1/g # removes everything in lines after pattern
      $ :%s/\(At\dg\d\d\d\d\d\.\d\)\(.*\)/\1\t\2/g # inserts tabs between At1g12345.1 and Description
      $ :%s/\n/new_pattern/g #Replaces return signs
      $ :%s/pattern/\r/g #Replace pattern with return signs!!
      $ :%s/\(\n\)/\1\1/g # insert additional return signs
      $ :%s/\(^At\dg\d\d\d\d\d.\d\t.\{-}\t.\{-}\t.\{-}\t.\{-}\t\).\{-}\t/\1/g # replaces content between 5th and 6th tab (5th column), '{-}' turns off 'greedy' behavior
      $ :#,#s/\( \{-} \|\.\|\n\)/\1/g # performs simple word count in specified range of text
      $ :%s/\(E\{6,\}\)/<font color="green">\1<\/font>/g # highlight pattern in html colors, here highlighting of >= 6 occurences of Es
      $ :%s/\([A-Z]\)/\l\1/g # change uppercase to lowercase, '%s/\([A-Z]\)/\u\1/g' does the opposite
      $ :g/my_pattern/ s/\([A-Z]\)/\l\1/g | copy $ # uses 'global' command to apply replace function only on those lines that match a certain pattern. The 'copy $' command after the pipe '|' prints all matching lines at the end of the file.
      $ :args *.txt | all | argdo %s/\old_pat/new_pat/ge | update # Command 'args' places all relevant files in the argument list (buffer); 'all' displays each file in separate split window; command 'argdo' applies replacement to all files in argument list (buffer); flag 'e' is necessary to avoid stop at error messages for files with no matches; command 'update' saves all changes to files that were updated.

    MATCHING PARENTHESES SEARCH
      - place curser on (, [ or { and type % # curser moves to matching parentheses

    HTML EDITING
      -Convert text file to html format:
      $ :runtime! syntax/2html.vim # run this command with open file in Vim

    SHELL COMMAND IN VIM
      $ :!<SHELL_COMMAND> <ENTER> # executes any shell command, hit <enter> to return
      $ :sh # switches window to shell, 'exit' switches back to vim

    USING VIM AS TABLE EDITOR
      $ v # starts visual mode for selecting characters
      $ V # starts visual mode for selecting lines
      $ CTRL-V # starts visual mode for selecting blocks (use CTRL-q in gVim under Windows). This allows column-wise selections and operations like inserting and deleting columns. To restrict substitude commands to a column, one can select it and switch to the command-line by typing ':'. After this the substitution sytax for a selected block looks like this: '<,'>s///.
      $ :set scrollbind # starts simultaneous scrolling of 'vsplitted' files. To set to horizontal binding of files, use command ':set scrollopt=hor' (after first one). Run all these commands before the ':split' command.
      $ :AlignCtrl I= \t then :%Align # This allows to align tables by column separators (here '\t') when the Align utility from Charles Campbell's is installed.
      To sort table rows by selected lines or block, perform the visual select and then hit F3 key. The rest is interactive. To enable this function one has to include in the .vimrc file from Gerald Lai the Vim sort script.

    MODIFY VIM SETTINGS (in file .vimrc)
      - see last chapter of vimtutor (start from shell)
      - useful .vimrc sample
      - when vim starts to respond very slowly then one may need to delete the .viminf* files in home directory

  15. THE UNIX SHELL
  16. When you log into UNIX/LINUX the system starts a program called SHELL. It provides you with a working environment and interface to the operating system. Usually there are many different shell programs installed.

    STDIN, STDOUT, STDERR, REDIRECTORS, OPERATORS & WILDCARDS (more on this @ LINUX HOWTOs)

    Useful shell commands

  17. SIMPLE SHELL ONE-LINER SCRIPTS
  18. Useful One-Liners (script download)

    How to write a script

  19. SIMPLE PERL ONE-LINER SCRIPTS
  20. Useful One-Liners

  21. REMOTE COPY: WGET, SCP and NCFTP
  22. WGET (file download from the www)

    SCP (secure copy between machines)

    NICE FTP

  23. ARCHIVING AND COMPRESSING
  24. Archiving and compressing

    Viewing Archives

    Extracting

  25. SIMPLE INSTALLS
  26. Systems-wide installations

    Applications in user accounts

    Intstallation of RPMs

    Installation of Debian packages

  27. DEVICES
  28. Mount/unmount usb/floppy/cdrom

  29. ENVIRONMENT VARIABLES
  30. EXERCISES
  31. Exercise 1
    1. Download proteome of Halobacterium spec. from ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Halobacterium_sp/AE004437.faa (use wget or web browser for download)
    2. How many predicted proteins are there?
      • $ grep '>' AE004437.faa | wc
    3. How many proteins contain the pattern "WxHxxH[1-2]"?
      • $ egrep 'W.H..H{1,2}' AE004437.faa | wc
    4. Use the find function (/) in 'less' to fish out the proteins containing this pattern or more elegantly do it with awk:
      • $ awk --posix -v RS='>' '/W.H..(H){1,2}/ { print ">" $0;}' AE004437.faa | less
    5. Create a BLASTable database with formatdb
      • $ formatdb -i AE004437.faa -p T -o T
        '-p F' for nucleotide and '-p T' for protein databases
    6. Generate list of sequence IDs for above pattern match result and retrieve its sequences with fastacmd from formatted database
      • $ fastacmd -d AE004437.faa -i my_IDs > seq
    7. Generate several lists of sequence IDs from various pattern match results and retrieve their sequences in one step using the fastacmd in for loop
      • $ for i in *.my_ids; do fastacmd -d AE004437.faa -i $i > $i.out; done
    8. Run blastall with a few proteins against newly created database or against Halobacterium or UniProt database (/data/UNIPROT/blast/uniprot)
      • $ blastall -p blastp -i input.file -d AE004437.faa -o blastp.out -e 1e-6 -v 10 -b 10 &
    9. Parse blastall output into Excel spread sheet:
      • a) using biocore parser
        $ blastParse -c <hits> -i <blast.out> -o <blast.parse>
        b) using BioPerl parser
        $ bioblastParse.pl blast.out
    10. Run HMMPFAM search with above proteins against Pfam database
      • $ hmmpfam -E 0.1 --acc -A0 /data/PFAM/Pfam_ls input.file > output.pfam
        Parse result with BioPerl parser
        $ hmmSummary output.pfam > hmm.summary
    Exercise 2
    1. Split sample fasta batch file with csplit (use sequence file from exercise 1).
    2. Concatenate single fasta files from (1) to one batch file.
    3. BLAST two related sequences, retrieve the result in table format and use join to identify common hit IDs in the two tables.
    Exercise 3
    1. write a shell script that executes several BLAST searches at once:
      • #!/bin/sh
        blastall -p blastp -d /.../my_database -i /.../my_input -o my_out -e 1e-6 -v 10 -b 10 &
        blastall -p blastp -d /.../my_database -i /.../my_input -o my_out -e 1e-6 -v 10 -b 10 &
    Exercise 4
    1. Create multiple alignment with ClustalW (e.g. use sequences with 'W.H..HH' pattern)
      • $ clustalw my_fasta_batch
    Exercise 5
    1. Reformat alignment into PHYILIP format using 'seqret' from EMBOSS
      • $ seqret clustal::my_align.aln phylip::my_align.phylip
    Exercise 6
    1. Create neighbor-joining tree with PHYLIP
      • $ cp my_align.phylip infile
        $ phylip protdist # creates distance matrix
        $ cp outfile infile
        $ phylip neighbor # use default settings
        $ cp outtree intree
        $ phylip retree # displays tree and can use midpoint method for defining root of tree, my typical command sequence is: 'N' 'Y' 'M' 'W' 'R' 'R' 'X'
        $ cp outtree my_tree.dnd
        View your tree in TreeBrowse or open it in TreeView