Skip to content.
CCB > CCBSIGS > SigFlow > PipelineWorkflows > PipelineWorkflows_BioinfoEMBOSS

CCB Bioinformatics EMBOSS Pipeline Workflows

Overview

This page contains a number of bioinformatics workflows based on the advanced EMBOSS informatics package. This page describes as Pipeline modules several bioinformatics tools from the EMBOSS suite and demonstrates the construction of a couple of integrated pipeline workflows (end-to-end bioinformatics solutions via the LONI Pipeline).

Problems addressed by EMBOSS workflows

The most popular EMBOSS applications include:

Program group Selected applications
Alignment Local (matcher, water) and global (stretcher, needle) alignment
Coding regions Synonymous codon use (syco), codon statistics (chips)
Comparison Large sequence word comparisons (dottup, polydot, wordmatch) and alignment (supermatcher)
Composition General (compseq), frequent words (wordcount), graphical representation (chaos)
CpG? islands Report (cpgreport) and plot (cpgplot)
DNA features Repeats (einverted, etandem), DNA melting (dan)
Editing General editing utilities (cutseq, splitter), features (maskfeat), etc.
Indexing Database indexing (dbiflat, dbigcg, dbiblast)
Motifs Searching prosite (patmatdb, motifsearch), prints (pscan), transfac (tfscan), general patterns (fuzznuc, fuzzpro)
Multiple alignment Interface to clustalw (emma), display (prettyplot), editing (mse)
Protein features Functional motifs (antigenic, sigcleave), structural motifs (pepcoil, helixturnhelix), amphipathic regions (pepnet, pepwheel), transmembrane prediction (tmap) and display (topo) Protein properties Hydropathy (pepwindow, pepwindowwall, octanol), protease sites (digest), general (pepstats)
Sequence formats Sequence reading/writing/format conversion (seqret, seqretall) and feature format conversion (seqretfeat)
Translation Codon usage (cusp), reading frames (getorf, showorf, backtranseq)
Utilities Motif database indexing (rebaseextract, tfextract, prosextract, printsextract), listing databases (showdb), searching for applications (wossname)

Detailed Workflow Usage & Specifications

Chips

Chips: This tool computes the Codon usage statistics. Chips calculates Frank Wright's Nc statistic for the effective number of codons used.
Matcher Finds the best local alignments between two sequences. It can be used to compare two sequences looking for local sequence similarities using a rigorous algorithm. Matcher is based on Bill Pearson's 'lalign' application, version 2.0u4 Feb. 1996.
Pipeline_EMBOSS_Chip_Results_2010.png

EBOSS LONI Installation

The LONI EBOSS installation needs to be provisioned to /usr/local (by the LONI IT team). The current test installation is in:
  • /ifs/ccb/CCB_SW_Tools/others/Bioinformatics/EMBOSS_6_2010/EMBOSS-6.3.1/ and Pipelines folder.
  • See the README file for configuration, build, test and validation instructions.

EMBOSS - mrFAST Interoperability and tool-integration example

Footnotes

  • Outputs and results:
  • Expected times:
  • Limitations:
  • Contact person/group: SIG-FLOW Team
  • Pubs: Publications
  • Grants
  • Tools/packages used in this workflow:
  • Notes

EMBOSS References

See also