CommandsΒΆ

clean_multi_ssake.sh

Run to remove intermediate files generated by multi_ssake.py from the current directory.

WARNING: Removes all files matching *.fasta and *paired.fa from the current directory!

compare_genomes.py
Finds differences between genomes based on the input multi-aligned fasta file
python: can't open file 'compare_genomes_test.py': [Errno 2] No such file or directory
fastq_to_fasta.py
Converts fastq files to fasta files
Usage: fastq_to_fasta.py [options] fastq_file fasta_file

 Convert fastq file to fasta

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -w WRAP, --wrap=WRAP  Maximum length of lines, 0 means do not wrap (default:
                        0)
  -v, --verbose         verbose output
find_contig_deletions.py
Find contigs in an assembly that have sections deleted and provide an option to insert the missing pieces
Usage: fastq_to_fasta.py [options] fastq_file fasta_file

 Convert fastq file to fasta

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -w WRAP, --wrap=WRAP  Maximum length of lines, 0 means do not wrap (default:
                        0)
  -v, --verbose         verbose output
gff2gtf_simple.py
Attempts to convert GFF files to GTF files.
Usage: gff2gtf_simple.py [options] gff_file

 Compares genomes using multiple alignment Input: GFF Output: GTF File  Very
simple and naieve GFF to GTF converter. Writen to handle GFF output from
BioPerls genbank2gff3.pl script for simple DNA viruses

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -v, --verbose  verbose output
maf_net.py
Stitch together alignment block from a MAF file, forming an alignment net.
Usage: maf_net.py [options] maf_file

 Determine the best MAF block (determined by score) that cover a specified
genome

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -r REFERENCE, --reference=REFERENCE
                        Reference species
  -s SPECIES, --species=SPECIES
                        List of species to include
  -c CHROMOSOME, --chromosome=CHROMOSOME
                        Sequence ID of the chromosome for which to generate
                        the alignment net (e.g. NC_001806)
  -o OUTPUT_DIR, --output_dir=OUTPUT_DIR
                        Directory to store output file, default is maf file
                        directory
  --consensus_sequence  Output "consensus sequence" for each species in files
                        named [species].[chromosome].consensus.fasta
  --reference_fasta=REFERENCE_FASTA
                        Check MAF file against this fasta (for
                        troubleshooting, debugging)
  -v, --verbose         verbose output
makePairedOutput2EQUALfiles_vamp.pl

See makePairedOutput2UNEQUALfiles_vamp.pl:

Usage: makePairedOutput2EQUALfiles_vamp.pl <fasta file 1> <fasta file 2> <library insert size>
       --- ** Both files must have the same number of records & arranged in the same order
makePairedOutput2UNEQUALfiles_vamp.pl

Modified versions of scripts provided by SSAKE. They are used to prepare two separate paired end fastq files for use by SSAKE. The modifications made were to accommodate new Illumina style sequence identifiers introduced with CASAVA 1.8.:

Usage: makePairedOutput2UNEQUALfiles_vamp.pl <fasta file 1> <fasta file 2> <library insert size>
       --- files could have different # of records & arranged in different order but template ids must match
multi_ssake.py
Run TQSFastq preprocessing and SSAKE using various combinations of parameters and combine the results.
Usage: multi_ssake.py [options] forward_reads reverse_reads

 Run various iterations of SSAKE, varying input files and parameters Collect
results into single list of contigs

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -i INSERT_SIZE, --insert_size=INSERT_SIZE
                        Mean insert size for paired reads (default: 500)
  --config=CONFIG       multi_ssake configuration file that specifies options
                        [default: /Users/lparsons/Documents/workspace_4.2/vamp
                        /makefiles/multi_ssake.config.template]
  --vamp_config=VAMP_CONFIG
                        vamp configuration file that specifies options
                        [default: /Users/lparsons/Documents/workspace_4.2/vamp
                        /makefiles/config.mk.template]
  --qsub                Use qsub to submit commands to cluster (default:
                        False)
  --untrimmed           Include SSAKE assembly of untrimmed reads (not
                        recommended)
  -v, --verbose         verbose output
  -d, --debug           debug (do not execute)
translate_cds.py
Takes GFF and Fasta input and translates spliced CDS regions from DNA to Amino Acid sequence, reporting errors.
Usage: translate_cds.py [options] <genes gff3> <ref fasta>

 Extracts coding sequences (cds) regions from fasta reference and gff file and
translates them into amino acid sequence, output in FASTA format to STDOUT.
Errors during translation are output to STDERR. Genes with translation errors
are not printed.

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  --notrans             Do not translate to amino acid sequence, output DNA
  -i IDATTR, --idattr=IDATTR
                        GFF attribute to use as gene ID. Features with the
                        same ID will be considered parts of the same gene. The
                        default "gene_id" is suitable for GTF files.
  -t FEATURETYPE, --featuretype=FEATURETYPE
                        GFF feature type(s) (3rd column) to be used. Specify
                        the option multiple times for multiple feature types.
                        The default is "CDS" for GFF files and "CDS" and
                        "stop_codon" for GTF files.
  --table=TABLE         NCBI Translation table to use when translating DNA
                        (see http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprint
                        gc.cgi). Default: 1.
  -v, --verbose         verbose output

Previous topic

Usage

Next topic

Modules

This Page