Skip to content

Singularity image for PASTEC (transposable elements classification tool).


Notifications You must be signed in to change notification settings


Repository files navigation



Singularity container for the transposable elements classification tool PASTEC(from the package REPET).


  • singularity (tested with v.3.9.5)


Development installation

  • Clone this repo:
git clone
  • Move to the PASTEC directory:
cd PASTEC-singularity
  • To build the .sif image from the definition file:
sudo singularity build pastec_latest.sif PASTEC.def

Only user installation

  • Pull directly the .sif image from the registry:
singularity pull --arch amd64 library://tommasobarberis98/pastec/pastec:latest


Store your project directory path in a variable:


This directory has to contain:

  • the fasta file with consensi sequences:

    From the PASTEC documentation, each sequence must have 60 bp (or less) by line, so if you have a "one-line" FASTA file, you can transform it with the following command:

    sed '/^>/!s/.\{60\}/&\n/g' oneline.fasta > multiline.fasta

    About the sequence headers, it is highly advised to write them like this : ">XX_i" with XX standing for letters and i standing for numbers.
    So, if you another type of header, you can try:

    sed 's/>.\+$/>TE/g' old_file.fa | awk '/^>/{$0=$0"_"(++i)}1' > new_file.fa
  • the PASTEClassifier.cfg file (a template is provided with the github repo at the root folder) to update some parameters such as the path to a known database of transposable elements (see more here: PASTEClassfier-tuto). Options that you are suggested to update:

    • project_name: whatever you want.
    • TE_nucl_bank: /mnt/nucl_bank.fa such as repbase20.05_ntSeq_cleaned_TE.fa (from RepBase20.05_REPET.embl require subscription to girinst), but you are free to use any other database. Then you can set:
      • TE_BLRtx: yes ➡️ homology with known TEs using tblastx.
      • TE_BLRn: yes ➡️ detection of helitron extremities.
    • TE_prot_bank: /mnt/prot_bank.fa such as repbase20.05_aaSeq_cleaned_TE.fa (from RepBase20.05_REPET.embl require subscription to girinst), but you are free to use any other database. Then you can set:
      • TE_BLRx: yes ➡️ homology with known TEs using blastx.
    • HG_nucl_bank: /mnt/host_genes.fa for the cDNA database of the host genes. Then you can set:
      • HG_BLRn: yes ➡️ homology with host genes.
    • rDNA_bank: /mnt/rdna.fa for the rDNA database. Then you can set:
      • rDNA_BLRn: yes ➡️ homology with rDNA.
    • TE_HMM_profiles: is already set to the ProfilesBankForREPET_Pfam32.0.hmm database, but your are free to use an another database.
    • Adjustable parameters in [classif_consensus] section (see PASTEClassfier-tuto).
  • the various databases that you can define in the PASTEClassifier.cfg file.

Container initialization

The PASTEC program use the MySQL database that need to be run as a server, for that you need to initialize the container with this service by creating a singularity instance:

bash path/to/ -s pastec_latest.sif -d $PROJECT_DIR

Interactive mode

  • Start a shell through the container:
singularity shell instance://pastec
  • Once in the container, run:
python2.7 /opt/PASTEC_linux-x64-2.0/bin/ -i /mnt/consensi.fasta -C /mnt/PASTEClassifier.cfg

NOTE: the /mnt refer to your PROJECT_DIR, so you have to conserve it and you have only to append the file names.

Batch mode

Create a file (a template is provided in the test folder) that will contain the PASTEC command in your project directory ($PROJECT_DIR):

#! /bin/bash 

cd /mnt
python2.7 /opt/PASTEC_linux-x64-2.0/bin/ -i /mnt/consensi.fasta -C /mnt/PASTEClassifier.cfg

Then you can run PASTEC using the container instance as follow:

singularity exec instance://pastec /mnt/

NOTE: the /mnt refer to your PROJECT_DIR, so you have to conserve it and you have only to append the file names.

Other PASTEC options

  • classification rules file name (e.g. PASTEClassifierRules.yaml) [optional]


  • step (0/1/2): default: 0 for all steps, step 1 for detect features, step 2 for classification run tool in parallel

-S STEP, --step=STEP

  • clean temporary files [optional] [default: False]

-c, --clean

  • verbosity [optional] [default: 3, from 1 to 4]



Singularity image for PASTEC (transposable elements classification tool).








No releases published
