Skip to content

Nextflow pipeline for extracting UniProt IDs and predicting chemical properties of proteins.

License

Notifications You must be signed in to change notification settings

aysanraza/protein-chemistry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Protein Chemistry

Protein-Chemistry is a Nextflow pipeline for extracting UniProt IDs and predicting chemical properties of proteins using Propy3. The pipeline takes input file as an argument. The input file should be a list of UniProt IDs or protein sequences in FASTA format. The output file will be a TSV file containing the protein IDs and the predicted chemical properties.

Installation

To install Protein-Chemistry, you will need to have Nextflow, UniProt, and Propy3 installed. You can install Nextflow using the following command:

conda install -c nextflow nextflow

You can install UniProt using the following command:

pip install uniprot

You can install Propy3 using the following command:

pip install propy3

Once Nextflow, UniProt, and Propy3 are installed, you can clone the Protein-Chemistry repository and run the pipeline using the following commands:

git clone https://github.com/aysanraza/protein-chemistry.git
cd protein-chemistry
nextflow run pipeline.nf

Usage

To run the Protein-Chemistry pipeline, you will need to specify the input and output files on the command line. You can do this using the following arguments:

  • --input_file: The path to the input file.
  • --output_file: The path to the output file.

For example, to run the pipeline with the following input and output files:

  • input.txt
  • output.tsv

You would use the following command:

nextflow run pipeline.nf --input_file input.txt --output_file output.tsv

Once the pipeline has finished running, you will find the output TSV file in the directory where you executed the command.

About Me:

I am a Bioinformatics and Machine learning expert, practecing insilico development and analytics in the domain of biology and medicine. I am open for research collaborations, you can email me to discuss.

Thank you,

Ahsan Raza

Masters in Bioinformatics

[email protected]

Islamabad, Pakistan.

💻 Tech Stack:

Python Shell Script Anaconda GitHub Neo4J SQLite MySQL scikit-learn SciPy Plotly TensorFlow Pandas NumPy Keras GIT LINUX

💰 You can help me by Donating

BuyMeACoffee

About

Nextflow pipeline for extracting UniProt IDs and predicting chemical properties of proteins.

Resources

License

Stars

Watchers

Forks