Skip to content

Latest commit

 

History

History
78 lines (54 loc) · 2.4 KB

reference.md

File metadata and controls

78 lines (54 loc) · 2.4 KB
layout
reference

Glossary

{:auto_ids}
accession : a unique identifier assigned to each sequence or set of sequences

BLAST : The Basic Local Alignment Search Tool at NCBI that searches for similarities between known and unknown biomolecules like DNA

categorical variable : Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical). Categorical variables take on a fixed number of values that are names or labels.

cleaned data
: data that has been manipulated post-collection to remove errors or inaccuracies, introduce desired formatting changes, or otherwise prepare the data for analysis

conditional formatting
: formatting that is applied to a specific cell or range of cells depending on a set of criteria

CSV (comma separated values) format
: a plain text file format in which values are separated by commas

factor
: a variable that takes on a limited number of possible values (i.e. categorical data)

Gb : gigabyte of file storage or file size

Gbase : a gigabase represents one billion nucleic acid bases (Gbp may indicate one billion base pairs of nucleic acid)

headers : names at tops of columns that are descriptive about the column contents (sometimes optional)

metadata
: data which describes other data

NGS : common acronym for "Next Generation Sequencing" currently being replaced by "High Throughput Sequencing"

null value
: a value used to record observations missing from a dataset

observation
: a single measurement or record of the object being recorded (e.g. the weight of a particular mouse)

plain text : unformatted text

quality assurance
: any process which checks data for validity during entry

quality control
: any process which removes problematic data from a dataset

raw data
: data that has not been manipulated and represents actual recorded values

rich text
: formatted text (e.g. text that appears bolded, colored or italicized)

string
: a collection of characters (e.g. "thisisastring")

TSV (tab separated values) format
: a plain text file format in which values are separated by tabs

variable
: a category of data being collected on the object being recorded (e.g. a mouse's weight)

Reference

This page is adapted from the Project Organization and Management for Genomics corresponding page.