Skip to content

Latest commit

 

History

History
333 lines (234 loc) · 9.46 KB

README.md

File metadata and controls

333 lines (234 loc) · 9.46 KB

RRAP Data Generators

Provides a single command-line interface to generate data sets for use with ADRIA and the RRAP program.

PyPI - Version PyPI - Python Version

Warning: requires Python >= 3.11


Table of Contents

Installation

TODO: Standalone executable.

From PyPI (TODO)

pip install rrap-dg

Dev version

pip install git+https://github.com/open-AIMS/rrap-dg

For Development

Clone the repository, and navigate to the project folder.

git clone https://github.com/open-AIMS/rrap-dg
cd rrap_dg

It is recommended that any development work be done in a separate environment.

Here, mamba is used to create a local conda environment.

# Create a new environment called rrap-dg
$ mamba create -n rrap-dg python=3.11

# Don't forget to activate the environment
$ mamba activate rrap-dg

# Install local development copy of rrap-dg
(rrap-dg) $ pip install -e .

Note: The first time rrapdg is run, it will go through an initial set up process.

Run the help command to trigger the setup.

(rrap-dg) $ rrapdg --help

Python venv setup

Alternatively, you can use a traditional python venv. For example

python -m venv .venv
source .venv/bin/activate
pip install -e .

Overriding Default Settings

This project uses Pydantic's BaseSettings to manage configuration. While default values are provided, you can easily override these settings using a .env file.

Creating a .env File

  1. Create a file named .env in the root directory of the project.
  2. Add your custom settings to this file using the following format, noting all values are optional:
PROVENA_DOMAIN=your.provena.domain.com
PROVENA_REALM_NAME=provena
PROVENA_CLIENT_ID=automated-access

Available Settings

The following settings can be overridden:

  • PROVENA_DOMAIN: The Provena deployment to target (default: "mds.gbrrestoration.org")
  • PROVENA_REALM_NAME: The Keycloak realm name (default: "rrap")
  • PROVENA_CLIENT_ID: The Keycloak client ID (default: "automated-access")

rrap-dg Data Packages

The rrap-dg Data Packages are used as inputs by DHW and Cyclone Mortality data generators. To generate DHW data cubes, the folders MIROC5, NOAA, RECOM and spatial are required. To generate the coral mortality projections due to cyclones, the folder cyclone_mortality is required.

The data package should be named with the following convention:

[cluster name]_rrapdg_[YYYY-MM-DD]

An example for a hypothetical Moore dataset:

Moore_rrapdg_2023-01-24
│   datapackage.json
│   README.md
│
├───MIROC5
│       GBR_maxDHW_MIROC5_rcp26_2021_2099.csv
│       GBR_maxDHW_MIROC5_rcp45_2021_2099.csv
│       GBR_maxDHW_MIROC5_rcp60_2021_2099.csv
│       GBR_maxDHW_MIROC5_rcp85_2021_2099.csv
│
├───NOAA
│       GBR_dhw_hist_noaa.nc
│
├───RECOM
│       Moore_2015_585_dhw_exp.nc
│       Moore_2016_586_dhw_exp.nc
│       Moore_2017_599_dhw_exp.nc
│
└───spatial
│       list_gbr_reefs.csv
│       Moore.gpkg
│
└───cyclones
│       coral_cover_cyclone.csv

The most recent data package is available on the RRAP IS Data store: https://hdl.handle.net/102.100.100/481718

Domain Template

Create an empty ADRIA Domain to be filled with data.

(rrap-dg) $ rrapdg template generate [directory]

TODO: Package an ADRIA Domain with data from the M&DS data store.

(rrap-dg) $ rrapdg template package [directory] [spec]

Where spec points to a json file defining handle IDs for each dataset to be downloaded from the M&DS data store.

Degree Heating Weeks (DHW) projections

Generate Degree Heating Week projections using combinations of

  • NOAA Coral Reef Watch (CRW version 3.1) satellite data
  • MIROC5 RCP projections (2021 - 2099)
  • RECOM spatial multi-marine heat wave patterns

This work was ported to Python from the original MATLAB developed by Dr. Veronique Lago and modified by Chinenye Ani in MATLAB.

Usage:

(rrap-dg) $ rrapdg dhw generate [cluster name] [input data directory] [output directory] [optional settings...]

For example, with default values shown for optional settings:

(rrap-dg) $ rrapdg dhw generate Moore C:/data_package_location C:/temp --n-sims 50 --rcps "2.6 4.5 6.0 8.5" --gen-year "2025 2100"

Note that the output directory is assumed to already exist.

Initial Coral Cover

Initial coral cover data is downscaled from ReefMod Engine (RME) data. The current process is compatible with ReefMod or RME v1.0.x datasets or the rrap-dg data package.

(rrap-dg) $ rrapdg coral-cover downscale-icc [rrap-dg datapackage path] [target geopackage] [output path]

For example, to downscale RME data for the Moore cluster defined by a geopackage:

(rrap-dg) $ rrapdg coral-cover downscale-icc C:/example/rrapdg ./Moore.gpkg ./coral_cover.nc
(rrap-dg) $ rrapdg coral-cover downscale-icc C:/example/rme_dataset ./Moore.gpkg ./coral_cover.nc

A set of initial cover files can be created using a TOML file:

(rrap-dg) $ rrapdg coral-cover downscale-icc [rrap-dg datapackage path] [target geopackage] [output directory] [TOML file]

The output path is assumed to exist.

(rrap-dg) $ rrapdg coral-cover bin-edge-icc C:/example/rrapdg ./Moore.gpkg ./icc_files ./bin_edges.toml

This will create a set of netCDFs in the icc_files directory using the bin edges defined in the TOML file.

The format of the TOML file is:

name_of_file = [
    [values, for, each, size class],
	[rows, are, functional, groups],
	[cols, are size, classes]
]

Note that ReefMod represents arborescent Acropora, whereas CoralBlocks does not. Hence the first line is set to 0.0.

A full example:

bin_edge_1 = [
	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
	[5.0, 7.5, 10.0, 20.0, 40.0, 100.0, 150.0],
	[5.0, 7.5, 10.0, 20.0, 35.0, 50.0, 100.0],
	[5.0, 7.5, 10.0, 15.0, 20.0, 40.0, 50.0],
	[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0],
	[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0]
]

bin_edge_2 = [
	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
	[4.0, 7.5, 10.0, 20.0, 40.0, 100.0, 150.0],
	[4.0, 7.5, 10.0, 20.0, 35.0, 50.0, 100.0],
	[4.0, 7.5, 10.0, 15.0, 20.0, 40.0, 50.0],
	[4.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0],
	[4.0, 7.5, 10.0, 20.0, 40.0, 50.0, 100.0]
]

bin_edge_3 = [
	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
	[5.0, 7.5, 10.0, 20.0, 40.0, 100.0, 100.0],
	[5.0, 7.5, 10.0, 20.0, 35.0, 50.0, 120.0],
	[5.0, 7.5, 10.0, 15.0, 20.0, 40.0, 60.0],
	[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 110.0],
	[5.0, 7.5, 10.0, 20.0, 40.0, 50.0, 120.0]
]

Using the above will create files named bin_edge_1, bin_edge_2, ..., etc.

Cyclone Mortality projections

Generate Cyclone Mortality projections using data from

  • Fabricius, Katharina E., et al. "Disturbance gradients on inshore and offshore coral reefs caused by a severe tropical cyclone." Limnology and Oceanography 53.2 (2008): 690-704.
  • ReefMod Engine data set

The mortality regression model was ported from an R script written by Dr. Vanessa Haller, intended for use with the C~Scape coral ecosystem model.

Usage:

(rrap-dg) $ rrapdg cyclones generate [rrapdg datapackage path] [reefmod engine datapackage path] [output directory path]

The output directory is assumed to already exist.

RRAP M&DS data store interface

Download data from M&DS datastore

(rrap-dg) $ rrapdg data-store download [dataset id] [output directory]

For example, to download and save the dataset with id "102.100.100/602432" in the current directory:

(rrap-dg) $ rrapdg data-store download 102.100.100/602432 .

Semantically, the command is to download from a source to a destination.

TODO: Uploading/submitting datasets.

Wave data

TODO

Domain clusters

Assign each location in a geopackage file to a cluster using k-means clustering. The cluster a location is a member of is indicated by a new column named cluster_id. Results are outputted to a new geopackage file saved to the user-indicated location.

The number of clusters are determined by optimizing for a high Silhouette score with Adaptive Differential Evolution (adaptive_de_rand_1_bin_radiuslimited() in BlackBoxOptim.jl).

(rrap-dg) $ rrapdg domain cluster [geopackage path] [output directory path]

# Example
(rrap-dg) $ rrapdg domain cluster "C:/example/example.gpkg" "./test.gpkg"

The method reports a "Best candidate", the floor of which indicates the identified optimal number of clusters.

License

rrap-dg is distributed under the terms of the MIT license.