Skip to content

MarkusJx/datagen

Repository files navigation

datagen

A random data generator which can be used to generate random data for testing purposes. The result schema can be defined using a JSON file.

The readme is still a work in progress, but you can check out the docs for more information and examples. Check out the demo to see datagen in action.

Similar projects

This project is heavily inspired by synth but features more complex references and a plugin system.

Usage

Simply grab a binary built during a workflow run or build it yourself using cargo build -p cli --release.

Docker

You can also use the Docker image ghcr.io/markusjx/datagen to run datagen in a container.

docker run -v $(pwd):/data ghcr.io/markusjx/datagen generate /data/schema.json /data/output.json

Check out the docker image documentation for more information.

Command-line interface

datagen provides a command-line interface either written in Rust or TypeScript.

Rust CLI

The Rust CLI is the main CLI and is the most feature-rich CLI. It is also the fastest CLI.

Installation

You can download a binary from the releases page or build it yourself using cargo build -p cli --release. The node CLI can be installed using npm install -g @datagen/cli.

Quick start

Create a file called schema.json with the following content:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "value": "John"
    },
    "age": {
      "type": "integer",
      "value": 20
    }
  }
}

Then run datagen generate schema.json to generate data.

Available generators

Generators are used to generate random data. The generators are defined in the schema file as JSON objects. The following generators are available:

  • integer: Generates random integers.
  • number: Generates random floating point numbers.
  • string: Generates random strings.
  • bool: Generates random booleans.
  • counter Generate numbers which increment each time.
  • array: Generates random arrays.
  • object: Generates random objects.
  • reference: Used to reference (or copy) other data.
  • anyOf: Chooses random data from a list of data.
  • flatten: Flattens an array or object.
  • plugin: Generates data using plugins.
  • file: Read random values from a JSON array inside a file.
  • include: Include external schema files.

Schema validation

You can use datagen validate schema.json to validate a schema file. Currently, the following checks are performed:

  • Check if the schema is a valid JSON file.
  • Check if all types match the supported types.
  • Check if all arguments are valid for the given type.
  • Check if files included by the include generator exist and are valid.
  • Check if files included by the file generator exist.
  • Check if all transformers are valid.

The validation also runs before generating data using datagen generate. You can disable this behavior using the --no-validate flag. This disables all validation checks except for the first one. Potential schema errors will be thrown during generation, with less detailed error messages.

A validation error includes:

  • The path to the error.
  • The error message.
  • The invalid value, if available.
  • The underlying error, if available.

JSON schema

A JSON schema file is provided for type checking. You can find it here.