Skip to content

Thread pool scheduler for parallelizable tasks with dependencies

License

Notifications You must be signed in to change notification settings

leoll2/ParallelizationManager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ParallelizationManager

ParallelizationManager helps you writing parallelizable programs and execute their operations on multiple threads concurrently. In other words, ParallelizationManager is a thread pool scheduler for parallelizable tasks whose dependencies can be modeled as a Directed Acyclic Graph (DAG).

Table of Contents

Install

Installation is not required at all. Simply clone this repository, compile the source files and include the header parallelizer.hpp in your application.

The repo can be downloaded with:

git clone https://www.github.com/leoll2/ParallelizationManager.git

and compiled with:

make

Nomenclature

These terms will be extensively used in the following description. Here is a clarification about their meaning in this context.

  • Activity: a self-contained sequence of instructions (that is, C++ code) which is to be executed by one thread.
  • Task: a collection of activities, arranged as a DAG by precedence; independent activities can be executed in parallel.
  • Manager: entity which decides the next activity to execute, according to a scheduling policy, and assigns it to an available worker.
  • Worker: a thread which can execute one activity at a time.

Usage

Activity

The body of an activity is declared according to the following syntax, where <Name> is the indeed the name of the activity, and <Code> contains the actual instructions to be executed.

ACTIVITY(<Name>)
{
	<Code>
}

The activity itself is then declared as an object, like this:

<Name> <ObjName>(<Arg>[,<Endpoint>]);

where <Name> corresponds to the code declaration, <ObjName> identifies this particular instance of the activity, <Arg> is a void pointer to directly pass an argument, and <Endpoint> is optional and used to mark the last activity of the task.

Arguments

If an activity needs arguments to execute, these can be retrieved with:

GET_ARG(<Parameter>, <Type>, <Port>)

where <Parameter> is the local variable to store the argument, <Type> is the data type and <Port> is the index of this argument among all those passed to the activity (must be the same specified in link_ret_to_arg().

There is also a shortcut to declare the local variable and retrieve the argument at once:

DECL_AND_GET_ARG(<Parameter>, <Type>, <Port>)

Example

For instance, this is what the activity of adding two integers looks like:

ACTIVITY(Sum)
{
	DECL_AND_GET_ARG(a, int, 1)
	DECL_AND_GET_ARG(b, int, 2)

	int res = a + b;
	
	RETURN(res)
}

Task

A Task object is declared like this:

Task <Name>(<RetValPointer>);

where <Name> is the task name, and <RetvalPointer> shall be a void pointer to the location where the final result will be allocated. A Task is initially empty when created; activities (previously declared) can be added with the member function add_activity:

<Task>.add_activity(<Activity>);

Similarly, dependencies between activities within a task can be specified with add_dependency:

<Task>.add_dependency(<Act_1>, <Act_2>); which means that activity Act_1 must be completed before Act_2 can start.

Usually, an activity depends from another because it needs to operate on data that is generated by the latter. For this purpose, the function link_ret_to_arg can be used to pipe the return value of an activity into the parameters list of each of its dependants.

<Task>.link_ret_to_arg(<Act_1>, <Act_2>, <Port>); with <Port> being the position in the parameters list where to put the funneled value. The very same port index is later used to retrieve the argument (see above).

When a task is finished, the final result can be obtained through the following macro:

RETRIEVE_RESULT(<Res>, <RetValPointer>, <ResType>) where <Res> is a variable of type <ResType> to store the result after casting <RetValPointer>.

Manager

A thread pool manager is instantiated as follows:

PManager m(<PoolSize>);

where <PoolSize> is the number of allocated worker threads.

Tasks can be assigned to the manager with add_task:

<Manager>.add_task(<Task>);

After adding tasks, the manager can be started:

<Manager>.run();

Scheduling

In order to understand the scheduling policy adopted by the manager, it's easier to start considering the case of a single task.

Single task

The manager can only execute tasks whose dependencies have already been solved (i.e., the corresponding activities have been completed). When multiple activities are ready for execution, the manager prioritizes the one which is directly required by the largest number of other activities. In other words, if 3 activities depend from A, and only 2 depend from B, the manager schedules A first. Ready activities are kept in a priority queue, and the priority is indeed the number of dependant activities.

Multiple tasks

When multiple tasks are available, their ready queues become part of a multi-level FCFS queue. The manager tries to schedule an activity from the ready queue of the first task; if no activity is available, it attempts to schedule one from the next queue, and so on. Still, within every queue, the previously described policy remains valid.

Examples

The folder test contains few working examples. After compiling with make, they can be executed individually with

bin/test<X>

where <X> is the index of the test. Otherwise, you can launch them all, one after the other, with a simple shell script:

./start.sh

Notes

  • In the Makefile, add the flag DDEBUG if you want the manager to be verbose during its activity.
  • If your program is aborted due to a memory error, it is likely that you overlooked something while defining dependencies and arguments. Please double check your code first, then feel free to open an issue request if the problem persists.

About

Thread pool scheduler for parallelizable tasks with dependencies

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published