-
Notifications
You must be signed in to change notification settings - Fork 75
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Channel README * Blueprint README * Architect documents * Small edit to provider README * catastrophic
- Loading branch information
Showing
13 changed files
with
114 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Mephisto | ||
This is the main package directory, containing all of the core workings of Mephisto. The breakdown is as following: | ||
|
||
- `client`: Contains interfaces for using Mephisto at a very high level. Primarily comprised of the python code for the cli and | ||
- `core`: Contains components that operate on top of the data_model layer | ||
- `data_model`: Contains the data model components as described in the architecture document, as well as the base classes for all the core abstractions. | ||
- `providers`: contains implementations of the `CrowdProvider` abstraction | ||
- `scripts`: contains commonly executed convenience scripts for Mephisto users | ||
- `server`: contains implementations of the `Architect` and `Blueprint` abstractions. | ||
- `tasks`: an empty default directory to work on your own tasks | ||
- `utils`: unorganized utility classes that are useful in scripts and other places | ||
- `webapp`: contains the frontend that is deployed by the main client | ||
|
||
## Discussions | ||
|
||
Changes to this structure for clarity are being discussed in [#285](https://github.com/facebookresearch/Mephisto/issues/285). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# server | ||
This folder contains the abstractions for Architects and Blueprints. There are three main subfolders: | ||
|
||
- `architects`: this folder specifically has implementations of the `Architect` abstraction. | ||
- `blueprints`: This folder has implementations of the `Blueprint` abstraction, as well as all related helper classes. | ||
- `channels`: This folder contains implementations of the `Channel` abstraction, which is required by an `Architect` to communicate with mephisto. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,42 @@ | ||
# TODO fill in this readme | ||
Details are in this PR right now: https://github.com/facebookresearch/Mephisto/pull/68 | ||
# architects | ||
This folder contains all of the current official `Architect` implementations. | ||
|
||
`Architect`'s contain the logic surrounding deploying a server that workers will be able to access. In many cases Mephisto is being run on compute clusters that aren't directly addressable, or in different configurations between collaborators. Mephisto should be able to run a task from start to finish regardless of the server configuration a user would like to use, and Architect's provide this capability. | ||
|
||
|
||
# Architect | ||
The `Architect` class is responsible for providing Mephisto with lifecycle functions for preparing, deploying, and shutting down a given server. It's also responsible with providing access to the user via a `Channel`, which defines an interface of callbacks for incoming messages and a function for outgoing messages. It should define the following things in order to operate properly: | ||
|
||
- `ArgsClass`: A dataclass that implements `ArchitectArgs`, specifying all of the configuration arguments that the `Architect` uses in order to properly initialize. | ||
- `get_channels`: A method that will return a list of initialized `Channel`'s that the supervisor will need to communicate with to manage running a task. Returns a list to handle cases where an `Architect` is communicating with multiple servers in a distributed setup. | ||
- `prepare`: Prepare any files that will be used in the deploy process. Should return the location of the prepared server files. | ||
- `deploy`: Launch the server (if necessary) and deploy the prepared task files such that the server will be able to serve them. Return the server URL for this task, such that Mephisto can point workers to it. | ||
- `cleanup`: Clean up any files that were used in the deploy process that aren't necessarily useful for later | ||
- `shutdown`: Shut down the server (if necessary) or otherwise take the specific task url expected to point to this Mephisto task run offline. | ||
- `download_file`: Save the file that is stored on the server with a given filename to the local save directory provided. Only required by `Architect`'s that aren't able to pass a file through the `Channel` directly. | ||
|
||
## Lifecycle | ||
|
||
During initialization, Mephisto calls `assert_task_args` and expects the `ArchitectClass` to be able to pass through any arguments specified by the `ArgsClass`, raising an exception if there are any configuration problems. After this point, Mephisto will initialize the `Architect` with the validated config. | ||
|
||
Initially Mephisto will call `prepare` to give the `Architect` a chance to collect any relevant files required to run the server. It will give the `Blueprint` a chance to add additional files to this folder before the deploy. | ||
|
||
Next, Mephisto will call `deploy` and then `get_channels`. This should ensure that there is an external server, and that Mephisto has a way to communicate with it through a `Channel`. Only after this is met, it will publish tasks to the crowd provider. | ||
|
||
Once the task is done, or if it is cancelled or an error occurs, Mephisto will call `shutdown`, which is the signal for the `Architect` to clean up both local resources and remote resources related to this task. | ||
|
||
# Implementations | ||
## LocalArchitect | ||
The `LocalArchitect` implementation works by running a `node` server on the local machine at the given port in a background process. It communicates over websockets with the `WebsocketChannel`, and requires that there's a directory where node is actively running in. The particular node server is the baseline `router` implementation available in the `router/deploy` folder. | ||
|
||
## HerokuArchitect | ||
The `HerokuArchitect` implementation works by getting access to the `heroku` cli, preparing a directory of what to deploy on that server, and then pushing it along. It communicates over the `WebsocketChannel`. This also relies on the node server implementation available in the `router/deploy` folder. | ||
|
||
## MockArchitect | ||
The `MockArchitect` is an `Architect` used primarily for testing. To test Mephisto lifecycle, you can choose `should_run_server=False`, which just leads to the lifecycle functions marking if they've been called. Setting `should_run_server=True` can be used to automatically test certain flows, as it launches a Tornado server for which every packet and action sent through it can be scripted. | ||
|
||
# Discussions | ||
|
||
Currently the abstraction around `prepare` and `deploy` should be a little more rigid, defining the kinds of files that tasks should be able to deploy, where to expect to find them, etc. At the moment, this API is somewhat unclear, and while this is okay with the current set of `Architect`'s, future ones may not be as clear on this capability. | ||
|
||
It's unclear if `cleanup` should be called immediately when the server is deployed (freeing space) or only after a task has been fully reviewed and archived following the review flow. It's possible that the deciding factor should be based on if the `Blueprint` is even registered to use the review flow at all. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# channels | ||
This folder contains the base `Channel` class, and some implementations (well, one for now). | ||
|
||
# Channel | ||
The channel class acts as the abstraction layer that allows servers with different configurations to still communicate with Mephisto. It is an interface that the `Supervisor` class knows how to communicate with, and different `Architect`'s may need specialized methodology to surface mephisto events to the user. `Architect` classes should be able to return a `Channel` to the `Supervisor` that will be used to communicate with any users that connect to Mephisto via that architect. | ||
|
||
The Channel class has five primary methods: | ||
- `open`: Does whatever is necessary to connect the local python `Channel` with the remote server to communicate. | ||
- `is_alive`: Should return whether or not the currently open connection is alive. Should remain `False` until after an `open` call successfully connects with the server. | ||
- `close`: Close the `Channel`, and ensure that all threads and resources used for the it have been cleaned up properly. | ||
- `is_closed`: Should return `True` only if `close` has been called for this channel, or the channel has closed itself due to an error. | ||
- `send`: Is given a `Packet` object, and should pass it along to the intended recipient and return `True`. If there's a transient error, should return `False`. If there's a serious error, it should do something to set `is_alive` to `False` and also return `False`. | ||
|
||
It also takes in three callback methods: | ||
- `on_channel_open`: Should be called when the channel is first alive, telling the Supervisor that it's okay to send a registration message. | ||
- `on_catastrophic_disconnect`: Should be called when the `Channel` believes that it is no longer able to communicate with the server. | ||
- `on_message`: Should parse incoming messages from the server into `Packet` objects, and then call this callback with that object. | ||
|
||
## Lifecycle | ||
|
||
The basic lifecycle of a `Channel` surfaces when a `Supervisor` is given a job with `register_job`. It reaches out to the `Architect` to get a method to connect to the users. After calling `open` it waits until the first `is_alive` call returns `True`, then begins running. The `Channel` should also call `on_channel_open` the first time it's certain that the | ||
|
||
While the server is running, it should take incoming messages from the server and convert them to the `Packet` format, and pass them along using `on_message`. It should take outgoing message `Packet`'s with `send` and pass them to the server. | ||
|
||
Once the task completes its run, or if it's interrupted or a serious error occurs, Mephisto will call `close` on all channels. At this point the `Channel` should clean up resources. | ||
|
||
## Retriability + Failure handling | ||
|
||
Ultimately if the `Channel` that Mephisto is communicating using dies, the Mephisto process needs to suspend the current run, shutdown, and clean up. As such, we leave it up to the `Channel` implementation to determine if the connection is still stable enough to be running. `is_alive`, the retriability of `send`, and `on_catastrophic_disconnect` covers the full freedom for a `Channel` to be able to signal to Mephisto that a task is no longer salvageable. | ||
|
||
Another way to visualize the flow for this would be to try to `send` a message, and upon failure, launch a thread that tries to fix the issue, then return `False` for the `send` call. Mephisto will wait for `is_alive` to be true before retrying. If the `Channel` succeeds in re-establishing a connection, then the retries will eventually go through. If the `Channel` believes that something is *really* wrong, it should call `on_catastrophic_disconnect`. | ||
|
||
# WebsocketChannel | ||
|
||
The `WebsocketChannel` is a `Channel` implementation that relies on a websocket app to handle incoming and outgoing messages. This application is run inside of a thread, and cleaned up on closure. It is usable on all systems that can maintain stable websocket-based connections, and with all `Architect`'s that can communicate over a socket. |