Skip to content

Commit

Permalink
Merge pull request #34 from fomorians/2.0
Browse files Browse the repository at this point in the history
Pyoneer w/ 2.0 support
  • Loading branch information
jimfleming committed Jun 27, 2019
2 parents 86e980c + 38959e6 commit ac2a00e
Show file tree
Hide file tree
Showing 73 changed files with 1,698 additions and 1,522 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/pyoneer.egg-info/
/build/
/dist/
/*.egg-info/
*.pyc
*.swp
.DS_Store
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ repos:
rev: master
hooks:
- id: black
language_version: python3.6
language_version: python3.6
5 changes: 3 additions & 2 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,13 @@ name = "pypi"

[packages]
gym = "*"
tensorflow = "*"
tensorflow-probability = "*"
tensorflow = "==2.0.0-beta1"
tfp-nightly = "*"

[requires]
python_version = "3.6.5"

[dev-packages]
ipdb = "*"
pre-commit = "*"
twine = "*"
94 changes: 50 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,34 +9,44 @@ For the top-level utilities, import like so:
import pyoneer as pynr
pynr.math.rescale(...)

For the large sub-modules, such as reinforcement learning, we recommend:
For the larger sub-modules, such as reinforcement learning, we recommend:

import pyoneer.rl as pyrl
pyrl.losses.policy_gradient_loss(...)
loss_fn = pyrl.losses.PolicyGradient(...)

In general the API tries to adhere to TensorFlow 2.0's API.
In general, the Pyoneer API tries to adhere to the TensorFlow 2.0 API.

### Examples

- [Eager Proximal Policy Optimization](https://github.com/fomorians/ppo)
- [TF 2.0 Proximal Policy Optimization](https://github.com/fomorians/ppo)

## API

### Activations ([`pynr.activations`](pyoneer/activations))

- `pynr.activations.swish`

### Debugging ([`pynr.debugging`](pyoneer/debugging))

- `pynr.debugging.Stopwatch`

### Distributions ([`pynr.distributions`](pyoneer/distributions))

- `pynr.distributions.MultiCategorical`

### Initializers ([`pynr.initializers`](pyoneer/initializers))

- `pynr.initializers.SoftplusInverse`

### Layers ([`pynr.layers`](pyoneer/layers))

- `pynr.layers.Normalizer`
- `pynr.layers.Swish`
- `pynr.layers.OneHotEncoder`
- `pynr.layers.AngleEncoder`
- `pynr.layers.DictFeaturizer`
- `pynr.layers.ListFeaturizer`
- `pynr.layers.VecFeaturizer`

### Tensor Manipulation ([`pynr.manip`](pyoneer/manip))

- `pynr.manip.flatten`
- `pynr.manip.batched_index`
- `pynr.manip.pad_or_truncate`
- `pynr.manip.shift`
Expand All @@ -62,80 +72,76 @@ In general the API tries to adhere to TensorFlow 2.0's API.
- `pynr.metrics.MAPE`
- `pynr.metrics.SMAPE`

### Neural Networks ([`pynr.nn`](pyoneer/nn))
### Moments ([`pynr.moments`](pyoneer/moments))

- `pynr.nn.swish`
- `pynr.nn.moments_from_range`
- `pynr.nn.StreamingMoments`
- `pynr.nn.ExponentialMovingMoments`
- `pynr.moments.range_moments`
- `pynr.moments.StaticMoments`
- `pynr.moments.StreamingMoments`
- `pynr.moments.ExponentialMovingMoments`

### Reinforcement Learning ([`pynr.rl`](pyoneer/rl))
### Learning Rate Schedules ([`pynr.schedules`](pyoneer/schedules))

Utilities for reinforcement learning.
- `pynr.schedules.CyclicSchedule`

#### Environments ([`pynr.rl.envs`](pyoneer/rl/envs))
### Reinforcement Learning ([`pynr.rl`](pyoneer/rl))

- `pynr.rl.envs.BatchEnv`
- `pynr.rl.envs.ProcessEnv`
Utilities for reinforcement learning.

#### Losses ([`pynr.rl.losses`](pyoneer/rl/losses))

- `pynr.rl.losses.policy_gradient_loss`
- `pynr.rl.losses.clipped_policy_gradient_loss`
- `pynr.rl.losses.policy_gradient`
- `pynr.rl.losses.policy_entropy`
- `pynr.rl.losses.clipped_policy_gradient`
- `pynr.rl.losses.PolicyGradient`
- `pynr.rl.losses.PolicyEntropy`
- `pynr.rl.losses.ClippedPolicyGradient`

#### Targets ([`pynr.rl.targets`](pyoneer/rl/targets))

- `pynr.rl.targets.discounted_rewards`
- `pynr.rl.targets.generalized_advantages`
- `pynr.rl.targets.DiscountedReturns`
- `pynr.rl.targets.GeneralizedAdvantages`

#### Strategies ([`pynr.rl.strategies`](pyoneer/rl/strategies))

- `pynr.rl.strategies.EpsilonGreedyStrategy`
- `pynr.rl.strategies.ModeStrategy`
- `pynr.rl.strategies.SampleStrategy`
- `pynr.rl.strategies.EpsilonGreedy`
- `pynr.rl.strategies.Mode`
- `pynr.rl.strategies.Sample`

### Training ([`pynr.training`](pyoneer/training))
#### Wrappers ([`pynr.rl.wrappers`](pyoneer/rl/wrappers))

- `pynr.training.CyclicSchedule`
- `pynr.training.update_target_variables`
- `pynr.rl.wrappers.ObservationCoordinates`
- `pynr.rl.wrappers.ObservationNormalization`
- `pynr.rl.wrappers.Batch`
- `pynr.rl.wrappers.BatchProcess`
- `pynr.rl.wrappers.Process`

## Installation

There are a few options of installing:

1. Install with `pipenv`:

pipenv install pyoneer
There are a few options for installation:

2. Install with `pip`:
1. (Recommended) Install with `pipenv`:

pip install pyoneer
pipenv install fomoro-pyoneer

3. Install locally for development with `pipenv`:
2. Install locally for development with `pipenv`:

git clone https://github.com/fomorians/pyoneer.git
cd pyoneer
pipenv install
pipenv shell

4. Install locally for development with `pip`:

git clone https://github.com/fomorians/pyoneer.git
cd pyoneer
pip install -e .

## Testing

There are a few options for testing:

1. Run all tests:

python -m unittest discover -p '*_test.py'
python -m unittest discover -bfp '*_test.py'

2. Run specific tests:

python -m pyoneer.math.logical_ops_test

## Contributing

File an issue following the `ISSUE_TEMPLATE`, then submit a pull request from a branch describing the feature. This will eventually get merged into `master`.
File an issue following the `ISSUE_TEMPLATE`. If the issue discussion warrants implementation, then submit a pull request from a branch describing the feature. This will eventually get merged into `master` after a few rounds of code review.
21 changes: 19 additions & 2 deletions pyoneer/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,28 @@
from __future__ import division
from __future__ import print_function

from pyoneer import activations
from pyoneer import debugging
from pyoneer import distributions
from pyoneer import initializers
from pyoneer import layers
from pyoneer import manip
from pyoneer import math
from pyoneer import metrics
from pyoneer import nn
from pyoneer import moments
from pyoneer import rl
from pyoneer import training
from pyoneer import schedules

__all__ = [
"activations",
"debugging",
"distributions",
"initializers",
"layers",
"manip",
"math",
"metrics",
"moments",
"rl",
"schedules",
]
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@
from __future__ import division
from __future__ import print_function

from pyoneer.rl.envs.batch_env_impl import BatchEnv
from pyoneer.rl.envs.process_env_impl import ProcessEnv
from pyoneer.activations.activations_impl import swish

__all__ = ["swish"]
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

def swish(x):
"""
Compute the Swish, self-gating, activation function: `x * sigmoid(x)`.
Compute Swish, self-gating, activation function: `x * sigmoid(x)`.
Args:
x: Tensor
Expand All @@ -16,5 +16,5 @@ def swish(x):
Tensor of same dimension as `x`.
"""
y = x * tf.sigmoid(x)
y = tf.check_numerics(y, "swish")
y = tf.debugging.check_numerics(y, "swish")
return y
19 changes: 19 additions & 0 deletions pyoneer/activations/activations_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from pyoneer.activations.activations_impl import swish


class ActivationsTest(tf.test.TestCase):
def test_swish(self):
x = tf.constant([-1.0, 0.0, +1.0])
actual = swish(x)
expected = tf.constant([-0.268941, 0.0, 0.731059])
self.assertAllClose(actual, expected)


if __name__ == "__main__":
tf.test.main()
7 changes: 7 additions & 0 deletions pyoneer/debugging/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from pyoneer.debugging.debugging_impl import Stopwatch

__all__ = ["Stopwatch"]
38 changes: 38 additions & 0 deletions pyoneer/debugging/debugging_impl.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf


class Stopwatch(object):
"""
Stopwatch for measuring how long operations take. Great for fast and easy profiling.
Example:
>>> x = tf.constant(1.0)
>>> y = tf.constant(2.0)
>>> with Stopwatch() as watch:
>>> z = x + y
>>> tf.print(watch.duration)
>>> # 0.00021505355834960938
"""

def __init__(self):
self.start_time = None
self.end_time = None
self.duration = None

def start(self):
self.start_time = tf.timestamp()

def stop(self):
self.end_time = tf.timestamp()
self.duration = self.end_time - self.start_time

def __enter__(self):
self.start()
return self

def __exit__(self, exc_type, exc_value, traceback):
self.stop()
20 changes: 20 additions & 0 deletions pyoneer/debugging/debugging_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from pyoneer.debugging.debugging_impl import Stopwatch


class DebuggingTest(tf.test.TestCase):
def test_stopwatch(self):
with Stopwatch() as stopwatch:
pass
self.assertIsNotNone(stopwatch.start_time)
self.assertIsNotNone(stopwatch.end_time)
self.assertIsNotNone(stopwatch.duration)


if __name__ == "__main__":
tf.test.main()
7 changes: 7 additions & 0 deletions pyoneer/distributions/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from pyoneer.distributions.distributions_impl import MultiCategorical

__all__ = ["MultiCategorical"]
35 changes: 35 additions & 0 deletions pyoneer/distributions/distributions_impl.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf


class MultiCategorical(object):
"""
Distribution composed of multiple distributions.
Useful for representing `gym.spaces.MultiDiscrete`.
Args:
distributions: list of distributions.
"""

def __init__(self, distributions):
self.distributions = distributions

def log_prob(self, value):
values = tf.split(value, len(self.distributions), axis=-1)
log_probs = [
dist.log_prob(val[..., 0]) for dist, val in zip(self.distributions, values)
]
return tf.math.add_n(log_probs)

def entropy(self):
return tf.math.add_n([dist.entropy() for dist in self.distributions])

def sample(self):
return tf.stack([dist.sample() for dist in self.distributions], axis=-1)

def mode(self):
return tf.stack([dist.mode() for dist in self.distributions], axis=-1)
Loading

0 comments on commit ac2a00e

Please sign in to comment.