Skip to content

A TF re-implementation of the Karpathy's minGPT (Generative Pretrained Transformer) training

License

Notifications You must be signed in to change notification settings

akanyaani/minGPTF

Repository files navigation

minGPTF

Originally implemented in PyTorch by Andrej Karpathy :- "karpathy/minGPT".

A Tensorflow implementation of minGPT

I have other implemenation of GPT2 also in tensorflow, you can have a look at "akanyaani/gpt-2-tensorflow2.0"

Setup

$ git clone https://github.com/akanyaani/minGPTF
$ cd minGPTF
$ python setup.py install

Usage

For generating text using GPT2

$ open generate.ipynb

Here's how you'd instantiate a GPT-2 (124M param version):

$ from mingptf.model import GPT
$ model_config = GPT.get_default_config()
$ model_config.vocab_size = 50257 # openai's model vocabulary
$ model_config.block_size = 1024  # openai's model block_size (i.e. input context length)
$ model = GPT(model_config)

And here's how you'd train it:

$ from mingptf.model import GPT
$ model_config = GPT.get_default_config()

$ model_config.model_type = 'gpt-micro'
$ model_config.vocab_size = 50257
$ model_config.block_size = 128
$ model = GPT(model_config)

$ train_config = get_default_train_config()
$ train_config.learning_rate = 5e-4 # the model we're using is so small that we can go a bit faster
$ train_config.max_iters = 2000

$ model.configure_optimizers(train_config)
$ model.fit(train_data, test_data, test_freq=5)

TO DO

1. Tensorfboard loging.
2. Mixed precison training.
3. Fine-Tuning wrapper.

References:

Contribution

  • Your issues and PRs are always welcome.

Author

License

About

A TF re-implementation of the Karpathy's minGPT (Generative Pretrained Transformer) training

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published