- Vectorization, one-hot encoding of vocabulary
- Vectorization, load word vectors from Gensim
- Design a basic working network with PyTorch nn module
- Build the forward function with biases and weights
- Use sparse matrices instead of dense matrices for combating memory issues
- Clean up the code, create script for utility functions and remove extra scripts
- Convert the two separate language and translation model scripts into single scripts with parameter to specify model type
- Contributed to the report
- Open and reading text files, preprocessing text and tokenization.
- Building trigrams from vobabulary sentences.
- Added options for GPU mode
- Added zero grad before back propagation
- Added log_softmax to help with vanishing gradients
- Loading data in batches
- Creating scripts for running training and testing in terminal
- Contributed to the report
- Splitting vocabulary into training and test data sets by percentage.
- Vectorization, removing out-of-vocabulary words
- Vectorization, building trigrams of vectors
- Writing cross entropy functionality for calculating loss
- Building predict function in the neural network
- Creating testing script
- Breaking out training and testing initialization into separate files
- Saving models to files with possibility to resume training if interrupted
- Trained and tested models for the report
- Implemented sanity checks for arguments
- Contributed to the report