Skip to content

iofu728/Model_retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

47 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Spider logo

Model retrieval

GitHub GitHub tag GitHub code size in bytes

Some ML Model retrieval

VSM

VSM = Vector Space Model

This a hand write VSM retrieval

β”œβ”€β”€ utils
β”‚Β Β  └── utils.py         // public function
└── vsm
    β”œβ”€β”€ pre.sh           // data preprocessing shell
    └── vsm.py           // vsm py

VSM process:

  1. word alignment
  2. TF - IDF (smooth, similarity)
  3. one by one calaulate
  • VSM.vsmCalaulate()
    • Consider about bias by smooth
    • Choose one tuple(artile1, artile2) have specific (tf-idf1, tf-idf2)
    • In this way, we have low performance, even we have two class Threadings
  • VSM.vsmTest()
    • Ignore bias by smooth
    • Calculate tf-idf in the pre processing which decided by artile instead of tuple(artile1, artile2)
    • In this way, we have fantastic performance
    • We calculate dataset of 3100βœ–οΈ3100 in 215s

SMN

SMN = Sequential Matching Network

some change from MarkWuNLP/MultiTurnResponseSelection

.
β”œβ”€β”€ NN
β”‚Β Β  β”œβ”€β”€ CNN.py              // CNN function
β”‚Β Β  β”œβ”€β”€ Classifier.py       // classifier function
β”‚Β Β  β”œβ”€β”€ Optimization.py     // NN optimization function
β”‚Β Β  β”œβ”€β”€ RNN.py              // RNN function
β”‚Β Β  └── logistic_sgd.py     // sgd function
β”œβ”€β”€ SMN
β”‚Β Β  β”œβ”€β”€ PreProcess.py       // pre deal function
β”‚Β Β  β”œβ”€β”€ SMN_Last.py         // model function
β”‚Β Β  β”œβ”€β”€ SimAsImage.py       // cnn pool & conv
β”‚Β Β  └── sampleConduct.py    // got negative and true sample
└── utils
 Β Β  β”œβ”€β”€ constant.py         // constant parameter
 Β Β  └── utils.py            // public function

SMN process:

  1. word embemdding
  2. GRU
  3. CNN
  4. GRU
  5. score
  • SMN.PreProcess.ParseMultiTurn(input_file)
    • prepare deal sample to matrix
  • SMN.PreProcess..ParseMultiTurnTest(input_file)
    • prepare deal test sample to matrix
  • SMN.sampleConduct.preWord2vec(input_file, out_file)
    • embedding sample
  • SMN.sampleConduct.SampleConduct()
    • got negative & true sample
  • SMN.SMN_Last.run_model()
    • run SMN model

Bert

LightGBM

DMN

bert_embedding

word2vec

wordNet

flask