Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Lucene 7.6 #514

Merged
merged 26 commits into from
Dec 18, 2018
Merged

Upgrade to Lucene 7.6 #514

merged 26 commits into from
Dec 18, 2018

Conversation

lintool
Copy link
Member

@lintool lintool commented Dec 16, 2018

After much fanfare, here's the upgrade to Lucene 7!
I'm running regressions one more time, but @Peilin-Yang you can start looking at the PR.

lintool and others added 20 commits September 18, 2018 21:14
Conflicts:
	docs/experiments-cw12.md
	docs/experiments-mb13.md
	src/main/resources/regression/cw12.yaml
	src/main/resources/regression/mb13.yaml
Conflicts:
	docs/experiments-car17.md
	docs/experiments-core17.md
	docs/experiments-cw09b.md
	docs/experiments-cw12.md
	docs/experiments-cw12b13.md
	docs/experiments-disk12.md
	docs/experiments-gov2.md
	docs/experiments-mb11.md
	docs/experiments-mb13.md
	docs/experiments-robust04.md
	docs/experiments-robust05.md
	docs/experiments-wt10g.md
	src/main/java/io/anserini/rerank/lib/AxiomReranker.java
	src/main/java/io/anserini/search/SearchCollection.java
	src/main/resources/regression/cacm.yaml
	src/main/resources/regression/car17.yaml
	src/main/resources/regression/core17.yaml
	src/main/resources/regression/cw09b.yaml
	src/main/resources/regression/cw12.yaml
	src/main/resources/regression/cw12b13.yaml
	src/main/resources/regression/disk12.yaml
	src/main/resources/regression/gov2.yaml
	src/main/resources/regression/mb11.yaml
	src/main/resources/regression/mb13.yaml
	src/main/resources/regression/robust04.yaml
	src/main/resources/regression/robust05.yaml
	src/main/resources/regression/wt10g.yaml
@lintool lintool mentioned this pull request Dec 16, 2018
new Term(WapoGenerator.WapoField.KICKER.name, "Letters to the Editor"),
new Term(WapoGenerator.WapoField.KICKER.name, "The Post's View")
Query filter = new TermInSetQuery(WapoGenerator.WapoField.KICKER.name, new BytesRef("Opinions"), new BytesRef("Letters to the Editor"), new BytesRef("The Post's View")
// new Term(WapoGenerator.WapoField.KICKER.name, "Opinions"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

@Peilin-Yang
Copy link
Collaborator

Thanks for the hard work!

@lintool
Copy link
Member Author

lintool commented Dec 18, 2018

On damiano, I performed a final verification of regressions:

[o] nohup python src/main/python/run_regression.py --collection disk12 >& log.disk12 &
[o] nohup python src/main/python/run_regression.py --collection robust04 >& log.robust04 &
[o] nohup python src/main/python/run_regression.py --collection robust05 >& log.robust05 &
[o] nohup python src/main/python/run_regression.py --collection core17 >& log.core17 &
[o] nohup python src/main/python/run_regression.py --collection core18 >& log.core18 &

[o] nohup python src/main/python/run_regression.py --collection mb11 >& log.mb11 &
[o] nohup python src/main/python/run_regression.py --collection mb13 >& log.mb13 &

[o] nohup python src/main/python/run_regression.py --collection wt10g >& log.wt10g &
[o] nohup python src/main/python/run_regression.py --collection gov2 >& log.gov2 &
[o] nohup python src/main/python/run_regression.py --collection cw09b >& log.cw09b &
[o] nohup python src/main/python/run_regression.py --collection cw12b13 >& log.cw12b13 &
[o] nohup python src/main/python/run_regression.py --collection cw12 >& log.cw12 &

[o] nohup python src/main/python/run_regression.py --collection car17 >& log.car17 &

[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection disk12 >& jdiq2018.disk12.log &
[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection robust04 >& jdiq2018.robust04.log &
[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection robust05 >& jdiq2018.robust05.log &
[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection wt10g >& jdiq2018.wt10g.log &
[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection gov2 >& jdiq2018.gov2.log &
[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection cw09b --metrics map ndcg20 err20 >& jdiq2018.cw09b.log &
[o] nohup python src/main/python/jdiq2018_effectiveness/run_batch.py --collection cw12b13 --metrics map ndcg20 err20 >& jdiq2018.cw12b13.log &

[o] python src/main/python/run_regression.py --collection cacm --index

These indexes are constructed a little while ago:

[jimmylin@damiano Anserini]$ ls -l /scratch2/indexes/
total 764
drwxrwxr-x. 14 jimmylin jimmylin   4096 Sep 27 11:27 lucene6
drwxrwxr-x.  2 jimmylin jimmylin   4096 Oct  2 07:41 lucene-index.cacm.pos+docvectors
drwxrwxr-x.  2 jimmylin jimmylin   4096 Sep 26 00:33 lucene-index.car17.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin   8192 Sep 26 07:24 lucene-index.core17.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin   4096 Nov 30 15:38 lucene-index.core18.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin  53248 Sep 25 21:08 lucene-index.cw09b.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin  65536 Sep 25 23:20 lucene-index.cw12b13.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin 278528 Sep 27 01:50 lucene-index.cw12.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin   8192 Sep 26 07:09 lucene-index.disk12.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin  28672 Sep 25 20:39 lucene-index.gov2.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin  28672 Sep 26 07:26 lucene-index.mb11.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin  12288 Sep 26 15:45 lucene-index.mb13.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin   8192 Sep 26 07:09 lucene-index.robust04.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin   8192 Sep 26 07:10 lucene-index.robust05.pos+docvectors+rawdocs
drwxrwxr-x.  2 jimmylin jimmylin  12288 Sep 25 18:37 lucene-index.wt10g.pos+docvectors+rawdocs

Furthermore, I reran regressions from scratch with new indexes:

[o] nohup python src/main/python/run_regression.py --collection disk12 --index >& log.disk12 &
[o] nohup python src/main/python/run_regression.py --collection robust04 --index >& log.robust04 &
[o] nohup python src/main/python/run_regression.py --collection robust05 --index >& log.robust05 &
[o] nohup python src/main/python/run_regression.py --collection core17 --index >& log.core17 &
[o] nohup python src/main/python/run_regression.py --collection core18 --index >& log.core18 &

[o] nohup python src/main/python/run_regression.py --collection mb11 --index >& log.mb11 &
[o] nohup python src/main/python/run_regression.py --collection mb13 --index >& log.mb13 &

[o] nohup python src/main/python/run_regression.py --collection wt10g --index >& log.wt10g &
[o] nohup python src/main/python/run_regression.py --collection gov2 --index >& log.gov2 &
[o] nohup python src/main/python/run_regression.py --collection cw09b --index >& log.cw09b &
[o] nohup python src/main/python/run_regression.py --collection cw12b13 --index >& log.cw12b13 &

[o] nohup python src/main/python/run_regression.py --collection car17 --index >& log.car17 &

Note: Haven't had time to build full cw12 index yet.

Separately, @Peilin-Yang updated results for fine-tuning experiments.

@lintool lintool merged commit e71df7a into master Dec 18, 2018
@lintool lintool deleted the lucene7 branch December 18, 2018 12:45
crystina-z pushed a commit to crystina-z/anserini that referenced this pull request Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants