Skip to content

Commit

Permalink
Add benchmark that measures cost of repeatedly opening the database.
Browse files Browse the repository at this point in the history
  • Loading branch information
ghemawat authored and cmumford committed Dec 11, 2014
1 parent 34ad72e commit 77948e7
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion db/db_bench.cc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
// readmissing -- read N missing keys in random order
// readhot -- read N times in random order from 1% section of DB
// seekrandom -- N random seeks
// open -- cost of opening a DB
// crc32c -- repeated crc32c of 4K of data
// acquireload -- load N*1000 times
// Meta operations:
Expand Down Expand Up @@ -442,7 +443,11 @@ class Benchmark {
bool fresh_db = false;
int num_threads = FLAGS_threads;

if (name == Slice("fillseq")) {
if (name == Slice("open")) {
method = &Benchmark::OpenBench;
num_ /= 10000;
if (num_ < 1) num_ = 1;
} else if (name == Slice("fillseq")) {
fresh_db = true;
method = &Benchmark::WriteSeq;
} else if (name == Slice("fillbatch")) {
Expand Down Expand Up @@ -702,6 +707,14 @@ class Benchmark {
}
}

void OpenBench(ThreadState* thread) {
for (int i = 0; i < num_; i++) {
delete db_;
Open();
thread->stats.FinishedSingleOp();
}
}

void WriteSeq(ThreadState* thread) {
DoWrite(thread, true);
}
Expand Down

2 comments on commit 77948e7

@ddnn55
Copy link

@ddnn55 ddnn55 commented on 77948e7 Oct 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(noob question) In fact, what is the cost of opening the database? If we have ~10,000 key/value pairs, can we expect to open the DB and extract a single value in less than 100ms? 1s?

@cmumford
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current release (1.18) will compact the database upon open, so the time will depend on how much data is in the log, and (of course) the speed of your hardware. There's a new manifest reuse feature (in the reuse-manifest branch), but there appears to be a bug that slightly increases corruption, so it's still experimental and not in master - until fixed.

I think that 100ms is unlikely (unless log is empty), but 1s is likely (but not guaranteed) on modern hardware.

Please sign in to comment.