Skip to content

A simpl(istic) Java implementation of the Elias-Fano compression schema

License

Notifications You must be signed in to change notification settings

catenamatteo/eliasfano

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

eliasfano

A simpl(istic) Java implementation of the Elias-Fano compression schema, a technique for compressing arrays of monotonically increasing integers. The Elias-Fano compression schema permits to decompress the i-th element in the compressed data, without decompressing the whole array. Similarly, it permits to find the index of the first element in the compressed data which is greater or equal to a given value -- without decompressing the whole array.

Usage

int[] a = ...; //an array of monotonically increasing integers;
//compress the array
byte[] compressed EliasFano.compress(a, 0, a.length, compressed, 0);
//decompress the array
int[] b = new int[a.length];
int u = a[a.length - 1]; //the maximum value in a;
int L = ef.getL(u, a.length); //the number of lower bits (see references)
EliasFano.decompress(compressed, 0, a.length, L, b, 0);
//get the value of the 4-th element in the compressed data
int val = EliasFano.get(compressed, 0, a.length, L, 3);
//get the index of the first element, in the compressed data, greater or equal than 1000
int idx = EliasFano.select(compressed, 0, a.length, L, 1000);

So, I can't use it to compress non-increasing arrays, isn't it?

Sure you can! You just need to transform your array into a monotonically increasing one, by adding to the i-th value the sum of the previous values in the array. Then, you can recompute the original i-th element doing i-th minus (i-1)-th value.

int[] a1 = ...; //a generic array
//make a2 monotonically increasing
int[] a2 = new int[a1.length];
a2[0] = a1[0];
for (int i = 1; i < a1.length; i++) a2[i]=a1[i]+a2[i-1];
//compress a2
byte[] compressed = EliasFano.compress(a2, 0, a2.length, compressed, 0);
//get the original i-th value of a1, as i-th minus (i-1)-th of a2
int u = a2[a2.length-1]; //the max value in a2
int L = ef.getL(u, a2.length);
int val = ef.get(compressed, 0, a2.length, L, i)-ef.get(compressed, 0, a2.length, L, i-1);

Dependecies

  • JUnit 4

References

For a general understanding of the Elias-Fano technique, see:
Sebastiano Vigna, "The Revenge of Elias and Fano" (link)

More advanced material:
Sebastiano Vigna, "Quasi-succinct indices", WSDM'13
Giuseppe Ottaviano and Rossano Venturini, "Partitioned Elias-Fano indexes", SIGIR'14

About

A simpl(istic) Java implementation of the Elias-Fano compression schema

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages