TABLE_DISCREPANCY is an executable C++ program, using double precision arithmetic, which computes bounds on the star discrepancy of an M-dimensional pointset contained in the unit hypercube.
The pointset is stored in a file, in the TABLE format.
The star discrepancy is a commonly cited statistic for determining how uniformly a pointset is distributed over a region. For convenience, this region is usually taken as the unit hypercube; TABLE_DISCREPANCY will assume that datasets under investigation are meant to fill up the unit hypercube.
If the pointset to be investigated actually lies in some other hypercube, a simply translation and rescaling may be enough to transform the data. This will probably NOT be satisfactory if the original region is rectangular, but has sides of different length, or if the region is not rectangular.
The discrepancy measures the worst error that would be made in estimating the area of a subregion of the hypercube by simply noting the fraction of the pointset contained in the subregion. If arbitrary subregions were allowed, then it would always be possible to make this error equal to 1 (just take the region consisting of the hypercube minus the pointset.) Since any "reasonable" area can be arbitrarily well approximated by rectangles, the star discrepancy calculation uses only rectangular subregions, whose sides are aligned with coordinates directions, and one of whose corners is at the origin.
Formally, the star discrepancy of a pointset of n points is symbolized by Dn* and defined as
Dn* = supremum ( P in I* ) | ( A(P,x) / n ) - lambda ( P ) |Here, I* is the set of all M-dimensional subintervals of the form [0,p1] x [0,p2] x ... x [0,ps] where every p is between 0 and 1; P is any such subinterval; lambda(P) is the volume of the subinterval, A(P,x) is the number of points of the point set x that occur in P, and n is the number of points in x.
Clearly, the star discrepancy is measuring how badly the pointset estimates the volume of a subinterval. This worst error is somewhere between 0 (absolutely no error ever) and 1 (totally missing the volume of the unit hypercube). A value of 0.25, for instance, means that there is a subinterval in the unit hypercube for which the difference between its true and estimated volumes is 0.25. (It might have a volume of 0.80, and be estimated at 0.55, for instance, or a volume of 0.05 that is estimated at 0.30.)
The original C program is by Eric Thiemard. Changes have been made to the program so that it compiles under C++, uses a simpler datafile format, and infers the dimensionality of the data from the information in the datafile.
TABLE is a file format which is used for the input and output of TABLE_BORDER
TABLE_BORDER is a C++ program that can read a TABLE file and add zero entries corresponding to a single layer of boundary data.
TABLE_DELAUNAY is an executable C++ program that reads a file of 2d point coordinates and computes the Delaunay triangulation.
TABLE_IO is a C++ library of routines that can read or write a TABLE file.
TABLE_LATINIZE is a C++ program that can read a TABLE file and write out a "latinized" version.
TABLE_QUALITY is a C++ program that can read a TABLE file and print out measures of the quality of dispersion of the points.
TABLE_UNBORDER is a C++ program can be used to remove the border from a table file.
TABLE_VORONOI is a C++ program that can read a TABLE file describing a set of 2D points, and print out information describing the Voronoi diagram of those points.
Eric Thiemard
table_discrepancy 0.001 10 halton_02_00010.txt
You can go up one level to the C++ source codes.