VAN_DER_CORPUT_DATASET
Generate van der Corput Datasets


VAN_DER_CORPUT_DATASET is a C++ program which creates a van der Corput sequence dataset and writes it to a file.

The program is interactive, and allows the user to choose the parameters that define the sequence.

The NDIM-dimensional Halton sequence is derived from the 1-dimensional van der Corput sequence by using a set of different (usually distinct prime) bases for each dimension, and the Hammersley sequence is derived in almost the same way.

The van der Corput sequence is often used to generate a "subrandom" sequence of points which have a better covering property than pseudorandom points.

The van der Corput sequence generates a sequence of points in [0,1] which (theoretically) never repeats. Except for SEED = 0, the elements of the van der Corput sequence are strictly between 0 and 1.

The van der Corput sequence writes an integer in a given base B, and then its digits are "reflected" about the decimal point. This maps the numbers from 1 to N into a set of numbers in [0,1], which are especially nicely distributed if N is one less than a power of the base.

Hammersley suggested generating a set of N nicely distributed points in two dimensions by setting the first component of the Ith point to I/N, and the second to the van der Corput value of I in base 2.

Halton suggested that in many cases, you might not know the number of points you were generating, so Hammersley's formulation was not ideal. Instead, he suggested that to generate a nicely distributed sequence of points in M dimensions, you simply choose the first M primes, P(1:M), and then for the J-th component of the I-th point in the sequence, you compute the van der Corput value of I in base P(J).

Thus, to generate a Halton sequence in a 2 dimensional space, it is typical practice to generate a pair of van der Corput sequences, the first with prime base 2, the second with prime base 3. Similarly, by using the first K primes, a suitable sequence in K-dimensional space can be generated.

The generation is quite simple. Given an integer SEED, the expansion of SEED in base BASE is generated. Then, essentially, the result R is generated by writing a decimal point followed by the digits of the expansion of SEED, in reverse order. This decimal value is actually still in base BASE, so it must be properly interpreted to generate a usable value.

Here is an example in base 2:
SEED (decimal) SEED (binary) VDC (binary) VDC (decimal)
00.00.0
11.10.5
210.010.25
311.110.75
4100.0010.125
5101.1010.625
6110.0110.375
7111.1110.875
81000.00010.0625

Related Data and Programs:

SEQUENCE_STREAK_DISPLAY is a MATLAB program that can make a "streak file" of a van der Corput sequence.

TABLE is a file format which is used for the output of VAN_DER_CORPUT_DATASET.

TABLE_DISCREPANCY is an C++ program which can read a TABLE file of points (presumed to lie in the unit hypercube) and compute bounds on the star discrepancy, a measure of dispersion.

VAN_DER_CORPUT is a C++ library of routines needed to compute the dataset. A compiled copy of this code must be available when building VAN_DER_CORPUT_DATASET.

VAN_DER_CORPUT is a dataset directory which contains datasets of van der Corput sequences.

VAN_DER_CORPUT_DATASET is available in a C++ version and a FORTRAN90 version and a MATLAB version.

Reference:

  1. Johannes van der Corput,
    Verteilungsfunktionen I & II,
    Nederl. Akad. Wetensch. Proc.,
    Volume 38, 1935, pages 813-820, pages 1058-1066.

Source Code:

Examples and Tests:

List of Routines:

You can go up one level to the C++ source codes.


Last revised on 31 August 2005.