Commit d54f3cdc authored by He Guanlin's avatar He Guanlin
Browse files

Update README.md

parent a0f843e5
......@@ -13,7 +13,7 @@ In particular, for both implementations we use a two-step summation method with
## "main.h" Configuration
The configuration for benchmark dataset, block size, etc., are adjustable in the _main.h_ file.
Our CUDA C code does NOT generate any synthetic data, so users should specify the path and filename of their benchmark dataset in the `INPUT_DATA` constant, and also give the `NbPoints`, `NbDims`, `NbClusters`. If users want to impose initial centroids, they should provide a text file and specifiy the corresponding path and filename in the `INPUT_INITIAL_CENTROIDS` constant.
Our k-means code does NOT generate any synthetic data, so your need to give the path and filename of your benchmark dataset in the `INPUT_DATA` constant, and also specifiy the `NbPoints`, `NbDims`, `NbClusters`. If you want to impose initial centroids, you need to provide a text file and specifiy the corresponding path and filename in the `INPUT_INITIAL_CENTROIDS` constant.
The synthetic dataset used in our papers below is too large (about 1.8GB) to be loaded here. So we provide the _Synthetic_Data_Generator.py_ instead. Since the generator uses the random function, the dataset generated each time will have different values but will always keep the same distribution.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment