Commit 9ce48c29 authored by He Guanlin's avatar He Guanlin
Browse files

Update README.md

parent a9813bd2
## CPU-GPU-kmeans
Optimized parallel implementations of the k-means clustering algorithm:
1. on multi-core CPU with vector units: thread parallelization using OpenMP, auto-vectorization using AVX units
2. on NVIDIA GPU: using shared memory, dynamic parallelism, and multiple streams
......@@ -13,12 +12,16 @@ In particular, for both implementations we use a two-step summation method with
## "main.h" Configuration
The configuration for benchmark dataset, block size, etc., are adjustable in the "main.h" file.
Our CUDA C code does not generate any synthetic data, so users should specify the path and filename of their benchmark dataset in the "INPUT_DATA" constant, and also give the NbPoints, NbDims, NbClusters. If users want to impose the initial centroids, they should provide a text file containing the coordinates of initial centroids and specifiy the corresponding path and filename in the "INPUT_INITIAL_CENTROIDS" constant.
The synthetic dataset used in our papers below is too large (about 1.8GB) to be loaded here. So we provide the Synthetic_Data_Generator.py instead. Since the generator uses the random function, the dataset generated each time will have different values but will always keep the same distribution.
## Execution
Before execution, recompile the code by entering the "make" command if any change has been made to the code.
Then you can run the executable file "kmeans" with several arguments:
-t <GPU|CPU>: run computations on target GPU or on target CPU (default: GPU)
-cpu-nt <int>: number of OpenMP threads (default: 1)
-max-iters <int>: maximal number of iterations (default: 200)
......@@ -29,11 +32,9 @@ k-means on CPU: ./kmeans -t CPU -cpu-nt 20
k-means on GPU: ./kmeans
## Corresponding papers
The approaches and experiments are documented in the following two papers. The second paper is an extended version of the first paper.
The approaches and experiments are documented in the following paper.
He, G., Vialle, S., & Baboulin, M. (2021). Parallelization of the k-means algorithm in a spectral clustering chain on CPU-GPU platforms. In B. B. et al. (Ed.), Euro-par 2020: Parallel processing workshops (Vol. 12480, LNCS, pp. 135–147). Warsaw, Poland: Springer. Available from: https://link.springer.com/chapter/10.1007/978-3-030-71593-9_11
He, G., Vialle, S., & Baboulin, M. (2021, revised & resubmitted). Parallel and accurate k-means algorithm on CPU-GPU architectures for spectral clustering. Concurrency and Computation: Practice and Experience. Wiley.
If you find any part of this project useful for your scientific research, please cite the papers mentioned above.
If you find any part of this project useful for your scientific research, please cite the paper mentioned above.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment