SimTK: Clustering Algorithms for Massively Parallel Architectures Including GPU Nodes: [#2083] Incorrect computing of distance in assignToClusters

Share
Follow	Project Follow publicly Follow privately Unfollow See followers (0) List Map More info

Trackers

Date:
2015-05-10 18:56

Priority:
3

State:
Open

Submitted by:
Andrey Alekseenko (aland)

Assigned to:
Nobody (None)

Summary:
Incorrect computing of distance in assignToClusters_KMCUDA

Detailed description
In assignToClusters_KMCUDA kernel, the coalesced prefetch is working incorrectly, because it assumes that all blockDim.x threads are active, while in the last block some might stall due to (t<N) condition. The line campaign/trunk/dev/kmeansGPU/kmeansGPU.cu:159 in the last block gets executed only for (N%blockDim.x) threads, which leads to incorrect loading of cluster centroid coordinates into shared memory if D>(N%blockDim.x). The smoke test does not catch it because in test dataset D=1, therefore even one active thread in block is enough to load centroid coordinate. I attach an example of possible patch, albeit it was not tested extensively.
In assignToClusters_KMCUDA kernel, the coalesced prefetch is working incorrectly, because it assumes that all blockDim.x threads are active, while in the last block some might stall due to (t<N) condition. The line campaign/trunk/dev/kmeansGPU/kmeansGPU.cu:159 in the last block gets executed only for (N%blockDim.x) threads, which leads to incorrect loading of cluster centroid coordinates into shared memory if D>(N%blockDim.x). The smoke test does not catch it because in test dataset D=1, therefore even one active thread in block is enough to load centroid coordinate. I attach an example of possible patch, albeit it was not tested extensively.

Add A Comment:

Message
Date: 2015-05-10 18:59 Sender: Andrey Alekseenko The file wasn't uploaded due to some cryptic error, so uploaded it to pastebin: http://pastebin.com/WbWn7J6s

No Changes Have Been Made to This Item