Even though @elyase answer is accepted, I think it is not the correct approach.
Yes, to use such distance you have to refine your distance measure and so - use different library. But what is more important - concept of mean used in k-means won't suit the cyclic dimension. Lets consider following example:
#current cluster X,, based on centroid position Xc=24
x1=1
x2=24
#current cluster Y, based on centroid position Yc=10
y1=12
y2=13
computing simple arithmetic mean will place the centoids in Xc=12.5
,Yc=12.5
, which from the point of view of cyclic meausre is incorect, it should be Xc=0.5
,Yc=12.5
. As you can see, asignment based on the cyclic distance measure is not "compatible" with simple mean operation, and leads to bizzare results.
- Simple k-means will result in clusters
{x1,y1}, {x2,y2}
- Simple k--means + distance measure result in degenerated super cluster
{x1,x2,y1,y2}
- Correct clustering would be
{x1,x2},{y1,y2}
Solving this problem requires checking one if (whether it is better to measure "simple average" or by representing one of the points as x'=x-24
). Unfortunately given n
points it makes 2^n
possibilities.
This seems as a use case of the kernelized k-means, where you are actually clustering in the abstract feature space (in your case - a "tube" rolled around the time dimension) induced by kernel ("similarity measure", being the inner product of some vector space).
Details of the kernel k-means are given here