Currently, I have 6 curves shown in 6 different colors as below. <img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/AXflS.png" data-original="https://i.stack.imgur.com/AXflS.png" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /> The 6 curves are in fact generated by 6 trials of <strong>one same experiment</strong>. That means, ideally they should be the same curve, but due to the noise and different trial participants, they just look similar but not exactly the same.
Now I wish to create an algorithm that is able to identify that the 6 curves are essentially the same and cluster them together into one cluster. <strong>What similarity metrics should I use?</strong>
x-axisdoes <strong>NOT</strong> matter at all! I simply align them together for visual purpose. Thus, feel free to left/right shift the curves, if doing so helps.</li> <li>"Sub-curves" that are part of the curves may appear. The "belongingness" is important and thus needs identifying as well. But again, left/right shifting is allowed.</li> </ol>
I have attemped to learn some of the clustering algorithm, such as DBSCAN, K-means, Fuzzy C-means, etc. But I don't see their appropriateness in this case, because the "belongingness" needs to be spotted!
<em>Any suggestions or comments are well welcomed. I understand that it is hard to give some exact solutions to this question. I am only expecting some enlightening suggestions here.</em>Answer1:
Have a look at <strong>time series similarity functions</strong>, such as dynamic time warping.
They can be used with e.g. DBSCAN but NOT with k-means (you cannot compute a reasonable "mean" for these distances; k-means is really designed for squared Euclidean distances).