Longest Common Subsequence Similarity (LCSS)

The basic idea is to match two sequences by allowing some elements to be unmatched or left out. (Sankoff and Kruskal 1983). Given a sequence C(m), and a sequence Q(n), find a sequence Z, such that Z is the longest sequence that is both a subsequence of C, and a subsequence of Q, The subsequence is defined as a sequence Z(k)m where there exists a strictly increasing sequence i = 1,… k of indices of C such for all j = 1… k; Cij = Zj.

8 0, _if _i = 0 _or_j = 0

Cij = Ci_1 j_1 + 1, _if_i, j > 0, Qi = Cj (3)

max{Ci-1j, Cij-1} ,_if_i, j > 0, Qi ф Cj

Dissimilarity between C and Q

m C n — 2l

LCSS (C, Q) = – – – (4)

m C n

Where L is the length of the longest common subsequence.

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>