algorithm - calculating Lempel-Ziv (LZ) complexity (aka sequence complexity) of a binary string -
i need calculate lz-complexity of binary string. lz-complexity number of differencet substrings encountered stream viewed begining end. example:
s = 1001111011000010
marking in different substrings sequence complexity c(s) = 6: s = 1 / 0 / 01 / 1110 / 1100 / 0010 /
can guide me find simple solution that? sure there should straight-forward implementations well-known problem, have difficulty finding them. can done done constructing suffix tree or similar. if yes, how? , should do?
anyone knows of c/c++ source code accomplish task?
thanks in advance.
to clarify construction of tree suggested in answers. tree looks this?
o / \ o o / \ / \ o o o o / / o o
below quick example of how compute lz-complexity using tree. convenience - mine; not yours - code implements fixed-sized pre-allocated tree, , prime example of why void* pointers ugly use , difficult maintain. hand code in is, , lecturer shoot in face :)
#include <stdlib.h> #include <stdio.h> int lzcomplexity(char *p_binarysequence, int p_maxtreenodes) { void **patterntree; void **currentnode; void **nextfreenode; int nodecount; int sequenceindex; int currentdigit; nodecount = 0; patterntree = malloc(sizeof(void*) * (p_maxtreenodes << 1)); currentnode = patterntree; nextfreenode = patterntree + (sizeof(void*) << 1); currentnode[0] = null; currentnode[1] = null; sequenceindex = 0; while (p_binarysequence[sequenceindex]) { currentdigit = p_binarysequence[sequenceindex] - 48; if (null == currentnode[currentdigit]) { currentnode[currentdigit] = nextfreenode; nextfreenode[0] = null; nextfreenode[1] = null; nextfreenode += (sizeof(void*) << 1); currentnode = patterntree; nodecount++; } else { currentnode = currentnode[currentdigit]; } sequenceindex++; } free(patterntree); return nodecount; } int main(int argc, char *argv[]) { printf("%u\n", lzcomplexity("10100101001011101011", 1000)); return 0; }
Comments
Post a Comment