KMP的python实现,KMPpython实现,KMP算法的精髓在于,在


KMP算法的精髓在于,在匹配不一致的时候,根据模式中前缀和后缀的重合状况,做适度的回溯。而不是像普通的字符串搜索操作的那样,完全回退。

需要回溯

普通方式:

1)

ababacababae

ababae

2)

ababacababae

ababae

KMP方式

1)

ababacababae

ababae

2)

ababacababae

ababae

不需要回溯

1)

abcefg

efg

2)

abcefg

efg

可见字符串匹配之所以需要回溯, 是因为模式(pattern)里有重叠的部分。

next数组的问题

KMP算法需要预设next数组,求next数组,可以看成自我匹配的过程。

#!/usr/bin/pythondef getNext(pattern, next):        j = 0        plen = len(pattern)        next.append(0)        for i in range(1, plen):                while ((j > 0) and (pattern[j] != pattern[i])):                        j = next[j-1]                if (pattern[i] == pattern[j]):                        j = j + 1                next.append(j)def kmp_search(text, pattern, next):        j = 0;        plen = len(pattern)        tlen = len(text)        for i in range(0, tlen):                while j > 0 and text[i] != pattern[j]:                        j = next[j-1]                if text[i] == pattern[j] :                        j = j + 1;                if j == plen :                    print "from ", i-j+1, "to ", i                    j = next[j-1]text = ("aaaaaaaaaaaaaaaaaaaab")pattern = ("aaab")next = []text1 = ("abababababababababa")pattern1 = ("ababab")next1 = []getNext(pattern, next)print nextkmp_search(text, pattern, next)print textprint patterngetNext(pattern1, next1)print next1kmp_search(text1, pattern1, next1)print text1print pattern1#该片段来自于http://byrx.net

评论关闭