Revision: 18603
Updated Code
at October 4, 2009 00:50 by freephys
Updated Code
>>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,6]): print s 3 >>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,7]): print s (nothing) Here is another KMP implementation: def seq_in_seq(seq1,seq2): ''' Return the index where seq1 appears in seq2, or -1 if seq1 is not in seq2, using the Knuth-Morris-Pratt algorithm based heavily on code by Neale Pickett <[email protected]> found at: woozle.org/~neale/src/python/kmp.py >>> seq_in_seq(range(3),range(5)) 0 >>> seq_in_seq(range(3)[-1:],range(5)) 2 >>>seq_in_seq(range(6),range(5)) -1 ''' def compute_prefix_function(p): m = len(p) pi = [0] * m k = 0 for q in xrange(1, m): while k > 0 and p[k] != p[q]: k = pi[k - 1] if p[k] == p[q]: k = k + 1 pi[q] = k return pi t,p = list(tee(seq2)[0]), list(tee(seq1)[0]) m,n = len(p),len(t) pi = compute_prefix_function(p) q = 0 for i in range(n): while q > 0 and p[q] != t[i]: q = pi[q - 1] if p[q] == t[i]: q = q + 1 if q == m: return i - m + 1 return -1
Revision: 18602
Initial Code
Initial URL
Initial Description
Initial Title
Initial Tags
Initial Language
at October 4, 2009 00:46 by freephys
Initial Code
>>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,6]): print s 3 >>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,7]): print s (nothing)
Initial URL
http://stackoverflow.com/questions/425604?sort=oldest#sort-top
Initial Description
This is a generalization of the "string contains substring" problem to (more) arbitrary types. Given an sequence (such as a list or tuple), what's the best way of determining whether another sequence is inside it? As a bonus, it should return the index of the element where the subsequence starts: Example usage (Sequence in Sequence): >>> seq_in_seq([5,6], [4,'a',3,5,6]) 3 >>> seq_in_seq([5,7], [4,'a',3,5,6]) -1 # or None, or whatever I second the Knuth-Morris-Pratt algorithm. By the way, your problem (and the KMP solution) is exactly recipe 5.13 in Python Cookbook 2nd edition. You can find the related code at http://code.activestate.com/recipes/117214/ It finds all the correct subsequences in a given sequence, and should be used as an iterator: For larger ones, look at the Aho-Corasick algorithm.
Initial Title
Determine if a Sequence is in another sequence in Python : solution 1
Initial Tags
Initial Language
Python