Given a string S, we are to find the longest sub-string s of S such that the reverse of s is exactly the same as s.
First insert a special character ‘#’ between each pair of adjacent characters of S, in front of S and at the back of S. After that, we only need to check palindrome sub-strings of odd length.
Let P[i] be the largest integer d such that S[i-d,…,i+d] is a palindrome. We calculate all P[i]s from left to right. When calculating P[i], we have to compare S[i+1] with S[i-1], S[i+2] with S[i-2] and so on. A comparison is successful if two characters are the same, otherwise it is unsuccessful. In fact, we can possibly skip some unnecessary comparisons utilizing the previously calculated P[i]s.
Assume P[a]+a=max{ P[j]+j : j<i }. If P[a]+a >= i, then we have
P[i] >= min{ P[2*a-i], 2*a-i-(a- P[a])}.
Is it the algorithm linear time? The answer is yes.
First the overall number of unsuccessful comparisons is obviously at most N.
A more careful analysis show that S[i] would never be compared successfully with any S[j](j<i) after its first time successful comparison with some S[k] (k<i).
So the number of overall comparisons is a most 2N.
First insert a special character ‘#’ between each pair of adjacent characters of S, in front of S and at the back of S. After that, we only need to check palindrome sub-strings of odd length.
Let P[i] be the largest integer d such that S[i-d,…,i+d] is a palindrome. We calculate all P[i]s from left to right. When calculating P[i], we have to compare S[i+1] with S[i-1], S[i+2] with S[i-2] and so on. A comparison is successful if two characters are the same, otherwise it is unsuccessful. In fact, we can possibly skip some unnecessary comparisons utilizing the previously calculated P[i]s.
Assume P[a]+a=max{ P[j]+j : j<i }. If P[a]+a >= i, then we have
P[i] >= min{ P[2*a-i], 2*a-i-(a- P[a])}.
Is it the algorithm linear time? The answer is yes.
First the overall number of unsuccessful comparisons is obviously at most N.
A more careful analysis show that S[i] would never be compared successfully with any S[j](j<i) after its first time successful comparison with some S[k] (k<i).
So the number of overall comparisons is a most 2N.
It turns out that this algorithm is called Manacher’s algorithm.
@stone: hi, can you please clarify what ‘a’ is in P[a] ?
Thanks
paul
a=argmax{ P[j]+j | j<i }
Hi,
Your article seems quite interesting. Would you mind to explain this idea a bit with a trivial example? I tried hours to comprehend your recurrence to compute the longest palindromic substring; but failed 😦
So, your help would be highly appreciated. Thanks.
-Josh
Hi,
Here is a simple example, for string “a#b#a#a#b#a”. Hope it will help you.
At position #0: P[0]=0, the longest palindrome is “a”.
At position #1: max of {P[0]+0} is 0, so we just check the “a#b” to determine P[1]. It turns out that P[1]=0
At position #2: max of { P[0]+0, P[1]+1} is P[1]+1=1, so we just check “#b#”,”a#b#a” to determine P[1]. It turns out that P[2]=2.
At position #3: max of { P[0]+0,P[1]+1,P[2]+2} is P[2]+2=4. We can know that p[3]>=Min{ P[2*2-3], 2*2-3-(2-P[2]) } = 0, so we check “b#a” to determine that P[3]=0
At position #4: max of { P[0]+0,P[1]+1,P[2]+2,P[3]+3} is P[2]+2=4. So P[4]>=Min{ P[2*2-4], 2*2-4-(2-P[2]) } = 0, we check “#a#” to determine that P[4]=1
At postion #5: max of { P[0]+0,P[1]+1,P[2]+2,P[3]+3, P[4]+4} is P[4]+4=5, so P[5]>=Min{P[2*4-5], 2*4-5-(4-P[4]) }=0, we check “a#a”,…,”a#b#a#a#b#a” to determine that P[5]=5.
At position #6: max of {P[0]+0,P[1]+1,P[2]+2,P[3]+3, P[4]+4, P[5]+5} is P[5]+5=10, so P[6]>=Min{ P[2*5-6], 2*5-6-(5-P[5])} =1, we check “a#a#b” to determine that P[6]=1. (here we don’t need to check “#a#”)
At position #7: max of {P[0]+0,P[1]+1,P[2]+2,P[3]+3, P[4]+4, P[5]+5,P[6]+6} is P[5]+5=10, so P[7]>=Min{ P[2*5-7], 2*5-7-(5-P[5]) }=0, we check “a#b” to determine that P[7]=0.
At position #8: max of {P[0]+0,P[1]+1,P[2]+2,P[3]+3, P[4]+4, P[5]+5,P[6]+6,…} is P[5]+5=10, so P[8]>=Min{ P[2*5-8], 2*5-8-(5-P[5])} =2,
we don’t need check any substring due to the boundary.
……
I am still confused …
how you come to this conclusion ::
Assume P[a]+a=max{ P[j]+j : j= i, then we have
P[i] >= min{ P[2*a-i], 2*a-i-(a- P[a])}.
[…] 原文地址: https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… 其实原文说得是比较清楚的,只是英文的,我这里写一份中文的吧。 […]
@stone: 1.should NOT the string be “#a#b#a#a#b#a#” instead of “a#b#a#a#b#a”?
2. also doubt asked by SG
Thanks
[…] O(N) 时间求字符串的最长回文子串 (Best explanation if you can read Chinese) » A simple linear time algorithm for finding longest palindrome sub-string » Finding Palindromes » Finding the Longest Palindromic Substring in Linear Time […]
[…] http://acm.hust.edu.cn:8080/judge/problem/viewSource.action?id=140283 https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… http://www.geeksforgeeks.org/archives/19155 所以自己总接一下备忘。简单描述: […]
[…] The algorithm is called Manacher Algorithm. You can find more details about the algorithm in this page […]
you are a legend :D, can’t find a better explanation than this one.
[…] (Best explanation if you can □□□□ Chinese) » A simple linear □□□□ algorithm for finding longest palindrome sub-string » Finding Palindromes » Finding the □□□□□□□ […]
[…] http://www.felix021.com/blog/page.php?pageid=6 http://blog.csdn.net/ggggiqnypgjg/article/details/6645824 https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… […]
[…] 源于这两篇文章: http://blog.csdn.net/ggggiqnypgjg/article/details/6645824 https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… […]
[…] 题目链接:http://poj.org/problem?id=3974 在这里我不打算详细介绍Manacher算法,因为网上有太多太多的启蒙教程。只是提出两个最关键的点来加深理解: 关于Manacher算法的两个核心思想: 1. 通过在原字符串中插入特殊字符的方式将其长度变为奇数。 2. 利用对称性压缩计算,达到线性时间复杂度。 比如原本的字符串是xabbba5,那么可以将这个字符串转化成#x#a#b#b#b#a#5#,然后处理这个新的字符串。 然后用一个数组 P[i] 来记录以字符S[i]为中心的最长回文子串向左/右扩张的长度,计算过程和对对称性的利用请参考 华山大师兄 的这篇博文: http://www.cnblogs.com/biyeymyhjob/archive/2012/10/04/2711527.html 讲得非常透彻,图文并茂。 如何严格证明Manacher算法是线性时间复杂度的,可以在下面的这篇博客中了解: A simple linear time algorithm for finding longest palindrome sub-string […]
[…] https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pal… […]
[…] A blog about Manacher’s Algorithm […]
[…] 首先给出一篇很好的文章:A simple linear time algorithm for finding the longest palindrome substring […]
[…] 如何在O(n)时间内处理字符串以每个位置为中心的最长回文。这里转载一个Manacher算法的论文翻译。原文地址:https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… 其实原文说得是比较清楚的,只是英文的,我这里写一份中文的吧。 首先:大家都知道什么叫回文串吧,这个算法要解决的就是一个字符串中最长的回文子串有多长。这个算法可以在O(n)的时间复杂度内既线性时间复杂度的情况下,求出以每个字符为中心的最长回文有多长, 这个算法有一个很巧妙的地方,它把奇数的回文串和偶数的回文串统一起来考虑了。这一点一直是在做回文串问题中时比较烦的地方。这个算法还有一个很好的地方就是充分利用了字符匹配的特殊性,避免了大量不必要的重复匹配。 算法大致过程是这样。先在每两个相邻字符中间插入一个分隔符,当然这个分隔符要在原串中没有出现过。一般可以用‘#’分隔。这样就非常巧妙的将奇数长度 回文串与偶数长度回文串统一起来考虑了(见下面的一个例子,回文串长度全为奇数了),然后用一个辅助数组P记录以每个字符为中心的最长回文串的信息。 P[id]记录的是以字符str[id]为中心的最长回文串,当以str[id]为第一个字符,这个最长回文串向右延伸了P[id]个字符。 原串: w aa bwsw f d 新串: # w # a # a # b # w # s # w # f # d #辅助数组P: 1 2 1 2 3 2 1 2 1 2 1 4 1 2 1 2 1 2 1 这里有一个很好的性质,P[id]-1就是该回文子串在原串中的长度(包括‘#’)。如果这里不是特别清楚,可以自己拿出纸来画一画,自己体会体会。当然这里可能每个人写法不尽相同,不过我想大致思路应该是一样的吧。 好,我们继续。现在的关键问题就在于怎么在O(n)时间复杂度内求出P数组了。只要把这个P数组求出来,最长回文子串就可以直接扫一遍得出来了。 由于这个算法是线性从前往后扫的。那么当我们准备求P[i]的时候,i以前的P[j]我们是已经得到了的。我们用mx记在i之前的回文串中,延伸至最右 端的位置。同时用id这个变量记下取得这个最优mx时的id值。(注:为了防止字符比较的时候越界,我在这个加了‘#’的字符串之前还加了另一个特殊字符 ‘$’,故我的新串下标是从1开始的)好,到这里,我们可以先贴一份代码了。复制代码void pk(){ int i; int mx = 0; int id; for(i=1; i<n; i++) { if( mx > i ) p[i] = MIN( p[2*id-i], mx-i ); else p[i] = 1; for(; str[i+p[i]] == str[i-p[i]]; p[i]++) ; if( p[i] + i > mx ) { mx = p[i] + i; id = i; } }} […]
[…] 原文地址:https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… 其实原文说得是比较清楚的,只是英文的,我这里写一份中文的吧。 首先:大家都知道什么叫回文串吧,这个算法要解决的就是一个字符串中最长的回文子串有多长。这个算法可以在O(n)的时间复杂度内既线性时间复杂度的情况下,求出以每个字符为中心的最长回文有多长, 这个算法有一个很巧妙的地方,它把奇数的回文串和偶数的回文串统一起来考虑了。这一点一直是在做回文串问题中时比较烦的地方。这个算法还有一个很好的地方就是充分利用了字符匹配的特殊性,避免了大量不必要的重复匹配。 算法大致过程是这样。先在每两个相邻字符中间插入一个分隔符,当然这个分隔符要在原串中没有出现过。一般可以用‘#’分隔。这样就非常巧妙的将奇数长度回文串与偶数长度回文串统一起来考虑了(见下面的一个例子,回文串长度全为奇数了),然后用一个辅助数组P记录以每个字符为中心的最长回文串的信息。P[id]记录的是以字符str[id]为中心的最长回文串,当以str[id]为第一个字符,这个最长回文串向右延伸了P[id]个字符。 原串: w aa b wsw f d 新串: # w # a # a # b # w # s # w # f # d #辅助数组P: 1 2 1 2 3 2 1 2 1 2 1 4 1 2 1 2 1 2 1 这里有一个很好的性质,P[id]-1就是该回文子串在原串中的长度(包括‘#’)。如果这里不是特别清楚,可以自己拿出纸来画一画,自己体会体会。当然这里可能每个人写法不尽相同,不过我想大致思路应该是一样的吧。 好,我们继续。现在的关键问题就在于怎么在O(n)时间复杂度内求出P数组了。只要把这个P数组求出来,最长回文子串就可以直接扫一遍得出来了。 由于这个算法是线性从前往后扫的。那么当我们准备求P[i]的时候,i以前的P[j]我们是已经得到了的。我们用mx记在i之前的回文串中,延伸至最右端的位置。同时用id这个变量记下取得这个最优mx时的id值。(注:为了防止字符比较的时候越界,我在这个加了‘#’的字符串之前还加了另一个特殊字符‘$’,故我的新串下标是从1开始的)好,到这里,我们可以先贴一份代码了。 […]
[…] 源于这两篇文章: http://blog.csdn.net/ggggiqnypgjg/article/details/6645824https://zhuhongcheng.wordpress.com/2009/08/02/a-simple-linear-time-algorithm-for-finding-longest-pali… […]