Labels

Wednesday, March 25, 2015

Regular Expression Matching (Advanced)

def Match(string, pattern):
        '*' star matches zero or more preceding characters
       '.' each dot matches exactly one character
       '^' prefix match, meaning that input string must start with exactly the same string following '^', e.g., '^abc' will match 'abcdefg', but 'cdefg' doesn't.
        '$' suffix match, meaning that input string must end with exactly the same string following '$', e.g., '$abc' will match 'cdefgabc', but 'abcdefg' doesn't
        without any special character: return True if pattern is in string (contains, not exactly same), and False otherwise
    # Your code here


Naive Way: This question is really hard in that it requires to match exactly part of the string. And it doesn't need to start from beginning of the string. I am assuming that ^ and $ will always appear as the first character of pattern and either of them will appear or nor of them appear.

My brute force way to match from every possible position. Use a recursive matching.

 public class Solution{  
   
      public static void main(String args[]){  
           Solution s = new Solution();  
           System.out.println(s.match("aabab",".*bab"));  
      }  
   
      public boolean match(String string, String pattern){  
           // edge case  
           if(pattern==null || pattern.length()==0) return string==null || string.length()==0;  
           if(string==null || string.length()==0) return pattern==null || pattern.length()==0;  
   
           // get suffix and prefix  
           boolean isPrefixed = false, isSuffixed = false;  
           if(pattern.charAt(0)=='^') isPrefixed = true;  
           if(pattern.charAt(0)=='$') isSuffixed = true;  
   
           // main process  
           char[] s = string.toCharArray();  
           char[] p = pattern.toCharArray();  
           if(isPrefixed) return match(s, 0, p, 1, -1);  
           else if(isSuffixed){  
                for(int i = 1;i < s.length;i++)  
                     if(match(s,i,p,1,1)) return true;  
           }else{  
                for(int i = 0;i < s.length;i++){  
                     if(i==0 && match(s,i,p,0,-1)) return true;  
                     else if(i!=0 && match(s,i,p,0,0)) return true;  
                }  
           }  
           return false;  
      }  
   
      private boolean match(char[] string, int index1, char[] pattern, int index2, int matchEnd){  
           // ending case  
           if(index2 == pattern.length){  
                if(matchEnd==1) return index1==string.length;  
                else if(matchEnd==-1) return index1 < string.length;  
                else return true;  
           }  
           if(index1 < 0 || index1 >= string.length || index2 >= pattern.length) return false;  
             
   
           int i = index1, j = index2;  
           if(string[i] == pattern[j] || pattern[j] == '.') return match(string, i+1, pattern, j+1, matchEnd);  
           else if(j+1 < pattern.length && pattern[j+1] == '*') return match(string, i, pattern, j+2, matchEnd);  
           else if(pattern[j] == '*'){  
                int u = i-2;  
                while(u < string.length && (u==i-2 || string[u] == pattern[j-1] || pattern[j-1] == '.'))  
                     if(match(string, u++, pattern, j+1, matchEnd)) return true;  
           }  
           return false;  
      }  
 }  

No comments:

Post a Comment