• Announcements

Archived

This topic is now archived and is closed to further replies.

Finding one string inside another

Recommended Posts

char string1[] = "Programmming"; char string2[] = "gram"; how would i find where string2 is inside string1? i know i could use the find() function but i need to write my own function to do it and i''m stumped!! can somebody please help? thanks NiNo

Share on other sites
Is this a do my homework for me post? Otherwise I see no reason not to use string::find () or strstr ()

Hint: Loop though each character in string1 and look for the first character in string2. If you find the first character, check the next character, etc., etc.

Share on other sites
no its not a "do my homewotrk for me question" its a "i''ve been trying to figure this out for hours and i''m stuck and its due tomorrow" question

i know i should loop through like you said but i just cant figure out how to do it right for some reason

NiNo

Share on other sites
Here is the source of Microsoft''s strstr function:

  char * __cdecl strstr ( const char * str1, const char * str2 ){ char *cp = (char *) str1; char *s1, *s2; if ( !*str2 ) return((char *)str1); while (*cp) { s1 = cp; s2 = (char *) str2; while ( *s1 && *s2 && !(*s1-*s2) ) s1++, s2++; if (!*s2) return(cp); cp++; } return(NULL);}

Share on other sites
quote:
Original post by nino
no its not a "do my homewotrk for me question" its a "i''ve been trying to figure this out for hours and i''m stuck and its due tomorrow" question

Hmmm... It''s not homework, but it''s DUE tomorrow?
Oops...

Share on other sites
Because its due tomorrow doesn''t qualify it as a homework assignment. It makes it a life and death issue

Share on other sites
There are several ways of doing this. The first approach is the one detailed by JY, and is a brute force approach. A more efficient approach would be to use the Knuth-Morris-Prath or Boyer-Moore algormithms. Both of these depends on first examining the pattern (the string you are looking for) for patterns. For natural language texts (e.g. english) BM is usually significantly more efficient than both other methods.

Here is some source-code for BM (searching ASCII-strings)
  int BM_Search (char P[], int plen, char T[], int tlen) { // Returns the index of the first occurence of P in T // Returns -1 if P is not found in T int i, j; int skip[255]; for (i = 0; i < 255; i++) { for (j = 0; j < plen; j++) skip[P[j]] = plen - j - 1; } i = plen-1; j = plen-1; while (j >= 0 && i < tlen) { if (T[i] == P[j]) { i--; j--; } else { if (plen - j > skip[T[i]]) i = i + plen - j; else i = i + skip[T[i]]; j = plen - 1; } } if (j == -1) return (i+1); else return -1;}

Without going in to too much detail on how this algorithm works it basically looks at the character in the text on which a mismatch occurs, and moves the pattern forward until the last occurence of that character in the pattern is lined up with it.
Oh, and it also checks from the end of the pattern backwards, rather than from the start of the pattern forwards.

Maybe a small example would help a little.
Assuming the text-string you are searcing in is
I REALLY LOVE DEBUGGING MY CODE
and the string you are searching for is
MY CODE

  First, the algorithm creates a skip-array that will look like thisskip[''E''] = 0skip[''D''] = 1skip[''O''] = 2skip[''C''] = 3skip['' ''] = 4skip[''Y''] = 5skip[''M''] = 6all other skips have the value 7Starting from the start we getI REALLY LOVE DEBUGGING MY CODEMY CODEStarting at the end of the pattern we get a mismatch on L (in the text).Since skip[''L''] is 7 we move the pattern 7 positions to the right, thus gettingI REALLY LOVE DEBUGGING MY CODE MY CODEAt this point we get a mismatch on '' '' (in the text).Since skip['' ''] is 4 we move the pattern 4 positions, gettingI REALLY LOVE DEBUGGING MY CODE MY CODENote that this skip moved the last occurence of '' '' in the pattern to match the '' '' we just found in the text. This is really the meat and bones of the algorithm, as whenever we find a character in the text we know we have in the pattern, we move the pattern so that that character matches in the pattern, and if we find a character in the text that isn''t in the pattern we move the pattern just past that point in the text.Anyway, that position above gives us a mismatch on ''U'' which doesn''t exist in our pattern, so again we move 7 positions gettingI REALLY LOVE DEBUGGING MY CODE MY CODEskip[''M''] is 6, so we move 6 spaces and getI REALLY LOVE DEBUGGING MY CODE MY CODEAnd we''re done.This algorithm is clearly a lot faster than a simple brute-force approach as the pattern is literally flying along often moving 7 spaces at the time without comparisons, while the brute-force approach would never move more than 1 before comparing again.

-Neophyte

- Death awaits you all with nasty, big, pointy teeth. -

Share on other sites
Anyway, the AP above was me, and boy! was that formatting ever screwed up. Sorry about that.

-Neophyte

- Death awaits you all with nasty, big, pointy teeth. -

Share on other sites
Looks like bullshit to me. The skip local array is never initialized, and the for i loop has no use.

Y.

Share on other sites
I think I would with the tried and tested method. It''s perfectly efficient enough.

Share on other sites
quote:
Original post by Ysaneya
Looks like bullshit to me. The skip local array is never initialized, and the for i loop has no use.

Y.

Doesn't matter what it looks like to you. It works.

But you're correct in saying that first i-loop is totally unnecessary. It doesn't do anything.

Edited by - Red Ant on November 15, 2001 6:02:49 AM

Share on other sites
You''re correct, that was a typo (or several) on my part. The correct loop should look like this:
  // Originally// for (i = 0; i < 255; i++) {// for (j = 0; j < plen; j++)// skip[P[j]] = plen - j - 1// }for (i = 0; i < 255; i++) skip[i] = plen;for (i = 0; i < plen; i++) skip[P[i]] = plen - i - 1;

-Neophyte

- Death awaits you all with nasty, big, pointy teeth. -

• Forum Statistics

• Total Topics
627719
• Total Posts
2978789

• 9
• 21
• 14
• 12
• 42