Sign in to follow this  
orbitafx

bad words filter?

Recommended Posts

orbitafx    120
hello all, first post... I am having a problem with an example in a book i'm reading (programming principles and practice - stroustrup) I want to compare 2 vectors of words in a loop... vector<string>words[2]; words[0]="dog"; words[1]="cat"; words[2]="human"; and vector<string>badwords[2]; badwords[0]="hell"; badwords[1]="damn"; so i'll loop through the values if i find a bad word ill replace it with "bad word" if not, print words... I tried using a nested loop but it didn't quite work out as planned... If you want me to post the code I had, I will when I wake up...

Share this post


Link to post
Share on other sites
assainator    685
something like this?


vector<string>words[3]; //make a vector for 3 words and add the words
words[0]="dog";
words[1]="cat";
words[2]="human";

vector<string>badwords[2];.//make a vector 2 two words and add the bad words
badwords[0]="hell";
badwords[1]="damn";

for(unsigned int i = 0; i < words.size(); i++) //loop through all words in the words vector
{
bool foundBadWord = false; //a boolean to see if we've found a bad word
for(unsigned int j = 0; j < badWords.size(); j++) //loop through all bad words in the bad words vector
{
if(words[i] == badWords[j]) //if the word is bad change the boolean
{
foundBadWord = true;
}
}
if(foundBadWord == true) //if the word was a bad word print "bad word"
{
std::cout << "bad word" << std::endl;
}
else //else, print the word
{
std::cout << words[i];
}
}




Hope it helps,
assainator

Share this post


Link to post
Share on other sites
MadMax1992    166
Allright, your problem in pseudocode:

loop through all the words
loop through all bad words
if word == badword
set isabadword-flag to true
endif
endloop

if isabadword-flag was set
print "badword"
else
print word
endif
endloop


try to implement that and come back if you have any problems, I would like to see your code to see what you're doing wrong.

Edit: I got ninja'd...

Share this post


Link to post
Share on other sites
mattd    1078
Don't forget to break after finding a matching bad word in your inner loop. There's no need to continuing searching after setting your is-bad-word flag.

Also, here's a more idiomatic C++ way, taking advantage of the STL:

#include <algorithm>

for(int i = 0; i < words.size(); ++i)
if(std::find(badwords.begin(), badwords.end(), words[i]) == badwords.end())
std::cout << words[i] << " ";
else
std::cout << "[CENSORED!] ";



Also, you specify the initial size of a std::vector using its constructor, like this:
vector<string> words(3);
You don't use square brackets for that.

Aaand.. you need to specify an initial size of 3 for words, not 2! :)

Share this post


Link to post
Share on other sites
Zahlman    1682
When I had to do this for a real-world project, I concatenated the bad words together with '|', compiled the result as a regular expression, cached the compiled regular expression, and would then run it against whatever input (with a bit of filtering). Depending on your requirements, you might need \b or something on either side of each word in the regular expression.

OTOH, I was doing this in Python, where it's considerably easier to do something like that. :) But the point is that once compiled, a regular expression is much faster at this sort of thing, at least in the algorithmic-complexity sense.

Share this post


Link to post
Share on other sites
orbitafx    120
thanks for the help, this way I have a few ways to do it next time...

definitely learned something new, looks like the find() function is the best way, but also made me realize the use of booleans

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this