Jump to content

  • Log In with Google      Sign In   
  • Create Account


Basic question


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 noatom   Members   -  Reputation: 773

Like
0Likes
Like

Posted 02 March 2013 - 04:36 AM

What's the easiest way to find how many times every word appears in a string?

 

For ex. in : "Outside there are twelve birds,twelve cars and twelve trees".

 

Should I just search the string for the first whitespace,save the position,then exatract everything that was before the first whitespace,save it in a string,then delete everythign that was before the whitespace,after that search the string for that word,and everytime i find it,delete it(and increment an int in order to know how many times i found it)? Then repeat the process?

 

Is there a better way?

 

Is is possible to save everything word that the user inputs via console in a vector?

 

like:

for every word received as input,vector.push_back(word)

 

I thought about that,but I can't figure a way to take each word separately,I mean when the user writes a sentence and presses enter,all words will be received at once...

 

 



Sponsor:

#2 EWClay   Members   -  Reputation: 659

Like
1Likes
Like

Posted 02 March 2013 - 05:57 AM

There is a much easier way using the standard library.

Create a stringstream from the input string and read one word at a time from it, or use an input stream from the console (it will read one word at a time unless you use getline).

Store the words in a map of <string, int>. the first time you insert a word, store 1. If the word is already in the map, increment the integer.

#3 King Mir   Members   -  Reputation: 1916

Like
0Likes
Like

Posted 02 March 2013 - 06:10 AM

You don't need a stringstream if your input is already comming from a stream. But otherwise, EWClay has the right of it.

cin >> string_var will read a string of character up to the next whitespace, effectively reading a word at a time. You will need some post processing to get rid of punctuation.

#4 Alpha_ProgDes   Crossbones+   -  Reputation: 4684

Like
1Likes
Like

Posted 02 March 2013 - 06:13 AM

couldn't you ignore punctuation and do something like "if (input_string.contains("twelve") then ++twelve_counter"?


Beginner in Game Development? Read here.
 
Super Mario Bros clone tutorial written in XNA 4.0 [MonoGame, ANX, and MonoXNA] by Scott Haley
 
If you have found any of the posts helpful, please show your appreciation by clicking the up arrow on those posts Posted Image
 
Spoiler

#5 Brother Bob   Moderators   -  Reputation: 7796

Like
2Likes
Like

Posted 02 March 2013 - 06:41 AM

Store the words in a map of <string, int>. the first time you insert a word, store 1. If the word is already in the map, increment the integer.

It's even easier that that, because you don't even need special logic to handle the first insert. The map value constructs its content if the key is not present, and integers value initializes to zero, so just go ahead and increment at all times. Thus, just use the [] operator and increment it; if the key doesn't exist it is value initialized to zero before proceeding to increment it to one and everything is fine.



#6 TMarques   Members   -  Reputation: 189

Like
0Likes
Like

Posted 02 March 2013 - 10:41 AM

It's not the easiest way, however, using a radix tree provides a great deal of latitude handling string searches (i.e find the position of each word, how many times each word appear, how many words have a given prefix, etc...)

 

If you are managing very large documents with thousands of words, radix trees will greatly optimize the time you take to make string search operations compared to vectors.

 

I don't know if there's a radix tree library for C, but it's worth taking a look at. It's hard to understand but very easy to implement.


Tiago.MWeb Developer - Aspiring CG Programmer

#7 Ectara   Crossbones+   -  Reputation: 2827

Like
2Likes
Like

Posted 03 March 2013 - 10:05 AM

couldn't you ignore punctuation and do something like "if (input_string.contains("twelve") then ++twelve_counter"?

If you know your input precisely (and probably already know how many of each word).

Otherwise, you could easily catch words that appear in a longer, different word, resulting in erroneous counts.


Edited by Ectara, 03 March 2013 - 10:06 AM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS