Sign in to follow this  

text representation for recognition

This topic is 4336 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am creating a program to learn to recognize text. I pretty much have everything in my AI classes working, but I'm not sure how to feed the text into the network. I heard inputing a massive array of booleans was a bad idea, but I don't know of any other way to do it...Thanks in advance

Share this post


Link to post
Share on other sites
Just think of the 2D matrix that contains the pattern as a 1D vector, formed either from row-major or column-major assignment from the matrix. Then use the 1D vector as the input vector (so you'll have as many input nodes as components in the vector). Make sense?

Cheers,

Timkin

BTW: are you preprocessing the character images before you feed them into the network, or just putting in raw images?

Share this post


Link to post
Share on other sites
So, you are saying I should just input all the booleans as a massive array of inputs? Right now I process the images to be boolean and always a set size. I haven't actually tried inputting anything yet because I was looking for some other way to process the characters... Guess I'll see how it works that way then.

Share this post


Link to post
Share on other sites
Well I feel kinda dumb. Don't know why I was thinking that wouldn't work. I had to fiddle with my some of my settings to get good results, but it seems to be working fine. I scale the images down to 10x10 arrays of integers(representing how many pixels were in each area of the image) and then apply a touch of blur(seems to help with generalizing). It's in an applet so anyone interested can check it out here: http://www.alrecenk.cjb.net:81/java/show/ai/math/test8.html

Share this post


Link to post
Share on other sites
Good to hear its working. You might want to consider performing a principle component analysis on the training data and then transforming each input vector into the principle basis space. OCR tasks usually show better results on test data with this kind of preprocessing.

Have you tested the quality of your OCR network by adding noise to the test inputs and noting the classification error rate as a function of added noise variance? This is a good way to analyse your results and helps to quantify your networks performance.

Cheers,

Timkin

Share this post


Link to post
Share on other sites
>>You might want to consider performing a principle component analysis on
>>the training data and then transforming each input vector into the principle
>>basis space.

VERY nice idea. Finally some usage for those esoteric PCA classes :P

Btw, where do you guys get the training data for these OCR problems?

-- Mikko

Share this post


Link to post
Share on other sites
Principle component analysis seems interesting. I'll look into that when I have more time. I'm actually not using a network. I've been experimenting with other types of "function approximators". I'm hoping to create something more effecient than an ANN, but it is hard to say how effecient my systems are right now because all my test programs are so simple that they solve almost instantaneously. My AI can be trained to work with 18 symbols in about an eighth of a second. To answer uutee's question: in the applet I posted the symbols are drawn, processed and trained into the network at runtime.

Share this post


Link to post
Share on other sites

This topic is 4336 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this