His description basically matches what I was thinking the algorithm actually did.
The first version is essentially making a histogram of the number of times that word pattern appears in the file. But due to the size of the file, the histogram overflows and he is left with an array of garbage. He picks the highest value from the (invalid) histogram, writes it, then discards the histogram. Then repeats the process for the next byte.
Considering his other forum questions were about probability and compression, I suspect he is doing it as an attempt to identify the probability of a value as a precursor to a compression algorithm. I don't think this specific method will get very far, especially considering all the research and study in the field that has brought about some amazing codecs, but you never know.
My suggestion is to get comfortable with books and courses on information theory, digital signal processing, and compression theory. With all three of those mastered, maybe this idea could be refined into something useful, just as the combination of Markov chains coupled with traditional compression yielded transformative results for other sources of data.