• Advertisement
Sign in to follow this  

Calculating information entropy

This topic is 4185 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

After much Googling, I wasn't able to find a good source of information on information entropy and how to calculate it from a stream of binary data. A few of the sites I read lead me to beleive that this would work (C#):
			float chiSq = 0;
			float expected = (float) length / freq.Length;
			for (int i = 0; i < freq.Length; ++i)
				chiSq += (freq - expected) * (freq - expected) / expected;
			Console.WriteLine("Length: " + length);
			Console.WriteLine("Chi-Squared: " + chiSq);
			Console.WriteLine("Entropy: " + (float) length / chiSq);


Where freq is the frequency table (usually 256 elements), and length is the length of the input. However, this value seems to grow arbitrarily and I was wondering how entropy bits per actual byte is calculated?

Share this post


Link to post
Share on other sites
Advertisement
Find the probability of every symbol (occurences / num. symbols). The entropy is the sum of all the quantities Probability*-log2(Probability) calculated fo each symbol.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement