How to read PNG images without a library?

Started by
26 comments, last by FLeBlanc 11 years, 7 months ago

The reason I don't want to use a library is because I want to know how to do it myself.


You find the specification, you spend a lot of time and effort to make something that doesn't work. Then you fix it until it works.


I'm sure I could probably write some code using the resource I provided, but I can't be certain it would work. I don't want to go through the trouble of writing code only to find that I did something wrong.
[/quote]

Tough nuts, that's how you learn.


I was hoping maybe someone had some experience in this.
[/quote]

We do. We went through the painful process so that we have the experience to know that 'use a library' is fantastic advice. If you think that experience is valuable, then go through that painful process yourself.
Advertisement

[quote name='Nyxenon' timestamp='1333384869' post='4927544']
The reason I don't want to use a library is because I want to know how to do it myself.


You find the specification, you spend a lot of time and effort to make something that doesn't work. Then you fix it until it works.


I'm sure I could probably write some code using the resource I provided, but I can't be certain it would work. I don't want to go through the trouble of writing code only to find that I did something wrong.
[/quote]

Tough nuts, that's how you learn.


I was hoping maybe someone had some experience in this.
[/quote]

We do. We went through the painful process so that we have the experience to know that 'use a library' is fantastic advice. If you think that experience is valuable, then go through that painful process yourself.
[/quote]

Definitely tough words to bite, but it's probably what I want to hear.
I started writing some code to try to read a PNG image, and when reading the signature, the first four bytes were correct, but the following four bytes were "10 204 204 204". Nothing I read said that something like that was proper, and it doesn't make sense. According to one thing that I read, it said the first four bytes determine if it is a PNG, then the next 4 bytes are CR-LF (or something like that). Is it safe to just skip those four bytes, or are the important in some way?

It bewilders me that something that seems as simple as a PNG image would be so complex. Are there any other file formats that are more simple than PNG besides the raw format?


TGA is fairly straightforward, as is BMP.
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

Nothing I read said that something like that was proper, and it doesn't make sense.


Yeah. Did I forget to mention that other programmers made mistakes when they implemented the specification and that you'll encounter situations like that?


TGA is fairly straightforward, as is BMP.


It's only 30 pages.

Problem is, since TGA used to be percevied as "simple" and since standards weren't respected or enforced, it's a bit "vague". Just like most specs, simple non-conformant and non-validated implementations add to the clutter.

Windows Photo Viewer, for example, cannot properly render .tga files, causing me considerable pain when dealing with certain type of government forms provided in tga.

BMP isn't simple either - due to different versions, same problems as above, as well as multiple raster modes, various incompatible yet BMP-labeled formats full support isn't trivial. For most part, Windows BMP is defined by whatever WinAPI loads.


Standards are written and defined by people, so complexity is people-related issues, not technical aspects.
[quote name='SimonForsman' timestamp='1333389159' post='4927574'] TGA is fairly straightforward, as is BMP.
It's only 30 pages. Problem is, since TGA used to be percevied as "simple" and since standards weren't respected or enforced, it's a bit "vague". Just like most specs, simple non-conformant and non-validated implementations add to the clutter. Windows Photo Viewer, for example, cannot properly render .tga files, causing me considerable pain when dealing with certain type of government forms provided in tga. BMP isn't simple either - due to different versions, same problems as above, as well as multiple raster modes, various incompatible yet BMP-labeled formats full support isn't trivial. For most part, Windows BMP is defined by whatever WinAPI loads. Standards are written and defined by people, so complexity is people-related issues, not technical aspects. [/quote]

That is true, i don't think the OP really needs 100% standards compliance though, being able to load the most common variations (the ones your tools produce really) on the format is usually enough for games.
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

Quote
but I have a learning disability (ADHD), so it's extremely hard for me to concentrate enough to figure out what I'm doing.

If that is accurate, then implementing any specification or standard is beyond your capabilities..

Fie on that Realm of thought! ADHD is certainly a disadvantage, but it's by no means insurmountable. It merely increases the challenge. Sure, there is no quantifiable reason to write your own PNG reader, but the OP saw it as a challenge, and I presume expects he would derive satisfaction from the accomplishment, regardless of its impracticality. Lots of people with dyslexia still manage to read. Lots of people with dyscalculia manage to do math. And lots of people with ADHD manage to do all kinds of boring tasks.

As to the OP's question, I've not written a PNG reader, and I doubt many people on this forum have, due to existing libraries, so you're unlikely to get general advice on tricky spots. But if you have specific trouble you can't manage to get past after a reasonable effort, post your relevant code and we'll do our best to help you find and fix the bug.
10 years ago, writing a library like this was a necessity. Existing libraries were hard to come by, there was no meaningful code sharing, reputable reference implementations were hard to come by or were part of reference packages, perhaps available for large fee to standardization committees. Heck, internet access was a problem.

So people in a pinch did the next best thing. They looked up wotsit.org, quickly parsed a few fields, tweaked until it worked. Maybe posted as a tutorial or sample code somewhere. Perhaps it got reused.

Result of the era was a ton of minimal readers/parsers that barely got the job done.

Then web exploded. Standard-compliant implementations became much more important, reuse became the norm, licenses were relaxed, patents expired. These libraries made their way into standard libraries, OSes and browsers. Suddenly getting vast exposure, their correctness was suddenly tested daily by 1 billion users. And patches made it back to source.


Regardless how one feels about cost vs. effort, these is actual engineering at its best. Without some really specific requirements in some adequately demanding project (corporate or hobby development is not it), the case for NIH is over.

These are solved problems. C and C++ remain as clumsy as ever in integrating third-party libraries. There are probably a lot of benign warnings that will crop up. But these libraries work, in real world on real data.

What do they buy you? Trillions of hours of QA. Billions of parsed images, from valid to invalid. Millions of users. Tens of thousands of hardware configurations. Hundreds of compilers.

Cost of replicating this effort is sufficient to cover entire economic debt of every country today. And it's available for free. Isn't it worth spending 3 hours downloading and integrating this one library?


Impressive, isn't it.
You do realize that PNG comes in many different flavors depending on bit depth, transparency, and such? Have you ever ran across an API or library that claims it can read PNG but realized that it can only read some PNGs? If you really want to learn the structure of PNGs, have several sample of PNG images (8bit, 24bit, 32bit), with and without transparency, open up a PNG documentation, and try parse those images mentally using a hex editor.

Maybe you are doing it for yourself, but I can almost guarantee that at the end of it, you'd feel that you'd just wasted your time as the other library did it better than your version.

I am writing an engine. It's not so hard, most of it is pretty easy for me
You just made my day man. I still don't understand why cannot you just focus on another problem.


Problem is, since TGA used to be percevied as "simple" and since standards weren't respected or enforced, it's a bit "vague". Just like most specs, simple non-conformant and non-validated implementations add to the clutter.
I completely agree with that. I strongly suggest against .tga nowadays, it's just outdated. RLE compression is pretty much a joke. They have been superceded by png as far as I understand.

Previously "Krohm"

This topic is closed to new replies.

Advertisement