Sign in to follow this  
RisingForce

Encoding

Recommended Posts

RisingForce    108
Does anyone know what encoding is used for the strings of characters in the property fields of a file under the summary tab?

Basically, if you right click a file, go properties. Then select summary tabs. When you write something to one of those fields and click apply. What is that data encoded as? Unicode, ASCII, UTF32, UTF7, UTF8. Will save me much of time if someone knows or knows of a faster way of finding out other than trial and error.

Thank you.

Share this post


Link to post
Share on other sites
RisingForce    108
idk if I understand your question. If you take a file, any file really. If you right click it and then select properties and go to the summary tab. You will see fields that you can write too.

My question was what encoding is done to that string when I press apply. I am using windows so the file system is NTFS. Does that help?

Share this post


Link to post
Share on other sites
iMalc    2466
Quote:
Original post by RisingForce
idk if I understand your question. If you take a file, any file really. If you right click it and then select properties and go to the summary tab. You will see fields that you can write too.

My question was what encoding is done to that string when I press apply. I am using windows so the file system is NTFS. Does that help?

The answer to rip-off's question (and indeed mine, and probably everyone else who has no idea what you were talking about) is Windows. Not everyone here even runs Windows, and those of us that do probably mostly didn't even guess what you meant.
That tab does not appear for just any file. It only appears for word docs and mp3s? and maybe a couple of other things. Most files do not have it.

Sorry I don't have an asnwer to your question at the moment.

Share this post


Link to post
Share on other sites
frob    44902
They are stored in an alternate data stream.

Most modern disc formats (NTFS, ext2/3/4, UDF, etc) allow you to have multiple data streams beyond the default primary channel you store your data in.

Windows has a few NTFS data streams that it uses for this information. You can see a little bit about how to access them on this msdn example.

The two streams that Windows uses to fill in those tabs are the SummaryInforation and DocumentSummaryInformation streams, which have been reverse engineered a few times; Google for them.

Share this post


Link to post
Share on other sites
RisingForce    108

Quote:

The two streams that Windows uses to fill in those tabs are the SummaryInforation and DocumentSummaryInformation streams, which have been reverse engineered a few times; Google for them.


I am working on a program that lets the user read and write to those fields in a file, using the IPropertySetStorage and IPropertyStorage interfaces. I can write to any of those fields in ANY file whatever I want through code. I can also read any file. Here is the problem....

I am using Windows Forms, keep that in mind.
Here is where things get screwy....

Example:
// write title to textbox
String^ title = gcnew String( propRead_title.pwszVal );
textBox1->Text = title;

These interfaces use PROPVARIANT structures to hold the pointer to the string of characters of a field. It is an LPWSTR( wchar_t ).
propRead_title.pwszVal is the LPWSTR

Through debugging it has been determined that after my read operation propRead_title.pwszVal contains the correct value. IT DOES!

Basically, there are two cases that differ for an unknown reason to me.

Case 1 is the case where I have already written to the files field previously.

Case 1 works!, case one works beautiful. If I have done a write first through code it does not matter! From that time on, even closing the program and reopening it and selecting the file again and just reading, the textBox1 contains the correct string of the field. You can even go through windows and edit the field normally and my program can then read that value with no problems WITHOUT writing.

Case 2 is the case where I try to read the initial data in that field before the file was "altered" with a write.

Case 2 fails...., Here is the weird part. First I thought it was just junk. I got these weird bold vertical bars and such written to textBox1. However, when I examined it further with the debugger I realized obviously that propRead_title.pwszVal does contain the correct value, can't stress that enough but that the VALUE of the pointer which in the debugger was shown as the same weird character string that showed up in textBox1. This was the value of both the propRead_title.pwszVal pointer and the String^ title....

This is why I though that writing to the fields first changes the field's encoding somehow. I believe I am doing the conversion correct from wchar_t* to String^ because otherwise Case 1 would have failed too! right?

Share this post


Link to post
Share on other sites
RisingForce    108
Ok here are some images to show you the debugger.
lowercase "a" is the value in the title field( for the image that shows that the dereference has correct value )

http://img593.imageshack.us/img593/571/image1su.png
http://img839.imageshack.us/img839/6240/image2gq.png
http://img508.imageshack.us/img508/6195/image3lg.png


So....haha... any ideas?

I am kind of stuck on this..

Share this post


Link to post
Share on other sites
RisingForce    108
nvm about this... It was just junk apparently but deceiving junk. More tests showed it was just junk. But if anyone knows how to read a file's summary tab fields initially without first writing. That would be great.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this