ascii to ISO 8859-1

Started by
9 comments, last by benryves 17 years, 5 months ago
Hi I have a text file in ASCII code base and I want to convert this file to Latin 1 ISO 8859-1. How is this done? Can it be done platform independent? Are there any libraries available for this? the file come in this format whatever that is:http://www.lookuptables.com/ and i want it in:http://www.asciitabell.se/
Advertisement
ASCII is a proper subset of latin1, therefore, by definition, no conversion needs to be done.

Mark
If no conversion is needed I wouldn’t have posted here in the first place.


There are differences between the two character sets.
Yes, there are differences but:
Quote:ASCII is a proper subset of latin1, therefore, by definition, no conversion needs to be done.

So you only need to worry about them going from Latin 1 to ascii.
An ASCII file must only contain ASCII characters, by definition. These will be the codes 0x20 - 0x7e inclusive, which make up the ASCII character set.

It might also contain control codes e.g. 0x0d, 0x0a, 0x09 (CR, LF, TAB) but that's ok too.

These characters will be exactly the same in ISO-8859-1

If the file contains any characters OUTSIDE the above set, it is clearly NOT in ASCII. If the file's interpretation of those codes is something other than the standard ASCII values, it is not ASCII.

Mark
http://gedcom-parse.sourceforge.net/doc/encoding.html
I have done some more reading and I found out that the file is probably in Code Page 437.

"IBM PC or MS-DOS code page 437, often abbreviated CP437 and also known as DOS-US, OEM-US or even just the OEM font [1], is the original character set of the IBM PC, circa 1981. The following is a table representing CP437 using the equivalent Unicode characters:"
http://en.wikipedia.org/wiki/Code_page_437

The extended above 127 needs to be converted
example: swedish 'å' dec CP437 134 needs to be converted to 229 (Latin 1).


So the problem is: I need to convert from CP437 to Latin 1 (ISO 8859-1).
In linux there is a tool called 'iconv', which allows to convert between various character sets. Maybe you give it a try ?
Quote:Original post by nmi
In linux there is a tool called 'iconv', which allows to convert between various character sets. Maybe you give it a try ?


Thanks for all help especially this help.
nice: iconv -f CP437 -t ISO-8859-1 D061115.txt > asd.txt

And there is a libiconv too.

This topic is closed to new replies.

Advertisement