big text file

Started by
7 comments, last by jnmacd 11 years, 11 months ago
I have a file of text about 600mb that wordpad freezes when trying to handle and notepad and openoffice writer refuse to work with. My computer is only a few years old running win7 so I don't understand why this size is such a challenge but it appears to be so. Anyway I can open it smoothly? I just need to copy a few lines.

I tried some text file splitter programs but they run out of memory trying to load the file which makes me wonder what their purpose is.
Advertisement
Notepad and Wordpad aren't great at that kind of thing. They try to load it all into memory in one go, and then perform word-wrapping. Almost Microsoft demonstrating how not to write an editor well. ;) I think Textpad may do a better job. In general it's problematic because many editors represent the file in a larger format in memory (for example to handle word-wrapping, syntax highlighting, formatting, etc), and often they don't do paging. It's weird that the text splitters didn't work.

If you have no luck with that, try it yourself. Open it as a stream (or manually buffer into a byte array), keep reading chunks until you hit both the size limit and a line break, then dump to disk as a file and repeat.
For a really quick test, I've been slightly more successful with Notepad++, I think I've opened stuff exceeding 100 Megs but in general, this is going to be problematic for most text editors.

Previously "Krohm"

I use Programmer's File Editor at work and frequently work with much bigger files than that. It can take a while, but it can handle multi-gig files.
If you just want to copy some lines you may be better off opening the textfile in a hexadecimal editor (which are fundamentally designed to be efficient).

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Perhaps this is a dumb question, but why on earth are you manually sifting through 600+ mb text files by hand. Thats WAY too much to do anything useful with, as a human being. If you know what information you want, maybe the better route would be to write some short thing that pulls out of it the part you need.

I routinely deal with huge text files. Typically on the order of 100mb to 1gb, and tens of thousands of them at a time. There is nothing even remotely productive that I can manually do with that information. I can, however, write a paragraph-long script that parses 100 gbs in one whack, and generates a graph of the feature I'm actually interested in. That's information that I can, as a human being, actually do something with.
For future sanity, think about how you are generating this data and see if you can make it more manageable. If it is a log, perhaps it can be broken into a number of distinct sub-systems. You might want to rotate the log periodically too.
Load it up on a linux box (or install some unix tools on your window's box). Less/more have no trouble with multi-gigabyte files, nor does ViM.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Notepad++ has worked on 600MB files for me.

This topic is closed to new replies.

Advertisement