Jump to content

  • Log In with Google      Sign In   
  • Create Account

ungetc() Behavior


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
4 replies to this topic

Poll: ungetc() Behavior (1 member(s) have cast votes)

In which order should put-back characters be read?

  1. First in, first out. (0 votes [0.00%])

    Percentage of vote: 0.00%

  2. Last in, first out. (1 votes [100.00%])

    Percentage of vote: 100.00%

Which functions should discard the put-back characters?

  1. fseek() (1 votes [25.00%])

    Percentage of vote: 25.00%

  2. fsetpos() (1 votes [25.00%])

    Percentage of vote: 25.00%

  3. rewind() (1 votes [25.00%])

    Percentage of vote: 25.00%

  4. fflush() (0 votes [0.00%])

    Percentage of vote: 0.00%

  5. Any write operation (1 votes [25.00%])

    Percentage of vote: 25.00%

  6. Other (explain) (0 votes [0.00%])

    Percentage of vote: 0.00%

Upon the put-back buffer being reset after a seek or flush, should the file position indicator be reset to where it was before the ungetc() call, or should it be left unchanged?

  1. Reset the file position indicator to where it was before the ungetc() call(s). (1 votes [100.00%])

    Percentage of vote: 100.00%

  2. Leave it where it was after the ungetc() calls (0 votes [0.00%])

    Percentage of vote: 0.00%

  3. Other (explain) (0 votes [0.00%])

    Percentage of vote: 0.00%

On streams that don't require a seek or flush between reads and writes, should a write perform the answer to Question 3?

  1. Perform the answer to Question 3 (1 votes [100.00%])

    Percentage of vote: 100.00%

  2. Always reset the file position to where it was before the ungetc() call(s) and clear the put-back buffer. (0 votes [0.00%])

    Percentage of vote: 0.00%

  3. Always move the file position indicator from where it last was, and discard as many put-back characters as were written. (0 votes [0.00%])

    Percentage of vote: 0.00%

  4. Other (explain) (0 votes [0.00%])

    Percentage of vote: 0.00%

Vote Guests cannot vote

#1 Ectara   Crossbones+   -  Reputation: 2968

Like
0Likes
Like

Posted 02 September 2012 - 11:18 AM

As I'm sure many of you know, the C standard says that at least one character can be pushed back on to the stream with the ungetc() function. However, as for when more successive ungetc() calls are allowed, all bets are off as to how it works. I've finally gotten around to dealing with the frustration of adding an unget function to my I/O stream library, and it supports multiple ungets (up to 16 currently, but can be changed by modifying a constant). I've read documentation from all sorts of C libraries, and how they handle multiple calls to ungetc() is wildly different. I couldn't even get a grasp of what was the most popular method. What follows is a description of my current implementation:

Maximum of 16 ungotten characters; can be changed through modifying a constant.
Characters that are put back are read in LIFO order.
The put-back buffer is discarded after any seeking operation (fseek(), fsetpos(), rewind()), and the file position is unchanged from where ungetc() left it. I'm considering adding fflush() to this list.
A write operation starts writing at where the modified file position is; for streams that need to be flushed or seek()'ed between mode changes, this is irrelevant. It discards an amount of put-back characters equal to the amount written.

If anyone could share their thoughts on what makes the most sense, I'd appreciate it.

Sponsor:

#2 swiftcoder   Senior Moderators   -  Reputation: 9994

Like
0Likes
Like

Posted 02 September 2012 - 06:03 PM

If anyone could share their thoughts on what makes the most sense, I'd appreciate it.

Don't allow ungetc. It breaks all the assumptions present in the underlying stream model, and doesn't add any functionality that application-level buffering couldn't implement more reliably.

That's my two cents.

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#3 Ectara   Crossbones+   -  Reputation: 2968

Like
0Likes
Like

Posted 02 September 2012 - 06:43 PM


If anyone could share their thoughts on what makes the most sense, I'd appreciate it.

Don't allow ungetc. It breaks all the assumptions present in the underlying stream model, and doesn't add any functionality that application-level buffering couldn't implement more reliably.

That's my two cents.

Noted. However, the "underlying stream model" could be one of countless different stream types; on top of having many different terminal stream types, it could be any of numerous stream attachments, that operate through the same interface, perform operations on the data, and pass it through to the next part of the stream. All of these terminal streams and stream attachments are aware that various things could happen before they give or get information, and I've taken precautions to ensure that ungetc() doesn't break them, among various other operations. IMHO, allowing ungetc() would be a lifesaver at times; a stream that cannot be rewound would be allowed to look ahead, and step back if it was mistaken. Buffering the stream might not be beneficial; having to pass the buffer around from whatever had to look ahead to whatever must read it next is cumbersome, and while these streams support a tremendous array of buffer settings, the buffers must be flushed when one attempts to seek through the stream, to ensure that the data read is up to data, and that the data written is committed to the stream. Attempting to flush the buffers might entail fetching another buffer from the stream, which could require rewinding the unrewindable streams.

In short, while I can debate many design decisions of the C standard, I believe this one to be beneficial, and well thought-out.

However, if someone can name one type of stream that is readable and writable, but not rewindable, I will take this into account, because ungetc() would have consequences.

Edited by Ectara, 02 September 2012 - 06:51 PM.


#4 swiftcoder   Senior Moderators   -  Reputation: 9994

Like
0Likes
Like

Posted 03 September 2012 - 06:12 AM

I'm a little unclear on what you are trying to do here. Are you building your own implementation of the standard library streams, or are you building a distinct streams library, modelled on the standard library?

Regardless, some philosophical ramblings re streams libraries:

- The presence of combined read/write streams always bothers me. There is no use case for these that can't be handled either with separate read and write streams, or by combining a stream with an application-side buffer.

- ungetc(), rewind() and passing a negative argument to seek() all violate the idea that a stream is intended as an in-order iteration. I'd prefer that all of these operations were disallowed, since you can either rearrange your reads to be sequential, or emulate all of the above via an application-side buffer.

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#5 Ectara   Crossbones+   -  Reputation: 2968

Like
0Likes
Like

Posted 03 September 2012 - 08:30 AM

I'm a little unclear on what you are trying to do here. Are you building your own implementation of the standard library streams, or are you building a distinct streams library, modelled on the standard library?

Regardless, some philosophical ramblings re streams libraries:

- The presence of combined read/write streams always bothers me. There is no use case for these that can't be handled either with separate read and write streams, or by combining a stream with an application-side buffer.

- ungetc(), rewind() and passing a negative argument to seek() all violate the idea that a stream is intended as an in-order iteration. I'd prefer that all of these operations were disallowed, since you can either rearrange your reads to be sequential, or emulate all of the above via an application-side buffer.

It's a separate library, with the standard library as inspiration for its interface.

Well, I do have a use for read/write streams; the stream is not always one directional, which allows for things such as my archive format. It can be set up to create new streams with file handles into the archive, so being able to read and write to the files in the archive uses the ability to seek back and forth in the archive's main stream. Buffering the entire archive isn't feasible, since the files can be enormous. However, if you wanted to buffer the entire thing, there's a stream attachment for that, which allows to you either use a buffer, or memory map the file, which is transparent to the user, and it can be seeked, written, and read like any other stream.

I like ungetc() over seeking backward. Again, buffering all that you might need is not always feasible. Then you must keep track of the buffer's lifetime, pass it to everywhere that might need to read the file next (which is a pain), and have enough memory for what could be an 8gb file. Also:

Buffering the stream might not be beneficial; having to pass the buffer around from whatever had to look ahead to whatever must read it next is cumbersome, and while these streams support a tremendous array of buffer settings, the buffers must be flushed when one attempts to seek through the stream, to ensure that the data read is up to data, and that the data written is committed to the stream. Attempting to flush the buffers might entail fetching another buffer from the stream, which could require rewinding the unrewindable streams.


It's well and fine, when you're reading the input, and reading a model file of some sort, but if you're parsing text, and then the stream needs to be passed to something else that will read, too, reading the whole stream into a buffer is a hard thing to manage. The main benefit of ungetc(), in my eyes, is that if you have to read ahead to know if there's still an action to be performed, you can logically step back a character, and the next thing to have the stream passed to it won't know the difference (if done right).

Again, there's an attachment that will buffer the whole file for you, and emulate a stream when reading and writing to it. So, if buffering the whole file is your thing, it is possible and easy to do with this stream model, and passing the same stream with a transparent attachment is easier than passing around a buffer and its information. However, often it isn't the best option, so ungetc() and the other operations are implemented. It's easy to tell the stream that it is unrewindable, which would make negative seeking fail, but ungetc() should still succeed.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS