[Following up on the comments of Christian Rau in another thread, I've
changed my first point to correspond to what I now realize]
There are several problems with your code. Some of the most obvious
are:
The condition at the end of your do...while
has undefined behavior.
In the expression eof = fgetc(fp)) != eof
, you modify an object
(eof
), and you access it elsewhere in the expression other than to
determine the value to be stored. As far as the standard is concerned,
anything may happen, and in fact, different compilers will do different
things.
You are assigning the results of fgetc
to a char
, rather than
to an int
. The return value of fgetc
is either in the range
[0...UCHAR_MAX]
or is EOF
(which is guaranteed to be negative). In
other words, it can take on one more value than will fit in a char
.
The results of then comparing the char
with EOF
depend on whether
plain char
is signed or not. If it's not signed, it can never have a
negative value, and thus, will never be equal to EOF
. If it's signed,
then on particular character code (0xFF, or 'ÿ'
in latin-1) will be
detected as end of file. The return value of fgetc
should always be
assigned to an int
, and the conversion to char
should only be done
after the test for EOF
.
You're using the results of fscanf
without checking that the
function has succeeded. In C++, IO, be it iostream or FILE*
is not
predictive. Because of the way the interface is defined, it is
impossible to tell in advance whether you will encounter end of file.
You must try to read, and then test whether the read succeeded.
You're using fscanf
into a char[]
without limiting the input
length. This is a buffer overflow waiting to happen.
In C++, the most natural way fo writing what you're doing would be:
std::string word;
while ( anIStream >> word ) {
// ...
}
Using the older, C compatible streams, you would write:
char word[20];
while ( fscanf( fp, "%19s", word ) == 1 ) {
// ...
}
In both cases, the check for success controls the loop; in the case of
the C interface, you use a format width specifier to ensure that you
don't overrun the buffer. (And in both cases, you must define the
variables being read outside the loop, even though you will only use
them in the loop.)