|
25.3.1 Text vs Binary Modes
As discussed in 15.3.5.1 Text and Binary Files, text and binary files are
different on Windows. Lines in a Windows text files end in a
carriage return/line feed pair, but a C program reading the file in text
mode will see a single line feed.
Cygwin has several ways to hide this dichotomy, and the solution(s) you
choose will depend on how you plan to use your program. I will outline
the relative tradeoffs you make with each choice:
- mounting
-
Before installing an operating system to your hard drive, you must first
organise the disk into partitions. Under Windows, you might only
have a single partition on the disk, which would be called
`C:'(63). Provided that some
media is present, Windows allows you to access the contents of any drive
letter -- that is you can access `A:' when there is a floppy disk in
the drive, and `F:' provided you divided you available drives into
sufficient partitions for that letter to be in use. With Unix, things
are somewhat different: hard disks are still divided into partitions
(typically several), but there is only a single filesystem mounted
under the root directory. You can use the
mount command to
hook a partition (or floppy drive or CD-ROM, etc.) into a
subdirectory of the root filesystem:
|
$ mount /dev/fd0 /mnt/floppy
$ cd /mnt/floppy
|
Until the directory is unmount ed, the contents of the floppy
disk will be available as part of the single Unix filesystem in the
directory, `/mnt/floppy'. This is in contrast with Windows'
multiple root directories which can be accessed by changing filesystem
root -- to access the contents of a floppy disk:
|
C:\WINDOWS\> A:
A:> DIR
...
|
Cygwin has a mounting facility to allow Cygwin applications to see a
single unified file system starting at the root directory, by
mount ing drive letters to subdirectories. When mounting a
directory you can set a flag to determine whether the files in that
partition should be treated the same whether they are TEXT or
BINARY mode files. Mounting a file system to treat TEXT files
the same as BINARY files, means that Cygwin programs can behave in
the same way as they might on Unix and treat all files as equal.
Mounting a file system to treat TEXT files properly, will cause
Cygwin programs to translate between Windows CR-LF line end
sequences and Unix CR line endings, which plays havoc with
file seeking, and many programs which make assumptions about the size of
a char in a FILE stream. However `binmode' is the
default method because it is the only way to interoperate between
Windows binaries and Cygwin binaries. You can get a list of which drive
letters are mounted to which directories, and the modes they are mounted
with by running the mount command without arguments:
|
BASH.EXE-2.04$ mount
Device Directory Type flags
C:\cygwin / user binmode
C:\cygwin\bin /usr/bin user binmode
C:\cygwin\lib /usr/lib user binmode
D:\home /home user binmode
|
As you can see, the Cygwin mount command allows you to
`mount' arbitrary Windows directories as well as simple drive letters
into the single filesystem seen by Cygwin applications.
- binmode
-
The
CYGWIN environment variable holds a space separated list of
setup options which exert some minor control over the way the
`cygwin1.dll' (or `cygwinb19.dll' etc.) behaves. One such
option is the `binmode' setting; if CYGWIN contains the
`binmode' option, files which are opened through `cygwin1.dll'
without an explicit text or binary mode,
will default to binary mode which is closest to how Unix behaves.
- system calls
-
`cygwin1.dll', GNU libc and other modern C API
implementations accept extra flags for
fopen and open calls to
determine in which mode a file is opened. On Unix it makes no
difference, and sadly most Unix programmers are not aware of this
subtlety, so this tends to be the first thing that needs to be fixed when
porting a Unix program to Cygwin. The best way to use these calls
portably is to use the following macros with a package's `configure.in'
to be sure that the extra arguments are available:
|
# _AB_AC_FUNC_FOPEN(b | t, USE_FOPEN_BINARY | USE_FOPEN_TEXT)
# -----------------------------------------------------------
define([_AB_AC_FUNC_FOPEN],
[AC_CACHE_CHECK([whether fopen accepts "$1" mode], [ab_cv_func_fopen_$1],
[AC_TRY_RUN([#include <stdio.h>
int
main ()
{
FILE *fp = fopen ("conftest.bin", "w$1");
fprintf (fp, "\n");
fclose (fp);
return 0;
}],
[ab_cv_func_fopen_$1=yes],
[ab_cv_func_fopen_$1=no],
[ab_cv_func_fopen_$1=no])])
if test x$ab_cv_func_fopen_$1 = xyes; then
AC_DEFINE([$2], 1,
[Define this if we can use the "$1" mode for fopen safely.])
fi[]dnl
])# _AB_AC_FUNC_FOPEN
# AB_AC_FUNC_FOPEN_BINARY
# -----------------------
# Test whether fopen accepts a "" in the mode string for binary file
# opening. This makes no difference on most unices, but some OSes
# convert every newline written to a file to two bytes (CR LF), and
# every CR LF read from a file is silently converted to a newline.
AC_DEFUN([AB_AC_FUNC_FOPEN_BINARY], [_AB_AC_FUNC_FOPEN(b, USE_FOPEN_BINARY)])
# AB_AC_FUNC_FOPEN_TEXT
# ---------------------
# Test whether open accepts a "t" in the mode string for text file
# opening. This makes no difference on most unices, but other OSes
# use it to assert that every newline written to a file writes two
# bytes (CR LF), and every CR LF read from a file are silently
# converted to a newline.
AC_DEFUN([AB_AC_FUNC_FOPEN_TEXT], [_AB_AC_FUNC_FOPEN(t, USE_FOPEN_TEXT)])
# _AB_AC_FUNC_OPEN(O_BINARY|O_TEXT)
# ---------------------------------
AC_DEFUN([_AB_AC_FUNC_OPEN],
[AC_CACHE_CHECK([whether fcntl.h defines $1], [ab_cv_header_fcntl_h_$1],
[AC_EGREP_CPP([$1],
[#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
$1
],
[ab_cv_header_fcntl_h_$1=no],
[ab_cv_header_fcntl_h_$1=yes])
if test "x$ab_cv_header_fcntl_h_$1" = xno; then
AC_EGREP_CPP([_$1],
[#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
_$1
],
[ab_cv_header_fcntl_h_$1=0],
[ab_cv_header_fcntl_h_$1=_$1])
fi])
if test "x$ab_cv_header_fcntl_h_$1" != xyes; then
AC_DEFINE_UNQUOTED([$1], [$ab_cv_header_fcntl_h_$1],
[Define this to a usable value if the system provides none])
fi[]dnl
])# _AB_AC_FUNC_OPEN
# AB_AC_FUNC_OPEN_BINARY
# ----------------------
# Test whether open accepts O_BINARY in the mode string for binary
# file opening. This makes no difference on most unices, but some
# OSes convert every newline written to a file to two bytes (CR LF),
# and every CR LF read from a file is silently converted to a newline.
#
AC_DEFUN([AB_AC_FUNC_OPEN_BINARY], [_AB_AC_FUNC_OPEN([O_BINARY])])
# AB_AC_FUNC_OPEN_TEXT
# --------------------
# Test whether open accepts O_TEXT in the mode string for text file
# opening. This makes no difference on most unices, but other OSes
# use it to assert that every newline written to a file writes two
# bytes (CR LF), and every CR LF read from a file are silently
# converted to a newline.
#
AC_DEFUN([AB_AC_FUNC_OPEN_TEXT], [_AB_AC_FUNC_OPEN([O_TEXT])])
|
Add the following preprocessor code to a common header file that will be
included by any sources that use fopen calls:
Save the following function to a file, and link that into your program
so that in combination with the preprocessor magic above, you can always
specify text or binary mode to open and fopen , and let
this code take care of removing the flags on machines which do not
support them:
|
#if HAVE_CONFIG_H
# include <config.h>
#endif
#include <stdio.h>
/* Use the system size_t if it has one, or fallback to config.h */
#if STDC_HEADERS || HAVE_STDDEF_H
# include <stddef.h>
#endif
#if HAVE_SYS_TYPES_H
# include <sys/types.h>
#endif
/* One of the following headers will have prototypes for malloc
and free on most systems. If not, we don't add explicit
prototypes which may generate a compiler warning in some
cases -- explicit prototypes would certainly cause
compilation to fail with a type clash on some platforms. */
#if STDC_HEADERS || HAVE_STDLIB_H
# include <stdlib.h>
#endif
#if HAVE_MEMORY_H
# include <memory.h>
#endif
#if HAVE_STRING_H
# include <string.h>
#else
# if HAVE_STRINGS_H
# include <strings.h>
# endif /* !HAVE_STRINGS_H */
#endif /* !HAVE_STRING_H */
#if ! HAVE_STRCHR
/* BSD based systems have index() instead of strchr() */
# if HAVE_INDEX
# define strchr index
# else /* ! HAVE_INDEX */
/* Very old C libraries have neither index() or strchr() */
# define strchr rpl_strchr
static inline const char *strchr (const char *str, int ch);
static inline const char *
strchr (const char *str, int ch)
{
const char *p = str;
while (p && *p && *p != (char) ch)
{
++p;
}
return (*p == (char) ch) ? p : 0;
}
# endif /* HAVE_INDEX */
#endif /* HAVE_STRCHR */
/* BSD based systems have bcopy() instead of strcpy() */
#if ! HAVE_STRCPY
# define strcpy(dest, src) bcopy(src, dest, strlen(src) + 1)
#endif
/* Very old C libraries have no strdup(). */
#if ! HAVE_STRDUP
# define strdup(str) strcpy(malloc(strlen(str) + 1), str)
#endif
char*
rpl_fopen (const char *pathname, char *mode)
{
char *result = NULL;
char *p = mode;
/* Scan to the end of mode until we find 'b' or 't'. */
while (*p && *p != 'b' && *p != 't')
{
++p;
}
if (!*p)
{
fprintf(stderr,
"*WARNING* rpl_fopen called without mode 'b' or 't'\n");
}
#if USE_FOPEN_BINARY && USE_FOPEN_TEXT
result = fopen(pathname, mode);
#else
{
char ignore[3]= "bt";
char *newmode = strdup(mode);
char *q = newmode;
p = newmode;
# if ! USE_FOPEN_TEXT
strcpy(ignore, "b")
# endif
# if ! USE_FOPEN_BINARY
strcpy(ignore, "t")
# endif
/* Copy characters from mode to newmode missing out
b and/or t. */
while (*p)
{
while (strchr(ignore, *p))
{
++p;
}
*q++ = *p++;
}
*q = '\0';
result = fopen(pathname, newmode);
free(newmode);
}
#endif /* USE_FOPEN_BINARY && USE_FOPEN_TEXT */
return result;
}
|
The correct operation of the file above relies on several things having
been checked by the configure script, so you will also need to
ensure that the following macros are present in your `configure.in'
before you use this code:
|
# configure.in -- Process this file with autoconf to produce configure
AC_INIT(rpl_fopen.c)
AC_PROG_CC
AC_HEADER_STDC
AC_CHECK_HEADERS(string.h strings.h, break)
AC_CHECK_HEADERS(stdlib.h stddef.h sys/types.h memory.h)
AC_C_CONST
AC_TYPE_SIZE_T
AC_CHECK_FUNCS(strchr index strcpy strdup)
AB_AC_FUNC_FOPEN_BINARY
AB_AC_FUNC_FOPEN_TEXT
|
|