[prev] [thread] [next] [lurker] [Date index for 2005/06/25]
* On 2005.06.20, in <200506201517.IAA23114@xxxxxx.xxx>, * "Dave Vandervies" <dj3vande@xxxxxx.xxx> wrote: > > > > > line\n\r > > feeds\m > > are^m > > hard&return; True. > out=fopen("foo","w"); > fputs("Nope, line\n",out); > fputs("feeds are\n",out); > fputs("actually\n",out); > fputs("really easy\n",out); > fclose(out); True. The problem isn't in I/O, it's in protocol. Everyone has a new and improved way of indicating logical line breaks within their own cross-platform specification. Traditionally, the Intarweb uses MS-DOS line breaks, \r\n, for maximum naive portability, while some specific platforms use either \r or \n solo. Each endpoint needs to be able to recognize what it's receiving and match what it's sending. I'm on a development team for an application -- a network listener with a bunch of arbitrary purpose behind it -- where, mysteriously, for reasons undiscovered, someone got \r\n backwards. It issues line breaks as \n\r. This is fine if you're a raw terminal device, and it doesn't really matter, but if you're a client application, this might matter. And, in fact, the client I use most often doesn't recognize \n\r as a line break; it recognizes it as two shizophrenic line breaks, so I get everything in doublespace. This has caused me some amount of teeth-grinding. I've had to turn vegetarian. > Any system that runs general-purpose programs has a C I/O library that > knows exactly how to do line feeds for that system, and most non-C > languages either have C at the back-end anyways or can easily be > coerced to use the C library for I/O. So, the trouble is it's not the host system, it's the interchange. What about data representations where a logical newline is zero-width whitespace, used exclusively to prettify presentation of metadata? The C library doesn't have a special XML mode, or a special LDAP mode, or a special Joe's L33t RDBMS mode -- nor should it. At some point you just have to accept that your application needs to have a brain, and also to use it. Personally -- and I'll admit that I'm speaking as a UNIX developer here -- I wish C didn't differentiate text and binary, not because they're the same, but because there's more than just text and binary in that big bad world, and it's not the C library's job to know the difference. It's just an illusion to think this alone is going to save your ass. > The hard part is finding a cluestick big enough for all of the people > who think all the world's a unix system and bypass the C stdio library Yeah, that's the Mac, right there. (Not really.) > The OP indicates that apparently not even all unix systems are unix in > this respect anymore... Who said anything about UNIX systems? Maybe it's iTunes for Windows, with Cygwin providing her %EDITOR% of choice. Newlines are hard, and it's not UNIX's fault. -- -D. dgc@xxxxxxxx.xxx NSIT University of ChicagoThere's stuff above here
Generated at 00:00 on 28 Jun 2005 by mariachi 0.52