Some long-term perspectives and multi-vendor gripes, and some
general soap-boxing
My career has had some interesting side-effects on my life.
In the beginning, computers could only understand upper case and
as a result I got in the habit of printing, in upper case.
I nearly always print, and seldom print in lower case.
My handwriting
is difficult to read and to do! In high school I took a typing course,
one of the few classes I still use every day.
But the keypunches of
the era forced me to abandon using the right shift key, because
a keypunch only had a left shift key, and a right un-shift key.
The Cyber consoles had re-arranged the digits, destroying my already-minimal
ability to touch type digits.
Over the years, various attitudes have existed between the technical
staff and management. In the early 1970s, management was itself fairly
technical, and a good rapport existed between the employers and employees.
The assistant directors would do programming and hardware development!
As time went on, and management became more detached from the work being
done by the staff, this attitude became somewhat antagonistic at times
because management may not be as aware of what the "workers" actually do.
An example from February 1995 of this degradation came from the division
director in a note announcing a video-taped panel making its visions
known on future computing:
-
"I have a video of a CAUSE Current Issues Forum. The title is "Seeing Higher
Education in the Year 2020." I have viewed it. It lasts 78 minutes. A bit
tedious in the middle, it does address many issues relative to information
technology and its influence on higher education. Furthermore, the
participants are "real people" like Presidents, Provosts and Faculty, not
computer techies."
Aside from the ridiculous assertion that "real people," or anybody, can
predict the state of computing 25 years into the future, please note that the
technical staff are not real people. (Was this an attempt at humor?
If so, it didn't come across that way, as it seems to be too close to
the perceived attitudes).
For another example of the rift between management and workers, refer to
the story earlier on write-protecting a 9-track tape by slapping red
stickers all over the write-ring slot. In this case it is a knowledge
rift, demonstrating the lack of technical expertise of some managers.
If they can't understand technical work, how can they manage it?
Surprisingly, it can be done (I've seen it), but doesn't seem to happen
very often.
Maybe we are lucky, but to my knowlege we've had only a few examples
of "bad" employees. The best (?) example was an operator from either
the late Philco or early Sigma era (I think it was early 1970).
At the time, instructors would
mark a student's grade for a class on a computer card. The cards were
then punched by our keypunch operators (and verified), then fed into
the system for sorting and final grading. An operator apparently was
not satisfied with his grade, and managed to find his card and alter
it by pasting a piece of tape over the hole and re-punched it with
an "A". The instructor didn't notice this when he posted the grades
on his door (he got a sheet from Registration & Records, presumably cut
off the SSNs, and taped it to his door) but another student
saw the grade and complained. The instructor insisted our operator
had not been given an "A" but there it was on the roster. Eventually
the card was pulled, the tape was spotted and the operator dismissed.
In another case an operator was using the Cyber console "H" display to
examine the user validation file. I thought this was rather clever,
and painless because the passwords were not stored in clear text, but
management decided to dump him.
A long-standing problem in computing is the repeated failure of vendors
to "get it right". In this example, tape drives and drivers. In spite
of the complete dominance of 1/2-inch tape for over 20 years, each time we
changed vendors I found problems and shortcomings. With the Sigma, I've
already described the fiasco of the poor design of their 7-track drive
that triggered operating system crashes. They also made the common blunder
of disallowing arbitrarily large block sizes. When we got the Cybers,
I ran a test with Sigma-written tapes containing every block size from
1 byte to 32K bytes and discovered that NOS would mis-report block sizes
that were a certain multiple of 512 CM words. On our VAX 11/785, the
TU80 tape drive would mis-report some small block sizes up to about 14
bytes, a problem for which DEC had a known fix that they never turned
into an ECO. They felt it was only worth installing on drives if the
customer complained, even though the drive was on a service contract. VMS
cannot read arbitrarily large blocks. It also claims to ignore all small
blocks smaller than 14 bytes, which is a lie (no complaint except that
the documentation is wrong). However, it will refuse to write blocks
smaller than 14 bytes with the result that we can read old NOS tapes
but cannot copy them! VMS' BACKUP utility loves
to write its own labels instead of letting the operating system do it
(very bad news), and on OpenVMS Alpha 6.1 botched the labels in some
way on continuation reels so that VMS won't understand it as an ANSI
tape set. (This was been fixed in VMS 6.2).
I've never seen a vendor that included a useful tape copying utility
in the operating system. All utilities were special-purpose, such as
being useful for labeled tapes only, or unlabeled tapes with fixed
records. The big problem in tape copying is knowing when to stop.
If the tape is ANSI labeled and the utility knows this, it's well defined.
But if the tape is unlabeled, of unknown format and just a bunch of blocks
and tape marks, the only convention worth mentioning as a de-facto
standard is to stop when two tape marks in a row are found. Most
utilities are not that smart.
One of the very worst tapes I ever saw, in terms of format, was the source
tape for Versatec software for the Sigma. It was unlabeled, and actually
began with a double tape mark (almost a universal signal that no more
valid information follows). Each file was separated by a double tape mark,
except the last file was simply followed by blank tape instead of even
a single tape mark.
There is also the issue of multi-reel tape sets. If ANSI labeled, things
are pretty clean except for determining what the "next" tape is if you
don't already know in advance (NOS had a fix for this, that worked most
of the time). But there are no standards for unlabeled tape sets, and
whenever I hear of such, I get upset. For example, NOS could be
told to switch reels on unlabeled tape sets in one of three ways:
- When the reflective marker is found during a write, abandon the
write, switch reels, then write the block again.
- When the reflective marker is found during a write, complete the
write, switch reels.
- When the reflective marker is found during a write, complete the
write, notify the program, and keep writing until a write-eof
is requested by the
program. Then switch reels.
Of course, if you didn't know how the tape was written, you probably could
not read it correctly. (This applied to NOS "Stranger" tapes, not ANSI
or "Internal" tapes).
Another big blunder the industry made, aside from starting out with octal
computers instead of hexadecimal, was to adopt a 7-bit communication code
for storage of 8-bit bytes. This led to all sorts of problems, such as
CDC claiming they supported "8 bit ASCII." Lots of vendors played tricks
with the 8th bit, such as leaving it zero, leaving it 1, using 0 or 1
interchangeably, or using it to define an incompatible second set of 128
characters (e.g. MS/DOS, DEC Multinational). Then again, IBM can't even
standardize its own 8-bit EBCDIC code. IBM will define a code of X'nn'
as, say, a vertical bar (shift-Y on a keypunch), except that on some
particular model of printer with some specific print chain it will print
instead as some other character. Look at an IBM code chart and you'll
find things like three different underscore characters, and several other
codes that have different graphics depending on the equipment. Finally,
there's the issue of non-translatable characters between ASCII and EBCDIC,
such as EBCDIC's cent sign.
Speaking of ASCII, a related snafu is the "tab" character. Its use was
originally to save space in files, but it critically yet silently assumes
particular tab settings. On the Sigma, terminals (i.e. TTY 33) had no real
tabs, so it was all done in software. The default mode was that when a
user hit a tab key, the tty driver would figure out how many spaces were
required to get to the next tab stop, and would send that many spaces to
both the terminal, and the calling application program. Thus, tabs were
never actually stored within the file (I think they called this "space
insertion mode"). Now, in 1995, the problem still persists. Many PC
applications will store tabs in the file, but if you carry this file to
VMS (or Unix?) it will not display correctly because the tab settings are
different (and there's no way for the file to declare its assumptions).
Even within VMS itself, lines will display differently in some editors
(e.g. EDT in line mode vs EDT in screen mode) because the tabs are not
handled correctly. Moral: For Pete's sake, don't use tabs!
Vendors keep releasing editors that can't edit an arbitrarily-large file.
In many cases, the limits were seen as "reasonable" at the time the software
was written, but customers and/or storage advances quickly overcame it.
On the Sigma, the editor would handle 16K lines max if I remember. This
was an interesting editor because the files were keyed (ISAM) with the
key being the line number (CP-V keyed files stored the key completely
separately from the data, unlike most other implementations) but the key
was too short to handle large files. But it did mean that no "work" file
was needed, nor huge gobs of memory. On the Cybers, XEDIT "almost" did
it right but in a few places used 17-bit "B" registers for the line number
and thus was limited to 128K lines. A popular patch among NOS customers
was to fix XEDIT to use 60-bit "A" registers instead, removing that limit.
Later, CDC's Full Screen Editor (FSE) could handle huge files but took
a very long time to read them in and build a work file first. In VMS,
EDT will break if the work file is over 64K blocks, and TPU will break if
the file is larger than the user's working set limit. Most DOS and PC
editors will also fail if the file being edited exceeds 640K or some other
memory limit.
I often hit this using the Solaris vi editor with either "Tmp file too large"
or "Out of register space (ugh)" or "Line too long".
Many vendors/applications can't seem to cope with empty files. On one of
the early Sigma operating systems (BTM?) the SORT utility considered a
file of zero length to be an error. While this *may* be the case, it
depends on what's going on, but the utility certainly can't make that
decision. Several sites had to beat up on XDS to convince them that
sorting an empty file made perfectly good sense. Just give me an output
file with all zero records sorted in order; in other words, create an
empty output file if the input is empty. Let the programmer decide whether
it's a problem! This came to mind just today (1995), 25 years later, noting that
the MXRN news reader aborts if the input news.rc file has zero records!
One of the amusing fiascos to watch in the long term, is the constant
flip-flopping on the issue of aborting jobs/processes. In the early days
if a job ran away, you rebooted the system. The OS folks soon gave us
a way to abort jobs, either from the operator console or, eventually,
from a user terminal. Almost as quickly the users demanded a way to
kill-proof their jobs, either completely or to at least allow the program
to "catch" such exceptions and do whatever was necessary (which in some
cases was to continue on with the task in progress). Next, the OS folks
gave us a way to *really* kill a process as well as a way to merely ask
for death. A good example is VMS, with the often-misunderstood
distinction between control-C (please die) and control-Y (I demand that
you die). Of course, the users then demanded a way to catch control-Y
and do with it what they wished. Unix allows ten different levels of
kill signals (the last level being uncatchable), and NOS would
unconditionally kill a process that had already been killed once.
In VMS, a privileged process can set a bit that says "don't kill me
ever, no matter what" and I recently had such a process (Distributed
Transaction Manager) hang and prevent the system from shutting down
properly. On a VMSNET newsgroup, somebody was asking how to kill
a kill-proof process. It's been a constant battle between
the "I must not be killed" and the "I must be able to kill" factions.
Of course neither is inherently right or wrong. But it has been interesting
to watch things swing back and forth, with more and more layers of kill
and no-kill. And of course the PC makers have re-discovered the whole
issue from scratch rather than pay attention to experience.
Mainframes are dead. Small systems will take over. I've heard this since
the PDP-11 was introduced, and I still don't believe it. Certainly the
small desk-top system has made a large impact, especially if dropped from
a height. But despite 15+ years of being told the dinosaurs of computing
are outmoded, they continue to live. Just today (March 7 1995) I read that
in 1994, IBM's sales of MVS systems grew by 40%. Pretty active for a corpse.
As I said, small systems have their place. That place is growing, and of
course today's small system has more storage and compute power than the
mainframes of a few years ago. But there are tasks that simply cannot
be done on a household appliance, such as payroll for a University or
registration/records. At least not yet. My latest chuckle is comparing
tentative plans for a PC file server for students. A multi-processor
system with gobs of memory, very fast tape drive for backups,
and many gigabytes of disk. Much better than the klunky old mainframe, with
its dual processors, gobs of memory, IBM 3480-style tape drives, and 27
gigabytes of disk, right? Even when management tries to make their own
predictions come true, "mainframe" use continues to climb.
Yet another common mistake, or maybe mistaken attitude, is the "normal
error." An application or system yields an error message, but the vendor
or provider says this is normal. Baloney. Aside from being a nuisance,
the Normal Error tends to produce the "wolf" effect, desensitizing the
viewer from real errors. It's unprofessional and tacky. My favorite
example was our Digital 7620 Alpha system. If you enter TEST at the
console prompt it would run a bunch of self-diagnostics, then exit. But
upon exit the console software will either function correctly, hang, or
explode with a "catastrophic failure" (their phrase, not mine). Because
the diagnostics themselves completed normally, this error was of
little concern to Digital. In other words, a "normal catastrophic
failure." Sheesh.
Hey, I'm starting to sound like Andy Rooney.
A lack of standard terminal behavior also plagued the industry, though
with Windows taking over, this is less important.
Example: some terminals wrap long lines, some truncate long lines, many
can be set one way or another. Some programs expect a truncating terminal,
some expect a wrapping terminal. Some even set the terminal the way it
expects, then leaves it set that way when it exits. Why is this important?
Example: In VMS mail, users here frequently typed message without ever
hitting the Return key because the terminal/emulator they are using happens
to wrap long lines, so the user thinks WYSIWYG and keeps typing. But in fact
VMS considered it a single long line because the wrapping was done by the
terminal, not VMS or Mail, and when the user hits character 512 the line
gets truncated with an error message. This confusion could be prevented
by not wrapping.
Vendors often give you ways of setting something like a process attribute,
but no way of examining it. I remember in NOS it was trivial to set a
User Job Name (UJN) but darn near impossible to determine its current
setting. There are several things in VMS that you can SET but not SHOW.
Before we started getting Digital-compatible terminals, my experience in
this particular focus was with terminals that either wrapped or didn't.
Then when we got real DEC-compatible ones, I was further annoyed to find
that VT-terminals would position the cursor at column 80 after you typed
the 79th character, and would then *leave it there* on top of the 80th
after you typed *that* (so how do you distinguish a 79-character line
from an 80-character line that ends with a blank?).
This is an interesting feature but complicated
the issue of supporting multiple terminal types on NOS.
For many years, we simply *had* to standardize on terminals that were
compatible with the original Lear-Siegler ADM-1, because the AMS systems
had been programmed to operate with those terminals. This excluded the
Digital VT series, much to the chagrin of almost everybody else. When
WHECN was formed, they also adopted the AMS systems as a standard, and
WHECN repeatedly forced the issue of a compatible terminal type for both
them and UW, even after UW shut down the last AMS. An excellent example
of the tail wagging the dog. Only when we acquired the VAXcluster, and
PCs became plentiful with Kermit's VT-100 emulator, and UW/WHECN divorced
each other, could we standardize on a standard terminal.
In the area of serious computing, UW has seldom been a contender
(depending on whether you consider a Cyber 760 a serious number cruncher,
which it was just barely in 1979 when we got it). Various hand-waving
has been done by directors, generally claiming that time on supercomputers
was freely available at other sites, which is/was not the case. One
well-placed source claims that Cray Research offered UW one of their
systems, but it was rejected because of "too many strings attached."
Another sign of this is UW's Institute for Scientific Computing.
I contend the mere existence of this group shows that our
Division isn't serving a broad enough segment of UW's population, but
others have disagreed. As desktop computing power has increased, this
has almost become a non-issue. We seem infatuated with networking,
e-mail, and word processing, but there are still those who use computers
for computing.
A second example of not meeting campus needs, at least in a specialized
area, is the attempted startup of a separate organization called GRIP.
This was the brainchild of the VP for research, who wanted a service
organization dedicated to computer graphics. But this group was supposed
to be "self sufficient" though I never got a precise definition of what
this meant. The implication, as above, is that the central computing
division wasn't doing its job in terms of serving the campus. Unfortunately
the VP in question "quit" (i.e. got canned) early in the project, which
then quickly turned into a graphics lab for a few researchers rather
than a campus-wide service center.
I sometimes wonder how management views me, especially my zeal for
detail (for which I was once criticized on an annual evaluation).
Recently I discovered a retiree's account, where the retiree had died.
This brought up the question of how to dispose of the user's files,
if for example a department or relative requested access, so it was
passed on to management for a proper pronouncement. I could almost hear
them saying "Only Kirkpatrick would worry about dead users."
Management, while meaning well, has occasionally gotten technical jargon
fouled up. This at least provides us with some humor. One person is
always worried about "increasing response time" instead of decreasing
(improving) it. Recently they got excited about a computerized University
information kiosk, with its "touch tone screen." One manager tried to
figure out how big a program could be, by dividing the page file size
by the number of authorized users.
Of course we often hear about "Local Area LANs."
We've repeatedly terminated specialized services because they were only
used by a small number of users. While I can understand the strength of
this argument, especially if dictated by economics, it worries me. I once
told a director that we could save a *lot* of money by not offering any
services at all, if money is the issue. But a University needs to cater
to more than just the demands of the majority. For example, back in 1970
there were a few hundred people using our computers to do computing; in 1995
there are over ten thousand using them for e-mail and composing useless
memos, and a few hundred people still using computers to do computing; should
we therefore tell the latter to get lost because they are in the minority?
The last thing we need to do is strive for mediocrity.
Time: Nobody seems to understand time, and the distinctions between local time,
Universal Coordinated Time (UTC), and International Atomic Time (TAI).
People get dumbfounded when you mention that UTC can have leap seconds added or
subtracted up to twice a year; I understand the POSIX official stance is that
this simply doesn't happen (no doubt because they could not agree on exactly
how a POSIX-compliant system should handle leap seconds). Leap seconds are such
a mysterious pain that some have argued they shouldn't happen at all, or they
should only happen once each decade (if we can't handle it twice a year and
get it right, what makes them think it will be handled correctly once a decade?).
At issue are applications and operating systems where the human coders have not
considered time correctly, who write things like "make xyz happen at 23:59"
without realizing that 23:59 might not actually happen because a negative leap
second could cause UTC to jump from 23:58 to 00:00, and other similar mistakes.
If you want something to happen one second before date change, you should be
able to code it that way without assuming that 23:59 is that time. And don't
forget that with positive leap seconds, the time could progress from 23:59
to 23:60 to 00:00. Then there are the darn legislators who in 2005 changed
when daylight saving time starts and ends, effective in 2007. How many automated
systems will malfunction because the old timezone rules were hard-coded?
Spam: Everybody has sure-fire ideas on how to stop spam, but most of them
simply don't make a lot of sense. For example, the greet-pause feature of
sendmail will reject connections from senders/servers if they attempt to
send email before the receiving system send an intentionally-delayed
greeting. The assumption is that spammers use impatient software, and that
non-spammers use polite RFC-compliant software. Of course this is not true
but out of the thousands of connections we reject, there is no way to know
how many would have been spam and how many are not. The same is true for
IP-address blacklisting; UW is not a major spammer yet we seem to land on
black lists from time to time and in spite of being falsely targeted, we
gleefully turned on our own feature to reject email from black-listed sites.
There was some sort of idea that involved having everybody put in a special
DNS entry so that others would know you're OK but this never seems to have
gotten very far because, of course, not everybody bought into the idea and
spammers can run their own DNS servers. And so on.
Disappearing error messages: Idiots who write error messages on the screen
then clear the screen before you can read them. For this reason I'm generally
opposed to clearing a screen for any reason. Of course here I'm talking
about terminal emulators for the most part but the same thing has been
observed for example in Solaris installation windows, where I once had
the install complete and clear the screen in spite of unrecoverable
CD-ROM read errors, removing the evidence that anything went wrong let
alone what it was that went wrong. The same thing can happen on Sun systems
when OBP detects a problem then clears the screen when the windowing
software starts; if this is before syslogd starts, you may not even know
that anything is amiss.
Metric: When the metric system was invented, all sorts of new screw sizes
were invented. Not only did these differ from the old "English" sizes, they
even decided to invert pitch, going from threads-per-unit-length to
length-per-thread. Duh. But then they come up with standard sizes that
are *almost* identical to the older sizes, so that you can't even decide
at times whether you're holding a metric screw or a standard screw. So what
does this have to do with computing? A lot if you just mangled a metric
threaded hole in a rack because you put a 1/4-inch screw into it. Or if you
have one of those old stupid Sun SBUS cover plates with screws that might
be 2-56 or they might be 2.5mm. Of course the bloody French expected the
planet would go metric just because they are the French, and most of the
planet did except the dumb old US. Sigh. Time for me to take a stress pill.
"The fact that [the sample random number generators] were given in Pascal
should not be taken as an indication that they are intended for toy
applications. A Fortran adaptation is given here." -- F. James,
Computer Physics Communications, Volume 60, page 337.
– End