[LEAPSECS] Schedule for success
Rob Seaman
seaman at noao.edu
Sat Dec 20 14:31:31 EST 2008
Scroll down for my response. Context seemed important.
On Dec 20, 2008, at 11:33 AM, M. Warner Losh wrote:
> In message: <D754EF5C-767A-4FF0-AC64-6E9543AAA62A at noao.edu>
> Rob Seaman <seaman at noao.edu> writes:
> : Poul-Henning Kamp wrote:
> :
> : > Steve Allen writes:
> : >
> : >> Please identify the operations which need one second
> predictability
> : >> over a time span of six months.
> : >
> : > Wrong question.
> : >
> : > Try: Please identify computer communications where it is not
> : > guaranteed that all involved computers will have their software
> : > updated every six months.
> :
> : Meant as a bon mot, I guess? Seems to emphasize Steve's point in
> any
> : event.
> :
> : However, you've actually identified a potential mechanism for
> : distributing scheduling data of all sorts, including for leap
> : seconds. Instead of building computer hardware, operating systems
> and
> : applications that pretend the relentless update cycle doesn't exist,
> : build such systems to expect scheduled updates to software and key
> : data structures. Leap seconds are just one from a large class of
> non-
> : static information that needs to be widely shared in common for
> : infrastructure to work.
>
> Sadly this is well divorced from reality. People can and do build
> systems that have a long shelf life. It is routine in certain sectors
> to buy 10 of something and put 8 into the field. The other 2 are
> spares and sit on the shelf for a long period of time. The software
> is rarely updated on systems like this (why should it be, they are
> simple and bug-free enough to run for years). These systems are
> expected to run for 10 years with < .001% (so called 5 9's) downtime.
> Upgrades make that nearly impossible to meet.
>
> When one fails, another one gets swapped in. Otherwise the system is
> up all the time. To force an upgrade every 6 months would force a
> down time, which is unacceptable. It would also, in many cases, for
>>
> someone to physically go to the location where the systems are running
> to do the upgrade since many of these systems aren't on public
> networks (and the private ones are oversubscribed with their current
> data loads, no room for extra software updates).
You are arguing with Poul-Henning, not with me.
Just a reminder that not only are astronomers big-time customers for
*both* high precision interval timescales *and* high accuracy earth
orientation timescales, but we *also* have some of the more bizarre
requirements for high reliability, remotely located, long shelf life
systems such as you describe.
Again - why are engineering best practices regarded as an annoyance?
If the real world places a requirement on a system (like - SI seconds
are not civil seconds), then a better solution will result from
actually designing this into the system up front. Perhaps this will
require an update every 6 months. Perhaps it can be longer than
that. Perhaps autonomous, adaptive scheduling can be built in.
Trying to pretend the requirement doesn't exist is naive decision-
making at its worst. One can't claim that systems resulting from such
a decision-making process have been engineered in any real sense at all.
> The non-regularity of leap seconds makes this very hard to do. Even
> with a GPS receiver in hand, it can be hard to start cold, and there's
> no way to startup reliably if you've been off as little as one year.
> These systems routinely exchange data with timestamps, some of which
> is historical. Without leapsecond knowledge, you get degraded
> performance. Systems that are off for a year have no clue when the
> last leapsecond(s) were, unless there weren't any in that time. This
> can and does lead to degraded performance in some cases.
What are those cases? Use cases drive requirements. If we weren't
perpetually trying to squelch the inappropriate rush to a decision of
the ITU, we could focus on the actual system engineering issues.
> Except, of course, in this case good system engineering is that these
> systems will run, unattended (and unnetworked), for years doing the
> job they need to do. To force them all to upgrade just because of
> leap seconds is silly.
You are describing another requirement for untended operation.
Requirements aren't a zero-sum game with one winning over another.
Rather, a coherent set of requirements have to all be honored in a
conforming solution. I said the exact opposite of demanding that they
update. However, if as PHK says, many systems have to update
frequently, those that do could use this as an opportunity to update
broad categories of data tables - not just leap seconds. Those small
number that can't use this commercial update regime, will still
benefit from following a coherent plan.
Truly untended, unnetworked systems likely have no requirement to use
UTC in the first place. An instrumentality of mankind that is
disjoint from the rest of civilization may need timekeeping, but
likely doesn't need civil timekeeping.
> You've constantly poo-pooed the notion that people that have actually
> written and deployed dozens of these systems know what they are
> talking about. You all but call such people morons for not following
> good system engineering practices. Yet, you show a surprising
> ignorance of how things actually work and of system engineering
> practices demanded by customers.
Customers pay for solutions. Solutions require problems to be
defined. That definition benefits from engineering. A customer might
attempt to specify an engineering process. (Although it seems more
likely that they would specify some widely known best practice
engineering process, rather than some ad hoc abrogation of same.) A
vendor or engineering firm should take the customer's specifications
into account when preparing a proposal. A wise engineering firm won't
accept a job with unrealistic strings attached.
I don't believe I've called anybody a moron. I apologize if so.
Otherwise, I am not responsible for the inferences of others to what
truly seem to me to be unremarkable statements regarding the
engineering of systems. Naive solutions are bad. Proper engineering
is the best way to avoid naive solutions. Why is this controversial?
It is my college logic professor you are arguing with if the assertion
is that one should defer to experts simply because they are experts.
However, folks do understand perhaps that the astronomers contributing
to this list are predominately members of the astronomical software
community and are experts themselves?
I'm not sure what "these systems" refers to, but timekeeping has often
been a key element of systems I've designed and deployed. I'm well
aware of the mistakes I've made over the years. My goal is to keep
the ITU from making one big whopper of a mistake revising this general
standard. Unless I sit on a design review for a specific system, I
don't have an opinion about the process design choices for that system.
> The root cause of all of this is the irregularity of the scheduling of
> leap seconds. If they were on a schedule, known years in advance,
> then these systems could be built.
And several of us from the astronomy software side have expressed
interest in exploring options for lengthening the schedule. My
reading of the current UTC standard is that this might even be
completely acceptable with no ITU or government level changes.
Wouldn't this be even easier and quicker to implement than the goal of
eradicating leap seconds, a goal that cannot (apparently) be reached
before 2019?
> Imagine if you have to find out from the pope every year if this year
> was going to be a leap year or not? There's all kinds of problems
> *THAT* would cause, and nobody would debate it....
And yet that is exactly the case with time zones and daylight saving
adjustments. Any government anywhere can change these far more
visible policies at anytime. The sliding time zone solution would
magnify this reality at a quadratically increasing cadence worldwide.
Rob
More information about the LEAPSECS
mailing list