20071230

Happy New Year! It's about time...

Time for a New Years Resolution! Establish an NTP server at every site I work on, and point as many devices to it as I can. (I'll do this to save everyone some time.)

Consider taking a look around your installation. How long would it take to check the uptimes of the devices, and also check their on-board Real Time Clocks. How many do you think are correctly showing the current time, within +/- 10 minutes? For your network gear, how many are showing log timestamps as the "from startup" uptime stamp, instead of a date/time notation?

Now, consider that, in the time it would take to EVALUATE your clocks and timestamp settings, that you could probably FIX THEM, not just record the results! If you plan to go and touch all the gear anyway, why not save some time, and SET the clocks, instead of checking them?

But, when was the last time anyone bothered to do this, in your installation? Maybe the clocks on your devices only get set when each device is installed. (If you ship devices between data centers, across timezones, do you make a point of changing the device clocks when you receive them into your installation?) Let's face it, these clocks are cheap, and they ALL have some drift in them. Even if the drift is minimal, the error is cumulative, and can be significant across 3 or more years. You complicate the drift when you don't use a stable starting point. (What time reference does everyone use when they set device clocks? Probably their independent wrist watch?)

In the same way that you need to spend some money to make more money, you need to spend time to make more time! NTP is a great way to set clocks, and keep them close to the current time. (Even a daily time sync will all but eliminate the effect of drift on the cheap clocks.) Most devices have a no-cost client available, and many servers and some network gear have a no-cost NTP Server option available. If you have a complicated network, with multiple customers or DMZs, you can invest in an appliance that uses GPS or the cellular phne service to provide a drop-in connection on an isolated network. You just need to take some time to set up a server, and then set your devices to be clients.

Technically, all of this work (and invested time setting up NTP) doesn't save time, per se. What it will do is reduce the time you spend tracking down problems. Calculating time-math is hard. Correlating time offsets in log files between devices with unsynchronized clocks takes a LOT of time. But, when you have an "event" in your installation, and you need to find the Root Cause, you need to compare a bunch of logs, to figure out what happend first, then second, and etc.

If you take the time to set up an NTP server, you make it quick and easy to 'set and forget' the clocks on new devices coming into your installation. When you take the time to point all of your existing devices to the NTP server, you'll save time comparing the logs on all future 'events'. It's hard to guess how much time, or how soon you will recover that time, but it WILL happen. Just give it time.

-Zonker-

Some NTP appliance resources I've found to be interesting:
VMware NTP appliance VM info
Symmetricom S200
J-Time (Meinberg USA) Lantime M300
Brandywine Communications NTV-100RG

20071228

Saving time, and saving soles...

Whether you wear sports shoes, cowboy boots, or birkenstocks, the life of your shoe's soles are determined partly by how much walking you do in them. I'm a big fan of recreational walking, but I'd like to minimize the running around I need to do at the shop.

In the event of an emergency, getting on a serial console saves me from pushing a cart around, waiting for elevators, carrying a laptop around (and all the extra adapters, cables and power packs), just so I can get on the console of various devices, just so I can check configurations.

Yes, if the network is working, I can check a lot of the configurations across the network. But, sometimes interfaces die. Sometimes a typographic error will change an interface or network setting. Sometimes a cable gets unplugged accidentally. That's when your normal access breaks, and leaves you scrambling. Unless you have a console management network in place, and some strategicly placed console servers. (Better still, you should have something logging all of those remote consoles, but that's a topic for another post.)

Today I watched a friend cutting over to a new network over the holidays. Progress was being made, but it was slower than planned, and the days are starting to run out. He had developers from the new network gear vendor helping to debug the lack of interaction. While they all had their laptops, I watched as they wheeled a cart around the buildings, and up and down elevators, trying to check configurations because the network wasn't yet stable.

What was missing was a stable, simple console server deployment, with 8-16 ports in each of the main and intermediate frame rooms (MDF and IDFs). There was already fiber between the rooms. The console net could have been simple, stable, and independent of the main network. And it would have allowed them to be logged into many consoles at one time. They could have been watching errors and events on many devices as they tried tuning various settings.

Sure, this costs a bit of money to set up. But the price per port is low for simple, reliable gear. Consider the time for a couple contractors, and three developers, working over the holidays, trying to debug a problem. (I guess if your doing that work as an hourly worker, it's not so bad...but if you're the person in finance trying to close out the end-of-year books, and the network is being flaky, I imagine that your perspective about how soon the network should be stable would be different.)

I've written elsewhere about my portable Emergency Kit (a small hub, console server, adapters, cables, and canned telnet configurations to the console server). Today, I'm trying to lobby for you to consider a simple configuration, to support your current devices, with a bit of room to grow. Hopefully, you may find yourself trying to add more gear in preparation for a cutover, adding extra devices in every wiring closet, and you'll save yourself time and steps if you have some extra ports ready in each location.

Remote offices/sites deserve this consideration as well. I know many places that install a modem on the core router at their remote sites. But, what happens when the router relies on a TACACS or RADIUS server at that location, and the problem is not the router? You can dial in, get a prompt, but you can't authenticate, so you can't get into the router, or the authentication server. If you had a small console server there, and the modem allowed you into the console server, you would have a better chance of getting the access you need. (Even if you only see errors, you'd understand how to resolve the problems later...maybe the authentication server is trying to log your attempt by resolving to a DNS host that it cannot reach (since the network to the main office is down)? You know you need a DNS host in that office, or at least some static entries in the local hosts table for that host.)

Trust me about this...adding console servers isn't going to make you lazy. It WILL save you time, which you can spend doing some of the other tasks on your plate. It is well worth the investment of time and money to set it up.

-Zonker-