Sir Santa: 2016

Sunday, August 21, 2016

The Death of Veracity

The Death of Veracity

Your vocabulary word for today, students, is "veracity". You may notice that Wikipedia re-directs to the page for "honesty". But this post is about a computer, not about moral integrity.

How we came to name the machine "veracity" is a sentimental saga for another season. The soggy sadness of today is the fallout from Veracity (the machine) having silently succumbed to an as-yet unknown hardware failure.

In recent weeks I've gotten a number of alarms "you need to take backups". We seem to have had a rash of machine failures. Veracity's demise is the second outage this week alone. (The other was "the day the WiFi died" which I'll describe in a separate post.)

This is a disaster recovery story, a D/R tail with a happy ending.

Veracity and Virtualization

The good news is that the systems hosted on veracity appear to be intact. One of them, zechariah by name, is our primary IPv6 gateway server. It is up again, after copying of its system disk and root disk to another virtualization host. Whatever failed on Veracity, thankfully the disk was okay.

The other guest, Jeremiah, followed soon after. It acts as our "main" server (email, files, print, DNS, DHCP). But I had gotten lax about backups. The D/R plan for jeremiah was that if it failed we'd switch the IP address for main over from Jeremiah to Nehemiah. While it lived, Nehemiah contained regular backups of Jeremiah's content. We did switch between the two once or twice in those days.

This method of using a similar-but-not-identical system for failover goes back before we had virtual machines on our little network. Where physical systems are involved, the historical plan for D/R is to have another system with the same or better capability standing ready to pick up the load. I was introduced to virtualization in 1982 but pervasive PC-think prevented me from applying tried and true V12N methods to personal systems. Bummer.

It began to dawn on me that we don't need a secondary server for a virtual system. All we really need is a copy of that system, a clone. Call it replication. Then when disaster strikes, bring up the clone on a designated alternate hypervisor: no moving around of IP addresses, no quirks from the subtle differences between the recovery system and the real system. A copy of a virtual machine is a perfect substitute because it's not actually a substitute. They're more identical than twins.

Replication and Recovery

Zecharian and Jeremiah are in better shape now than they were before the mishap. The host hardware to which they got moved has KVM. Previously they were Xen guests. Not complaining about Xen, but the change forced me to make some adjustments that had been put off, things that needed to be done anyway. They were already configured to share the O/S (another fun rabbit trail, maybe another blog post). They share a common system disk image, now corrected for KVM. (They each have their own root filesystem.) Once the KVM changes were done for one, the other instantly got the same benefit.

I had more success recovering Zechariah and Jeremiah than with this blog post. (Don't get me started about how the Blogger app likes to lose draft updates.)

NORD to the Rescue

As it happened, I had a newer kernel for NORD than that of the SUSE release Jeremiah and Zechariah run. As it happened, I already had a KVM-bootable system disk. So I copied NORD's kernel to the SUSE system disk and re-stamped the bootstrap. Generally, one simply brings in the kernel and related modules. As long as the kernel and modules are as-new-as and within the same generation the userland parts should have few problems. Works.

Note: this is a good reason to sym-link /lib/modules to /boot/modules and commit to a replaceable /boot volume It's a feature of NORD but trivial with any Linux distro.

KVM performance sucks relative to that of Xen. Any time you can use para-virtualization (instead of full hardware simulation) you're going to see better performance. Xen was originally para-virt-only and continues to strongly support para-virtualization. But we're using KVM for the sake of manageability. The guests can run with no knowledge of the hypervisor. (We can always switch to para-virt later, selectively per device.) And these guests aren't doing heavy multi-media work. Performance is sufficient. Presence is paramount.

You can see Zechariah for yourself. (You'll need IPv6 to hit it.) The web content is presently unchaged, demonsntrating the effect of a replicated virtual machine. An update with part of this story will be forthcoming. Jeremiah's connectivity is more controlled, not generally reachable from outside.

-- R; <><

Friday, July 1, 2016

misdirected redirection

misdirected redirection

... or HTTPS everywhere getting us nowhere

I'm a long time believer in crypto. (Look me up. I'm in the Web-of-Trust.) I've done SSL stack development (in assembler). My day job is helping customers integrate field-level encryption. And I look forward to a safer, more secure, heavily crypto-laden internet.
But I'm concerned about the rush to HTTPS everywhere.

Okay, "getting us nowhere" is an exaggeration, but it makes for a decent sound-alike to English speakers.

To be specific, there are cases where plain HTTP makes more sense. Sometimes it's just for performance sake. Don't presume that every object must be fetched via SSL/TLS. Some content is actually not a security risk. I could ask "Why would you burden not-at-risk clear traffic with encryption?". But I won't ask. To ask would be rhetorical and that would be ASSuming things that might not be correct.

And then there's automation. For better or worse, our world runs on automation, much of which is not encrypted. Not now anyway. Much not now; some not ever. And unencrypted operation is not inherently evil.

Rule number one: don't break stuff.

Instead of rhetorically asking or assuming, I'll tell you: think about what should and should not be protected with TLS/SSL. Don't blindly bloat our treasured traffic with careless crypto. Think. Be selective about which sessions and services actually needs the extra work. (And it will be extra work, and it won't be your burden alone. Choices you make affect other people, always.)

Okay, "don't break stuff" is trumped by security (and by bugs). But hear my point that getting it right is hard work. Blanket solutions aren't solutions, and solving even the most urgent problems by wanton breakage is to follow one problem with another.

It's as if someone (make that plural, many someones) asked (rhetorically), "Why would you not encrypt everything?". The question ASSumes that there's no good reason to have cleartext on the net. But there are good reasons. I'll cite only one because I'm tired, presently annoyed, and generally cranky of late. Here's a classic, "Why would you ever need more than 640K?".

Important note: That's not how the quote goes and Bill Gates never actually said such a thing. (Someone did say something once to John Sculley about floppies being all Apple would ever need in response to a question about Macintosh networking. Wanna guess who?) Please hear my second point that there is no one-size-fits-all for software, or for hardware, or for clothing.

My case is the chicken-and-egg situation of trying to build OpenSSL from source. One must download the source before one can build the source. How does that happen? Easy, use 'curl' or 'wget' or some similar tool, point to openssl.org, get the tarball. Explode the downloaded tarball and follow the standard recipe. But if you don't have SSL working then you can't use HTTPS.

So far my case doesn't sound like a problem. Here's the problem. Some bright "HTTPS everywhere" aficionado decided that nobody should be using HTTP. When you hit openssl.org via HTTP you (now) get re-directed to HTTPS. If you're downloading OpenSSL for the sake of implementing OpenSSL ... and you don't already have some kind of SSL or TLS ... this is a difficult situation.

They broke stuff. They broke my stuff. Now it's personal.

Building systems from source is important. Or I don't know, maybe it's not important. (Seems like it's important to some people, but they're getting hard to find.) I build from source for several reasons, partly because I'm a control freak, partly because I'm a tinker, and partly because I don't trust systems built by other people. Trust ... it's always about trust. (Systems built by other people: I do use them, but kind of like Google. I use Google but I don't trust them. Been saying that for several years now. And look, here I am blogging on a Google property.)

The automation in question actually has SSL. The problem is that it has an embryonic infrastructure with an empty PKI trust store. This is not to say that it doesn't have a solid trust chain. It just doesn't (at the point of fetching OpenSSL source) have a cache of root certificates for the World Wide Web. So when we hit a site like openssl.org (via HTTPS) the server certificate fails to verify. (Plain HTTP is fine, was fine, until mister "my solution works for everyone" did the re-direct re-design on their site.)

Gimme back HTTP!

It's the re-direction that's the problem.
I said HTTP because I meant HTTP. The protocol has 301 and 302. Oy vey, another great feature now fallen victim to abuse by ill-conceived implementation. (There's a long and growing list of those.) The files didn't move. (Would be nice if some people used 301 to replace 404, ya think?)

This is the second time in recent months that I've run up against someone having disabled a perfectly reasonable function because they knew better than the rest of us. I guess we gotta kill stupidity one bad idea at a time.

Let's encrypt. Let's encrypt widely. Let's encrypt carefully.

-- R; <><

Tuesday, June 28, 2016

Off-site Backup

Off-site Backup

I've been pressed for time. Scheduling one thing to finish when another is due to start has become essential. On this particular evening, I was hungry, but needed to walk the dog. So I checked the time required for my pizza: 18 to 21 minutes in the conventional oven. Perfect! I could let this cook while exercising the canine.

As I left the house, it occurred to me that I was leaving the hot oven unattended: wife was visiting family, son was on campus, daughter was at work. The risk was small, but it's the kind of risk that we avoid. I was reminded of what happened to Bdale Garbee a year or three ago.

The Garbee family lost everything. Their home was consumed by the same Colorado fire the rest of us heard about on the nightly news. See the video. Bdale's account is enlightening.

For me, this was just a ten minute mental exercise, but here's how it went. All that was most precious was off-premises: family in distributed locations, dog with me. What remained, except for a few heirlooms, was replaceable. But the data? What if we happened to lose the data? Maybe we depend too much on computers to hold our "data".

Until maybe three months ago, I did have off-site backup. Probably seems like overkill for a residential "data center", but I'm a hobbyist. We hobbyists do things for fun that others do only for pay. Dad was more than happy to let me park a surplus desktop-turned-server at his house. A little Linux, shiny new SATA drive, some SixXS to avoid NAT, and a touch of RSYNC. Voi-la! Instant off-site backup.

That was when he had a house. Now he's in an apartment. The facility doesn't provide wired internet; everything is WiFi. And the hack I had rigged to piggy-back that server off of his Windows box began to fail. (First to go was an old 8-port hub in the Rube Goldberg scheme I concocted.) I retrieved the machine several weeks before my lonesome pizza and puppy party.

Surplus hardware is great! You get extended life (from an investment someone made, if not you yourself) and you get low cost service for all kinds of things. Here I had Fedora with LVM and a decent sized platter stack. It was more than just remote storage; it was also a remote point-of-presence. (Helps for those when-not-if times that something funky is happening with the web. And Netflix can just chill because the bandwidth is way to low for regional masking.)

The point of the post: think about a surplus box built to your own specifications as a means to have your own off-site backup or similar service.

But my off-site machine, and its spinning rust, was back home now. Any catastrophe which might wipe out my primary systems would do just as much to my spare. Scary!

When pooch and I got back from our walk, the house was not in flames. There was no smoke nor out-of-control cookery. Instead I was greeted by the aroma of a nearly finished Red Baron Supreme with thin crust. Yesss!! But I am reminded that I need to get serious about the remote box (or maybe two?) and re-deploy real soon now.

-- R; <><