16 November, 2015

A DDoS Attack Can Really Ruin Your Day

I hope I don't have to go through too many more days like this past Saturday, 14-Nov.  The insidius thing about DDoS attacks is there is little that can be done about them, except a few things.  Ordinarily (the DoS attack) you identify a host or netblock which is sending the nuisance traffic and you add some rules in your router to drop those packets.  But add that extra "D", and it's like being surrounded by a swarm of bees, you can't possibly swat them all dead.

The few things which can be done (which I know about) are:
  • direct traffic somewhere else.  There are companies which specialize in this sort of mitigation, and sometimes sink/absorb multiple gigabits per second of spurious traffic.  Although I've never approached anyone who does this, I'm guessing that doesn't come cheap.
  • insert some sort of rules to limit the rate at which these packets are handled.  This still diminishes service, but hopefully it's not shut down completely.  At least processing is occuring on the packet level, and it's not an application wasting time handling meaningless connections.  Also the kernel doesn't waste time maintaining socket state and such.
  • coordinate with your upstream Internet provider to block or reduce the rate of ingress of these packets.  This is more useful than acting on one's own due to the limits of your upstream link...i.e., the attackers are simply filling your pipe with useless bits.  Considering my link is what you might call residential-class service, this is impractical.  (Besides, shhhhhhhhhhh!  their ToS say I'm not supposed to be running a server.  But let's think about this a second.  If they were truly serious about this, all they would have to do is block the well-known proto/port used, just like they do with outbound TCP/25.)
  • shut off service entirely, and hope that the attacker botnet eventually "loses interest" in in your host, or gets shut down by others' actions.
This last is the strategy I used yesterday.  I know it's much less than ideal and is not sustainable.  From what I can tell in the logs, it began about 0130 local (US Eastern) time, and lasted for around 15 hours or so.  I was merrily puttering around (I think Web browsing) when I noticed the Internet NIC activity LED on my router was flashing an awful lot, accompanied by a lot of disk seeking sound (which would have been syslogd(8) writing out maillog).

The first thing I did of course was log onto the (Linux) router and do a tcpdump of the "WIC" (wide area network (WAN) interface card), and discovered a lot of TCP/25 traffic.  So I pulled up the maillog, and discovered a lot of instances of "host such-and-such did not issue MAIL/EXPN/VRFY/ETRN" and messages about refusing connections because the configured number of (MTA) children has been reached (happens to be 7 in my case).  By far my biggest concern is the attackers will be sending something which trips up my MTA and makes it spam other people, possibly getting me knocked off my ISP.  So I did what I usually do, look at a representative sample of recent maillog entries and add some iptables(8) rules and DROP traffic coming from these hosts (or maybe netblocks) which are causing the "did not issue MAIL/EXPN/VRFY/ETRN" entries to be generated (basically, connecting and doing a whole lot of nothing, just tying up a socket and pushing me over my child process limit).

There came a point where I realized I needed more than just a line or three of rules, so I decided to grep(1) for "did not issue" in the logs, take the latest 150 such entries with tail(1), extract the IP addresses with sed(1), and use sort(1) to get a unique set of them.  As I recall, I came up with 30 or so addresses, and used a for loop to append DROPs of them to the PREROUTING chain in the mangle table.  (The idea is to cut off traffic as soon as the router takes packets in, so a PREROUTING chain.)  Unfortunately, it became apparent that a handful of addresses wasn't going to be effective, because the log messages kept on a-comin'.  So I decided maybe a little PTR record and whois(1) investigation (to block entire netblocks instead of individual addresses) was in order.  A disturbing trend started to emerge.  The IP addresses were generally purportedly to be Russian and other Eastern Bloc countries, who, at least to me, are notorious for being the origin of a lot of DDoS attacks and spam.


I really did not want to shut off services entirely, but I saw little choice.  I did not want my MTA compromised to be yet another source of spam.  I put in an iptables(8) rule which dropped all TCP/25 traffic arriving over the WIC.  I noticed this did not stop the popping of message into maillog and realized the attackers were also attempting connections to TCP/587 and TCP/465, so I put in rules for those too.  For the moment, IPv6 is, in a sense, only in its infancy (for being a twenty or so year-old infant!), so there was no apparent reason to add any ip6tables(8) rules.  And in fact, email still flowed in over IPv6, most notably from Google (thanks for being a leader in the adoption of IPv6!).

It was at this point I was very glad that (according to Wikipedia) Dana Valerie Lank, D. Green, Paul Vixie, Meng Weng Wong, and so many others collaborated on initiating the SPF standard.  I began to think, from whom was it particularly critical that I receive email?  I could go to those domains' SPF records and insert ACCEPT rules for those addresses or netblocks.  Google was already "covered," as noted above about IPv6.  Oddly enough, the first domain I thought of was aol.com, because my volleyball team captain might want to tell me something about the soon-upcoming match on Monday.  The SPF for them looked knarly, with some include: directives.  I settled for looking at Received: headers in previous emails from Matt and identifying the netblock AOL had been using previously (turns out it could be summarized with a single /24).  Next I thought of Discover, then of First Niagara, then Verizon (for them informing me of impending ejection from their network for violation of their ToS).  I also thought that although it's not all that critical, I receive an awful lot of email from nyalert.gov, especially considering we had that extended rain and wind storm, and the extension of small craft advisories and so on.  All in all, I made exceptions for a handful of mailers.

Then to evaluate the extent of the problem, I used watch(1) to list the mangle PREROUTING table every 60 seconds, to see the packet counts on the DROP rules.  I'd say they averaged deltas of around 20, or one attempt every three seconds, and the peak I saw once was 51, or nearly an attempt per second.  I know as DDoS attacks go, this is extremely mild.  If it were a seriously large botnet controlled by a determined entity, they could likely saturate my 25 Mbit/s link, making doing anything Internet-related extremely challenging or impossible.  Always in the back of my mind was that this was unsustainable in the long run, and hoped that the botnet was dumb enough to "think" that if not even ICMPs for connection refusal were coming back, that it would assume they in one way acheived their one possible objective, which was knocking my entire MTA host offline.

I then contemplated the work which would be involved in logging onto various Web sites (power, gas, DSLReports/BroadbandReports, First Niagara, Amazon, Newegg, and on and on) and updating my email address to be GMail.  I also started trying to search for email hosting providers who would basically handle this entire variety of mess on my behalf, and forward email for me, whereby I could block all IP addresses except for those of said provider.  Or maybe I could rejuvenate fetchmail(1) or similar to go get my email from them over IMAP and reinject it locally, as I used to do decades ago with my dialup ISP's email.  To my amazement at the low prices, it looks as if, for example, ZoneEdit will handle email forwarding for only on the order of $1/month/domain (so $3 or maybe $4 in my case, because there is philipps.us, joe.philipps.us, philippsfamily.org, and philipps-family.org).  This is in contrast (as far as I know) to Google's $5/month/user, and I have a ton of custom addresses (which might be separate "users").  (Basically, it's one of the perks of having your own domain and your own email server.  Every single sender gets a separate address, and if they turn into a not-so-savory sender, their specific address ceases to work.)  The search terms needed some tweaking because trying things like "mail exchangers" (thinking in terms of MX records) turned up lots of hits for Microsoft Exchange hosting.

A friend of mine runs a small Internet services business and has some Linux VPSes which I can leverage.  He already lets me run DNS and a little "snorkel" where I can send email through a GRE tunnel, and not appear to be sending email from a "residential-class IP address."  So I called him up and thankfully he answered right away.  I got his permission to use one of the IPv4 addresses (normally used for Apache/mod_ssl) for inbound email, in an attempt to see if these attackers are more interested in my machine (the specific IP(v4) address) or my domain(s).  If I add an additional address to accept email, and the attacks do not migrate to that address, I then know that it's far more likely that this botnet came across my address at random or by scanning the Verizon address space.  So, I picked an address on one VPS, added a NAT rule to hit my end of the GRE tunnel, had to debug a routing table issue, redid the MTA configuration to listen on the tunnel's address, all-in all about an hour and a half's worth of work to implement and do preliminary testing.

It was at this time I realized in my watch window that the packet counts were no longer increasing, even over several samples (minutes).  Even after I added the A record to supplement the one for my Verizon address, I noticed there was basically no activity in maillog.  So as besst as I can tell, it was as I suspected, the botnet was really likely only interested in my specific address/host.  And thankfully it "saw" the lack of any TCP response as an indicator that the host had gone offline, and to cease directing any resources to the attack on me.  I hate to give these worms any ideas, but you could also try other things, like ICMP echo, to determine if that host is still alive.  Then again, if your sole objective is compromising an MTA, maybe that doesn't matter.

Eventually, I inserted a rule to accept TCP/25 traffic again, thinking if attacks resumed, I could easily kick that rule out and spring the shutoff trap again.  Or even better, I could replace it with a rule including the limit match, so only something like 10 or 15 connection attempts could be made per minute.  At least the MTA would not be wasting time/CPU on these useless connections, and the kernel would not have to keep track of state beyond the token bucket filter.  I almost hit the panic button when I saw some more "did not issue" messages, as well as a little later a notice of refusing connections because of child limit.  But I reasoned, just wait a few minutes, and see if it's persistent.  Howver, I had a lot of anxiety that it was not over yet, and that it was the domain and not the host the attackers wanted compromised, because some of that unwanted activity was through the newly set up IP address.

In retrospect, I question whether I want to continue doing this myself.

  • It's against the ISP ToS, and they could blast me off their service at any time.  I'd likely have to pay their early termination fee and I'd have to go slinking back to Time Warner (and their slow upstream speed).
  • What happens the next time this occurs?  Will it be more persistent?
  • What if a future attack is attacking the domain, and any IP address which I use get similarly bombarded?
  • What if next time it's not around an attempt per second but instead hundreds or more attempts per second?
  • I should have traced traffic on my GRE "snorkel" tunnel to see if they managed to compromise my MTA, and were actually sending email surreptitiously.
At least the experience has uncovered flaws in my router's configuration, and I'll be more ready to switch to using the VPS to recieve email.  And I'll have some ideas about hosting companies if I want to migrate operations there instead.

UPDATE Mon., 16-Nov: There are some more folks at it again.  What I've decided to do for now is to put in some iptables(8) rules which use the limit match to allow only a certain number of connections per minute.  So far this has been fairly effective at keeping the chaos to a minimum, but it's not ideal.  I really think I'm going to have to look at an email hosting service, at least until I can put up a more robust email server, or maybe permanently.


Direct all comments to Google+, preferably under the post about this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!

16 August, 2015

It's Been Interesting Times This Morning With Some Severe Thunderstorms

15-Aug-2015:

(dated here at the top in case I don't finish this by midnight...all the "this morning" or similar references refer to the 15th.)

The supposedly Chinese saying goes, "may you live in interesting times." This morning I would say qualifies, although not in the curse sense.  It was just a weird slice of life with some drama for me.  As I and Garrison Keillor have said many times, it could be worse, and I got away comparatively really well...this time.

What an early morning.  I had slept for a couple of hours, when for unknown reasons I awoke to heavy rain and a thunderstorm at about 1:45 EDT or so.  A few minutes after listening to this, there was a REALLY CLOSE lightning strike, no discernible delay between the really bright flash and a sound which seemed like it would be loud enough to at least crack if not shatter a window (but all mine seem OK).  Thinking only more of this is going to continue, making it next to impossible to get back to sleep with all this thunder going on, and the fact that I like watching lightning, I decided to stay up.  It was quite a light show.

As I was wandering over near my kitchen sink, I heard the sound of water pouring at an incredible rate.  I knew what this was; I have experienced times before when water was coming into the drain tiling so fast that it made that definite sound of water pouring into other water, such as pouring water out of a pail into a swimming pool.  However, this was different.  It seemed more intense, maybe much more intense, than I had previously experienced.  That underscored just how hard it was raining outside, and that it had been doing that for long enough to soak through 2 to 3 meters of earth.

It occurred to me this could be quite a problem.  One day a few years ago, I was standing in my basement at my workbench during a thunderstorm when we also had a close lightning strike.  This was a bit further away back then because there was a perceptible, although short, delay between flash and thunder boom.  But close to the time of the flash, I also heard the sound of clicking, which I quickly deduced was the sound of (more than one) circuit breaker tripping.  I was also quite concerned that day because I smelled that same smell when you've built up quite a bit of friction between blade and wood when sawing, that smell of wood slightly roasting.  The thought was, "something got hot enough to do that to one of my floor joists?  ouch!"  I never did find any evidence of electrical fault or roasted wood.

So that previous experience got me to thinking, what if the same thing happened in this really close lightning strike, but this time to the breaker protecting/controlling my sump pump?  I went into my basement post-haste to look.  I could hear the sump pump motor humming away, so no, at least that breaker was OK.  Nonetheless, I went to the load center to look at all the breaker levers.  The RCD (GFCI) breaker for the garage was tripped, so I reset that.  All the others looked to be in the ON position, so I was fortunate there.

I thought I had fixed the concrete block mortar so that water would not leak in.  I was proven wrong this morning.  The Susan Lane Basement Creek was flowing again, as usual from south wall to sump at the north end.  When I had first noticed it this morning, it was not even to my workshop wall yet, but almost.  Judging how water was falling out of the sky to beat the band, I figured it had to make it to the sump, eventually.  Ever since the first incident after moving in, I have been careful not to leave anything on the basement floor which did not have the tolerance for a couple of millimeters of water...so plastic bases on computers, glides on the legs of wooden tables, and so on.  The poor workshop wall still suffers, but oh, well; this sort of thing hasn't happened in a long, long time so it's not really worth trying to mitigate.

But that seemed the least of worries at the moment.  My attention turned to the sump, and the heretofore unseen volume per unit time entering through the drain tile outlets.  It had to be on the order of a liter per second.  After a pumping out cycle, it had to be somewhere around only 10 or at most 15 seconds to fill it to the pump trip point again.  And during the pumping cycle, it really seemed like the pumping rate was not all that much faster than the filling rate.  I began to think flooding was a real possibility, so I started "pulling for" the pump.  Then it occurred to me, I would be in some exceptionally deep doo-doo if, due to the thunderstorm, we lost mains power.  That would mean the sump would overflow, and who knows if the floor drain about 4 or 5 meters to the east could handle the requisite rate?  I also started contemplating what, if anything, was subject to water damage between those two points (turns out, not a lot).  I had to have been nervously standing there for at least 20 minutes.  I sort of assured myself after that much observation that it was rather unlikely the flow rate would increase significantly enough to overwhelm the pump.  That was somewhat diminished by the thought that even if the rain stopped immediately, there were still many, many liters remaining in the soil which would make their way into my sump.

As I was ascending the staircase out of the basement, as I have a number of electroluminescent night lights, making much of my house glow a dull green, I noticed the power was interrupted for a few hundred milliseconds.  As I passed through the kitchen, I noticed the microwave had held time, but the oven did not.  That's weird, it's either so brief they both hold or long enough they both lose it.  It's very rare indeed that one holds but the other loses.  I knew that since at least one lost it, it was likely at least one, probably several, of my computers had just been forcibly rebooted. I only have so much capacity in my UPS, so I have decided only the  barest of essentials...which includes the router, the managed switch, the video monitor, and little else... will be connected to it, not everything.

Sure enough, as I got to the systems room, the screen was showing the RAM test for sal9000 just ending, and beginning to execute the Adaptec SCSI host adapter POST.  I watched it to see that it wouldn't hang on, say, a USB HDD (I have still yet to figure out why a USB drive left plugged in, even one with GrUB installed onto it, freezes POST).  So at least this one is alive and well.  Next my attention turned to my MythTV, because the boot was hanging at the point where sal9000 was supposed to NFS mount some shares from it.  Ruh-roh.

That was a whole lot less hopeful and more worrisome.  That is the one system I access as a general purpose computer the least, so I have it set up not to have screen blanking.  One of its purposes is to be able to switch to it to warm up the CRT (this one takes on the order of 15 seconds to show a picture).  So it was quite disconcerting when the monitor would not lock onto a picture, and its LED went amber indicating power saving mode.  Tap, tap on some keys to try to wake it up; there was no response.  OK, fair enough, I'll just activate an ACPI shutdown by pressing the power button.   Ummm...ummmm....ummmm.....that does nothing.  OK, so it's very much not as preferred, but I'll hold in the power button to make the ATX PSU shut down hard.  Ummmmm....how long have I been holding it?   1...2...3...4...5...6...7...uhhhhh, shouldn't it have hard shut down by now?  Isn't that supposed to be 5 seconds?  At this point, I'm thinking my poor MythTV has had it, I'll probably never see anything through it again, my (yearly) SchedulesDirect subscription renewal in June will have gone mostly for naught.  Hmmm....what to do...I went around to the back, yanked the power cord from the PSU, waited for the Ethernet link lights to go dark, and reinserted it.  I went back around the desk to the front, and with much hope pressed the power button.  Yay!  The power LED came on.  The CRT warmed up, and I saw RAM being tested/counted.  512 MB OK; this is good.  It booted...until the part with the filesystem check (fsck).  Well, I knew this was going to take the better part of 10 minutes or so; this is "normal" PATA, on a 1GHz Pentium /// machine, 400 or so gigabytes-worth.  So I turned my attention to the rest of the network.

Rootin, being on the UPS, seemed just fine.  In fact, it looked to be getting out to both the IPv4 and IPv6 Internet just fine.  By that time, the ONT had plenty of time to reboot, or had not been taken down at all.

The next thing to cross my mind was the rain gauge in my back yard.  Had we had so much rain it had overflowed?  It was roughly 0330 hrs by now. I put on some boots and a raincoat and went out to look.  Even in the dim light of night I could see about a third of my yard and half of southern neighbor Megan's yard had visibly ponding water.  The rain gauge capacity is 110 mm, give or take.  It was to 96.2, of course using the scientific method of estimating the last digit between the markings.

Not too long after that, I saw what looked like a truck from our volunteer fire department (not an engine, something a lot more regular and rectangular) go by on Huth Road, the street which connects to my dead end street.  I thought that was kind of odd, and I wonder what they were doing. The best I could guess was they were looking around the neighborhood for a fire started by the lightning strike.  A few minutes later, I watched them slowly go down my street too; southbound.  I went out to my street to try to get a better look at what they were doing; they were maybe 175 m away by that time (my street I believe is about 400 m long).  It's then i got the idea that I wanted to see my sump discharge, as I knew it would be likely doing so about every 10 to 15 seconds.

I noticed then that the grate cap was nowhere in sight.  I walked down the street a little ways, figuring it floated away (it's made of plastic).  Darn.  I should probably go to Home Depot later today for a replacement, I thought.  Aw, heck, it's got to be somewhere on the street or close to it, right?  So I went back into my house and got a flashlight.  As I returned to the outside, I noticed Fire Rescue 7 had turned around, and now was going, much faster this time, north.  Whatever they were  looking for, either they found it, or decided it wasn't on Susan Lane (or Hemenway).

I kept looking.  I thought it was unlikely for the cap to disappear.  It wasn't small enough to fit down the storm drains.  It didn't lodge against the tires of the cars parked in the street.  Eventually, about 10 minutes later, I found it...maybe 120 m "downstream," and a meter and a half away from the side of the street (we don't have curbs).  I thought, score! don't have to go to Home Depot anymore!

When I got back to the house, after I put the cap back on, I thought, maybe the NWS would be interested in how much rain I measured.  I checked the latest chart of the last 72 hours of measurements, and they had recorded roughly 60mm since midnight.  Even though they are on the order of only 5km away, there still could be significant differences in things like rainfall.  So I Tweeted to them (note: this requires JavaScript to be allowed from Twitter to view completely):

Apparently, they liked this very much:


I was displeased that they "imperialized" it instead of leaving it in SI units, but what the fsck, I realize they're publishing to a primarily US audience, and the US doesn't deal with SI very well at all.

Things seemed to be calming down. The downpour had reduced to just drizzle, my sump pump was undoubtedly keeping up with inflow (which was reduced), The Internet was accessible and all my systems seemed to come away from this unharmed...or were they?

Every once in a while, I'll go down my list of systems in the ConnectBot SSH/Telnet client app on my Nexus 7 and log into each one, just so the date/time last accessed will say less than a day.  Maybe I should have just put the tablet down and went to sleep.  But I hit the "sal9000 con" entry, one that establishes a "telnet" session to my Chase IOLAN serial terminal server on TCP port 5003.  (The port number I chose is a holdover from me working with Bay Networks' Annex 3 terminal servers, which port numbers are not configurable, they're 5000 + serial port number).  This in turn connects to serial port 3, which is connected to the ttyS0 port on sal9000.  And...it wouldn't even handshake.  I tried the rootin entry (port 2/5002).  Similarly, no dice.  Sigh...all was not well with the computing world.  So I went and traced the power cord, yanked the 120VAC from the IOLAN PSU, replug, saw the LAN lights going nuts as it TFTP downloaded updated firmware from rootin, and again a few seconds later with the updated configuration (the firmware for the most part will operate just fine, but i wanted the latest available used.  Similarly, it has nonvolatile storage for the config, but I figured what the heck, it's good to have a live backup on a server too.)  So eventually, sometime after 0400, yes, things as far as I could tell were right with the computing and networking world.

I stayed up longer, hitting refresh on the 72 hour history page, to get the 3:54 precip figure.  It wasn't posted until about 4:15.  It added some millimeters, but the total still didn't get up to my measurement by a fair margin.  I had a snack and some orange drink, and finally settled down to "Coast to Coast AM" with David Schrader on sleep timer.


Direct all comments to Google+, preferably under the post about this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!

03 June, 2015

Another (Couple of) Days in the IT Grind

I do indeed realize there are far worse things that could have happened to me, but the past couple of days have not been good.  I am a technologist, and as such, I get very uneasy and tense when the technology I own fails.

It started out during experimentation with installing Xubuntu on a Pentium II (P-II) 450 machine.  What I had completed earlier this week was to take apart a failed ATX-style power supply, unsolder all its output cables, take apart a Dell power supply (which has proprietary wiring of its 20P20C plug), desolder the proprietary and solder in the standard 20P19C plug.  I don't care if I blow up this P-II-450 system, because it is one of the lowliest of capable systems I have here at home, and also a bit wonky at times.  So it was the natural target for testing of my power supply unit (PSU) cable transplant job.

It turns out that the wiring job went well, no magic smoke or sparks were released for either the PSU or the computer.  As just mentioned, it is a bit of a funky system, and with the transplant PSU, it seemed to want to boot off optical disk OK but not hard disk (HDD).  With another supply I have, it didn't seem to want to boot off optical (got to a certain point in the Xubuntu 12.04 disc and rebooted) but the HDD seemed to operate, albeit with a bootloader installation error which I was trying to remedy (hence needed both drives to operate well).  For whatever oddball reasons, a brand new PSU, less than one hour power on time, finally operated the computer, HDD, and CD OK.  (Since then, the other PSUs seem to work too, don't know what changed other than all the unplugging/plugging.)

The first weirdness was this ancient Intel mainboard was complaining about being able to read the SPD (???) of the RAM I put into it (had 2 x 128M DIMMS, which I was replacing w/ 2 x 256M + 1 x 128M).  So I puttered around with Google and Intel's legacy support site, and managed to make up a BIOS update floppy.  After flashing, the SPD error did not go away, and (probably because of that) it will no longer "quick boot" (skip the RAM test), PLUS I haven't found a keystroke which will bypass the RAM test.  It checks somewhere around 10-20 MB per second, so it takes the better part of a minute before it will boot anything (CD or HDD).

After getting a working Xubuntu 12.04 set up, I tried doing the in-place 14.04 upgrade.  That worked kind of OK, except the X server would not run (log showed it was SIGSEGfaulting).  MmmmmKay, maybe something did not work well during the upgrade, so let's try a straight 14.04 installation (which had to be done with the minimal CD, because the full disc is more than one CD-R, so must be USB or DVD).  This implies installing almost everything over the Internet.  This computer is so old it doesn't boot from USB, and I don't really have a spare DVD drive, so over the Internet it was.  Unfortunately, it had the same result, the Xorg server would not stay running.

On one reboot while trying some fixes, the boot just froze.  I found out this was due to not getting a DHCP address (network initialization).  So I arose from my basement to find my Internet router (a P-II 350) locked solid.  That fortunately rebooted OK.  That prompted me to get a rudimentary daemon going to drive a watchdog timer card I had installed a few months ago after my previous router went splat.

After getting home from my volleyball match last night, I found myself able to log onto the router, but off the Internet.  I rebooted, and I was back online.  I may have been able to get away with ifdown eth3 && ifup eth3, but I didn't think of it at the time.  I also reinstated the command to send an email when booting was complete.

I awoke this morning to see that sometime after 3am it had been rebooted, no doubt due to the watchdog timer card tagging the reset line.  In retrospect, this is when the system gets really busy reindexing all pathnames for mlocate.  I have since adjusted my daemon to call nice(-19) to give it the highest userland priority.

I had been watching the latest EEVBlog Fundamentals Friday on BJTs and MOSFETs when I tried to leave a comment.  And YouTube's little "loading the comments" icon would not disappear and show me the comments (and the blank for another one).  I found out the router was routing with whatever it had in RAM, but it was spewing oodles of disk access errors on the console.  Presumably it needed something on disk in order to complete DNS recursion or something.  I couldn't even log onto the router.  I just had to "let it ride."  It immediately made me very nervous, because so much I have relies on the functioning of that router: some public DNS zones, email, Google Voice VOIP, routing between my VLANs, DHCP, Hurricane Electric's 6in4 tunnel/radvd, and on and on.  The worst of it is that static IPv4 addressing is horrendously expensive (Verizon (for FiOS) charges $20/month more than a DHCP account), and while TWC's leases are a week, Verizon's are a short TWO HOURS.  So let's just say, there are a whole lot of little headaches strewn throughout the Internet which require attention when my IPv4 address changes.  So being inaccessible more than 2 hours could add insult to injury.

Needless to say, instead of looking forward to some YouTube watching and Google+ reading, immediately the character of the day changed radically.  It was "beat the clock" time, with finding a replacement computer to use for the router, installing sufficient RAM and HDD in it, and restoring something bootable from backups.  There was no easy way to see if dhclient was continuing to renew the lease for "my" IPv4 address (as it would be tryiing to write a renew notice to the syslog, which would be likely failing badly).  My nerves were frazzled, my hands shaking.  I kept on thinking, got to follow the attitude, stay as calm as possible under the circumstances, just work the problem one step at a time as the step arises.

Thinking that I might have to replace the whole computer, I screwed a spare 20GB HDD into computer.  Later through the process, I thought it better to at least try removing the current HDD and substituting a rewritten from backup one (I thought, great, wasted time getting back online).  So I booted an Ubuntu 12.04 Rescue Remix CD, laid out partitions, formatted, mounted them up into one neat tree under /mnt/rootin ("rootin" is the router's name), used rsync to copy from the backup USB disk onto this new disk (which took about 30 minutes), and do grub-install to make the disk bootable.  On reboot, the kernel panicked because it cannot find init.  Reading back in the kernel messages a little further, the root filesystem could not be mounted because that particular kernel could not handle the inode size chosen by the mke2fs on the Rescue Remix.  ARGHH!!  that was the better part of an hour basically wasted.

So I dug out the CD which I used to build the router initially, booting from it into rescue mode.  I used its mke2fs all over again (wiping out my restore).  Rebooted to the Rescue Remix, rsync, grub-install, reboot.  This time, it worked OK, at least in single user mode.  Things were looking up for the first time in two hours or so.

To add to my already frazzled nerves during this, when trying to switch from one computer to another with my KVM switch, my CRT would only display black.  Humf.  Suspecting this was the KVM switch's fault because it had been operating so long, I switched it off then on...no effect.  For positions where I expected power-saving mode, the monitor's LED was orange, for those where I expected a picture, green, but still no picture on the tube.  Indeed I do have another CRT in the basement, but it's a 19" whereas the one which was wonky this morning is a 22", so quite a visual advantage.  Thankfully it wasn't totally dead, I powercycled the monitor, and it was better.

I decided I would give it one more try though.  I tried Ctrl-Alt-Del on the router's console.  That did not go well.  It immediately started spewing more disk access errors on the console (could not write this, bad sector read that, a technician's horror show).  As this kernel has neither APM nor ACPI support, hitting the power button brought it down HARD.  When turning it on, I expected POST would not even recognize the disk.  Surprisingly though, it booted OK.

But here are the things I'm thinking about as this incident winds down.  One, I wish I did not get so worked up about these technical failures.  For me, email would stop (but it would presumably queue up at its sources), and a bunch of conveniences would be inaccessible.  I can't seem to put it into perspective.  Wouldn't it be a lot more terrible if I were in parts of TX (floods) or Syria (ISIL)?  Two, at least now I have a disk which I can fairly easily put into the existing router should its disk decide to go splat for good (such as won't even pass POST).  Three, at least I have a fairly complete checklist for IPv4 address changes, I just have to calm down and execute each step.  Four, I have some new practical recovery experience for my environment.  In theory, that SHOULD help calm my nerves...but I can't seem to shake that feeling of dread when things go wrong.  I know it's no fun being in the middle of being down, but I wish I could calm down when this sort of thing happens.  Heck...I get nervous at the thought of just rebooting ANY of the computers in my environment.  I would guess what tweaks me the most is not knowing what the effort will be to restore normal operation.

I think what I really need is a complete migration plan as much away from in-home solutions as I can manage.  That way when stuff fails at home, there is less to lose.  But that's going to cost at least some money, for example for a VPS somewhere.  Sigh.


Direct all comments to Google+, preferably under the post about this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!

13 May, 2015

This is Why I Hate Being in IT Sometimes

I had multiple "WHAT THE??!!..." moments today.  I noticed that one of my internal hosts was attempting to connect to my router (which runs Sendmail) via its ULA and getting "connection refused."  All right...that should be a fairly easy optimization, just add another DAEMON_OPTIONS stanza to sendmail.mc, redo the M4 (with make(1)), restart the daemon, and Bob's your uncle.  WELLLLLLL....not quite.

After doing that, I got the following failure on startup:
NOQUEUE: SYSERR(root): opendaemonsocket: daemon mainv6MTA: cannot bind: Address already in use
Huh?  NOW WHAT?????!!!!  OKOKOK...so this should be simple, just back out the change. After all, I am using RCS for my sendmail.mc file, I should just check out (co) the revision of the file prior to adding the additional address to which to bind, run "make" for a new sendmail.cf, restart Sendmail.  NoNoNo...some days in IT, it's never that easy.  I still got the same error about the address being in use.  I used "netstat -tlnp" (TCP listeners, no DNS lookup (numeric output), and show the process associated with that socket) and saw there was nothing listening on TCP port 25...another "WHAT THE...!!!!" moment.

Believe me, this is one of the worst IT positions to be in, when backing out a change still results in a failure.  I even started going to my backups to fish out the previous sendmail.cf.  But then I thought, no, that's no help, it should be the same as that generated by the co of sendmail.mc, that's very unlikely to help any; no need to keep plugging in the USB HDD.  But this is where the better IT people, hopefully myself counted in that, get the job done.  Just to get "back on the air," I went into sendmail.cf directly, put a hashmark before the DaemonOptions line with address family inet6, and restarted Sendmail.  Pfew!  At least that worked; the daemon stayed runnning.

"OKOKOK...so what do I know?" I naturally asked myself.  The error is "address in use."  For Sendmail, what causes an address to be in use?  Well, that'd be any DaemonOptions lines in sendmail.cf, which are generated from DAEMON_OPTIONS lines in sendmail.mc.  So, next step, find all the non-commented-out DAEMON_OPTIONS lines in sendmail.mc (with grep) and go through them one by one to see if the same address shows up for more than one line.  Well...there was only one line, quite on purpose, whose daemon label is "mainv6MTA" (remember, from the error message), and that is for mail.philipps.us.  OKey dokey, so what is the address of mail.philipps.us?
host mail.philipps.us
(returns just my DHCP IPv4 address. Ummmm...)
host -t aaaa mail.philipps.us
mail.philipps.us has no AAAA record
This of course triggers another "WHAT THE...!!!!" moment.  How the frak did my AAAA record get deleted??  It turns out the actual reason was relatively simple.

At some time between when Sendmail had been started last and today, I did indeed manage to remove the AAAA record for "mail.philipps.us."  And now I know why.  As I have recently (well, back in 2015-Mar) changed ISPs, and therefore the IPv4 address I was using, I rejiggered the philipps.us zone file so that a.) I could make updates of the zone programmatic when the IPv4 address changes, and b.) it preserves the plain text zone file, with comments and such, so that precludes using a dynamic zone and updating with nsupdate.  I implemented this well after changing address space, so Sendmail continued to run just fine.  The implementation I chose was to $INCLUDE a separate zone file piece which is generated out of the DHCP client scripting using a template file (I get an address to use via DHCP).  Sure, there was a comment in the zone file that "mail" had been moved to an $INCLUDE file, but what I failed to realize at the time was, right below what I had commented out was an IPv6 address for "mail.philipps.us!"  But for compactness and less typing, I omitted the name, as is common in zone files.

mail    300    IN    A    192.0.2.112
        300    IN    AAAA 2001:DB8:2001:2010::112

became
;;; moved to $INCLUDE file
;;; mail    300    IN    A    192.0.2.112
            300    IN    AAAA 2001:DB8:2001:2010::112

So while doing this refactoring, I removed the A line, because it would be "covered" in the $INCLUDEd file.  But of course, this "continuation" line took on the name of whatever was before the now-commented-out A record.  So I had effectively obliterated the AAAA record for mail.philipps.us.  Ouch.

Let's review though.  The error was "cannot bind: Address already in use."  It is, unfortunately, one of those bogus, red herring errors.  For this particular build of Sendmail, instead of reporting the referral response for the inet6 lookup (and therefore lack of AAAA record), it (probably) used whatever garbage was in the IP address variable it uses.  Chances are that was initialized to all binary zeroes, which is the address used for the wildcard address.  I think at least on the kernel implementation I have on my router/Sendmail host, it would count IPv4 addresses used under that wildcard, by matching any ::ffff:w.x.y.z addresses to which I had already bound.

I really hate it when things don't work like they're supposed to work.  But indeed this time it was due to my own doing, just that it was a few weeks ago.  This is the sort of thing that really drives me batty about being in IT though, when reverting doesn't work.


Please direct all comments to Google+, preferably under the post about this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!

16 April, 2015

When Police Officer Reality Meets Public Perception



A lot is being made of this video, especially on local talk radio (WBEN ).  Tom Bauerle in particular has explained in the monologue portion of today's show that there are reasons such maneuvers are done.

First, for an emergency vehicle, these things (surprisingly) are legal.  (I thought that police were subject to the same laws as the citizenry, but this is obviously an exception.)  The stated reason is that seconds may sometimes count, that parking someplace else in order to go into a restaurant for something to eat may introduce additional, arguably unnecessary, delays in responding.  Explained this way, I think many if not most people would be OK with this.

But secondly, perception is reality.  This is such a short clip and without much context, so it is tough to judge all of what's going on here.  What Mr. Bauerle can't seem to get past (at least before I turned him off) is this simple adage.  It is perceived that this officer is overstepping his authority, or at that very least unduly asserting his authority (as in, maybe holding to the letter of the law, but not the spirit).  Ordinary, non-emergency folks a lot of times will do similar things, and think that by turning on their hazard flashers it somehow excuses the double parking.  Mr. Bauerle calls this tone snarky; I see (hear) it more as incredulity.  There was absolutely NO other place where these officers could have parked?  It just seems unlikely.  But again, this short video lacks that context.  Considering how ordinarily the Buffalo PD are quite helpful and professional, I would guess it WAS the case that no other space was available.  Plus, I'm willing to give them the benefit of the doubt in general.

But here's the thing...I know Tom has undergone police training, and seems thoroughly versed in V&T law, but he can't seem to back away from that today and view this like an average citizen, and see this as a (minor) abuse of authority.  It's fairly obvious from this video (and how it has spread) that most folks are unaware of this exception in the law and the reasons for it.  This is especially surprising to me because he is an ardent supporter of Amendments IV and V of the U.S. Constitution.  In fact, he has recently spoken out about how the Erie County Sheriff's Office has used "stingray" devices to monitor cell phone activity, which to him (and me) is overstepping the bounds of Amendment IV.  In principle, the perception is similar, officers of the law doing something they see as ordinary, crime-fighting procedures, but the citizens don't think of it that way.


Direct all comments to Google+, preferably under the post about this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!

13 March, 2015

Starting the Migration from Time Warner High Speed Internet to Verizon FiOS

(By the way...in the following, "TWC" means "Time Warner Cable," if you didn't already think of those letters that way.)

I think the scarier parts are done with as I post this.  I was going to handle Verizon by simply creating another VLAN, but I discovered my particular router platform (a Dell OptiPlex GX1 350MHz Pentium II with some "ancient" NICs) did not support dhclient with VLANs.  Whether this is a problem in the core kernel, the VLAN code, the NIC drivers, or dhclient, it's tough to say.  Using tcpdump, I do see a DHCP server reply...so why this doesn't get back properly to dhclient, I don't know, and I don't much care.  I solved it with hardware, by putting in another NIC.

Why scarey?  The rational mind thinks, powering down a computer, partially disassembling it (OptiPlex GX1s have a riser card with the PCI slots (and ISA slots!)), adding the requisite hardware, reassembling it, and turning it back on should be normal, ordinary, and just work as expected.  But of course the experienced person knows cables (e.g., IDE) get tugged, components get bumped (e.g. RAM modules), hard disks every once in a blue moon refuse to spin up or otherwise refuse to come online, and so on.  This is why using a VLAN interface (e.g. eth2.105, for physical port eth2, VLAN 105...which is the ASCII code for "i", symbolic of "Internet'...I already have a 73/"I" VLAN for TWC) was preferable, because of very few if any hardware unknowns.

Complicating that was the fact that my systems room (a fancy name for what was no doubt intended by the house architect as a bedroom) does not have a whole lot of space, so the "home" for my capable but power-hungry desktop (a Dell Precision 670 which was given to me ) is on top of the router computer.  That one would be expensive to replace if it got munged by moving it.  The video board in it (a PCIe nVidia card) is a bit finnicky, and that was my worst worry; same concerns as twiddling with the router, of HDDs which had their last useful poweron cycle, mainboards getting flexed by the movement just enough to make them fail or become unreliable, etc.  But as it turns out, I'm typing just fine on the '670 to write this post.

I know, I know...I probably sound like Boober Fraggle.  But this is an unfortunate attitude which comes after years and years of experience with stuff failing for no apparent reason at all.  In fact, my very first Linux router was a Dell Dimension XPS P75.  One day, I did something with it, and it had to be powered down.  It would not power back on.  So thankfully that was as easy as taking the parts out of it and jamming them into a Dimension XPS P133.  So not all was bad with that, I actually ended up with a faster router that day, so it could sustain more throughput (which was to be thrown at its way a couple of years later, from 640/90Kbps Verizon ADSL to 10/0.5Mbps Adelphia Power Link).

A certain amount of anxiety can be lessened by thinking of these things and devising, as best as you can, fallback positions or workarounds.  For example, if putting the new NIC in somehow causes an electrical conflict (e.g., IRQ lines), chances are fairly good that if you just remove this NIC, things will go back to the way they were and you can figure out some other strategy (e.g., try a different NIC, or a different manufacturer's NIC).  But of course, nothing like that is absolute.  You may find out inserting that NIC caused a freeze, reset, kernel panic, or whatever, but in the process of removing it, you may have bumped the RAM, and now even with the NIC gone you're still down.  Maybe you just go to the basement and get another computer, and try transplanting the NIC and hard disk.  If the HDD fails, that's a whole new many hour kettle of fish, involving setting up a new one, restoring from backup, and such.  To lessen stress/anxiety, it helps to have a copied and offline tested HDD, but that itself is a lot of work which may be totally unnecessary.  (Yes, I have about a dozen old, old computers in my basement which serve terribly these days as a desktop but function just great as something like a router, or as a source of spare/repair parts.)

You might ask, "why the NIC?"  That's because with most ISPs which use DHCP, the address you obtain is tied to the MAC address of the NIC which requested an IPv4 address.  Not all NICs or NIC drivers support setting an arbitrary MAC address, especially older ones made before people thought of even doing such things.  And if the MAC address changes, the IP address will change, and there will be a whole slew of other things scattered throughout the Internet which must be updated...DNS entries, tunnel endpoints which are administered/work by IP address, ACLs for various bits of software on the router, iptables(8) rules, DNS secondaries which must fetch my masters from a new address, and so on.  So if things go as planned, none of that additional work (and accompanying stress) has to be done to be functional again.  Hopefully if it's the computer dying, the HDD's and NIC's new home will "just work."

Mind you, at some point, the IP address changes will have to happen, it's just that again, there is the fallback position of "TWC still works with the current IPv4 address," and the IP address transition doesn't have to occur ASAP in order to have certain Internet stuff working...such as email.

The first go of it caused quite some tension in me because booting would hang at the eth1 (TWC) initialization.   Oh, great, what now???  Was the new NIC interfering somehow, such as stealing an IRQ?  Was it changing the device names so that eth1 was no longer TWC?  Actually, it was a whole lot simpler.  I thought I knew the color code of the Ethernet cables I used...yellow for Internet (WAN), other (in this case blue) for LAN.  But as I'm staring into space contemplating what was wrong, my gaze was towards my switch.  It is mounted (screwed) to the side of my desk, with two rows of ports running vertically.  The top four ports are natively in VLAN 73.  And as I'm thinking of how to diagnose what's going on, it dawned on me there is a blue cable in port 1/4 and a light gray cable in 1/1.  Notice none of those is yellow.  Ahhhh....LAN is plugged into WAN, and WAN is plugged into LAN.  So trying to do DHCPDISCOVER or DHCPREQUEST on a network segment which has no (other) DHCP server is not going to work out well at all.  (I say "other" because for that segment, the router itself is the DHCP server.)

So right now the new NIC is in, and in fact it's doing DHCP with Verizon instead of Verizon's supplied Actiontec router.  Everything "critical" seems to be working: email, Google Voice, Web surfing, YouTube, etc.  For some oddball reason, my Sipura SPA2000 wouldn't register with PBXes.org; I don't use it that much so I decided to drop that for now as it's not really critical.  I have briefly changed the default route to point out the new FiOS NIC, and it tested at good speeds.  I discovered that sometimes being dually homed like this causes some minor difficulties, such as somehow packets sneak out with the wrong source address, so NAT rules need to be added on each Internet interface which SNAT from the wrong to the right address.  At least for residential class Internet service, reverse path filtering is in full force, and therefore asymmetric routing simply will not work.  I cannot send a packet to Verizon with a TWC source address, nor vice versa.  If this were perhaps a business account, I might be able to do that sort of thing "out of the box," or at least be able to correspond with network engineers at both ISPs to ask them to allow source addresses which I'm using from the "other" ISP.

UPDATE ON 15-Mar-2015: I left the SPA2000 ATA unplugged overnight.  I basically think this was the SPA not sending stuff for many contiguous hours, and PBXes.org not sending any traffic responding to what the SPA was trying to do.  Therefore Linux aged out the NAT entries after PBXes stopped sending stuff.  So it will now function.  I also turned off autoprovisioning via TFTP and connected it for a few minutes directly to TWC.  Next I'll try turning autoprovisioning back on and see if it still will make/receive calls; I suspect that won't be detrimental, but we'll see.

I also might add I had my first "production" use of FiOS this morning.  I had another Linux box set up as a router, and manually pointed my Nexus 7's default route to it so it would mainly use FiOS instead of TWC (I say "mainly" because other things like DNS will still use the old network path).  Since the installation on Wednesday, really all I have done is speed tests at www.speedtest.net and DSLReports.  But today my Titanium Backup backup weighed in at 272 MB, mainly because several large apps got updated (Google+, Angry Birds Rio, and some more).  This would have likely taken the better part of an hour at the 1Mbps speed of TWC to sync to Google Drive.  But with 25 symmetrical FiOS, it was only about 8 minutes.  The speed was almost comparable to the rsync(1) of those files I do to a computer on the LAN!

Well, one of these days, I will finish the transition to FiOS.  The whole goal was to get better Internet at less of a monthly price, and to disconnect TWC Internet (leaving just basic TV).  We'll see how long this takes.


Direct all comments to Google+, preferably under the postabout this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!

25 February, 2015

'Net Neutrality MythBusting

Lately I've been hearing an awful lot of  misinformation being bandied about in regards to 'Net Neutrality.  I hope to bust wide open and dispel a few of these.  Of course, I'm willing to listen to evidence which shows I am misinformed, but I have been involved professionally in tech for over 20 years, and have read up on certain sectors of tech outside of job-related work.

Just one small note before I begin: I have tried to get Blogger/Blogspot several times to make hyperlinked footnotes for me (using the HTML override editor), without much success.  If I leave the fragment identifier alone and leave Blogger's "correction" of my hyperlink to the same document, clicking on it tries to bring the reader/surfer to an edit page for this post (and of course noone but myself can edit, so it errors).  If I remove everything but the fragment identifier as many Web pages on the footnotes topic suggest, for some reason my browser won't follow them.  I apologize for the inconvenience of the reader needing to scroll to near the end of the post in order to read the footnotes.

Myth: Netflix (or any other streaming content provider) is being throttled (or blocked) to make incumbents' video services seem attractive in comparison to streaming.

 

Truth: No, they are not being throttled per se.  So passing some sort of neutrality regulation or law likely won't help much, if at all.


When people talk about "network throttling," (also called "traffic shaping") they are talking about classifying network traffic, either by protocol (such as TCP plus a port number), or by source or destination IP address, and programming one or more intervening routers to allow only some limited count of packets to pass through for some given amount of time (e.g., packets amounting to 200 kilobytes per second).  The source (e.g. Netflix) compensates for this apparent lack of bandwidth by slowing down the rate at which it transmits those packets.  Thus the viewing quality is downgraded by switching to a more highly compressed version of the video, and might also result in stutters and stops for rebuffering.

While a certain amount of this may be being done by some ISPs, current thinking is that this is unlikely.  What is more likely happening is the incumbent ISPs are unwilling to increase interconnect capacity.  If the current interconnects between Netflix's ISP(s) and the incumbents is run up to capacity, naturally no more streams of the same bitrate can be passed through.  Everything must be slowed down.

The analogy of the Internet to pipes/plumbing is not too bad.  If some pipe is engineered to deliver 600 liters per minute at 500 kilopascals, you can't just increase the pressure to try to stuff more water per minute through, the pipe will burst and would no longer be usable.  One solution would be to install a similar capacity pipe in parallel with the original, thus increasing the effective cross sectional area of pipe, which would be analogous to installing another (fiber) link between routers.  The other way of course is to increase the pipe diameter, which would correspond to using existing media such as fiber in a different way (such as using more simltaneous laser wavelengths or something).

As implied, incumbents such as Comcast, Time Warner (TWC), Cox, etc. have economic disincentives to upgrade their peering capacities with services which compete with their video services.  So all they have to do is let existing links saturate with traffic, and it appears to the consumer like their streaming provider is being throttled.  Those consumers are of course in no position to be able to investigate whether their streaming service (e.g. Netflix) is being traffic shaped or if peering points are being allowed to saturate.  Only their incumbent's network engineers would have the access to the equipment and know-how to be able to answer that question accurately.

I personally cannot envision regulation which would likewise be able to discriminate between administrative traffic shaping and interconnect saturation.  In my libertarian opinion, any attempt to do so would be government takeover of Internet service, because that would mandate how ISPs must allocate resources to run their core business.  With government takeover would come government control, and this could quickly be used to suppress essential freedoms such as those expressed in Amendment I.

Myth: The passage of 'Net Neutrality would lead to a government takeover of the Internet where connection to the 'Net will require a license.

 

Truth: Take off the tin foil hat, that is simply preposterous.  There is noone (in government) I know of which is even proposing such a thing, not FCC commissioners, not legislators; noone.


This one seems to be a favorite fear mongering tactic of Glenn Beck.  Oh, believe me, I like Glenn (plus Pat and "Stu"), and agree with a lot of their views.  But this one is so far out in Yankee Stadium left field that they'd be in the Atlantic Ocean.  I understand the rough route how this conclusion could be reached.

The FCC required licensing of radio stations for very sound technical reasons.  You just simply cannot rationally allow anyone to transmit on any frequency with any amount of power they can afford.  He said at one time radio stations used to be akin to that, with people broadcasting to their neighborhoods.  This will very quickly become untenable, with the next person claiming the freedom to do as they please with available spectrum, and trying to overpower their competition.  It's simply not within the laws of radio physics to be able to operate that way.  There needs to be some central authority to say you, you over there, you may transmit on this frequency with this modulation, this bandwidth (basically, modulating frequency or FM deviation), at this power, and you can be reasonably guaranteed you will have clear transmission of your content to this geographic region.

A weak analogy would be vehicular traffic.  You simply cannot drive wherever you want, whenever you want, at whatever speed you want.  You must drive on either your own property or established rights of way (roads, basically), yield to other traffic pursuant to law, and at speeds deemed safe by authorities (guided by engineers who designed the roadways).  You also must prove competency with these precepts, plus physical ability, thus requiring licensure.

I didn't think I would hear it from radio veteran Rush Limbaugh, but as I was writing this (over a few days), I also heard him propose that the FCC might require licenses for Web sites to come online, comparing it to radio stations needing licensing.  His thesis is that radio stations must prove to the FCC that they're serving the community in which they're broadcasting.  That's true.  Why would the FCC grant exclusive use of a channel1 to a station which wasn't broadcasting content in which the community was interested?  It cannot allocate that channel to more than one transmitting entity.

Even reclassifying ISPs under Title II will do no such thing.  Telephone companies have been regulated under Title II for a very long time, and have you ever heard of ANYONE suggesting needing a license to connect to the phone network??  NO, NEVER.  Yes, it is true, for a long, long time, the Bell System insisted it was technically necessary to have only their own equipment connected to the network.  This was modified eventually to type acceptance/certification.2  Still, it's an open, nonprejudicial process.  If for some oddball reason the FCC suddenly decided to synthesize some objection to a device being attached to phone lines in some sort of effort to interfere in a company's commerce, the manufacturer could sue in court and plead their case that their device does indeed meet all technical requirements, and the FCC is merely being arbitrary.  But the point is, this is not licensing individual users, only the manufacturers, and purely for technical reasons.

 

Myth: The phone company (or companies) has/have not made any substantial improvements (except maybe Touch-Tone) in almost a century (since the Bell System).

 

Truth: There have been plenty of innovations during the "reign of," and since, the Bell System.


Times used to be that trunking between cities had a one-to-one correspondence between cable pairs and conversations. If we were alive back in the 1920 or so, if I wanted to talk from here in Cheektowaga, NY to my sister in Pasadena, TX, a pair of wires along the complete path from phone switch to phone switch along the way would have to be reserved for the duration of the conversation.

The first innovation was frequency division multiplexing (FDM). Phone conversations are limited from approximately 300 Hz to 3000 Hz response, which is quite adequate for speech.  What you could do is carry several on modulated carriers, much like radio does, but closer to the audio spectrum, for example 10 KHz, 20 KHz, 30 KHz, etc. (not sure of the exact frequencies used).

Starting in the 1960s, after solid state computer technology was well developed, interconnections (trunking) between switching centers was upgraded from analog to digital.  This is how the modern public switched telephone network (PSTN) works.  Instead of transmitting voice or FDM voice on the links, the voice is digitized (analog to digital conversion, or ADC), and sent a sample at a time down the trunk, and interleaving several conversations digitally (time division multiplexing, or TDM).

Digitizing the phone system also had the benefit of getting rid of the mechanical switching.  It used to be that when you dialled the phone, while the dial was returning to its "home" position, it would send a pulse down the line for each number it passed.  So for example a 1 sent one pulse, 2 sent two pulses, and so on, with the exception of 0 which sent ten pulses.  Every pulse would physically advance (step) a rotary switch, and the timeout between digits would pass the pulsing along to the next rotary switch. See Strowger switch and crossbar switch.  With the advent of TDM, each subscriber could have a timeslot in their local switch, and each current conversation between switches (over trunks) could have a timeslot on the trunk.  Instead of physically carrying voice around a switch, it was a matter of transferring data instead.

And then of course, there was the mentioned Touch-Tone (a trademark for the dual tone multiple frequency (DTMF) system).  This meant instead of pulses stepping a mechanical switch of some kind, the subscriber's telephone would generate carefully engineered tones which would be detected by the central office (CO) switch, and use a computer program to collect those digits and figure out and allocate a route to the destination subscriber.

Of course, then there was the Bell System breakup in the 1980s.  The effect of this was increased competition and therefore falling prices.  I would think anyone old enough (I was born in 1965) remembers when it used to be quite expensive to call long distance, even as short a distance as here to Niagara Falls.  Instead of a local call where someone could talk for as long as they'd like for one flat fee per month, it was charged on a per-minute basis, and the per-minute price varied based on load/demand, with daytime, evening, and overnight rates.  So every once in a while, one would stay up late (past 11PM local time) to call the relatives who moved away (in my case, siblings who went to Texas, and a grandmother who lived in Closter, New Jersey).  These days, due to a couple of decades of competition, long distance is included with most (although not all) flat rate priced telephone service.

Let's also not forget the development of mobile telephony.  At first, it was only the very rich who had phones installed in their vehicles.  Then a few decades later there were the pack phones which were a lot more affordable that I used to sell in the late eighties when I worked at Radio Shack.  Then after that, there were the famous bricks.  Then the mobile phone system itself was digitized, with CDMA and GSM.  Then of course some of the CDMA and GSM bandwidth was dedicated to carrying arbitrary data, not just voice data.  And these days, we have the next evolution in mobile communications, LTE.

So don't try to tell me there hasn't been innovation and improvement in decades.  To a certain extent, even with the lack of competition before the 1980s breakup, improvements were slowly but surely made.  Don't get me wrong, money was a large motivating factor (e.g., we can't charge for our long distance service because we don't have the capacity, people just won't buy it; so we have to innovate how we carry those conversations around in order to sell the service).  But the key is competition since the breakup greatly accelerated innovation, because it was no longer a matter of the monopoly of getting around to it when the company felt like it, they have to innovate faster than the competition or they'll lose customers.

Myth: Netflix (or other streaming content providers) are being extorted into paying more in order to have a "fast lane" to their customers by the likes of Comcast.

 

Truth: Settlement-free peering does not apply; Netflix or their ISP is a customer and not a peer.  Nor should they be afforded free colocation.  'Net Neutrality should have no effect whatsoever on these private business relationships.


There is a concept of "settlement-free peering" in the ISP business.  At least the current business model of ISPs is to charge to move octets from their customers to either another one of their customers or to one of their peers which services the customer on the other end of the communication.  This is important and key; they charge for moving packets.  Let's take two well-known ISPs to illustrate this, Verizon and Comcast.  Hypothetically, Verizon could charge Comcast for all bytes delivered from Verizon's customers to Comcast, and for all bytes delivered from Comcast to Verizon's customers.  But that could equally be said of Comcast; Comcast could charge for all bytes moved to/from Verizon to/from Comcast's customers.  But that would be kind of silly to simply pass money back and forth, and it makes more economic and accounting sense simply to say, if you accept roughly the same amount of bytes as you give us, we'll call it a wash if you do the same, and we consider each other peers because of this.

Poor Cogent.  They at least at one time had the unenviable position of being the ISP to Netflix.  At the time Netflix was just starting up, they likely had all sorts of customers which would be sucking down bytes from the Internet, roughly in proportion to the bytes being sucked out of their customers over to their peers.  But at some time Netflix started becoming a larger and larger proportion of Cogent's outbound traffic, eventually dwarfing the inbound quantity.  The tenet upon which settlement-free peering, that you give as much as you get, increasingly no longer applied.  So at least according to generally accepted practice, Cogent (or indeed any other 2nd or lower tier ISP) became a customer to their "peers" instead of a true peer.  Rightfully so, at least according to common practice, Cogent, and I suppose later Netflix themselves, began to have to pay for moving those bytes because they were no longer really peers.

Now what do you suppose 'Net Neutrality would do if it mandates that there is to be no such "paying extra for a fast lane to the Internet?"  I contend that would essentially abolish settlement-free peering, and regardless of relative traffic, every ISP will have to charge every other ISP with which they peer for traffic.  This will incrementally increase the costs of doing business, because now there is that much more accounts payable and accounts receivable, with the same potential hassles of nonpayment/late payment and other such miscellaneous wrangling.  And ISPs aren't going to "take it lying down," they'll just increase rates to their customers to cover these additional costs.

OK, you say, you don't have to deliver the content through ever more saturated wide-area network (WAN) links, simply establish a content distribution network (CDN), whereby you plop down servers with your content in the network operation centers (NOCs) of all the major ISPs.  Putting up and maintaining WAN connections is more costly than providing some ports on an Ethernet switch, plus that Ethernet switch is going to be tons faster than any WAN link, so it's a win for the ISPs and the ISPs customers, and consequently, for the content provider's customers too.  This could also be called a colocation (colo) arrangement.  The content provider (e.g., Netflix) transmits the necessary content to its CDN servers, potentially over the WAN, once, and that content is accessed thousands or even maybe millions of times, thus in effect multiplying the "capacity" of that WAN link.

But the truth is, although providing Ethernet ports is lots cheaper, it certainly isn't free; the ISP's engineers must install such a switch, upgrade it as necessary, help troubleshoot problems which arise, and so on...definitely not even near zero costs.  Plus there is power, cooling/environmentals, isolation (wouldn't want a fault in the CDN server to take down the ISP's network), security both for network and for physical access (repairs to the CDN servers when necessary), and all the various and sundry things which must be done for a colocation.  Again, this is not just some minor expense which could be classified as miscellaneous overhead.  They are real and substantial costs.

So no matter how you slice it, Netflix and their ilk must pay the ISPs.  If they choose, they can be a customer with ordinary, conventional WAN links, or they could also opt for colo.  Either way, they owe.  It's not extortion in the least.  They pay for the services they receive, one way or another.  They might try to frame their plight some other way, but anyone can research what's happening and draw conclusions independently.  I don't think they're being extorted into paying "extra" at all.  As long as they're being charged the same as any other ISP customer to move the same approximate quantity of bytes, or the same as any other colo customer, it's totally fair.

I have to ask, why is it the only company I heard of or read of making a "stink" like this is Netflix?  Why not Google because of YouTube?  Why not Hulu?  Why not Amazon for their Instant Video product?  I suspect Netflix is looking for a P.R. angle, and want to lobby the public to get angry at their ISP for their current business practices as outlined in this section.

Myth: 'Net Neutrality is not needed because of competition.  If you don't like your ISP, you can just switch.  Problem solved; the other ISP should have more favorable policies or prices because they're competing for customers.

 

Truth: It is true that no new regulation is needed for what I'll term the Internet core.  Competition there is fine.  The last kilometer (or last mile if you must), or the edge, is the problem area however.


The reclassification by the FCC of ISPs under Title II of the Communications Act of 1934 from an information service to a communication service on its surface makes at least some sense.  When the FCC proposed some 'Net Neutrality rules, ISPs took them to court (and won), saying their regulations could not be binding because the FCC lacked authority.  However, if they can be successfully reclassified as common carriers under Title II, that would give the FCC (or at least the federal government, maybe the ICC) the authority to impose such regulations.  It would also absolve ISPs of certain liabilities due to their common carrier status.  (As an example, if two thieves coordinate their efforts via phone calls, the phone company is not an accessory to the theft simply because their network was used to facilitate the theft, some other complicity would need to be demonstrated.)  But there's a couple of disturbing things about this.

First is the track record of the the federal government.  As this essay that the Cato Institute (PDF) relates:

The telephone monopoly, however, has been anything but natural. Overlooked in the textbooks is the extent to which federal and state governmental actions throughout this century helped build the AT&T or “Bell system” monopoly.  As Robert Crandall (1991: 41) noted, “Despite the popular belief that the telephone network is a natural monopoly, the AT&T monopoly survived until the 1980s not because of its naturalness but because of overt government policy.
And a little further on, there is this other notable passage:

After seventeen years of monopoly, the United States had a limited telephone system of 270,000 phones concentrated in the centers of the cities, with service generally unavailable in the outlying areas. After thirteen years of competition, the United States had an extensive system of six million telephones, almost evenly divided between Bell and the independents, with service available practically anywhere in the country.
 The referenced monopoly would be to that created by Bell's patents in the mid to late part of the 19th Century.  I'm not against patents; inventors and innovators need the assurance that their work, for some limited amount of time, will provide a return on the investment made.  (The current patent system is in need of overhaul though.  Some of its applications, even those granted, are ridiculous.)  Once again, it is proven that competition improves service and lowers prices.  The government though seems to have been in the Bell System's pocket, thus squeezing out competitors and strengthening their monopoly. The AT&T folks convinced regulators that their size and expertise made them ideal to establish universal phone service, and that needing to accommodate competitors' systems would only slow them down and was duplicated effort; that having one, unified system would be the best.  They falsely argued telephone service was a natural monopoly, despite evidence quite to the contrary in the late 19th and early 20th Centuries.

Second, the FCC claim they will exercise discretion in what parts of Title II they will bring to bear on regulation.  Ummmm....sure....if you like your ISP, you can keep your ISP.  These new regulations will save most folks $2500 per year on their ISP service fees.  There's nothing to stop the FCC from using every last sentence in it.  Title II has all sorts of rate-setting language in it.  Much as I don't want to spend more than I really have to, I don't trust the government to turn the right knobs in the right ways to lower my ISP bill.  Look at what happened to banking regulation under Dodd-Frank.  Sure, banks complied with all the provisions, but because that cut into the profits they were making, they simply began charging other fees, or raising others not covered by the new regulations.

I guess it also goes without saying that anytime government inserts itself to control something else, there is ample potential for abuse of that power, such that, yet again, essential liberties are curtailed.

Yes, on occasion, regulation has served some public good.  If it weren't for the 1980s divestiture, we'd probably still be paying high, per-minute long distance rates, and have differing peak and off-peak rates.  But the government has propped up the telco monopoly for decades, and has attempted to compensate for their missteps.   For example, the Telecommunications Act of 1996 amended the 1934 Act in several ways, probably the most apropos to this discussion would be forcing telcos to share their infrastructure (the "unbundled network elements") so that competing companies could lease out CO space for switches and lease the lines to customers...thus creating competitive local exchange carriers (CLECs).

On the flip side of that though, just as video providers who also have roles as ISPs have an economic disincentive to carry Internet video streams unencumbered, so too the ILECs would seem to have disincentive to help CLECs.  If some CLEC customer's loop needs repair, from an operational/managerial standpoint, why should it come before their own customers' repair needs?  I can't say for sure, but it did not seem like Verizon were trying their best to help Northpoint when I had Telocity DSL.  Sure, I would have a few weeks or a few days of good service, but then I would have trouble maintaining sync for many minutes or sometimes hours.  Wouldn't you know?  After Telocity and Northpoint went out of business, and I transitioned to Verizon Online DSL, things were pretty much rock solid.  (OK, maybe the comparison isn't quite exactly equal.  Telocity was 768K SDSL, whereas Verizon was 640/90 G.dmt ADSL.)

One thing I distinctly heard Pat Gray and Steve "Stu" Burguiere say when Glenn was out is that if you don't like your ISP, you can just switch.   The reality is though that in the same price range, people usually have only two "last kilometer" choices, the telephone company (telco) or the cable TV company (cableco).  And for some, it's down to one choice.  I wouldn't count cellular Internet access because noone offers a flat rate, so-called unlimited plan.  The industry norm is around 5 GB for $100/month, which is more than I'm paying ($58) for 15/1 Mbps but unlimited.  Someone could chew through 5 GB in only a few hours of HD streaming video viewing, with huge overage fees after that.  Satellite likewise is most decidedly not competition either.  It's at least twice as expensive for the same bitrates, is limited by a fair access policy (you're throttled after so many minutes of full usage), and has latency which makes several Internet applications (VOIP for example) totally impractical to impossible to use.  There are a few wireless providers (WISPs), but again their price points are very high compared to DOCSIS, DSL, or fiber, and are mostly only for rural areas underserved by either a cableco or telco.  No other technology is widely available.  Put very simply, there are one, maybe two economically viable choices; an oligopoly.  Sorry, Pat and Stu, you're way off base here.  It's not like drugs, where I can go to the pharmacy at K-Mart, or at Walmart, or at Walgreens, or at RiteAid, or at CVS, or at Tops, or at Wegman's, or...  That has a sufficient number of competitors to really be called "competition."

Competition (lack of an oligopoly) is essential to improving service and lowering prices.  Look at what happened in the AT&T Mobility/T-Mobile USA case.  Four competitors for mobile phone service (AT&T Mobility, Verizon Wireless, T-Mobile USA, and Sprint) is barely enough competition, if that.  Going down to three would have made the competition landscape worse; that's why the merger was denied (and is why the Comcast/Time Warner merger should not be allowed either).  Everywhere Google Fiber goes, the incumbents lower prices to try to retain customers.

So would forcing telcos and cablecos to share their infrastruture help foster competition?  I'm unsure.  It sounds good, because it's unlikely people would tolerate many additional sets of cables on their utility poles which would be necessary to provide alternate services.  It'd be really ugly.  Plus the incumbents have a several decades head start.  Laying new cable is expensive and will require lots and lots of time.  So maybe cable and office sharing like is done with phone service might help.

One thing the FCC (or some other part of the federal government) could do is to annul and void any local laws which limit or even outlaw competition.  Such is the case with some municipalities wanting to string their own fiber optic networks.  The whole reason they even consider it is, the telcos and cablecos refuse to provide the level of service the community seeks.  But no, instead of upping their game and meeting the challenge, once again the telcos and cablecos use their lobbyists to enact laws making muni broadband illegal.

Of course, this isn't really 'Net Neutrality.  The real issue is a need for increases in local competition.

Myth: If it's written into 'Net Neutrality that extra charges for a "fast lane" will be disallowed, my Internet bill will be kept lower.

 

Truth: Practically any way  it's worded, it won't preclude the possibility of either speeding up or slowing down some class of traffic.

 

I'll bet dollars to donuts some "propeller head" is going to look at 'Net Neutrality regulations, see that no throttling (or its inverse, preferential packet treatment) is allowed, and that charging extra for a "fast lane" will not be allowed, then sue their ISP claiming they don't have the 100Mbps tier, they only have the 15Mbps tier, and they're illegally charging extra to get the 100Mbps service.  Still others will see a slowdown in their service because of some poor bandwitdth management by their ISP, and sue because they think they're being throttled.

If you've never heard of it, let me try to explain "committed information rate" (or CIR).  That is the rate of information transfer your ISP will guarantee you have, let's say 3 Mbps.  This means that no matter what, the ISP is guaranteeing connections through your link which add up to that 3 Mbps will stay at that rate.  That implies, in order to meet that guarantee, for each peering connection they have, 3 Mbps of it must be reserved exclusively for that customer.  The customer's connection is often capable of much faster than that, say 100 Mbps, so it can have burst speeds of up to the technology's (or the provisioned) limit.  So basically, any rate between the CIR and the provisioned rate is best effort.  As you can imagine, dedicating bandwidth on multiple WAN links is an expensive proposition, as it means during some times, there is bandwidth on the link which is just sitting idle, not being used at all.  Therefore, what the customer requesting the CIR is charged is priced commensurately.

CIR is typically only engineered for businesses because that CIR may be critical to providing their service (e.g., an HD video stream may require some CIR to work at all, and HD video streaming is the company's product/service).  For what most folks have, there is no CIR, and it is all best-effort.  It says as much in the terms of service.  This leads to a commonly accepted ISP business practice called oversubscription.  It simply means that the ISP counts on not every one of their customers utilizing their provisioned rate simultaneously.  A typical ratio in the dialup Internet days was 10 to 1 (meaning for every 10 customers they had dialing in at 56Kbps, they would have 56Kbps capacity to their upstream Inernet provider); I'm not sure what common practice is for residential broadband.

So, let's apply this to 'Net Neutrality.  Some residential "propeller head" type is going to see their connection sold as 50 Mbps only going 30 due to heavy use at the time, and is going to start shouting "throttling!" to the ISP/FCC/courts.  Of course, if they wanted a CIR of 50 Mbps, it ain't gonna be no $80/month because theoretically it's not going to be shared with their neighbors on the same CMTS or DSLAM, it's going to be a guaranteed 50 Mbps.  The cost to provide most service is kept really low by oversubscription and no CIR.  If customers think they are consistently getting under their advertised rate, their existing remedy is something like suing for false advertising or fraud.

So there is no need for extra "you can't charge extra for an Internet 'fast lane,' or throttle my connections" regulation.  Attempting to do so is just going to push everybody's monthly rate through the roof because the ISPs will be forced to do away with the oversubscription, best effort model and to a CIR model, lest they be accused of throttling or providing preferential packet treatment.


Myth: Since all TV is just bits these days, 'Net Neutrality could mandate that cable TV providers who are also ISPs (so, virtually all of them in the US) will be forced to reallocate video bandwidth to Internet servicing, lest they be accused of not treating all bits as equal.  As a result, your TV will start "buffering" like your Netflix, Amazon Instant Video, iTunes, etc. does.

 

Truth: This is also just utter nonsense, and is proposed by those who are totally ignorant of how cable TV infrastructure is electrically engineered.  Bandwidth used for video delivery cannot be arbitrarily reallocated to DOCSIS or vice-versa.  It's just technically impossible.  Both TVs and cable modems would have to be totally reengineered to have that capability.


This is the latest uninformed rambling by Mark Cuban and parroted by Glenn Beck.  Bandwidth and frequencies are allocated ahead of time.  Some channels will be dedicated to video delivery, even if it is SDV.  Some bandwidth will be dedicated to telephony if the MSO decides to offer that service.  It has to be.  Otherwise stuff like E911 could never work; the bandwidth must be available, it can't be reallocated for something else.  And finally, it is just not techincally possible to take bits being transmitted as video QAM and have them magically start being used by DOCSIS modems.  DOCSIS will also have channels dedicated to it.  Even if it were physically feasible to start taking video QAM bandwidth and use it for DOCSIS instead, more CMTSes would have to be deployed to do so, it can't just magically start to happen.

Besides, there is very little "buffering" in a video QAM stream.  The receiving equipment very simply is not engineered to do so.  It is engineered to assume that the stream is (fairly) reliable and continuous, so only a few tens of milliseconds is ever inside the TV waiting to make its way to the screen.  This is a far cry from the multiple seconds of buffering which is typically done for an Apple TV, Roku, Chromecast, Amazon Fire TV, smart TV, etc.  The worst that will happen if a stream is interrupted is picture macroblocking and sound dropout; there will never be any "buffering" or "rebuffering."  It was just totally ignorant of Glenn to even say that.

Imposition of 'Net Neutrality will have precisely ZERO effect on delivering cable TV video streams.




Some footnotes:

1A channel is a range of frequencies.  When a transmitter transmits, it is said to be transmitting with some frequency, say WBEN-AM at 930 KHz.  Physics reality is that WBEN-AM is actually occupying 920.5 KHz to 930.5 KHz, which assumes their modulating frequencies are held to 0-5 KHz.  The very process of modulating the carrier causes frequencies in that range to be emitted by the transmitter.  (In reality, the FCC and the CRTC probably will not allocate 920 nor 940 KHz and allow WBEN-AM to modulate with better fidelity than 0 to 5 KHz.)  As another example, WBFO-FM is said to transmit on 88.7 MHz.  But as FM implies, the frequency is varied +/- 75 KHz, thus really occupying 88.625 to 88.775 MHz.  An additional 25 KHz "guard band" on either "side" is used for spacing (to reduce interference), thus making FM broadcast channels 25 + 75 + 75 + 25 = 200 KHz wide.

2 meaning any manufacturer of equipment could submit their products to the FCC for testing to meet certain technical requirements, such as sufficient isolation of power from the phone line so that it won't zap the phone company's equipment or technicians.  It also must not emit any interfereing radio waves.



Direct all comments to Google+, preferably under the post about this blog entry.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!