Fixing Loopy Networks Using Low-tech Methods

FAIL (the browser should render some flash content, not this).

Download in .ogv format

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Smug h8ers

wbchaney's picture

Man! There's a bunch of smug Linuxer's on this website! Why don't you all take what Shawn is sharing here and ADD to it instead of trying to squash him? At least he's trying to educate the community! Everybody can't be the Linux Gods you portray yourselves to be (which is probably a good thing)!

Shawn, keep up the good work with these Linux Minutes, I enjoy and appreciate them, whether they're strictly Linux-related or something more on the Linux periphery , like this tip. 100% of a Sys Admin's job is not at the console!

Yep

Darklurker's picture

Been there too. It's normally a PITA to fix, especially if you've got dumb switches and hundreds of drops. STP in the smart switches helps, but can do weird things in other ways.
But you're right, at the end of the day, it happens because somebody plugged two switches together, and until you find that problem, life sucks.

What makes life suck even worse is when you realize it was YOU who plugged the two switches together in a closet with 300 drops in it. hehe. Yeah, I got a plaque for that one. ;)

HP Switches

Anonymous's picture

You could have solved that problem in about 5 seconds from your desk. HP even has a pretty GUI, but being Linux Journal, you can ssh in there as well. :)

Scaling the troubleshooting

Daniele's picture

In a big environment where you have more than 6-10 downlinks to other switches, I usually use the "bisection method" as a trouble-finding algorithm (http://en.wikipedia.org/wiki/Bisection_method).
That is, for example, you disconnect the first half of the network connections you have on the main switch. If the problem persists, you reconnect them (to make the poor, remaining network service to flow again) and disconnect the first half of the second half and so on, recursively. If the problem disappears, reconnect half-at-a-time (recursively) the disconnected cables.

Since in the clip I see you use a specific vendor for your main switch, I can say that using the same switches configured to run multiple spanning-tree, we had good results in detecting and automatically blocking loops till the first level of unmanaged, non-stp switches (with fast intervention times since mstp is faster than stp).

Keep in mind that enabling stp on your network, loops are less likely to block the entire network, but more difficult to come up to your attention since you need good monitoring tools to query your network equipments about stp state for each port.

Of course I know about STP

Shawn Powers's picture

The problem isn't with my main switches, it's with the cheapo desktop switches in the classroom. We are a school, and don't have enough network drops into the rooms, so we have to use desktop switches in each room to allow more computer connections.

The cheap switches don't have STP, and so cause a network breakdown. My "good" switches don't see a network loop, they just see millions of multicast packets, and dutifully try to send them to every port in the district.

In a perfect world, I'd have enough drops for each room -- but it's a school, and we're lucky to have computers at all. :(

Shawn Powers is an Associate Editor for Linux Journal. You might find him chatting on the IRC channel, or Twitter

Loopbacks in schools.

Venomfang's picture

Been there done that.

I do tech work for a private school; have the same issues. Money is tight and when rooms get built or added on they cheap out on the network infrastructure ( cabling and switches ).

Also it doesn't matter if you use STP or not, you still have to deal with the network loopback in that classroom; because someone plugged there cat5 cable into 2 ports on the wall.

It really becomes a pain.

Old...

Aaron C. de Bruyn's picture

Damn. Back when I was in school, we had to worry about someone taking the 50 ohm resistor off the end of a BNC cable...

Old-fashioned troubleshooting today? Image that.

Technoslick's picture

Bravo. So gratifying to see a demonstration of old-fashioned deductive reasoning used in a real-world situation, where hands-off, hi-tech equipment aren't being utilized. School techs in my rural county also have to deal with inadequate equipment supported by inadequate budgets. The situation isn't going to change soon, if ever. Any of them would embrace your MacGyver way of troubleshooting in keeping their inadequate computer systems running.

loops?

Gareth's picture

Firstly I would just like to say thank you for your tech tips which I find intuitive and an excellent resource for the linux community!

However with regards to this article in particular I would not call this article fixing loopy network, instead I would call it delaying the inevitable.

As a previous poster has mentioned you should be using spanning Tree Protocol (STP) if you have three or more switches interconnected. In a correctly configured network this situation should never occur.

Out of curiosity how often are you having to do this?

looping?

Anonymous's picture

In a large scale environment, you should be using spanning-tree capable switches in order to prevent L2 loops from bringing down your system.

on the plus-side, that means you have the ability to have a hot-standby cable (like via an alternate path) between your switches, and STP will make sure that only one is in use at any time.

I feel so sorry for ur

Net Admin's picture

I feel so sorry for ur employer.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix