An Example of Troubleshooting in Arch Linux

Posted by jason on Aug. 13, 2011, 3:13 p.m.
Tags: arch config linux sysadmin

I recently had a rather obscure problem with my Intel 5100 wireless card in Arch Linux. I doubt you had the same problem--this post is meant to illustrate the thought process (or lack thereof) while troubleshooting a random issue in Arch Linux.

The Problem

I went about a month without upgrading Arch on my laptop. Last night, I finally did it. Among the key updates was going from kernel 2.6 to 3.0. After an update and a reboot, my wireless was not working--no wlan0 device at all.

The Investigation

It's no secret that Google is my friend. I started broadly, searching for "arch linux network broken after update", and found an Arch forum thread called Lost network connection after update.

Although this thread didn't solve my problem, the issue was similar to mine, and I did glean some important information: in post #1, I learned that "ls /sys/class/net/" will show me the network interfaces that currently exist--if an interface isn't there, don't expect it to be anywhere; and in post #6, I learned that my issue, like the original poster's, is probably a driver issue and a look at "lspci, dmesg.log and error.log" might be useful (duh!).

Now, I have something to go on. I already know what "lspci" will give me:

02:00.0 Network controller: Intel Corporation WiFi Link 5100

Frustratingly, dmesg (and everything.log) didn't really help me, though I may not have been looking for the right things:

dmesg | grep -i fatal
dmesg | grep -i error
dmesg | grep -i 5100
dmesg | grep -i intel

These commands gave me information that got me nowhere. I went back to Google, searched for "arch linux 3", and found an Arch thread called Wireless doesn't work after upgrade to Linux 3. (I think Google takes into account your previous search: how else could this be the #2 search result for such general search terms?) Although the poster has a Broadcom card, the first post makes me realize that my problem probably has to do with loading the proper modules for my wireless card. Looking at my original install notes, I see I've put "During the install, make sure you install iwl5000-ucode." Let's make sure that that package is there:

pacman -Qs iwl5000-ucode

Nothing.

pacman -Ss iwl5000-ucode

Again, nothing. Perhaps I needed to do something to account for this package not being available any more? Google: "arch linux no more iwl5000". I end up on the Arch wiki wireless setup page which talks about the iwl5000 driver: "[the] iwl5000-series chipsets (including 5100BG, 5100ABG, 5100AGN, 5300AGN and 5350AGN) module has been supported since kernel 2.6.27, by the intree driver iwlagn". That explains no "iwl5000-ucode" package (in hindsight, I'm pretty sure that this package never existed and that my wife put some crack in my Raisin Bran this morning).

Edit: I think the package used to be "iwlwifi-5000-ucode", though I'm not 100% sure on that.

I now know that the iwlagn module needs to be loaded for my wireless card to work. Let's see if it is:

[root@laptop ~]$ lsmod | grep -i iwlagn
[root@laptop ~]$

Nothing there. Let's modprobe it, then:

[root@laptop ~]$ modprobe iwlagn
FATAL: Error inserting iwlagn (/lib/modules/3.0-ARCH/kernel/drivers/net/wireless/iwlwifi/iwlagn.ko.gz): Unknown symbol in module, or unknown parameter (see dmesg)
[root@laptop ~]$ 

Wonderful! I think I've found where the problem is. Let's check dmesg and get our next Google search term:

[root@laptop ~]$ dmesg
...
[ 3245.440995] iwlagn: Unknown parameter `11n_disable50'
[root@laptop ~]$ 

Back to Google to see if anyone else has the same issue: "arch linux iwlagn unknown parameter 11n_disable50". This search brings me to an Arch bug report about iwlagn not loading. The first comment says to try "depmod -a" and try again. I do it and still get the same error. The last comment is the jackpot:

hmm, wait a minute... this is a configuration error on your side: [...]

The Solution

I have enough knowledge to know that if it's a modprobe configuration error on my side, it's in /etc/modprobe.d. I go into /etc/modprobe.d and find a file called "options.conf" with the following:

options iwlagn 11n_disable=0 11n_disable50=0 swcrypto=1 swcrypto50=1

I have no clue when or why I did this (again, blame the wife for lacing my breakfast cereal with hard drugs). I remove the 11n_disable50 portion and try to modprobe again, but I get an error now for the swcrypto50 parameter. The fix is easy: I take out the swcrypto50 parameter, modprobe iwlagn--and boom! No news is good news! Now I run "ls /sys/class/net/" (remember what I learned from the very first page I landed on?) and I see wlan0 is there. Problem solved.


1 comment

Oskar #1

I had the exact same problem, thanks for helping me fix it!

The 11n_disable* flags I added since Intel drivers doesn't support n-networks very well. In my case they were both enabled though.