1 [[!meta title="Diskless Debian Etch"]]
3 Caveat: I have much more Linux experience than I had when I wrote this
4 HOWTO. If I had to do it again, I'd probably use [[Gentoo]] instead
5 of [[Debian]], because I find it easier to write custom packages for
6 Gentoo. Anyhow, here's my original HOWTO, preserved for posterity.
12 This HOWTO details the procedure I used to set up the abax cluster for
13 NFS-rooted network booting. The system is useful in this case because
14 it centralizes the installation in the head node (server), which makes
15 maintaining, upgrading, or altering the computational nodes (clients)
18 This procedure follows mainly Tim Brom's [Microwulf configuration
19 notes][microwulf] with two major differences.
21 * Microwulf uses Ubuntu (gutsy?), and I'm using Debian etch.
22 * Microwulf has a seperate partition for each client's root, populated
23 with an independent installation from CD. I'm using a single
24 partition for all of my clients, with the base system created using
27 For guidance in my deviations, I'm indebted to Bart Trojanowski's
28 [pxeboot and nfsroot notes][debian-nfsroot] and Falko Timme's notes on
29 [kernel compilation in Debian Etch][kernel].
35 Our cluster has one server with eight clients. The server has two
36 network cards, `eth0` and `eth1`. `eth1` is connected to the outsize
37 world (WAN). All of the clients have one network card, `eth0`. All
38 of the `eth0`s are connected together through a gigabit switch (LAN).
44 Throughout this HOWTO, I will use `#` as the prompt for root, `$` as
45 the prompt for an unpriveledged user, and `chroot#` as the prompt for
46 a root in a `chroot`ed environment. File contents will be listed with
47 the full path in the text introducing the listing. For example,
52 All files are complete with the exception of lines containing `…`, in
53 which case the meaning of the example should be clear from the
63 Boot the server with the Debian installation kernel following one of
64 the options in the [Debian installation guide][install]. I netbooted
65 my server from one of the client nodes following this procedure to set
66 up the DHCP and TFTP servers on the client and untarring
67 [netboot.tar.gz][] in my `tftpboot` directory. After netbooting from
68 a client, don't forget to take that client down so you won't have
69 DHCP conflicts once you set up a DHCP server on your server.
71 Install Debian in whatever manner seems most appropriate to you. I
72 partitioned my 160 GB drive manually according to
74 <table style="margin-left: auto; margin-right: auto">
75 <tr><th>Mount point</th><th>Type</th><th>Size</th></tr>
76 <tr><td>`/` </td><td>ext3</td><td>280 MB</td></tr>
77 <tr><td>`/usr` </td><td>ext3</td><td>20 GB</td></tr>
78 <tr><td>`/var` </td><td>ext3</td><td>20 GB</td></tr>
79 <tr><td>`/swap` </td><td>swap</td><td>1 GB</td></tr>
80 <tr><td>`/tmp` </td><td>ext3</td><td>5 GB</td></tr>
81 <tr><td>`/diskless`</td><td>ext3</td><td>20 GB</td></tr>
82 <tr><td>`/home` </td><td>ext3</td><td>93.7 GB</td></tr>
85 I went with a highly partitioned drive to ease mounting, since I will
86 be sharing some partitions with my clients. To understand why
87 partitioning is useful, see the [Partition HOWTO][partition].
89 You can install whichever packages you like, but I went with just the
90 standard set (no Desktop, Server, etc.). You can adjust your
91 installation later with any of (not an exhaustive list)
93 * [`tasksel`][tasksel], command line, coarse-grained package control.
94 * `apt-get`, command line, fine-grained package control.
95 * `aptitude`, curses frontend for `apt-get`.
96 * `synaptic`, gtk+ frontend for `apt-get`.
97 * `dpkg`, command line, package-management without dependency checking.
99 The base install is pretty bare, but I don't need a full blown
100 desktop, so I flesh out my system with:
102 # apt-get install xserver-xorg fluxbox fluxconf iceweasel xterm xpdf
103 # apt-get install build-essentials emacs21-nox
105 which gives me a bare-bones graphical system (fire it up with
106 `startx`) and a bunch of critical build tools (`make`, `gcc`, etc.).
109 Configuring networking
110 ----------------------
112 We need to set up our server so that `eth1` assumes it's appropriate static IP on the WAN, and `eth0` assumes it's appropriate static IP on the LAN.
113 We achieve this by changing the default `/etc/network/interfaces` to
115 # This file describes the network interfaces available on your system
116 # and how to activate them. For more information, see interfaces(5).
118 # The loopback network interface
120 iface lo inet loopback
122 allow-hotplug eth0 # start on boot & when plugged in
123 iface eth0 inet static # static LAN interface
124 address 192.168.2.100
125 netmask 255.255.255.0
126 broadcast 192.168.2.255
128 allow-hotplug eth1 # start on boot & when plugged in
129 #iface eth1 inet dhcp # WAN DHCP interface (not used)
130 iface eth1 inet static # WAN static interface
131 address XXX.XXX.YYY.YYY
132 netmask 255.255.128.0
133 broadcast XXX.XXX.127.255
134 gateway XXX.XXX.ZZZ.ZZZ
136 where I've censored our external IPs for privacy. The netmask selects
137 which addresses belong to which networks. The way we've set it up,
138 all 192.168.2.xxx messages will be routed out `eth0`, and everything
139 else will go through `eth1` to it's gateway. See the [Net-HOWTO][]
149 The clients will boot remotely using the Pre eXecution Environment
150 (PXE). The boot procedure is
153 2. Client BIOS comes up, detects attached devices, and looks for a
154 DHCP server for advice on network booting.
155 3. DHCP server gives client an IP address, domain name, host name, the
156 IP address of the TFTP server, and the location of the bootloader
158 4. Client gets bootloader from TFTP server.
159 5. BIOS hands over control to bootloader.
160 6. Bootloader gets kernel and initial ramdisk from TFTP server.
161 7. Bootloader hands over control to kernel
162 8. Kernel starts up the system, mounting root via NFS.
163 9. … after this point, it's just like a normal boot process.
165 We can see that we need to set up DHCP, TFTP, and NFS servers (not
166 necessarily on the same server, but they are in our case).
170 The [pxe][] bootloader can be obtained with
172 # apt-get install syslinux
174 which installs it to `/usr/lib/syslinux/pxelinux.0` along with a
175 manual and some other `syslinux` tools.
179 Install a server with
181 # apt-get install dhcp
183 Configure the server with `/etc/dhcpd.conf`
185 allow bootp; # maybe?
186 allow booting;# maybe?
187 option domain-name "your.domain.com";
188 option domain-name-servers XXX.XXX.XXX.XXX,YYY.YYY.YYY.YYY;
190 subnet 192.168.2.0 netmask 255.255.255.0 {
191 range 192.168.2.150 192.168.2.200; # non-static IP range
192 option broadcast-address 192.168.2.255;
193 option routers 192.168.2.100; # Gateway server
194 next-server 192.168.2.100; # TFTP server
195 filename "pxelinux.0"; # bootloader
198 hardware ethernet ZZ:ZZ:ZZ:ZZ:ZZ:ZZ;
199 fixed-address 192.168.2.101;
200 option root-path "192.168.2.100:/diskless/n1";
201 option host-name "n1";
203 … more hosts for other client nodes …
206 This assigns the client a static hostname, domain name, and IP address
207 according to it's ethernet address (aka MAC address). It also tells
208 all the clients to ask the TFTP server on 192.168.2.100 for the
209 bootloader `pxelinux.0`. For extra fun, it tells the clients to send
210 packets to the router at 192.168.2.100 if they can't figure out where
211 they should go, and to use particular DNS servers to resolve domain
212 names to IP addresses. This gives them access to the outside WAN. I
213 don't know yet if the booting options are necessary, since I don't
216 We also need to ensure that the DHCP server only binds to `eth0`, since starting a DHCP server on your WAN will make you unpopular with your ISP. You should have the following `/etc/default/dhcp`:
220 Once the DHCP server is configured, you can start it with
222 # /etc/init.d/dhcp restart
224 Check that the server is actually up with
228 and if it is not, look for error messages in
230 # grep -i dhcp /var/log/syslog
235 There are several TFTP server packages. We use `atftpd` here, but
236 `tftp-hpa` is also popular. Install `atftpd` with
238 # apt-get install atftpd xinetd
240 where `xinetd` is a super-server (replacing `inetd`, see `man xinetd`
241 for details). Configure `atftpd` with `/etc/xinetd.d/atftpd`
250 server = /usr/sbin/in.tftpd
251 server_args = --tftpd-timeout 300 --retry-timeout 5 --bind-address 192.168.2.100 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --logfile /var/log/atftpd.log --verbose=10 /diskless/tftpboot
254 Note that the `server_args` should all be on a single, long line,
255 since I haven't been able to discover if `xinetd` recognizes escaped
256 endlines yet. This configuration tells `xinetd` to provide TFTP
257 services by running `in.tftpd` (the daemon form of `atftpd`) as user
258 `nobody`. Most of the options we pass to `in.tftpd` involve
259 multicasting, which I believe is only used for MTFTP (which
260 `pxelinux.0` doesn't use). `--logfile /var/log/atftpd.log
261 --verbose=10` logs lots of detail to `/var/log/atftpd.log` if it
262 exists. You can create it with
264 # touch /var/log/atftpd.log
265 # chown nobody.nogroup /var/log/atftpd.log
267 The most important argument is `/diskless/tftpboot`, which specifies
268 the root of the TFTP-served filesystem (feel free to pick another
269 location if you would like). This is where we'll put all the files
270 that the TFTP will be serving. It needs to be read/writable by
271 `nobody`, so create it with
276 (TODO: possibly set the sticky bit, remove writable?)
278 Finally, we need to restart the `xinetd` server so it notices the new
281 # /etc/init.d/xinetd restart
283 Check that the `xinetd` server is up with
285 # ps -e | grep xinetd
287 and look for error messages with
289 # grep -i dhcp /var/log/syslog
291 Just having `xinetd` up cleanly doesn't prove that `atftpd` is working
292 though, it just shows that the `atftpd` configuration file wasn't too
293 bungled. To actually test `atftpd` we need to wait until the
294 [Synthesis Section][sec.synthesis] when we actually have files to
299 Install the NFS utilities on the server with
301 # apt-get install nfs-common nfs-kernel-server
303 We go with the kernel server because we want fast NFS, since we'll be
304 doing a lot of it. Set the NFS server up to export the root file
305 systems and the user's home directories with `/etc/exports`:
307 /diskless/n1 192.168.2.0/24(rw,no_root_squash,sync,no_subtree_check)
308 … other node root exports …
309 /diskless 192.168.2.0/24(rw,no_root_squash,sync,no_subtree_check) # unnecessary
310 /home 192.168.2.0/24(rw,no_root_squash,sync,no_subtree_check)
311 /usr 192.168.2.0/24(rw,no_root_squash,sync,no_subtree_check)
313 Then let the NFS server know we've changed the `exports` file with
315 # exportfs -av # TODO: -r?
317 Test that the NFS server is working properly by `ssh`ing onto one of
318 the clients and running
320 client# mkdir /mnt/n1
321 client# mount 192.168.2.100:/diskless/n1 /mnt/n1
323 … some resonable contents …
324 client# umount /mnt/n1
325 client# rmdir /mnt/n1
331 The only client setup that actually happens on the client is changing
332 the BIOS boot order to preferentially boot from the network. Consult
333 your motherboard manual for how to accomplish this. It should be
334 simple once you get into the BIOS menu, which you generally do by
335 pressing `del`, `F2`, `F12`, or some such early in your boot process.
336 *Everything else happens on the server*.
340 We want to install a basic Debian setup on our clients. Since each
341 client doesn't have it's own, private partition, we need to install
342 Debian using `debootstrap`.
344 # apt-get install debootstrap
346 # debootstrap --verbose --resolve-deps etch /diskless/n1
347 # chroot /diskless/n1
348 chroot# tasksel install standard
349 chroot# dpkg-reconfigure locales
350 chroot# apt-get install kernel-image-2.6-686 openssh-server nfs-client
352 TODO: what get's installed with standard?
353 See `/usr/share/tasksel/debian-tasks.desc` for a list of possible
354 tasks and [the debian docs][internals] for details on how a full
355 installation from CD or netboot.
357 We can also add a few utilities so we can work in our `chroot`ed environment
359 chroot# apt-get install emacs21-nox
361 ### Configuring `/etc`
363 The client will be getting its hostnames from the DHCP server, so
366 # rm /diskless/n1/etc/hostname
368 We also need to setup the `fstab` to mount `/home` and `/usr` from the
369 server. In `/diskless/n1/etc/fstab`:
371 # /etc/fstab: static file system information.
373 # <file system> <mount point> <type> <options> <dump> <pass>
374 # automatically mount nfs root and proc through other means
375 192.168.2.100:/home /home nfs defaults,nolock 0 0
376 192.168.2.100:/usr /usr nfs defaults,nolock 0 0
377 # we're diskess so we don't need to mount the hard disk sda :)
378 #/dev/sda1 / ext3 defaults,errors=remount-ro 0 1
379 /dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0
380 /dev/fd0 /media/floppy0 auto rw,user,noauto 0 0
385 ### Kernel and initial ramdisk
387 The kernel version number shows up often in this section. You can
388 determine your kernel version number (in my case 2.6.18-6-686) with
389 `uname -r`. Because kernel versions change fairly frequently, I'll
390 use `KERNEL_VERSION` to denote the kernel version string.
392 Your kernel must be compiled with NFS root support if it's going to have an [NFS root][nfsroot].
393 You can determine whether your kernel supports NFS roots with
395 # grep 'ROOT_NFS' /diskless/n1/boot/config-KERNEL-VERSION
397 I didn't have it in my default debian etch 2.6.18-6-686 kernel, so I
398 had to recompile my kernel (see the [Kernel Appendix][app.kernel] and
399 [Falko's notes][kernel]). My compiled kernel had a version string
402 Most kernels boot using an [initial ramdisk][initrd] (a compressed
403 root filesytem that lives in RAM). This ramdisk contains the
404 necessary programs and scripts for booting the kernel. We need to
405 create a ramdisk that can handle an NFS root, so `chroot` into your
406 client filesystem and install some tools
408 chroot# apt-get install initramfs-tools
410 Configure future ramdisks for NFS mounting with
411 `/etc/initramfs-tools/initramfs.conf`:
413 # Configuration file for mkinitramfs(8). See initramfs.conf(5).
415 BOOT=nfs # was BOOT=local
418 Compile a new `initrd` with
420 chroot# update-initramfs -u
422 If you compiled your own kernel as in [Kernel Appendix][app.kernel]
423 after setting up `initramfs.conf`, an appropriate ramdisk should have
424 been created automatically.
426 You can examine the contents of your ramdisk with
428 $ cp /diskless/n1/boot/initrd.img-2.6.18-6-686 initrd.img.gz
429 $ gunzip initrd.img.gz
432 $ cpio -i --make-directories < ../initrd.img
435 <a name="synthesis" />
440 To configure PXE, we need to bring `pxelinux.0` into our new
443 # cp /usr/lib/syslinux/pxelinux.0 /diskless/tftpboot/
445 We also need to bring in our kernel image and initial ramdisk
447 # cd /diskless/tftpboot
448 # ln -s /diskless/n1/boot/initrd.img-2.6.18-custom
449 # ln -s /diskless/n1/boot/vmlinuz-2.6.18-custom
451 `atftpd` handles the symbolic links, but if your TFTP server doesn't,
452 you'll have to copy the image and ramdisk over instead.
454 At this point you should test your TFTP server with test transfers.
455 Install the atftp client
457 # apt-get install atftp
459 And attempt to transfer the important files.
461 $ atftp 192.168.2.100
463 Connected: 192.168.2.100 port 69
479 tftp> get initrd.img-2.6.18-custom
480 tftp> get vmlinuz-2.6.18-custom
484 -rw-r--r-- 1 sysadmin sysadmin 4297523 2008-05-30 09:27 initrd.img-2.6.18-custom
485 -rw-r--r-- 1 sysadmin sysadmin 13480 2008-05-30 09:26 pxelinux.0
486 -rw-r--r-- 1 sysadmin sysadmin 1423661 2008-05-30 09:27 vmlinuz-2.6.18-custom
489 If this doesn't work, look for errors in `/var/log/syslog` and
490 `/var/log/atftpd.log` and double check your typing in the `atftpd`
493 The last stage is to configure the `pxelinux.0` bootloader. Create a
494 configuration directory in `tftboot` with
496 # mkdir /diskless/tftpboot/pxelinux.cfg
498 When each client loads `pxelinux.0` during the boot, they look for a
499 configuration file in `pxelinux.cfg`. The loader runs through a
500 sequence of possible config file names, as described in
501 `pxelinux.doc`. We'll have different rood directories for each of our
502 nodes, so we need a seperate config for each of them. In order to
503 make our configs machine-specific, we'll use the ethernet (MAC)
504 address file-name scheme. That is, for a machine with MAC address
505 AA:BB:CC:DD:EE:FF, we make the file
506 `pxelinux.cgf/01-aa-bb-cc-dd-ee-ff`. TODO: base config on IP address.
507 In `/diskless/tftpboot/pxelinux.cfg/01-aa-bb-cc-dd-ee-ff`:
512 kernel vmlinuz-2.6.18-custom
513 append root=/dev/nfs initrd=initrd.img-2.6.18-custom
514 nfsroot=192.168.2.100:/diskless/n1,tcp ip=dhcp rw
516 Note that the `append`ed args should all be on a single, long line,
517 since I haven't been able to discover if `pxelinux` recognizes escaped
518 endlines yet. This file is basically like a `grub` or `lilo` config
519 file, and you can get fancy with a whole menu, but since this is a
520 cluster and not a computer lab, we don't need to worry about that.
521 Note that this file was only for our first node (`n1`). You have to
522 make copies for each of your nodes, with the appropriate file names
525 The kernel options are fairly self explanatory except for the `tcp`
526 for the `nfsroot` option, which says the client should mount the root
527 directory using TCP based NFS. Traditional NFS uses UDP, which is
528 faster, but possibly less reliable for large files (like our kernel
529 and initrd). However I'm having trouble tracking down a reliable
530 source for this. For now, consider the `tcp` a voodoo incantation to
531 be attempted if the NFS booting isn't working.
533 You're done! Plug a monitor into one of the clients and power her up.
534 Everything should boot smoothly off the server, without touching the
541 To add a new client node `nX` to the cluster, we need to do the
542 following (which can be combined into an `add-client` script). First,
543 we need to create a root directory for the new client
548 Now we need to export that directory
550 # echo '/diskless/nX 192.168.2.0/24(rw,no_root_squash,sync,no_subtree_check)' >> /etc/exports
553 Finally, we need to set up the booting and DHCP options
555 # cd /diskless/tftpboot
556 # sed 's/\/diskless\/n1/\/diskless\/nX/' 01-xx-xx-xx-xx-xx-xx > 01-yy-yy-yy-yy-yy-yy
558 hardware ethernet YY:YY:YY:YY:YY:YY;
559 fixed-address 192.168.2.10X;
560 option root-path "192.168.2.100:/diskless/nX/";
561 option host-name "nX";
562 }' >> /etc/dhcpd.conf
563 # /etc/init.d/dhcp restart
569 <a name="app.kernel" />
574 See [Falko's notes][kernel] for an excellent introduction, and the
575 [NFS-root mini-HOWTO][nfsroot-mini] for NSF root particulars.
577 First, grab a bunch of useful compilation tools
579 chroot# apt-get install wget bzip2 kernel-package
580 chroot# apt-get install libncurses5-dev fakeroot build-essential initramfs-tools
582 Some of these (e.g. `wget`) should already be installed, but apt-get
583 will realize this, so don't worry about it. Configure `initramfs` for
584 building NFS root-capable initial ramdisks by setting up
585 `/etc/initramfs-tools/initramfs.conf` as explained in the [Kernel
586 Section][sec.kernel]. For NSF root, your kernel needs the following
591 → Networking support (`NET [=y]`)
593 → TCP/IP networking (`INET [=y]`)
594 → IP: kernel level autoconfiguration (`IP_PNP =y`)
595 `ROOT_NFS` (`NET && NFS_FS=y && IP_PNP`)
597 → Network File Systems
599 I also used the build-in NFS client instead of the module.
600 Here is a `diff` of the original debian etch conf vs. mine:
602 $ diff /diskless/n1/boot/config-2.6.18-6-686 .config
604 < # Sun Feb 10 22:04:18 2008
606 > # Thu May 29 23:59:47 2008
608 < # CONFIG_IP_PNP is not set
611 > CONFIG_IP_PNP_DHCP=y
612 > CONFIG_IP_PNP_BOOTP=y
613 > CONFIG_IP_PNP_RARP=y
624 < CONFIG_NFS_ACL_SUPPORT=m
626 > CONFIG_NFS_ACL_SUPPORT=y
629 < CONFIG_SUNRPC_GSS=m
630 < CONFIG_RPCSEC_GSS_KRB5=m
633 > CONFIG_SUNRPC_GSS=y
634 > CONFIG_RPCSEC_GSS_KRB5=y
636 < CONFIG_CRYPTO_DES=m
638 > CONFIG_CRYPTO_DES=y
640 Compile your shiny, new kernel with
642 chroot# make-kpkg clean
643 chroot# fakeroot make-kpkg --initrd --append-to-version=-custom kernel_image kernel_headers
645 The new kernel packages are in the `src` directory
650 Install the packages with
652 chroot# dpkg -i linux-image-2.6.18-custom_2.6.18-custom-10.00.Custom_i386.deb
658 ### No network devices available
662 IP-Config: No network devices available.
664 messages during the boot (*after* the kernel is successfully loaded!).
665 According to [this post][no-net], the problem is due to a missing
668 So I figured out what card I had:
672 03:03.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller
673 03:04.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller
676 The [ethernet HOWTO][ethernet] claimed that the `e1000` drivers were
677 required for Intel gigabit cards, and indeed I had the e1000 module
678 mounted on my server:
685 I reconfigured my kernel with (old vs new):
687 diff .config_mod_e1000 .config
689 < # Linux kernel version: 2.6.18-custom
690 < # Fri May 30 00:13:47 2008
692 > # Linux kernel version: 2.6.18
693 > # Fri May 30 22:21:29 2008
699 After which I recompiled and reinstalled the kernel as in the [Kernel
700 Appendix][app.kernel].
702 ### Waiting for `/usr/`
704 On booting a client, I noticed a `Waiting for /usr/: FAILED` message
705 just before entering runlevel 2. I attribute the error to a faulty
706 boot order on the client not mounting it's fstab filesystems before
707 trying to run something in `/usr/`. There don't seem to be any
708 serious side effects though, since the wait times out, and by the time
709 I can log in to the node, `/usr/` is mounted as it should be.
712 [microwulf]: http://www.calvin.edu/~adams/research/microwulf/sys/microwulf_notes.pdf
713 [debian-nfsroot]: http://www.jukie.net/~bart/blog/nfsroot-on-debian
714 [kernel]: http://www.howtoforge.com/kernel_compilation_debian_etch
715 [install]: http://www.debian.org/releases/stable/installmanual
716 [netboot.tar.gz]: http://http.us.debian.org/debian/dists/etch/main/installer-i386/current/images/netboot/netboot.tar.gz
717 [partition]: http://www.tldp.org/HOWTO/text/Partition
718 [tasksel]: http://www.debian.org/releases/stable/i386/apds03.html.en
719 [net-howto]: http://www.faqs.org/docs/Linux-HOWTO/Net-HOWTO.html
720 [pxe]: http://syslinux.zytor.com/pxe.php
721 [internals]: http://d-i.alioth.debian.org/doc/internals/
722 [nfsroot]: http://www.kernel.org/doc/Documentation/filesystems/nfsroot.txt
723 [initrd]: http://www.ibm.com/developerworks/linux/library/l-initrd.html
724 [nfsroot-mini]: http://www.tldp.org/HOWTO/text/NFS-Root
725 [no-net]: http://www.linuxquestions.org/questions/linux-networking-3/ip-config-no-network-device-available-591273/
726 [ethernet]: http://tldp.org/HOWTO/Ethernet-HOWTO-4.html#ss4.24
728 [sec.kernel]: #kernel
729 [sec.synthesis]: #synthesis
730 [app.kernel]: #app.kernel