Useful kernel and driver performance tweaks for your Linux server

Posted on November 20, 2013. Filed under: Performance | Tags: , , , , , |

This article is going to address some kernel and driver tweaks that are interesting and useful. We use several of these in production with excellent performance, but you should proceed with caution and do researchprior to trying anything listed below.

Tickless System

The tickless kernel feature allows for on-demand timer interrupts. This means that during idle periods, fewer timer interrupts will fire, which should lead to power savings, cooler running systems, and fewer useless context switches.

Kernel option: CONFIG_NO_HZ=y

Timer Frequency

You can select the rate at which timer interrupts in the kernel will fire. When a timer interrupt fires on a CPU, the process running on that CPU is interrupted while the timer interrupt is handled. Reducing the rate at which the timer fires allows for fewer interruptions of your running processes. This option is particularly useful for servers with multiple CPUs where processes are not running interactively.

Kernel options: CONFIG_HZ_100=y and CONFIG_HZ=100

Connector

The connector module is a kernel module which reports process events such as forkexec, and exit to userland. This is extremely useful for process monitoring. You can build a simple system (or use an existing one like god) to watch mission-critical processes. If the processes die due to a signal (like SIGSEGV, orSIGBUS) or exit unexpectedly you’ll get an asynchronous notification from the kernel. The processes can then be restarted by your monitor keeping downtime to a minimum when unexpected events occur.

Kernel options: CONFIG_CONNECTOR=y and CONFIG_PROC_EVENTS=y

TCP segmentation offload (TSO)

A popular feature among newer NICs is TCP segmentation offload (TSO). This feature allows the kernel to offload the work of dividing large packets into smaller packets to the NIC. This frees up the CPU to do more useful work and reduces the amount of overhead that the CPU passes along the bus. If your NIC supports this feature, you can enable it with ethtool:

[joe@timetobleed]% sudo ethtool -K eth1 tso on

Let’s quickly verify that this worked:

[joe@timetobleed]% sudo ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: on
large receive offload: off

[joe@timetobleed]% dmesg | tail -1
[892528.450378] 0000:04:00.1: eth1: TSO is Enabled

Intel I/OAT DMA Engine

This kernel option enables the Intel I/OAT DMA engine that is present in recent Xeon CPUs. This option increases network throughput as the DMA engine allows the kernel to offload network data copying from the CPU to the DMA engine. This frees up the CPU to do more useful work.

Check to see if it’s enabled:

[joe@timetobleed]% dmesg | grep ioat
ioatdma 0000:00:08.0: setting latency timer to 64
ioatdma 0000:00:08.0: Intel(R) I/OAT DMA Engine found, 4 channels, device version 0x12, driver version 3.64
ioatdma 0000:00:08.0: irq 56 for MSI/MSI-X

There’s also a sysfs interface where you can get some statistics about the DMA engine. Check the directories under /sys/class/dma/.

Kernel options: CONFIG_DMADEVICES=y and CONFIG_INTEL_IOATDMA=y and CONFIG_DMA_ENGINE=y and CONFIG_NET_DMA=y and CONFIG_ASYNC_TX_DMA=y

Direct Cache Access (DCA)

Intel’s I/OAT also includes a feature called Direct Cache Access (DCA). DCA allows a driver to warm a CPU cache. A few NICs support DCA, the most popular (to my knowledge) is the Intel 10GbE driver (ixgbe). Refer to your NIC driver documentation to see if your NIC supports DCA. To enable DCA, a switch in the BIOS must be flipped. Some vendors supply machines that support DCA, but don’t expose a switch for DCA. If that is the case, see my last blog post for how to enable DCA manually.

You can check if DCA is enabled:

[joe@timetobleed]% dmesg | grep dca
dca service started, version 1.8

If DCA is possible on your system but disabled you’ll see:

ioatdma 0000:00:08.0: DCA is disabled in BIOS

Which means you’ll need to enable it in the BIOS or manually.

Kernel option: CONFIG_DCA=y

NAPI

The “New API” (NAPI) is a rework of the packet processing code in the kernel to improve performance for high speed networking. NAPI provides two major features1:

Interrupt mitigation: High-speed networking can create thousands of interrupts per second, all of which tell the system something it already knew: it has lots of packets to process. NAPI allows drivers to run with (some) interrupts disabled during times of high traffic, with a corresponding decrease in system load.

Packet throttling: When the system is overwhelmed and must drop packets, it’s better if those packets are disposed of before much effort goes into processing them. NAPI-compliant drivers can often cause packets to be dropped in the network adaptor itself, before the kernel sees them at all.

Many recent NIC drivers automatically support NAPI, so you don’t need to do anything. Some drivers need you to explicitly specify NAPI in the kernel config or on the command line when compiling the driver. If you are unsure, check your driver documentation. A good place to look for docs is in your kernel source under Documentation, available on the web here: http://lxr.linux.no/linux+v2.6.30/Documentation/networking/but be sure to select the correct kernel version, first!

Older e1000 drivers (newer drivers, do nothing)make CFLAGS_EXTRA=-DE1000_NAPI install

Throttle NIC Interrupts

Some drivers allow the user to specify the rate at which the NIC will generate interrupts. The e1000e driver allows you to pass a command line option InterruptThrottleRate

when loading the module with insmod. For the e1000e there are two dynamic interrupt throttle mechanisms, specified on the command line as 1 (dynamic) and 3 (dynamic conservative). The adaptive algorithm traffic into different classes and adjusts the interrupt rate appropriately. The difference between dynamic and dynamic conservative is the the rate for the “Lowest Latency” traffic class, dynamic (1) has a much more aggressive interrupt rate for this traffic class.

As always, check your driver documentation for more information.

With modprobe: insmod e1000e.o InterruptThrottleRate=1

Process and IRQ affinity

Linux allows the user to specify which CPUs processes and interrupt handlers are bound.

  • Processes You can use taskset to specify which CPUs a process can run on
  • Interrupt Handlers The interrupt map can be found in /proc/interrupts, and the affinity for each interrupt can be set in the file smp_affinity in the directory for each interrupt under /proc/irq/

This is useful because you can pin the interrupt handlers for your NICs to specific CPUs so that when a shared resource is touched (a lock in the network stack) and loaded to a CPU cache, the next time the handler runs, it will be put on the same CPU avoiding costly cache invalidations that can occur if the handler is put on a different CPU.

However, reports2 of up to a 24% improvement can be had if processes and the IRQs for the NICs the processes get data from are pinned to the same CPUs. Doing this ensures that the data loaded into the CPU cache by the interrupt handler can be used (without invalidation) by the process; extremely high cache locality is achieved.

oprofile

oprofile is a system wide profiler that can profile both kernel and application level code. There is a kernel driver for oprofile which generates collects data in the x86′s Model Specific Registers (MSRs) to give very detailed information about the performance of running code. oprofile can also annotate source code with performance information to make fixing bottlenecks easy. See oprofile’s homepage for more information.

Kernel options: CONFIG_OPROFILE=y and CONFIG_HAVE_OPROFILE=y

epoll

epoll(7) is useful for applications which must watch for events on large numbers of file descriptors. Theepoll interface is designed to easily scale to large numbers of file descriptors. epoll is already enabled in most recent kernels, but some strange distributions (which will remain nameless) have this feature disabled.

Kernel option: CONFIG_EPOLL=y

Conclusion

  • There are a lot of useful levers that can be pulled when trying to squeeze every last bit of performance out of your system
  • It is extremely important to read and understand your hardware documentation if you hope to achieve the maximum throughput your system can achieve
  • You can find documentation for your kernel online at the Linux LXRMake sure to select the correct kernel version because docs change as the source changes!

Thanks for reading and don’t forget to subscribe (via RSS or e-mail) and follow me on twitter.

References

  1. http://www.linuxfoundation.org/en/Net:NAPI
  2. http://software.intel.com/en-us/articles/improved-linux-smp-scaling-user-directed-processor-affinity/
  3. http://timetobleed.com/useful-kernel-and-driver-performance-tweaks-for-your-linux-server/  (Source of this file)
Read Full Post | Make a Comment ( None so far )

Make ipmitool working for ESXi 5.1

Posted on September 17, 2013. Filed under: Linux, Programming | Tags: , , |

ESXi 5.1 has no ipmitool command built-in, and has limited IPMI capability, we can build a static ipmitool binary for ESXi 5.1.

1. Login to ESXi 5.1, check it’s libc.so.6 version by:
~# /lib/libc.so.6

Compiled by GNU CC version 4.1.2 20070626 (Red Hat 4.1.2-14).
Compiled on a Linux 2.6.9 system on 2012-03-21.

This means the libc.so.6 is built on linux with kernel 2.6.9 and gcc 4.1.2, so we’d better setup a linux environment with the same linux kernel and gcc version, such as CentOS 4.8.

2. login to the linux environment, build static linked ipmitool binary
2.1 download the ipmitool source code from sourceforge
2.2 Compile. And adding “-static” linker option. The resulting binary I got back was not statically compiled, but worked. Omitting the -static flag caused ipmitool to throw compile errors, so I left it.
~/src/ipmitool-1.8.11$ ./configure CFLAGS=-m32 LDFLAGS=-static

~/src/ipmitool-1.8.11/src/$ ldd ipmitool
linux-gate.so.1 => (0x009f0000)
libm.so.6 => /lib/libm.so.6 (0x00c80000)
libc.so.6 => /lib/libc.so.6 (0x00ac4000)
/lib/ld-linux.so.2 (0x00a9e000)

2.3 copy the built binary to ESXi 5.1 server, it works fine

References:
http://ipmitool.sourceforge.net/
http://blog.rchapman.org/post/17480234232/configuring-bmc-drac-from-esxi-or-linux

Read Full Post | Make a Comment ( 6 so far )

Linux Page Faults

Posted on September 16, 2013. Filed under: Linux, Programming | Tags: , , , |

Page Fault

4 Linux Commands to View Page Faults Stats: ps, top, sar, time

  1. major fault occurs when disk access required. For example, start an app called Firefox. The Linux kernel will search in the physical memory and CPU cache. If data do not exist, the Linux issues a major page fault.
  2. A minor fault occurs due to page allocation.

 

Read Full Post | Make a Comment ( None so far )

Cross compiling environment setup for ARM Architecture pidora OS

Posted on September 2, 2013. Filed under: C/C++, Linux | Tags: , , , , , , , |

In this article, I’m using raspberry pi hardware, and using pedora 32-bit target OS, Scientific Linux 6.1 64-bit as host OS, crosstool-ng as cross compiling tool. I will give a brief introduction steps about how to build binaries and shared/static libraries, and in the process, some building depends on third party libraries

Question: Why we need setup the cross compiling environment?
The ARM architecture is not powerful enough to build the binary as quick as we want, so we need setup a toolchain build environment on a powerful host OS which is using Intel or AMD cpu to build binaries for target OS running on ARM cpu.

    1. After install fedora in raspberry pi, need following information from it:

kernel version by uname -a

gcc version by gcc --version

glibc version by run /lib/libc.so.6 directly

    1. download and install crosstool-ng to host OS

login as root to the host OS, and then:
wget http://crosstool-ng.org/download/crosstool-ng/crosstool-ng-1.18.0.tar.bz2
tar xjvf crosstool-ng-1.18.0.tar.bz2
yum install gperf.x86_64 texinfo.x86_64 gmp-devel.x86_64

./configure –prefix=/usr/local/
make
make install

unset LD_LIBRARY_PATH (or “export -n LD_LIBRARY_PATH”, this step is because of crosstool-ng may use this environment, so if we leave the host one, it may screw up the target compiling)
and add the destination address to PATH:
export PATH=/usr/local/x-tools/crosstool-ng/bin:$PATH (then you can use “ct-ng”)
mkdir ~/pi-staging
cd ~/pi-staging
ct-ng menuconfig

In the configuration, select related configuration of Host and Target OS, such as:
In target option, select target architecture as “ARM”, and set “Endianness set” as “Little endian” and “Bitness” as “32-bit”

      • enable “EXPERIMENTAL” features in “Paths and misc”, leave your prefix directory as “${HOME}/x-tools/${CT_TARGET}”
      • in “Operating System”, set “Target OS” as “linux”, and “Linux Kernel” to that ARM pedora’s kernel version
      • Set “binutils” version to “2.21.1a”
      • Set “C Compiler” version to what ARM pedora is using, and enable “C++” if needed
      • Set “glibc” or “eglibc” or “ulibc” version to what ARM pedora is using (get this information from /lib/libc.so.6)

Then:
ct-ng build

this will build the basic environment of target os, and it may take 1 to 2 hours, after this building, you can find the binaries under directory “~/x-tools/arm-unknown-linux-gnueabi/bin/”, and we need add this directory into PATH, and following binary provided:
arm-unknown-linux-gnueabi-gcc
arm-unknown-linux-gnueabi-arm

    1. After build crosstool-ng, there is one “sysroot” directory under “x-tools/arm-unknown-linux-gnueabi/arm-unknown-linux-gnueabi/sysroot”, this information can be found in cross tool-ng’s document “crosstool-ng/docs/”When we build a big project which has multiple libraries (static or dynamic) and binaries, the project may depend on more third party libraries, so we need make the build environment supports third party libraries deployment capability, the straightforward way is include all source code of third party libraries into the project, but that costs a lot of debugging and waste building time, the simple way is to setup a sysroot (like chroot) to “install”/”unpack” the third party libraries into the sysroot directory
        1. we’re not going to use built-in sysroot directory, we will setup a new sysroot directory that can benefit multiple projects use their own sysroot and screw one won’t affect other projects
        2. run following commands for a new sysroot

      cp -r ~/x-tools/arm-unknown-linux-gnueabi/arm-unknown-linux-gnueabi/sysroot ~/my-sysroot

        1. once you need a third party library to build a project, you can download the rpm from http://downloads.raspberrypi.org/pidora/releases/18/packages/armv6hl/os/Packages

      such as I need libcurl-devel for my project, I can download rpm and unpack the package to my-sysroot:
      cd ~/my-sysroot/
      rpm2cpio libcurl-devel-7.27.0-6.fc18.armv6hl.rpm | cpio -idmv

    2. start cross compiling your project

Lots of our projects are using autoconf or automake, so we may use “./configure” and “make” during the project compilation
Set CC to the cross-compiler for the target you configured the library for; it is important to use this same CC value when running configure, like this: ‘CC=target-gcc configure target’. Set BUILD_CC to the compiler to use for programs run on the build system as part of compiling the library. You may need to set AR to cross-compiling versions of ar if the native tools are not configured to work with object files for the target you configured for. such as:

CC=arm-unknown-linux-gnueabi-gcc AR=arm-unknown-linux-gnueabi-ar ./configure

4.1 once the Makefile is generated, go check the Makefile is using the right gcc and ar version as you want
and if there is architecture parameter in the Makefile, please check its using “armv6” instead of “i386” or “x86_64”

4.2 If there is header files needed, please check the CFLAGS that it’s pointing to the header file in your ~/my-sysroot directory, not host system header file

4.3 Some binary or library needs third party library to link, so add lib directory of your sysroot to CFLAGS and LDFLAGS as well, like:

CFLAGS+=-B/root/my-sysroot/usr/lib

Sometimes, the make process can not find some library even you give the path it exist, then you may need following adding to Makefile
LDFLAGS+=-Wl,-rpath,/root/my-sysroot/lib

  1. Troubleshootings
    5.1 if you want to see the process of buildadding “-v” to “CFLAGS“5.2 cannot find crti.o: No such file or directory

    Adding sysroot lib directory “-B/root/my-sysroot/usr/lib” to “CFLAGS

    5.3 arm-unknown-linux-gnueabi/bin/ld: cannot find /lib/libpthread.so.0
    arm-unknown-linux-gnueabi/bin/ld: skipping incompatible /usr/lib/libpthread_nonshared.a when searching for /usr/lib/libpthread_nonshared.a
    arm-unknown-linux-gnueabi/bin/ld: cannot find /usr/lib/libpthread_nonshared.a

    This need edit file in my-sysroot to delete absolute path:
    5.3.1 cd ~/my-sysroot/usr/lib/
    vi libpthread.so

    change “GROUP ( /lib/libpthread.so.0 /usr/lib/libpthread_nonshared.a )
    to:
    GROUP ( libpthread.so.0 libpthread_nonshared.a )

    5.3.2 this solution apply to libc.so file for cross-compiling as well

    5.4 arm-unknown-linux-gnueabi/bin/ld: cannot find -lboost_regex
    download and install boost-static from pedora to sysroot which includes static boost libraries
    And adding the sysroot lib directory to LDFLAGS such as:
    LDFLAGS+=-Wl,-rpath,/root/my-sysroot/lib

    This method applies to libpthread.so.0, librt.so.1, libresolv.so.2 not found issues

    5.5 arm-unknown-linux-gnueabi/bin/ld: warning: ld-linux.so.3, needed by my-sysroot//usr/lib/libz.so, not found (try using -rpath or -rpath-link)

    This is because of the ld-linux.so.3 in your sysroot points to the Host OS one, we need relink it to target sysroot lib file:
    make sure you unpack the pedora glibc library into your sysroot (rpm2cpio glibc-2.16-28.fc18.a6.armv6hl.rpm | cpio -idmv),
    then:
    cd ~/my-sysroot/lib/
    ls -l ld-linux.so.3
    rm ld-linux.so.3
    ln -s ld-linux-armhf.so.3 ld-linux.so.3

    5.6 wired link problems
    5.6.1 sometimes if you provide all static or dynamic libraries directories from your sysroot for the binary or library to link, it still fail, try to switch the third party libraries sequence in LDFLAGS, such as change “-lz -lcrypto” to “-lcrypto -lz
    5.6.2 In function `boost::iostreams::detail::bzip2_base::compress(int)': undefined reference to `BZ2_bzCompress'
    In this case, your Makefile may want to link static libraries into your binary, in this case you can try to link dynamic linked library instead of static linked library.

  • References:

http://crosstool-ng.org/
http://jecxjo.motd.org/code/blosxom.cgi/devel/cross_compiler_environment.html
http://fedoraproject.org/wiki/Raspberry_Pi
http://downloads.raspberrypi.org/pidora/releases/18/packages/armv6hl/os/Packages
http://linux.die.net/man/1/yumdownloader
http://smdaudhilbe.wordpress.com/2013/04/26/steps-to-create-cross-compiling-toolchain-using-crosstool-ng/
http://briolidz.wordpress.com/2012/02/07/building-embedded-arm-systems-with-crosstool-ng/
http://www.kitware.com/blog/home/post/426
http://stackoverflow.com/questions/6329887/compiling-problems-cannot-find-crt1-o

http://www.raspberrypi.org/phpBB3/viewtopic.php?f=33&t=37658

Read Full Post | Make a Comment ( None so far )

Linux performance related urls

Posted on August 28, 2013. Filed under: Linux | Tags: , |

1. red hat rhel6 Performance Tuning Guide

2. Xen Network Throughput and Performance Guide

Read Full Post | Make a Comment ( None so far )

Major and Minor Numbers

Posted on January 18, 2010. Filed under: Linux, Services | Tags: , , , , |

One of the basic features of the Linux kernel is that it abstracts the handling of devices. All hardware devices look like regular files; they can be opened, closed, read and written using the same, standard, system calls that are used to manipulate files. Every device in the system is represented by a file. For block (disk) and character devices, these device files are created by the mknod command and they describe the device using major and minor device numbers. Network devices are also represented by device special files but they are created by Linux as it finds and initializes the network controllers in the system.

To UNIX, everything is a file. To write to the hard disk, you write to a file. To read from the keyboard is to read from a file. To store backups on a tape device is to write to a file. Even to read from memory is to read from a file. If the file from which you are trying to read or to which you are trying to write is a “normal” file, the process is fairly easy to understand: the file is opened and you read or write data. If, however, the device you want to access is a special device file (also referred to as a device node), a fair bit of work needs to be done before the read or write operation can begin.

One key aspect of understanding device files lies in the fact that different devices behave and react differently. There are no keys on a hard disk and no sectors on a keyboard, though you can read from both. The system, therefore, needs a mechanism whereby it can distinguish between the various types of devices and behave accordingly.

To access a device accordingly, the operating system must be told what to do. Obviously, the manner in which the kernel accesses a hard disk will be different from the way it accesses a terminal. Both can be read from and written to, but that’s about where the similarities end. To access each of these totally different kinds of devices, the kernel needs to know that they are, in fact, different.

Inside the kernel are functions for each of the devices the kernel is going to access. All the routines for a specific device are jointly referred to as the device driver. Each device on the system has its own device driver. Within each device driver are the functions that are used to access the device. For devices such as a hard disk or terminal, the system needs to be able to (among other things) open the device, write to the device, read from the device, and close the device. Therefore, the respective drivers will contain the routines needed to open, write to, read from, and close (among other things) those devices.

The kernel needs to be told how to access the device. Not only does the kernel need to be told what kind of device is being accessed but also any special information, such as the partition number if it’s a hard disk or density if it’s a floppy, for example. This is accomplished by the major number and minor number of that device.

All devices controlled by the same device driver have a common major device number. The minor device numbers are used to distinguish between different devices and their controllers. Linux maps the device special file passed in system calls (say to mount a file system on a block device) to the device’s device driver using the major device number and a number of system tables, for example the character device table, chrdevs. The major number is actually the offset into the kernel’s device driver table, which tells the kernel what kind of device it is (whether it is a hard disk or a serial terminal). The minor number tells the kernel special characteristics of the device to be accessed. For example, the second hard disk has a different minor number than the first. The COM1 port has a different minor number than the COM2 port, each partition on the primary IDE disk has a different minor device number, and so forth. So, for example, /dev/hda2, the second partition of the primary IDE disk has a major number of 3 and a minor number of 2.

It is through this table that the routines are accessed that, in turn, access the physical hardware. Once the kernel has determined what kind of device to which it is talking, it determines the specific device, the specific location, or other characteristics of the device by means of the minor number.

The major number for the hd (IDE) driver is hard-coded at 3. The minor numbers have the format

(<unit>*64)+<part>

where <unit> is the IDE drive number on the first controller, either 0 or 1, which is then multiplied by 64. That means that all hd devices on the first IDE drive have a minor number less than 64. <part> is the partition number, which can be anything from 1 to 20. Which minor numbers you will be able to access will depend on how many partitions you have and what kind they are (extended, logical, etc.). The minor number of the device node that represents the whole disk is 0. This has the node name hda, whereas the other device nodes have a name equal to their minor number (i.e., /dev/hda6 has a minor number 6).

If you were to have a second IDE on the first controller, the unit number would be 1. Therefore, all of the minor numbers would be 64 or greater. The minor number of the device node representing the whole disk is 1. This has the node name hdb, whereas the other device nodes have a name equal to their minor number plus 64 (i.e., /dev/hdb6 has a minor number 70).

If you have more than one IDE controller, the principle is the same. The only difference is that the major number is 22.

For SCSI devices, the scheme is a little different. When you have a SCSI host adapter, you can have up to seven hard disks. Therefore, we need a different way to refer to the partitions. In general, the format of the device nodes is

sd<drive><partition>

where sd refers to the SCSI disk driver, <drive> is a letter for the physical drive, and <partition> is the partition number. Like the hd devices, when a device refers to the entire disk, for example the device sda refers to the first disk.

The major number for all SCSI drives is 8. The minor number is based on the drive number, which is multiplied by 16 instead of 64, like the IDE drives. The partition number is then added to this number to give the minor.

The partition numbers are not as simple to figure out. Partition 0 is for the whole disk (i.e., sda). The four DOS primary partitions are numbered 14. Then the extended partitions are numbered 58. We then add 16 for each drive. For example:

brw-rw----   1 root     disk       8,  22 Sep 12  1994
/dev/sdb6

Because b is after the sd, we know that this is on the second drive. Subtracting 16 from the minor, we get 6, which matches the partition number. Because it is between 4 and 8, we know that this is on an extended partition. This is the second partition on the first extended partition.

The floppy devices have an even more peculiar way of assigning minor numbers. The major number is fairly easy its 2. Because the names are a little easier to figure out, lets start with them. As you might guess, the device names all begin with fd. The general format is

fd<drive><density><capacity>

where <drive> is the drive number (0 for A:, 1 for B:), <density> is the density (d-double, h-high), and <capacity> is the capacity of the drive (360Kb, 1440Kb). You can also tell the size of the drive by the density letter, lowercase letter indicates that it is a i5.25″ drive and an uppercase letter indicates that it is a 3.5″ driver. For example, a low-density 5.25″ drive with a capacity of 360Kb would look like

fd0d360

If your second drive was a high-density 3.5″ drive with a capacity of 1440Kb, the device would look like

fd1H1440

What the minor numbers represents is a fairly complicated process. In the fd(4) man-page there is an explanatory table, but it is not obvious from the table why specific minor numbers go with each device. The problem is that there is no logical progression as with the hard disks. For example, there was never a 3.5″ with a capacity of 1200Kb nor has there been a 5.25″ with a capacity of 1.44Mb. So you will never find a device with H1200 or h1440. So to figure out the device names, the best thing is to look at the man-page.

The terminal devices come in a few forms. The first is the system console, which are the devices tty0-tty?. You can have up to 64 virtual terminals on you system console, although most systems that I have seen are limited five or six. All console terminals have a major number of 4. As we discussed earlier, you can reach the low numbered ones with ALT-Fn, where n is the number of the function key. So ALT-F4 gets you to the fourth virtual console. Both the minor number and the tty number are based on the function key, so /dev/tty4 has a minor number 4 and you get to it with ALT-F4. (Check the console(4) man-page to see how to use and get to the other virtual terminals.)

Serial devices can also have terminals hooked up to them. These terminals normally use the devices /dev/ttySn, where n is the number of the serial port (0, 1, 2, etc.). These also have a minor number of 4, but the minor numbers all start at 64. (Thats why you can only have 63 virtual consoles.) The minor numbers are this base of 64, plus the serial port number (04). Therefore, the minor number of the third serial port would be 64+3=67.

Related to these devices are the modem control devices, which are used to access modems. These have the same minor numbers but have a major number of 5. The names are also based on the device number. Therefore, the modem device attached to the third serial port has a minor number of 64+3=67 and its name is cua3.

Another device with a major number of 5 is /dev/tty, which has a minor number of 0. This is a special device and is referred to as the “controlling terminal. ” This is the terminal device for the currently running process. Therefore, no matter where you are, no matter what you are running, sending something to /dev/tty will always appear on your screen.

The pseudo-terminals (those that you use with network connections or X) actually come in pairs. The “slave” is the device at which you type and has a name like ttyp?, where ? is the tty number. The device that the process sees is the “master” and has a name like ptyn, where n is the device number. These also have a major number 4. However, the master devices all have minor numbers based on 128. Therefore, pty0 has a minor number of 128 and pty9 has a minor number of 137 (128+9). The slave device has minor numbers based on 192, so the slave device ttyp0 has a minor number of 192. Note that the tty numbers do not increase numerically after 9 but use the letter af.

Other oddities with the device numbering and naming scheme are the memory devices. These have a major number of 1. For example, the device to access physical memory is /dev/mem with a minor number of 1. The kernel memory device is /dev/kmem, and it has a minor number of 2. The device used to access IO ports is /dev/port and it has a minor number of 4.

What about minor number 3? This is for device /dev/null, which is nothing. If you direct output to this device, it goes into nothing, or just disappears. Often the error output of a command is directed here, as the errors generated are uninteresting. If you redirect from /dev/null, you get nothing as well. Often I do something like this:

cat /dev/null > file_name

This device is also used a lot in shell scripts where you do not want to see any output or error messages. You can then redirect standard out or standard error to /dev/null and the messages disappear. For details on standard out and error, see the section on redirection.

If file_name doesn’t exist yet, it is created with a length of zero. If it does exist, the file is truncated to 0 bytes.

The device /dev/zero has a major number of 5 and its minor number is 5. This behaves similarly to /dev/null in that redirecting output to this device is the same as making it disappear. However, if you direct input from this device, you get an unending stream of zeroes. Not the number 0, which has an ASCII value of 48this is an ASCII 0.

Are those all the devices? Unfortunately not. However, I hope that this has given you a start on how device names and minor numbers are configured. The file <linux/major.h> contains a list of the currently used (at least well known) major numbers. Some nonstandard package might add a major number of its own. Up to this point, they have been fairly good about not stomping on the existing major numbers.

As far as the minor numbers go, check out the various man-pages. If there is a man-page for a specific device, the minor number will probably be under the name of the driver. This is in major.h or often the first letter of the device name. For example, the parallel (printer) devices are lp?, so check out man lp.

The best overview of all the major and minor numbers is in the /usr/src/linux/Documentation directory. The devices.txt is considered the “authoritative” source for this information.

References:

http://www.linux-tutorial.info/modules.php?name=MContent&pageid=94

http://www.linux-tutorial.info/modules.php?name=MContent&obj=glossary&term=man-page

http://www.kernel.org/pub/linux/docs/device-list/devices.txt

http://publib.boulder.ibm.com/infocenter/dsichelp/ds8000ic/index.jsp?topic=/com.ibm.storage.ssic.help.doc/f2c_linuxdevnaming_2hsag8.html

http://docsrv.sco.com/HDK_concepts/ddT_majmin.html

http://burks.brighton.ac.uk/burks/linux/rute/node18.htm

http://www.makelinux.info/ldd3/chp-3-sect-2.shtml

http://www.thegeekstuff.com/2009/06/how-to-identify-major-and-minor-number-in-linux/

Read Full Post | Make a Comment ( None so far )

Linux Network Setup/Monitor commands

Posted on November 30, 2009. Filed under: Linux | Tags: , , , , |

1. check devices status:

lspci  (dmidecode, lsusb, lshw , lspci)

reference: http://ubuntuforums.org/showthread.php?t=320995

2.  Check the network status:

mii-tool  (you should be root)

ip link

References:

http://www.linuxforums.org/forum/redhat-fedora-linux-help/129167-how-detect-hardware-add-new-one.html

3. NIC card change sequence:

nameif

(hwinfo)

/lib/udev/rename_netiface <old> <new>

References:

http://www.centos.org/modules/newbb/viewtopic.php?topic_id=10434

http://www.cyberciti.biz/faq/rhel-fedora-centos-howto-swap-ethernet-aliases/

http://www.linuxforums.org/forum/suse-linux-help/92280-how-change-ethx-ethy.html

4. iftop

a good console bandwidth visualization tool that shows you active
connections, where they are going to/from and how much of your precious bandwidth
they are using.

iftop is a command-line system monitor tool that produces a frequently-updated list of network connections. By default, the connections are ordered by bandwidth usage, with only the “top” bandwidth consumers shown.

Reference:

http://en.wikipedia.org/wiki/Iftop

5. iptraf

IPTraf is a console-based network statistics utility for Linux. It gathers a variety of figures such as TCP connection packet and byte counts, interface statistics and activity indicators, TCP/UDP traffic breakdowns, and LAN station packet and byte counts.

http://iptraf.seul.org/

Read Full Post | Make a Comment ( 1 so far )

BASH Shell: For Loop File Names With Spaces

Posted on November 20, 2009. Filed under: Linux, Mac, Shell | Tags: , , , , , , |

BASH for loop works nicely under UNIX / Linux / Windows and OS X while working on set of files. However, if you try to process a for loop on file name with spaces in them you are going to have some problem. for loop uses $IFS variable to determine what the field separators are. By default $IFS is set to the space character. There are multiple solutions to this problem.

Set $IFS variable

Try it as follows:

#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for f in *
do
  echo "$f"
done
IFS=$SAVEIFS

OR

#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
# set me
FILES=/data/*
for f in $FILES
do
  echo "$f"
done
# restore $IFS
IFS=$SAVEIFS

More examples using $IFS and while loop

Now you know that if the field delimiters are not whitespace, you can set IFS. For example, while loop can be used to get all fields from /etc/passwd file:

....
while IFS=: read userName passWord userID groupID geCos homeDir userShell
do
      echo "$userName -> $homeDir"
done < /etc/passwd

Using old good find command to process file names

To process the output of find with a command, try as follows:

find . -print0 | while read -d $'' file
do
  echo -v "$file"
done

Try to copy files to /tmp with spaces in a filename using find command and shell pipes:

find . -print0 | while read -d $'' file; do cp -v "$file" /tmp; done

Processing filenames using an array

Sometimes you need read a file into an array as one array element per line. Following script will read file names into an array and you can process each file using for loop. This is useful for complex tasks:

#!/bin/bash
DIR="$1"

# failsafe - fall back to current directory
[ "$DIR" == "" ] && DIR="." 

# save and change IFS
OLDIFS=$IFS
IFS=$'\n' 

# read all file name into an array
fileArray=($(find $DIR -type f))

# restore it
IFS=$OLDIFS

# get length of an array
tLen=${#fileArray[@]}

# use for loop read all filenames
for (( i=0; i<${tLen}; i++ ));
do
  echo "${fileArray[$i]}"
done

Playing mp3s with spaces in file names

Place following code in your ~/.bashrc file:

mp3(){
	local o=$IFS
	IFS=$(echo -en "\n\b")
	/usr/bin/beep-media-player "$(cat  $@)" &
	IFS=o
}

Keep list of all mp3s in a text file such as follows (~/eng.mp3.txt):

/nas/english/Adriano Celentano - Susanna.mp3
/nas/english/Nick Cave & Kylie Minogue - Where The Wild Roses Grow.mp3
/nas/english/Roberta Flack - Kiling Me Softly With This Song.mp3
/nas/english/The Beatles - Girl.mp3
/nas/english/John Lennon - Stand By Me.mp3
/nas/english/The Seatbelts, Cowboy Bebop - 01-Tank.mp3

To play just type:
$ mp3 eng.mp3.txt

Another example about IFS:

#!/bin/bash

IFS=$’\n’
set $(cat my.file)

# Now the lines are stored in $1, $2, $3, …

echo $1
echo $2
echo $3
echo $4

Reference:

http://www.cyberciti.biz/tips/handling-filenames-with-spaces-in-bash.html

Read Full Post | Make a Comment ( None so far )

Useful linux commands: null,zero,full,lsof

Posted on November 1, 2009. Filed under: Linux | Tags: , , , |

Linux / Unix Command: null

NAME

null, zero – data sink

DESCRIPTION

Data written on a null or zero special file is discarded.

Reads from the null special file always return end of file, whereas reads from zero always return characters.

null and zero are typically created by:

mknod -m 666 /dev/null c 1 3
mknod -m 666 /dev/zero c 1 5
chown root:mem /dev/null /dev/zero
chmod 666 /dev/null /dev/zero

References:

http://linux.about.com/library/cmd/blcmdl4_null.htm

DESCRIPTION
File /dev/full has major device number 1 and minor device number 7.

Writes  to  the  /dev/full device will fail with an ENOSPC error.  This
can be used to test how a program handles disk-full errors.

Reads from the /dev/full device will return characters.

Seeks on /dev/full will always succeed.

CONFIGURING
If your system does not have /dev/full created already, it can be  cre-
ated with the following commands:

mknod -m 666 /dev/full c 1 7
chown root:root /dev/full

Reference:

man full

lsof

lsof is a command meaning “list open files”, which is used in many Unix-like systems to report a list of all open files and the processes that opened them. This open source utility was developed and supported by Vic Abell, the retired Associate Director of the Purdue University Computing Center. It works in and supports several UNIX flavors.[2]

Open files in the system include disk files, pipes, network sockets and devices opened by all processes. One use for this command is when a disk cannot be unmounted because (unspecified) files are in use. The listing of open files can be consulted (suitably filtered if necessary) to identify the process that is using the files.

# lsof /var
COMMAND     PID     USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
syslogd     350     root    5w  VREG  222,5        0 440818 /var/adm/messages
syslogd     350     root    6w  VREG  222,5   339098   6248 /var/log/syslog
cron        353     root  cwd   VDIR  222,5      512 254550 /var -- atjobs

To view the Port associated with a daemon :

 # lsof -i -n -P | grep sendmail
 sendmail  31649    root    4u  IPv4 521738       TCP *:25 (LISTEN)

From the above we can see that “sendmail” is listening on its standard port of “25”.

  • -i Lists IP sockets.
  • -n Do not resolve hostnames (no DNS).
  • -P Do not resolve port names (list port number instead of its name).

Reference:

http://en.wikipedia.org/wiki/Lsof

man lsof

Read Full Post | Make a Comment ( None so far )

Useful Linux commands: Screen, ttyload, mytop, mtop

Posted on October 23, 2009. Filed under: Linux, MySQL, Shell | Tags: , , , , , |

1. Screen

Screen is a full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells). The same way tabbed browsing revolutionized the web experience, GNU Screen can do the same for your experience in the command line. Instead of opening up several terminal instances on your desktop or using those ugly GNOME/KDE-based tabs, Screen can do it better and simpler. Not only that, with GNU Screen, you can share sessions with others and detach/attach terminal sessions. It is a great tool for people who have to share working environments between work and home

References:

http://www.cyberciti.biz/tips/how-to-use-screen-command-under-linux.html

http://linux.die.net/man/1/screen

http://en.wikipedia.org/wiki/GNU_Screen

2. ttyload

ttyload is a little *NIX utility I wrote which is meant to give a color-coded graph of load averages over time.

Reference:

http://www.daveltd.com/src/util/ttyload/

3. mtop/mkill

mtop (MySQL top) monitors a MySQL server showing the queries which are taking the most amount of time to complete. Features include ‘zooming’ in on a process to show the complete query, ‘explaining’ the query optimizer information for a query and ‘killing’ queries. In addition, server performance statistics, configuration information, and tuning tips are provided.

mkill (MySQL kill) monitors a MySQL server for long running queries and kills them after a specified time interval. Queries can be selected based on regexes on the user, host, command, database, state and query.

Reference:

http://mtop.sourceforge.net/

http://astguiclient.sourceforge.net/vicidial.html

http://www.voip-info.org/wiki/view/VICIDIAL+Dialer

4. mytop

mytop is a console-based (non-gui) tool for monitoring the threads and overall performance of a MySQL 3.22.x, 3.23.x, and 4.x server. It runs on most Unix systems (including Mac OS X) which have Perl, DBI, and Term::ReadKey installed. And with Term::ANSIColor installed you even get color.

Reference:

http://jeremy.zawodny.com/mysql/mytop/

Read Full Post | Make a Comment ( None so far )

Useful Linux tools: sipsak,Balance,eAccelerator,ploticus

Posted on October 20, 2009. Filed under: Linux, Shell | Tags: , , , , , |

1. sipsak

sipsak is a small command line tool for developers and administrators of Session Initiation Protocol (SIP) applications. It can be used for some simple tests on SIP applications and devices

Reference:

http://sipsak.org/

2.Balance

Balance is our surprisingly successful load balancing solution being a simple but powerful generic tcp proxy with round robin load balancing and failover mechanisms. Its behaviour can be controlled at runtime using a simple command line syntax.

Reference:

http://www.inlab.de/balance.html

3.eAccelerator

eAccelerator is a free open-source PHP accelerator & optimizer. It increases the performance of PHP scripts by caching them in their compiled state, so that the overhead of compiling is almost completely eliminated. It also optimizes scripts to speed up their execution. eAccelerator typically reduces server load and increases the speed of your PHP code by 1-10 times.

Reference:

http://eaccelerator.net/

4.ploticus

A free, GPL, non-interactive software package for producing plots, charts, and graphics from data. It was developed in a Unix/C environment and runs onvarious Unix, Linux, and win32 systemsploticus is good for automated or just-in-time graph generation, handles date and time data nicely, and has basic statistical capabilities. It allows significant user control over colors, styles, options and details.

Reference:

http://ploticus.sourceforge.net/doc/welcome.html

Read Full Post | Make a Comment ( None so far )

Useful linux commands: readelf, ldd, od

Posted on June 8, 2007. Filed under: Uncategorized | Tags: , , , , , |

readelf

readelf displays information about the contents of one or more ELF format object files.   The options control what particular information to display.  elf files are the object files to be examined.  32-bit and 64-bit ELF files are supported, also archives containing ELF files.  This program performs a similar function to objdump but it goes into more detail and it exists independently of the BFD library, so if there is a bug in BFD then readelf will not be affected.

BFD – Binary File Descriptor.  Is a package which allows applications to use the same routines to operate on object files whatever the object file format (http://www.gnu.org/).

http://linux.about.com/library/cmd/blcmdl1_readelf.htm

ldd

ldd prints the shared libraries required by each program or shared library specified on the command line

http://linux.about.com/library/cmd/blcmdl1_ldd.htm

http://linux.byexamples.com/archives/409/how-to-list-shared-library-dependencies-used-by-an-application/

od

od will dump files in octal and other formats.  Write an unambiguous representation, octal bytes by default, of a FILE to standard output.  With more than one FILE argument, concatenate them in the listed order to form the input.  With no FILE, or when FILE is -, read standard input.

http://linux.die.net/man/1/od

Others:

nm, objdump, objcopy,size, strings

reference:

http://www.tenouk.com/Module111linuxutility1.html

http://www.tenouk.com/Module000linuxnm3.html

http://www.tenouk.com/Module111.html

Read Full Post | Make a Comment ( None so far )

Some multithreading tutorial URLs

Posted on June 6, 2007. Filed under: C/C++, Linux, Programming, Windows | Tags: , , , , , |

C

POSIX Threads Tutorial

http://math.arizona.edu/~swig/documentation/pthreads/

POSIX thread (pthread) libraries

http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html

C++

Multithreading Tutorial(Mutex, deadlock, livelock, scoped lock, Read/Write lock, Monitor Object, Active Object, boost) — Good!

http://www.paulbridger.com/multithreading_tutorial/

Multithreading Tutorial (mainly on Win32)

http://www.codeproject.com/KB/threads/MultithreadingTutorial.aspx

C++/CLI Threading

Part I:

http://www.drdobbs.com/windows/184402018

Part II:

http://www.drdobbs.com/windows/184402029

Threads in C++

http://www.linuxselfhelp.com/HOWTO/C++Programming-HOWTO-18.html

Read Full Post | Make a Comment ( None so far )

Liked it here?
Why not try sites on the blogroll...