Which Checksum Tool on Linux is Faster?

Posted on

It is common practice to calculate the checksums for files to check its integrity. For large files, the checksum computation is slow. Now I am wondering why it is so slow and whether choosing another tool will be better. In this post, I try three common tools md5sum, sha1sum and crc32 to compute checksums on
Read more

How to Rotate Videos from iPhone in Linux

Posted on

iPhone is nice to take videos. However, one headache is the video may be rotated by 90 degree if you play it with non-Apple software such as MPlayer on Linux or Windows. This tutorial will introduce how to rotate the video taken from iPhone or other sources on Linux by 90 degree. The tool we
Read more

Hadoop Installation Tutorial (Hadoop 2.x)

Posted on

Hadoop 2 or YARN is the new version of Hadoop. It adds the yarn resource manager in addition to the HDFS and MapReduce components. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce designed and implemented by Google initially for processing and generating large data
Read more

Extending Mounted Ext4 File System on LVM in Linux

Posted on

LVM is a great tool to manage hard disks on Linux—you can abstract the hard drives away and manage logical volumes from volume groups, you can dynamically add or remove hard drives while the file systems on the logical volumes need not to backed up and recovered, and you may create many snapshots of the
Read more

How to Find Out Failed Disks’ SATA Ports in Linux

Posted on

The Linux disk names (e.g. sda1, hdb3, etc.) are not reliable—they may be changed if there are hardware changes, such an adding or removing a disk. Additionally, the order for the Linux device names is not always the same as the order of SATA poets. For example, the disk connected to SATA port 0 (first
Read more

Script: Checking Alive Servers from a Server List

Posted on

With a list of servers, it is common that one or more are down or crash. Lots cluster management tools can detect the aliveness of servers. However, it can be easily done with ping with a Bash script. I summarize the script that I used and share it here: check-alive-server.sh. Usage: usage: ./check-alive-server.sh file Each
Read more

How to Set Default Entry in Grub2 and Grub

Posted on

Linux booting is usually controlled by Grub or the new Grub2. Setting the default booting entry is a frequent operations. Here, we introduce how to set the default entry in Grub2 and Grub. Setting the default booting entry in grub2 Note1: With some version of grub2, the grub2-set-default method and the script below may not
Read more

Setting up Stable Xen Dom0 with Fedora: Xen 3.4.3 with Xenified Linux Kernel 2.6.32.13 in Fedora 12

Posted on

This is the latest stable and recommended stable Xen Dom0 solution on Fedora 12. No serious bug found till now and we will fix the bugs by ourselves if some appears. It also works on Fedora 14 as well. It should not be hard to use this solution on other versions of Fedora or other
Read more

Setting Up Xen Dom0 on Fedora : Xen 3.4.1 with Linux Kernel 2.6.29 on Fedora 12

Posted on

Please refer to for the latest stable Xen Dom0 solution. In this post, the detailed tutorial for setting up Xen 3.4.1 dom0 on top of Fedora 12 with kernel 2.6.29 will be introduced. Hardware: Dom0 hardware platform: Motherboard: INTEL S5500BC S5500 Quad Core Xeon Server Board CPU: 2 x Intel Quad Core Xeon E5520 2.26G
Read more

How to Set Up Socks Proxy Using SSH Tunnel

Posted on

We can set up a socks proxy on top of a SSH tunnel. Besides the common proxy functions, such as web browsing, the proxy on top of SSH tunnel also ensures the security between the browser and the proxy server (the SSH server). In this post, we introduce and explain how to set up a
Read more

Automatically Backing Up Xen File-backed DomU

Posted on

A script for backing up file-backed Xen DomU is introduced in this post. This script can be changed to similar platform. In our cluster, virtual machines are stored under /lhome/xen/. Virtual machine with id vmid is stored in directory vmvmid. The raw image disk file name can also be derived from vmid. Some more details
Read more

ALSA Problem of Fedora 11 on Compaq Presario CQ35-240TX

Posted on

When I get my new Compaq Presario CQ35-240TX, of course, the first thing is to install Fedora ;) But unfortunately, after installation there is no sound! It seems there is something wrong with the driver configuration. Here is a solution to this: Add these two line at the end of /etc/modprobe.d/dist.conf options snd-hda-intel model=hp-m4 enable=1
Read more

PUMA: A MapReduce Benchmark Suite

Posted on

MapReduce is a well-known programming model designed for generating and processing large data. There are various MapReduce implementations. One widely known and used one may be Hadoop. Benchmarking MapReduce frameworks gets to be important. Faraz Ahmad et al. developed a benchmark suite: PUMA MapReduce Benchmark. During our work on MapReduce, we developed a benchmark suite
Read more

Hadoop Installation Tutorial (Hadoop 1.x)

Posted on

Update: If you are new to Hadoop and trying to install one. Please check the newer version: Hadoop Installation Tutorial (Hadoop 2.x). Hadoop mainly consists of two parts: Hadoop MapReduce and HDFS. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce that is initially designed
Read more

A Simple Sort Benchmark on Hadoop

Posted on

After [[hadoop-installation-tutorial|installing Hadoop]], we usually run some benchmark programs to test whether the system works well. In the post of the Hadoop install tutorial, we show a very simple to grep strings from a simple sets of files. In this post, we introduce the Sort for testing and benchmarking Hadoop. The Sort program is also
Read more

Setting Up Standalone (Local) Hadoop

Posted on

Hadoop is designed to run on [[hadoop-installation-tutorial|hundreds to thousands of computers]] inside cluster. However, Hadoop is configured to run things in a non-distributed mode as a single Java process by default. This is specially useful for debugging since distributed debugging is really a nightmare. This post introduces how to set up a standalone Hadoop environment.
Read more