How to find out all files with replication factor 1 in HDFS? The hdfs dfsadmin -report shows there are blocks with replication factor 1: Missing blocks (with replication factor 1): 7 How to find them out? You can run hdfs fsck to list all files with their replication counts and grep those with replication factor
Read more
Author: Eric Ma
Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.How to disable PHP short open tags?
Posted onSome of my pages has XML tags like ‘<?xml ?>’ which are considered by PHP as short open tags ‘<? ?>’. How to disable PHP short open tags? In PHP, to disable the short open tags, you can set the variable short_open_tags = FALSE; in your php.ini. Reference: short_open_tags in php.ini
How to print all fields after a certain field with awk on Linux?
Posted onHow to print all fields after a certain field with awk on Linux? Say, I want to print out all fields after field $3: a b c d e f a b b b a a c d should be transformed to d e f b d You may use a for loop in awk
Read more
How to add a prefix string at the beginning of each line in Bash shell script on Linux?
Posted onHow to add a prefix string at the beginning of each line in Bash shell script on Linux? For example, assume we have a file a.txt: line 1 line 2 I want to have, pre line 1 pre line 2 You may use sed, the stream editor for filtering and transforming text: sed -e ‘s/^/pre
Read more
How to get the highest temperature from all sensors in a server on Linux?
Posted onIt is useful to monitor a server node’s temporary. Among all the sensors’ temperatures, the higher one may be a very important one. How to get the highest temperature from all sensors in a server on Linux? You can use this command to get the the highest temperature from all sensors in a server on
Read more
How to make CentOS Linux to load a module automatically at boot time?
Posted onHow to make CentOS Linux to load a module, say ixgbe, automatically at boot time? I am using CentOS 7. You can create a text file <some name>.conf in the /etc/modules-load.d/ and list the modules to be loaded there, one per line. The systemd-modules-load.service daemon will read these files and load the modules. Check more
Read more
How to detect whether a file is being written by any other process in Linux?
Posted onHow to detect whether a file is being written by any other process in Linux? Before a program open a file to processes it, it wants to ensure no other processes are writing to it. Here, we are sure after the files are written and closed, they will not be written any more. Hence, one-time
Read more
How to count the number of reads in each chromosome in a bam file?
Posted onHow to count the number of reads in each chromosome in a bam file? The bam file is already sorted by the chromosome names. If the bam file is indexed, you may quickly get these info from the index: samtools idxstats in.bam | awk ‘{print $1″ “$3}’ If the bam file is not indexed, you
Read more
WPS’ wpp program reports “libbz2.so.1.0: cannot open shared object file” on CentOS 7
Posted onWPS’ wpp program reports “libbz2.so.1.0: cannot open shared object file” on CentOS 7 as follows: $ wpp /opt/kingsoft/wps-office/office6/wpp: error while loading shared libraries: libbz2.so.1.0: cannot open shared object file: No such file or directory The reason: wpp will tries to dynamically link ‘libbz2.so.1.0’ $ ldd /opt/kingsoft/wps-office/office6/wpp | grep libbz2 libbz2.so.1.0 => not found libbz2.so.1 =>
Read more
How to write a SQL to replace strings in a column in a MySQL database table?
Posted onHow to replace strings in a column in a MySQL database table? For example, I would like to replace http://www.systutorials.com with https://www.systutorials.com in a content column of a table post. You may use a SQL statement as follows. update post set content = replace(content, ‘http://www.systutorials.com’, ‘https://www.systutorials.com’) where content like ‘%http://www.systutorials.com%’
How to use encfs in Android?
Posted onIs encfs available in an Android phone? You may try Encdroid a piece of free software released under the GNU General Public License. It is an Android application. It can access EncFS volumes on cloud storage or internal/USB storage devices. Google Play Store Link: https://play.google.com/store/apps/details?id=org.mrpdaemon.android.encdroid&hl=en Source code: https://github.com/mrpdaemon/encdroid
How to split a gzip file to several small ones on Linux?
Posted onI have a very large (e.g. 100GB) .gz file and would like to split it into smaller files like 8GB/each for storage/copying. How to split a gzip file to several small ones on Linux? You may use the tool split to split a file by sizes. An example to split a large.tgz to at most
Read more
How to estimate the memory usage of HDFS NameNode for a HDFS cluster?
Posted onHDFS stores the metadata of files and blocks in the memory of the NameNode. How to estimate the memory usage of HDFS NameNode for a HDFS cluster? Each file and each block has around 150 bytes of metadata on NameNode. So you may do the calculation based on this. For examples, assume block size is
Read more
Is Samba sync or async for writes?
Posted onBeing sync or async for data writing of a file system or a network file system affects the data integrity. Is Samba sync or async for writes? In summary, Samba writes are async by default. But the behavior is configurable. Here is a great summary by Eric Roseme. Samba defaults to asynchronous writes. smbd writes
Read more
How to get the mtime of a file on Linux?
Posted onHow to get the mtime of a file on Linux from the file’s path? You can use stat to get the file status including the mtime: %y time of last modification, human-readable %Y time of last modification, seconds since Epoch As an example, $ stat -c %y ./file 2017-06-26 13:33:06.764042064 +0800 $ stat -c %Y
Read more
How to embed a Map image from Google Map in a website?
Posted onGoogle Map has APIs. But the requirement is that the Map image should be static and uploaded to the website only. No request or dependency on Google’s website should be needed so that the website can run without Internet. Is it possible or allowed (like a screenshot of the Google Map)? If not, any suggestions
Read more
Printing a Line to STDERR and STDOUT in Python
Posted onIn Python, how to print a string as a line to STDOUT? That is, the string and the newline character, nicely? And similarly, how to print the line to STDERR? In Python, to print a string str with a new line to STDOUT: print str In Python to print a line to STDERR: import sys
Read more
How to Debug a Bash script?
Posted onHow to debug a Bash script if it has some bugs? Common techniques like printing varibles out for checking apply for bash too. For bash, I also use 2 bash-specific techniques. Use errexit option Run the script with -e option like bash -e your-script.sh or add set -o errexit to the beginning of the script.
Read more
How to find the disk where root / is on in Bash on Linux?
Posted onQuestion: how to find the disk where the Linux’s root(/) is on in Bash? The root may be on a LVM volume or on a raw disk. 2 cases: One example: # df -hT | grep /$ /dev/sda4 ext4 48G 32G 14G 71% / For another example: # df -hT | grep /$ /dev/mapper/fedora-root ext4
Read more
How to set Zimbra web service’s hostname and port?
Posted onIn Zimbra server, how to set Zimbra web service’s hostname and port? Set Zimbra Web service’s host and port (to mail.domain.com:80 as an example) for a mail domain domain.com: zmprov md domain.com zimbraPublicServiceHostname mail.domain.com zmprov md domain.com zimbraPublicServicePort 80 Reference: https://wiki.zimbra.com/wiki/When_using_a_proxy,_the_’change_password’_box_doesn’t_load