One of HDFS cluster’s hdfs dfsadmin -report reports: Under replicated blocks: 139016 Blocks with corrupt replicas: 9 Missing blocks: 0 The “Under replicated blocks” can be re-replicated automatically after some time. How to handle the missing blocks and blocks with corrupt replicas in HDFS? Understanding these blocks A block is “with corrupt replicas” in HDFS
Read more
Category: Systems
Tutorials and posts on computing, storage and networked systems, and more.
A Beginners’ Guide to x86-64 Instruction Encoding
Posted onThe encoding of x86 and x86-64 instructions is well documented in Intel or AMD’s manuals. However, they are not quite easy for beginners to start with to learn encoding of the x86-64 instructions. In this post, I will give a list of useful manuals for understanding and studying the x86-64 instruction encoding, a brief introduction
Read more
How to force a metadata checkpointing in HDFS
Posted onThe metadata checkpointing in HDFS is done by the Secondary NameNode to merge the fsimage and the edits log files periodically and keep edits log size within a limit. For various reasons, the checkpointing by the Secondary NameNode may fail. For one example, HDFS SecondaraNameNode log shows errors in its log as follows. 2017-08-06 10:54:14,488
Read more
How To Debug Linux Kernel With Less Efforts
Posted onIntroduction In general, if we want to debug Linux Kernel, there are lots of tools such as Linux Perf, Kprobe, BCC, Ktap, etc, and we can also write kernel modules, proc subsystems or system calls for some specific debugging aims. However, if we have to instrument kernel to achieve our goals, usually we would not
Read more
QEMU/KVM Network Mechanisms
Posted onIntroduction As we know, network subsystems are important in computer systems since they are I/O systems and need to be optimized with many algorithms and skills. This article will introduce how QEMU/KVM [2] network part works. In order to put everything simple and easy to understand, we will begin with several examples and then understand
Read more
I/O Microscopy: Tasks’ Disk I/O Information with High Accuracy
Posted onAbstract Most popular task monitor systems (such as top, iotop, proc, etc) can only get tasks’ disk I/O information like tasks’ I/O utilization percentage every seconds due to kernel timer/tick frequency and high time cost of system interfaces. This article presents I/O Microscopy, a new way to get tasks’ disk I/O information with high accuracy.
Read more
How does linux kernel collect task stats data
Posted onMotivation Recently, I find it is hard to know the percentage of time that one process uses to wait for synchronous I/O (eg, read, etc). One way is to use the taskstats API provided by Linux Kernel [1]. However, for this way, the precision may be one problem. With this problem, I dig into Linux
Read more
How to Upload Large Files to Amazon S3 with AWS CLI
Posted onAmazon S3 is a widely used public cloud storage system. S3 allows an object/file to be up to 5TB which is enough for most applications. The AWS Management Console provides a Web-based interface for users to upload and manage files in S3 buckets. However, uploading a large files that is 100s of GB is not
Read more
Hadoop Installation Tutorial (Hadoop 2.x)
Posted onHadoop 2 or YARN is the new version of Hadoop. It adds the yarn resource manager in addition to the HDFS and MapReduce components. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce designed and implemented by Google initially for processing and generating large data
Read more
Big Data Benchmark from AMPLab of UC Berkeley
Posted onBenchmarks are important to understand the performance and quantitative and qualitative comparison of different systems. Many analytic frameworks, such as Hive, Impala and Shark, are designed and implemented these years and become fundamental software for processing big data. How to benchmark these big data analytic systems is an interesting problem. The Big Data Benchmark The
Read more
Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google Cloud Storage and Windows Azure Storage
Posted onThe public cloud storage services like Amazon S3, Google Cloud Storage and Windows Azure Storage replicate the data to ensure high availability. On the other hand, with data being replicated, the storage services exhibits certain data consistency models. Different cloud service providers employ different data consistency models nowadays. In this post, we survey the data
Read more
Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean
Posted onSoftware Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. You can download the slides from Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. These slides contain the “Numbers everyone should know” which everyone working on systems should be familiar with. Numbers Everyone Should Know L1 cache reference 0.5 ns Branch
Read more
Hadoop MapReduce Tutorials
Posted onHere is a list of tutorials for learning how to write MapReduce programs on Hadoop, the opensource MapReduce implementation with HDFS. MapReduce Tutorials The official tutorial on Hadoop MapReduce framework: http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html. Yahoo! Hadoop Tutorial A comprehensive tutorial on Hadoop from Yahoo! Developer Network: http://developer.yahoo.com/hadoop/tutorial/. More about MapReduce To better understand the design behind MapReduce, it
Read more
Storage Architecture and Challenges by Andrew Fikes at Google Faculty Summit 2010
Posted onStorage Architecture and Challenges in Faculty Summit, July 29, 2010, by Andrew Fikes, Principal Engineer. Download PDF (from archive.org). This slides introduces some of Google’s storage systems with insights and discussion of problems.
Designs, Lessons and Advice from Building Large Distributed Systems
Posted onDesigns, Lessons and Advice from Building Large Distributed Systems by Jeaf Dean. Everyone who is interested in large distributed systems should read: PDF for Designs, Lessons and Advice from Building Large Distributed Systems by Jeaf Dean.
PUMA: A MapReduce Benchmark Suite
Posted onMapReduce is a well-known programming model designed for generating and processing large data. There are various MapReduce implementations. One widely known and used one may be Hadoop. Benchmarking MapReduce frameworks gets to be important. Faraz Ahmad et al. developed a benchmark suite: PUMA MapReduce Benchmark. During our work on MapReduce, we developed a benchmark suite
Read more
Large-scale Data Storage and Processing System in Datacenters
Posted onResearch on Cloud Computing has made big progresses and many excellent large-scale systems have been designed in recent years. I compiled a list of some large-scale data storage and processing systems in datacenters as follows. Storage systems Google File System (GFS): http://research.google.com/archive/gfs.html HDFS implementation: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html Colossus (GFS2): Colossus: Successor to the Google File System (GFS)
Read more
Microsofts Cosmos Service
Posted onCosmos is “Microsoft’s internal data storage/query system for analyzing enormous amounts (as in petabytes) of data”. There is no paper/technical report about Cosmos published yet. I compiled a list of information about Cosmos on the Web as follows. What is Microsoft’s Cosmos service? by Yaron Y. Goland. Microsoft Cosmos: Petabytes perfectly processed perfunctorily by Seth
Read more
Colossus: Successor to the Google File System (GFS)
Posted onColossus is the successor to the Google File System (GFS) as mentioned in the paper on Spanner at OSDI 2012. Colossus is also used by spanner to store its tablets. The information about Colossus is slim compared with GFS which is published in the paper at SOSP 2003. There is still some information about Colossus
Read more
Hadoop Installation Tutorial (Hadoop 1.x)
Posted onUpdate: If you are new to Hadoop and trying to install one. Please check the newer version: Hadoop Installation Tutorial (Hadoop 2.x). Hadoop mainly consists of two parts: Hadoop MapReduce and HDFS. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce that is initially designed
Read more