How to change an running HDFS cluster’s replication factor?
Posted on In QANow, I have a running HDFS cluster storing lost files. I want to change its default replication factor.
How to change it? What will happen after it is changed?
For example, I change from 2 to 3. Will HDFS automatically re-replicate the data chunks?
First, the replication factor is client decided.
Second, the replication factor is per-file configuration.
Hence, the configuration only changes the client and takes effect for new files.
For existing files, you need to manually re-set the replication factor:
https://www.systutorials.com/qa/1225/how-to-change-number-of-replications-of-certain-files-in-hdfs