How to sort a file by hexadecimal numbers on Linux using sort command?

Posted on In QA

The sort command has a -n option to sort a file by numbers. However, it does not work with hexadecimal numbers.

For example, this file:

400000000	__crt0
400000039	__newr0
400001B14	get_my_task_id
400001C14	get_new_task_id
400001582	input_char
40000166E	input_q
400001A5D	input_q_exit
400002002	main
4000000DB	output_char
400001134	output_char_str
40000100C	output_id
40000018F	output_q
400000614	output_q_digits
400000B7E	output_q_hex
400000D3E	output_q_hex_j1
400000E3D	output_q_hex_j2
4000002FB	output_q_j1
400000444	output_q_j2
40000131F	output_str
400001385	output_str_j1
200000020	reg1
200000028	reg2
200000030	reg3
200000038	reg4
20400001000	status
400001E0C	t1
400001E70	t2
400001D14	task_id_to_ec_range

will be sorted to:

400000B7E	output_q_hex
400000D3E	output_q_hex_j1
400000E3D	output_q_hex_j2
400001A5D	input_q_exit
400001B14	get_my_task_id
400001C14	get_new_task_id
400001D14	task_id_to_ec_range
400001E0C	t1
400001E70	t2
4000000DB	output_char
4000002FB	output_q_j1
40000018F	output_q
40000100C	output_id
40000131F	output_str
40000166E	input_q
200000020	reg1
200000028	reg2
200000030	reg3
200000038	reg4
400000000	__crt0
400000039	__newr0
400000444	output_q_j2
400000614	output_q_digits
400001134	output_char_str
400001385	output_str_j1
400001582	input_char
400002002	main
20400001000	status

But 200000020 is smaller than 40000166E.

How to sort a file by hexadecimal numbers on Linux using sort command?

The combination of sort with perl, paste and cut produce the correct results (assume the file is file.txt).

perl -lpe '$_=hex' file.txt | 
paste -d" " - file.txt  | 
sort -n | 
cut -d" " -f 2-

The first line will convert the hexadecimal numbers to decimal numbers. The second line concatenate the converted decimal numbers with the original text by row. Then sort them and cut the first decimal number fields away.

200000020	reg1
200000028	reg2
200000030	reg3
200000038	reg4
400000000	__crt0
400000039	__newr0
4000000DB	output_char
40000018F	output_q
4000002FB	output_q_j1
400000444	output_q_j2
400000614	output_q_digits
400000B7E	output_q_hex
400000D3E	output_q_hex_j1
400000E3D	output_q_hex_j2
40000100C	output_id
400001134	output_char_str
40000131F	output_str
400001385	output_str_j1
400001582	input_char
40000166E	input_q
400001A5D	input_q_exit
400001B14	get_my_task_id
400001C14	get_new_task_id
400001D14	task_id_to_ec_range
400001E0C	t1
400001E70	t2
400002002	main
20400001000	status

Eric Ma

Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

Leave a Reply

Your email address will not be published. Required fields are marked *