docs/multi-thread-compression.txt uses parameter names with underscores instead of dashes. Wrong since day one. docs/rdma.txt, tests/qemu-iotests/181, and tests/qtest/test-hmp.c are wrong the same way since commit cbde7be900d2 (v6.0.0). Hard to see, as test-hmp doesn't check whether the commands work, and iotest 181 appears to be unaffected. Fixes: 263170e679df (docs: Add a doc about multiple thread compression) Fixes: cbde7be900d2 (migrate: remove QMP/HMP commands for speed, downtime and cache size) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
		
			
				
	
	
		
			150 lines
		
	
	
		
			5.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			150 lines
		
	
	
		
			5.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
Use multiple thread (de)compression in live migration
 | 
						|
=====================================================
 | 
						|
Copyright (C) 2015 Intel Corporation
 | 
						|
Author: Liang Li <liang.z.li@intel.com>
 | 
						|
 | 
						|
This work is licensed under the terms of the GNU GPLv2 or later. See
 | 
						|
the COPYING file in the top-level directory.
 | 
						|
 | 
						|
Contents:
 | 
						|
=========
 | 
						|
* Introduction
 | 
						|
* When to use
 | 
						|
* Performance
 | 
						|
* Usage
 | 
						|
* TODO
 | 
						|
 | 
						|
Introduction
 | 
						|
============
 | 
						|
Instead of sending the guest memory directly, this solution will
 | 
						|
compress the RAM page before sending; after receiving, the data will
 | 
						|
be decompressed. Using compression in live migration can help
 | 
						|
to reduce the data transferred about 60%, this is very useful when the
 | 
						|
bandwidth is limited, and the total migration time can also be reduced
 | 
						|
about 70% in a typical case. In addition to this, the VM downtime can be
 | 
						|
reduced about 50%. The benefit depends on data's compressibility in VM.
 | 
						|
 | 
						|
The process of compression will consume additional CPU cycles, and the
 | 
						|
extra CPU cycles will increase the migration time. On the other hand,
 | 
						|
the amount of data transferred will decrease; this factor can reduce
 | 
						|
the total migration time. If the process of the compression is quick
 | 
						|
enough, then the total migration time can be reduced, and multiple
 | 
						|
thread compression can be used to accelerate the compression process.
 | 
						|
 | 
						|
The decompression speed of Zlib is at least 4 times as quick as
 | 
						|
compression, if the source and destination CPU have equal speed,
 | 
						|
keeping the compression thread count 4 times the decompression
 | 
						|
thread count can avoid resource waste.
 | 
						|
 | 
						|
Compression level can be used to control the compression speed and the
 | 
						|
compression ratio. High compression ratio will take more time, level 0
 | 
						|
stands for no compression, level 1 stands for the best compression
 | 
						|
speed, and level 9 stands for the best compression ratio. Users can
 | 
						|
select a level number between 0 and 9.
 | 
						|
 | 
						|
 | 
						|
When to use the multiple thread compression in live migration
 | 
						|
=============================================================
 | 
						|
Compression of data will consume extra CPU cycles; so in a system with
 | 
						|
high overhead of CPU, avoid using this feature. When the network
 | 
						|
bandwidth is very limited and the CPU resource is adequate, use of
 | 
						|
multiple thread compression will be very helpful. If both the CPU and
 | 
						|
the network bandwidth are adequate, use of multiple thread compression
 | 
						|
can still help to reduce the migration time.
 | 
						|
 | 
						|
Performance
 | 
						|
===========
 | 
						|
Test environment:
 | 
						|
 | 
						|
CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz
 | 
						|
Socket Count: 2
 | 
						|
RAM: 128G
 | 
						|
NIC: Intel I350 (10/100/1000Mbps)
 | 
						|
Host OS: CentOS 7 64-bit
 | 
						|
Guest OS: RHEL 6.5 64-bit
 | 
						|
Parameter: qemu-system-x86_64 -accel kvm -smp 4 -m 4096
 | 
						|
 /share/ia32e_rhel6u5.qcow -monitor stdio
 | 
						|
 | 
						|
There is no additional application is running on the guest when doing
 | 
						|
the test.
 | 
						|
 | 
						|
 | 
						|
Speed limit: 1000Gb/s
 | 
						|
---------------------------------------------------------------
 | 
						|
                    | original  | compress thread: 8
 | 
						|
                    |   way     | decompress thread: 2
 | 
						|
                    |           | compression level: 1
 | 
						|
---------------------------------------------------------------
 | 
						|
total time(msec):   |   3333    |  1833
 | 
						|
---------------------------------------------------------------
 | 
						|
downtime(msec):     |    100    |   27
 | 
						|
---------------------------------------------------------------
 | 
						|
transferred ram(kB):|  363536   | 107819
 | 
						|
---------------------------------------------------------------
 | 
						|
throughput(mbps):   |  893.73   | 482.22
 | 
						|
---------------------------------------------------------------
 | 
						|
total ram(kB):      |  4211524  | 4211524
 | 
						|
---------------------------------------------------------------
 | 
						|
 | 
						|
There is an application running on the guest which write random numbers
 | 
						|
to RAM block areas periodically.
 | 
						|
 | 
						|
Speed limit: 1000Gb/s
 | 
						|
---------------------------------------------------------------
 | 
						|
                    | original  | compress thread: 8
 | 
						|
                    |   way     | decompress thread: 2
 | 
						|
                    |           | compression level: 1
 | 
						|
---------------------------------------------------------------
 | 
						|
total time(msec):   |   37369   | 15989
 | 
						|
---------------------------------------------------------------
 | 
						|
downtime(msec):     |    337    |  173
 | 
						|
---------------------------------------------------------------
 | 
						|
transferred ram(kB):|  4274143  | 1699824
 | 
						|
---------------------------------------------------------------
 | 
						|
throughput(mbps):   |  936.99   | 870.95
 | 
						|
---------------------------------------------------------------
 | 
						|
total ram(kB):      |  4211524  | 4211524
 | 
						|
---------------------------------------------------------------
 | 
						|
 | 
						|
Usage
 | 
						|
=====
 | 
						|
1. Verify both the source and destination QEMU are able
 | 
						|
to support the multiple thread compression migration:
 | 
						|
    {qemu} info migrate_capabilities
 | 
						|
    {qemu} ... compress: off ...
 | 
						|
 | 
						|
2. Activate compression on the source:
 | 
						|
    {qemu} migrate_set_capability compress on
 | 
						|
 | 
						|
3. Set the compression thread count on source:
 | 
						|
    {qemu} migrate_set_parameter compress-threads 12
 | 
						|
 | 
						|
4. Set the compression level on the source:
 | 
						|
    {qemu} migrate_set_parameter compress-level 1
 | 
						|
 | 
						|
5. Set the decompression thread count on destination:
 | 
						|
    {qemu} migrate_set_parameter decompress-threads 3
 | 
						|
 | 
						|
6. Start outgoing migration:
 | 
						|
    {qemu} migrate -d tcp:destination.host:4444
 | 
						|
    {qemu} info migrate
 | 
						|
    Capabilities: ... compress: on
 | 
						|
    ...
 | 
						|
 | 
						|
The following are the default settings:
 | 
						|
    compress: off
 | 
						|
    compress-threads: 8
 | 
						|
    decompress-threads: 2
 | 
						|
    compress-level: 1 (which means best speed)
 | 
						|
 | 
						|
So, only the first two steps are required to use the multiple
 | 
						|
thread compression in migration. You can do more if the default
 | 
						|
settings are not appropriate.
 | 
						|
 | 
						|
TODO
 | 
						|
====
 | 
						|
Some faster (de)compression method such as LZ4 and Quicklz can help
 | 
						|
to reduce the CPU consumption when doing (de)compression. If using
 | 
						|
these faster (de)compression method, less (de)compression threads
 | 
						|
are needed when doing the migration.
 |