Ceph is a popular storage backend for OpenStack. It allows you to use the same storage for images, ephemeral disks and volumes. Each Ceph image is also thinly provisioned (by default in blocks of 4M). Until recently, once storage was allocated the filesystem in the virtual machine had no method to release it back. In recent Ceph and Openstack releases it is possible to use fstrim inside the virtual machine to release disks blocks. fstrim was not supported in older operating systems such as Ubuntu 14.04, but Ubuntu 16.04 and CentOS 7 both support it.
Lets try it:
In the virtual machine, 5.4G is used. Before the df I ran sync and fstrim /
vm$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 50G 5.4G 45G 11% /
The real usage of the vm disks (diff against the stock centos 7 image):
$ rbd diff vms/ae445f4c-38e7-446d-9edd-618533632469_disk | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }' 7467.23 MB
This is already quite a lot more. This is a virtual machine that has been in use for some time. So probably not all deleted data has been released by the filesystem.
Add a file of 1G and sync it to disk:
vm$ dd if=/dev/zero of=test.img bs=4M count=250 250+0 records in 250+0 records out 1048576000 bytes (1.0 GB) copied, 1.71766 s, 610 MB/s vm$ sync vm$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 50G 6.4G 44G 13% /
The real usage increased with 824M. This is not the full 1G but close:
$ rbd diff vms/ae445f4c-38e7-446d-9edd-618533632469_disk | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }' 8291.23 MB
When we delete the file, only a few MB are released from Ceph storage:
vm$ rm -f test.img vm$ sync vm$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 50G 5.4G 45G 11% /
Ceph usage:
$ rbd diff vms/ae445f4c-38e7-446d-9edd-618533632469_disk | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }' 8283.23 MB
After fstrim:
vm$ fstrim /
Ceph usage:
$ rbd diff vms/ae445f4c-38e7-446d-9edd-618533632469_disk | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }' 7107.23 MB
This is even less than before the 1G test file. This probably due to additional fs blocks that are released.