Table of Contents
Benchmark
rendimiento disco performance
Aqui se listan herramientas para analizar la velocidad de los discos y otro tipo de tests.
LBA
Logical Block Address
Sectores
http://www.ibm.com/developerworks/library/l-4kb-sector-disks/
- Sector. 1 disco se divide en sectores.
- Tradicionalmente el tamanyo de cada sector era de 512 bytes, pero desde 2010 se han generalizado los discos de 4096 bytes por sector.
- Para mantener la compatibilidad, el sector fisico (de 4096 bytes cada uno) se divide en 8 logicos (de 512 bytes cada uno)
- Para determinar el tamanyo de sectores fisico y logico:
sudo cat /sys/block/sda/queue/physical_block_size 4096 sudo cat /sys/block/sda/queue/logical_block_size 512
Otro metodo:
sudo fdisk -l | egrep "Disk|Sector" | grep -v "identifier"
Salida:
Disk /dev/md0 doesn't contain a valid partition table Disk /dev/md1 doesn't contain a valid partition table Disk /dev/sda: 1000.2 GB, 1000204886016 bytes Sector size (logical/physical): 512 bytes / 4096 bytes Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes Sector size (logical/physical): 512 bytes / 4096 bytes Disk /dev/md0: 983.2 GB, 983214915584 bytes Sector size (logical/physical): 512 bytes / 4096 bytes Disk /dev/md1: 16.8 GB, 16844193792 bytes Sector size (logical/physical): 512 bytes / 4096 bytes Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes Sector size (logical/physical): 512 bytes / 512 bytes
En este caso se trata de uno de 4096 bytes. Los viejos son de 512 bytes
Alineacion
http://www.ibm.com/developerworks/library/l-4kb-sector-disks/
https://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues
http://people.redhat.com/msnitzer/docs/io-limits.txt
TODO: ver si es cierto o no que particiones no alineadas impactan en el rendimiento
- SOLO impacta negativamente, y asi lo informa “sudo fdisk -l” si el tamanyo de los sectores fisicos es distinto al de los logicos. Por ejemplo en un disco Advanced Format de Western Digital.
- Cada particion tiene que comenzar en un numero de sector divisible por 8 (relacion 4096 bytes fisico - 512 bytes logico)
- El impacto es solo negativo en operaciones de escritura
Con fdisk se puede alinear lanzando el siguiente comando cuando se particiona:
sudo fdisk -H 224 -S 56 /dev/sda
Obtener IOPS
http://www.techrepublic.com/blog/the-enterprise-cloud/calculate-iops-in-a-storage-array/2182/#.
IOPS = Input OutPut por Segundo
1. Obtener el modelo de disco
1.1. Obtener path:
sudo find /sys/ -type f -name "model"
Salida:
/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-3/2-3:1.0/host6/target6:0:0/6:0:0:0/model /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/2:0:0:0/model /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/target3:0:0/3:0:0:0/model
En este caso tengo dos discos.
1.2. Ver contenido de ese path:
cat /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/2:0:0:0/model
Salida:
ST1000DM003-1CH1
2. Copiar/pegar esa cadena en un navegador y buscar. En mi caso:
[[http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0CEMQFjAC&url=http%3A%2F%2Fwww.seagate.com%2Fstaticfiles%2Fsupport%2Fdocs%2Fmanual%2Fdesktop%2FBarracuda%25207200.14%2F100686584.pdf&ei=WAXdUsClLOaj0QXRkYHYCg&usg=AFQjCNEsZmj2OkbZgiRhYc5s2JxCs0hBkw&sig2=b13XL8ZlGYHlflprHAXgMQ&bvm=bv.59568121,d.d2k&cad=rja|Desktop HDD Seagate ST1000DM003 1TB]]
3. Obtener los siguientes valores de la informacion del paso 2:
Categoria | Unidades | Valor | Comentario |
---|---|---|---|
r.p.m. | r.p.m. | 7200 | Measured in revolutions per minute (RPM), most disks you'll consider for enterprise storage rotate at speeds of 7,200, 10,000 or 15,000 RPM with the latter two being the most common. A higher rotational speed is associated with a higher performing disk. This value is not used directly in calculations, but it is highly important. The other three values depend heavily on the rotational speed, so I've included it for completeness. |
Average seek, read | ms | 8.5 | The time (in ms) it takes for the hard drive's read/write head to position itself over the track being read or written. There are both read and write seek times; take the average of the two values. |
Average seek, write | ms | 9.5 | “ |
Average latency | ms | 4.16 | The time it takes for the sector of the disk being accessed to rotate into position under a read/write head. |
IOPS = 1/(((Average seek, read / 1000) + (Average seek, write / 1000))/2 + (Average latency / 1000))
En mi caso:
IOPS = 1/(((8.5 / 1000) + (9.5 / 1000))/2 + (4.16 / 1000)) IOPS = 1/((0.0085 + 0.0095)/2 + 0.00416) IOPS = 1/(0.009 + 0.00416) IOPS = 1/(0.01316) IOPS = 75.99
Es decir, el disco es capaz de realizar 76 operaciones de lectura o escritura por segundo.
Penalizacion IOPS en RAID
La siguiente tabla muestra el numero de IOPS que realiza un RAID en funcion del nivel y del tipo de operacion.
Por ejemplo para escribir sobre un RAID 1 se necesitan 2 IOPS. Por tanto el numero de IOPS calculado en el paso anterior se tiene que dividir por 2.
RAID level | Read | Write |
---|---|---|
RAID 0 | 1 | 1 |
RAID 1 (and 10) | 1 | 2 |
RAID 5 | 1 | 4 |
RAID 6 | 1 | 6 |
Tests de rendimiento
En esta seccion se listan las herramientas para lanzar tests que realizan distintas operaciones de lectura/escritura (I/O) y muestran los resultados.
Debe servir para ver como se comporta el sistema realizando que tipo de operacion y en que condiciones.
iozone
- Instalacion:
sudo aptitude install iozone3
- Ejecutar un test y la salida XLS:
iozone -a -b output.xls
- Ejecutar solo test de lectura y escritura. La salida a un archivo Excel (NO cambiar la extension):
iozone -a -z -i 0 -i 1 -b output.xls
- Ejecutar una serie de tests:
for i in {1..5}; do iozone -a > `hostname`"_iozone_"$i".out"; done
CPU | RAM | Test | Tiempo (segundos) |
---|---|---|---|
- | - | - | - |
- Graficas:
www.iozone.org/src/current/report.pl
- Necesita como entrada uno o mas archivos creados asi:
iozone -a > config1.out
Y se usa asi:
perl iozone_visualizer.pl config1.out config2.out
www.iozone.org/src/stable/iozone_visualizer.pl
Igual que el anterior. NO funciona.
http://code.google.com/p/iozone-results-comparator/source/checkout
iozone-results-comparator
Requiere instalar los siguientes paquetes:
sudo apt-get install gfortran libopenblas-dev liblapack-dev
Y luego:
sudo pip install scipy
Que ademas me casca con un:
error: Command "x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/tmp/pip_build_root/scipy/scipy/special -I/usr/lib/pymodules/python2.7/numpy/core/include -I/usr/include/python2.7 -I/tmp/pip_build_root/scipy/scipy/special/c_misc -I/usr/lib/pymodules/python2.7/numpy/core/include -c scipy/special/c_misc/gammaincinv.c -o build/temp.linux-x86_64-2.7/scipy/special/c_misc/gammaincinv.o" failed with exit status 1 ---------------------------------------- Cleaning up... Command /usr/bin/python -c "import setuptools;__file__='/tmp/pip_build_root/scipy/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-kSp3gV-record/install-record.txt --single-version-externally-managed failed with error code 1 in /tmp/pip_build_root/scipy Traceback (most recent call last): File "/usr/bin/pip", line 9, in <module> load_entry_point('pip==1.4.1', 'console_scripts', 'pip')() File "/usr/lib/python2.7/dist-packages/pip/__init__.py", line 148, in main return command.main(args[1:], options) File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 169, in main text = '\n'.join(complete_log) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 53: ordinal not in range(128)
Busco otra opcion
bonnie++
- Instalacion:
sudo aptitude install bonnie
- Ejecutar un test:
sudo bonnie++ -u usuario -m template-1 -d sandbox/ -x 2 -q >> 01.out sudo bonnie++ -u usuario -n 1000 -m template_1 -d sandbox/ -q >> 04.out u="user"; d="directory"; for i in {1..5}; do sudo bonnie++ -u $u -m `hostname` -d $d -q >> `hostname`"_"$i".out"; done
- Salida:
1.97,1.97,template-1,1,1389464918,1G,,648,97,120930,14,58219,7,2949,92,217747,14,475.7,8,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,16555us,89077us,243ms,7611us,68846us,179ms,19243us,535us,490us,150us,47us,437us
RAM | Num.archivos (-n) | Tamanyo archivos (MB) (-s) | Num.tests con +++++ |
---|---|---|---|
512 | - | - | 12 |
512 | 2048 | 1024 | ??? |
Importante: si salen muchos ”+++“ significa que el resultado se ignora porque el test no se considera significativo. Incrementar con el parametro -n el numero de archivos
- Graficas
https://gist.github.com/npinto/1182653 Funciona bien.
Campo grafica | Campo salida bonnie++ | Explicacion |
---|---|---|
blk out | Sequential Output / Block / K/sec | |
blk rew | Sequential Output / Rewrite / K/sec | |
blk in | Sequential Input / Block / K/sec | |
seeks | Random / seeks / /sec | |
create operations | Sequential Create / Create / /sec | |
delete | ——– | |
rnd crt | —— | |
rnd del | —— |
Profiling
Herramientas que “escuchan” las operaciones de disco que se realizan en el sistem y arrojan los resultados. De esta forma podemos determinar que tipo de accesos realizan los servicios corriendo en ese servidor, para poder asi tomar las mejores decisiones en terminos de optimizacion.
iostat
http://sebastien.godard.pagesperso-orange.fr/
- Instalar en Debian:
sudo aptitude install sysstat
- Determinar el Block Size medio que usa el sistema en un disco determinado (campo avgrq-sz):
iostat -x /dev/sda avg-cpu: %user %nice %system %iowait %steal %idle 2.07 0.00 0.51 1.42 0.00 96.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 1.36 4.00 8.62 3.42 237.55 125.16 60.27 0.25 20.60 13.77 37.80 4.49 5.40
Campo | Explicacion |
---|---|
%user | Show the percentage of CPU utilization that occurred while executing at the user level (application). |
%nice | Show the percentage of CPU utilization that occurred while executing at the user level with nice priority. |
%system | Show the percentage of CPU utilization that occurred while executing at the system level (kernel). |
%iowait | Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request. |
%steal | Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor. |
%idle | Show the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request. |
Device: | This column gives the device (or partition) name as listed in the /dev directory. |
tps | Indicate the number of transfers per second that were issued to the device. A transfer is an I/O request to the device. Multiple logical requests can be combined into a single I/O request to the device. A transfer is of indeterminate size. |
Blk_read/s (kB_read/s, MB_read/s) | Indicate the amount of data read from the device expressed in a number of blocks (kilobytes, megabytes) per second. Blocks are equivalent to sectors and therefore have a size of 512 bytes. |
Blk_wrtn/s (kB_wrtn/s, MB_wrtn/s) | Indicate the amount of data written to the device expressed in a number of blocks (kilobytes, megabytes) per second. |
Blk_read (kB_read, MB_read) | The total number of blocks (kilobytes, megabytes) read. |
Blk_wrtn (kB_wrtn, MB_wrtn) | The total number of blocks (kilobytes, megabytes) written. |
rrqm/s | The number of read requests merged per second that were queued to the device. |
wrqm/s | The number of write requests merged per second that were queued to the device. |
r/s | The number (after merges) of read requests completed per second for the device. |
w/s | The number (after merges) of write requests completed per second for the device. |
rsec/s (rkB/s, rMB/s) | The number of sectors (kilobytes, megabytes) read from the device per second. |
wsec/s (wkB/s, wMB/s) | The number of sectors (kilobytes, megabytes) written to the device per second. |
avgrq-sz | The average size (in sectors) of the requests that were issued to the device. |
avgqu-sz | The average queue length of the requests that were issued to the device. |
await | The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. |
r_await | The average time (in milliseconds) for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. |
w_await | The average time (in milliseconds) for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. |
svctm | The average service time (in milliseconds) for I/O requests that were issued to the device. Warning! Do not trust this field any more. This field will be removed in a future sysstat version. |
%util | Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits. |
avgrq-sz: The average size (in sectors) of the requests that were issued to the device. En este caso es 60.27
Algunos resultados de servidores en produccion:
Servicio | avgrq-sz |
---|---|
MySQL replicacion | 52.72 (root) 111.86 (mysql) |
Vacia, corro tests de rendimiento | 368.45 |
collectl
TODO
Configuracion/comandos de bajo nivel de discos
- Set read ahead cache:
blockdev --setra $1 /dev/md*
- Clear disk caches:
sync echo 3 > /proc/sys/vm/drop_caches
- Enable/disable NCQ:
echo 31 > /sys/block/sd#/device/queue_depth echo 1 > /sys/block/sd#/device/queue_depth
- Conocer el tamanyo fisico de sectores (en bytes), en este caso del disco /dev/vda:
sudo cat /sys/block/vda/queue/physical_block_size