To compile OpenSSL with pkcs11 engines, you need to apply a special patch which can be found at Miscellaneous OpenSSL Contributions. This patch is maintained by Jan Pechanec who's blog has more information about it.
The latest conribution is for OpenSSL 0.9.8j, but when writing this, OpenSSL was at 0.9.8p. I spent about an hour and patched his patch to the latest release. You'll at least need to change the shabang.
Using the Solaris Cool Tools version of gcc (GCC4SS) version 4.3.3, I can use additional niagara2 optimizations that are not available with the OS-bundled gcc.
My compile script ends up looking like (minus my environment variables):
gunzip -c openssl-${OPENSSL_VER}.tar.gz | tar xfvp -
#Change to the build directory
cd openssl-${OPENSSL_VER}
# apply pkcs11 patch
gpatch -p1 < ../pkcs11_engine-0.9.8p.2009-11-19/pkcs11_engine-0.9.8p
# fix the solaris optimizations to use niagara2
cp Configure Configure.old
nawk ' /solaris64-sparcv9-gcc/ { gsub(/-mcpu=ultrasparc/,"-mcpu=niagara2"); print $0 } ! /solaris64-sparcv9-gcc/ { print $0 } ' Configure.old > Configure
# note the --pk11-libname parameter added by the patch
./Configure --prefix=${OPENSSL_DIR} --pk11-libname=/usr/lib/sparcv9/libpkcs11.so threads shared solaris64-sparcv9-gcc -R${GCCRT_DIR}/lib/sparcv9 -L${GCCRT_DIR}/lib/sparcv9
make
make install
OpenSSL pkcs11
This blog is intended to document my experiences attempting to configure and use OpenSSL with the Solaris SCF pkcs11 engine.
Thursday, December 23, 2010
OpenSSL pkcs11 engine performance
The obvious way to measure performance is to use the openssl speed subprogram.
Simply performing the following command
# /usr/sfw/bin/openssl -engine pkcs11 speed rsa
will measure the rsa signs and verifies at various key sizes:
sign verify sign/s verify/s
rsa 512 bits 0.0000s 0.0000s 25429.8 30112.9
rsa 1024 bits 0.0000s 0.0000s 23423.1 28794.3
rsa 2048 bits 0.0000s 0.0000s 21155.0 27410.4
rsa 4096 bits 0.7073s 0.0190s 1.4 52.5
# /usr/sfw/bin/openssl speed -engine pkcs11 -elapsed rsa
sign verify sign/s verify/s
rsa 512 bits 0.0003s 0.0002s 3103.7 5225.4
rsa 1024 bits 0.0007s 0.0003s 1482.0 3053.3
rsa 2048 bits 0.0023s 0.0008s 433.2 1286.6
rsa 4096 bits 0.7047s 0.0184s 1.4 54.2
sign verify sign/s verify/s
rsa 512 bits 0.0009s 0.0001s 1141.4 13736.3
rsa 1024 bits 0.0042s 0.0002s 238.3 4629.1
rsa 2048 bits 0.0260s 0.0007s 38.5 1480.9
rsa 4096 bits 0.1576s 0.0022s 6.3 456.1
Well the performance doesn't look so good now!
The trick is that the pkcs11 version uses a lot less CPU. Try using the option -multi to run multiple speed tests at once and compare your CPU usage with top. The pkcs11-enabled version will barely use the CPUs whereas a non-pkcs11 version will pin the CPU.
Simply performing the following command
# /usr/sfw/bin/openssl -engine pkcs11 speed rsa
will measure the rsa signs and verifies at various key sizes:
sign verify sign/s verify/s
rsa 512 bits 0.0000s 0.0000s 25429.8 30112.9
rsa 1024 bits 0.0000s 0.0000s 23423.1 28794.3
rsa 2048 bits 0.0000s 0.0000s 21155.0 27410.4
rsa 4096 bits 0.7073s 0.0190s 1.4 52.5
Wow! ... looks fast!
or is it?
Lets try the same on a Windows Intel box:
sign verify sign/s verify/s
rsa 512 bits 0.000286s 0.000026s 3500.8 38864.3
rsa 1024 bits 0.001467s 0.000079s 681.6 12578.7
rsa 2048 bits 0.009515s 0.000280s 105.1 3576.4
rsa 4096 bits 0.068150s 0.001098s 14.7 911.0
Or a VMWare Solaris x86:
sign verify sign/s verify/s
rsa 512 bits 0.0008s 0.0001s 1208.7 14074.3
rsa 1024 bits 0.0041s 0.0002s 245.0 4995.9
rsa 2048 bits 0.0243s 0.0007s 41.1 1505.2
rsa 4096 bits 0.1485s 0.0020s 6.7 492.9
But apparently, we're not really comparing apples to apples. Somewhere I read, we have to use the "-elapsed" flag of the speed subcommand. If we don't the comparisons are not fair.
# /usr/sfw/bin/openssl speed -engine pkcs11 -elapsed rsa
sign verify sign/s verify/s
rsa 512 bits 0.0003s 0.0002s 3103.7 5225.4
rsa 1024 bits 0.0007s 0.0003s 1482.0 3053.3
rsa 2048 bits 0.0023s 0.0008s 433.2 1286.6
rsa 4096 bits 0.7047s 0.0184s 1.4 54.2
Versus the Windows Intel machine:
sign verify sign/s verify/s
rsa 512 bits 0.000287s 0.000026s 3481.1 38185.0
rsa 1024 bits 0.001482s 0.000081s 674.9 12400.2
rsa 2048 bits 0.009558s 0.000281s 104.6 3557.0
rsa 4096 bits 0.069450s 0.001092s 14.4 916.1
Versus the Solaris VMWare x86 machine:
rsa 512 bits 0.0009s 0.0001s 1141.4 13736.3
rsa 1024 bits 0.0042s 0.0002s 238.3 4629.1
rsa 2048 bits 0.0260s 0.0007s 38.5 1480.9
rsa 4096 bits 0.1576s 0.0022s 6.3 456.1
The trick is that the pkcs11 version uses a lot less CPU. Try using the option -multi to run multiple speed tests at once and compare your CPU usage with top. The pkcs11-enabled version will barely use the CPUs whereas a non-pkcs11 version will pin the CPU.
Verifying OpenSSL is using pkcs11
It's quite possible the version of OpenSSL you have or have compiled does not have the pkcs11 engine enabled. To show what engines you have:
# /usr/sfw/bin/openssl engine -c -t
(pkcs11) PKCS #11 engine support
[RSA, DSA, DH, RAND, DES-CBC, DES-EDE3-CBC, DES-ECB, DES-EDE3, RC4, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-CTR, AES-192-CTR, AES-256-CTR, MD5, SHA1]
[ available ]
But this doesn't guarantee that you are actually using the engine. You can check the hardware's engine use using kstat with the "-n ncp0" option or the "-n n2cp0" option for example:
# /usr/bin/kstat -n ncp0 -s rsaprivate
module: ncp instance: 0
name: ncp0 class: misc
rsaprivate 6781417
To watch its use:
# while true; do kstat -n ncp0 | grep rsaprivate | nawk '{ print $2 }'; sleep 1; done
So to verify that it's really using it, you can perform an openssl speed test at the same time as running the above watch:
# /usr/sfw/bin/openssl speed -engine pkcs11 -elapsed rsa
engine "pkcs11" set.
You have chosen to measure elapsed time instead of user CPU time.
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077 CVE-2009-0590)
built on: date not available
options:bn(64,32) md2(int) rc4(ptr,char) des(ptr,risc1,16,long) aes(partial) blowfish(ptr)
compiler: information not available
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: ftime
sign verify sign/s verify/s
rsa 512 bits 0.0003s 0.0002s 3103.7 5225.4
rsa 1024 bits 0.0007s 0.0003s 1482.0 3053.3
rsa 2048 bits 0.0023s 0.0008s 433.2 1286.6
rsa 4096 bits 0.7047s 0.0184s 1.4 54.2
Note that openssl does not use all of the possible SCF pkcs11 functions.
# /usr/sfw/bin/openssl engine -c -t
(pkcs11) PKCS #11 engine support
[RSA, DSA, DH, RAND, DES-CBC, DES-EDE3-CBC, DES-ECB, DES-EDE3, RC4, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-CTR, AES-192-CTR, AES-256-CTR, MD5, SHA1]
[ available ]
# /usr/bin/kstat -n ncp0 -s rsaprivate
module: ncp instance: 0
name: ncp0 class: misc
rsaprivate 6781417
# while true; do kstat -n ncp0 | grep rsaprivate | nawk '{ print $2 }'; sleep 1; done
So to verify that it's really using it, you can perform an openssl speed test at the same time as running the above watch:
# /usr/sfw/bin/openssl speed -engine pkcs11 -elapsed rsa
engine "pkcs11" set.
You have chosen to measure elapsed time instead of user CPU time.
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077 CVE-2009-0590)
built on: date not available
options:bn(64,32) md2(int) rc4(ptr,char) des(ptr,risc1,16,long) aes(partial) blowfish(ptr)
compiler: information not available
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: ftime
sign verify sign/s verify/s
rsa 512 bits 0.0003s 0.0002s 3103.7 5225.4
rsa 1024 bits 0.0007s 0.0003s 1482.0 3053.3
rsa 2048 bits 0.0023s 0.0008s 433.2 1286.6
rsa 4096 bits 0.7047s 0.0184s 1.4 54.2
Overview
Some Solaris 10 machines support hardware SSL. These include those with the UltraSparc T1 and UltraSparc T2 chips. These are sometimes referred to niagara1 and niagara2.
The official documentation for this starts at Using the UltraSPARC cryptographic accelerators.
Default installations of Solaris on these machines will include a version of OpenSSL that works with the hardware. This is the one in /usr/sfw/bin.
To compile Apache against this one, add the configure option "--with-ssl=/usr/sfw".
Showing the version of this we see something like:
# /usr/sfw/bin/openssl version
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077 CVE-2009-0590)
The official documentation for this starts at Using the UltraSPARC cryptographic accelerators.
Default installations of Solaris on these machines will include a version of OpenSSL that works with the hardware. This is the one in /usr/sfw/bin.
To compile Apache against this one, add the configure option "--with-ssl=/usr/sfw".
Showing the version of this we see something like:
# /usr/sfw/bin/openssl version
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077 CVE-2009-0590)
So Solaris starts with the 0.9.7d version and patches various CVEs.
Next -> how do we know if we're using the crypto accelerators?
Subscribe to:
Posts (Atom)