Thursday, December 23, 2010

Compiling OpenSSL for pkcs11

To compile OpenSSL with pkcs11 engines, you need to apply a special patch which can be found at Miscellaneous OpenSSL Contributions. This patch is maintained by Jan Pechanec who's blog has more information about it.

The latest conribution is for OpenSSL 0.9.8j, but when writing this, OpenSSL was at 0.9.8p. I spent about an hour and patched his patch to the latest release. You'll at least need to change the shabang.

Using the Solaris Cool Tools version of gcc (GCC4SS) version 4.3.3, I can use additional niagara2 optimizations that are not available with the OS-bundled gcc.

My compile script ends up looking like (minus my environment variables):

gunzip -c openssl-${OPENSSL_VER}.tar.gz | tar xfvp -
#Change to the build directory
cd openssl-${OPENSSL_VER}
# apply pkcs11 patch
gpatch -p1 < ../pkcs11_engine-0.9.8p.2009-11-19/pkcs11_engine-0.9.8p
# fix the solaris optimizations to use niagara2
cp Configure Configure.old
nawk ' /solaris64-sparcv9-gcc/ { gsub(/-mcpu=ultrasparc/,"-mcpu=niagara2"); print $0 } ! /solaris64-sparcv9-gcc/ { print $0 } ' Configure.old > Configure
# note the --pk11-libname parameter added by the patch
./Configure --prefix=${OPENSSL_DIR} --pk11-libname=/usr/lib/sparcv9/libpkcs11.so threads shared solaris64-sparcv9-gcc -R${GCCRT_DIR}/lib/sparcv9 -L${GCCRT_DIR}/lib/sparcv9
make
make install

OpenSSL pkcs11 engine performance

The obvious way to measure performance is to use the openssl speed subprogram.

Simply performing the following command

# /usr/sfw/bin/openssl -engine pkcs11 speed rsa

will measure the rsa signs and verifies at various key sizes:

                  sign    verify    sign/s verify/s
rsa  512 bits   0.0000s   0.0000s  25429.8  30112.9
rsa 1024 bits   0.0000s   0.0000s  23423.1  28794.3
rsa 2048 bits   0.0000s   0.0000s  21155.0  27410.4
rsa 4096 bits   0.7073s   0.0190s      1.4     52.5

Wow! ... looks fast! 

or is it?

Lets try the same on a Windows Intel box:

                  sign    verify    sign/s verify/s
rsa  512 bits 0.000286s 0.000026s   3500.8  38864.3
rsa 1024 bits 0.001467s 0.000079s    681.6  12578.7
rsa 2048 bits 0.009515s 0.000280s    105.1   3576.4
rsa 4096 bits 0.068150s 0.001098s     14.7    911.0

Or a VMWare Solaris x86:

                  sign    verify    sign/s verify/s
rsa  512 bits   0.0008s   0.0001s   1208.7  14074.3
rsa 1024 bits   0.0041s   0.0002s    245.0   4995.9
rsa 2048 bits   0.0243s   0.0007s     41.1   1505.2
rsa 4096 bits   0.1485s   0.0020s      6.7    492.9

But apparently, we're not really comparing apples to apples. Somewhere I read, we have to use the "-elapsed" flag of the speed subcommand. If we don't the comparisons are not fair.

# /usr/sfw/bin/openssl speed -engine pkcs11 -elapsed rsa

                  sign    verify    sign/s verify/s
rsa  512 bits   0.0003s   0.0002s   3103.7   5225.4
rsa 1024 bits   0.0007s   0.0003s   1482.0   3053.3
rsa 2048 bits   0.0023s   0.0008s    433.2   1286.6
rsa 4096 bits   0.7047s   0.0184s      1.4     54.2

Versus the Windows Intel machine:

                  sign    verify    sign/s verify/s
rsa  512 bits 0.000287s 0.000026s   3481.1  38185.0
rsa 1024 bits 0.001482s 0.000081s    674.9  12400.2
rsa 2048 bits 0.009558s 0.000281s    104.6   3557.0
rsa 4096 bits 0.069450s 0.001092s     14.4    916.1

Versus the Solaris VMWare x86 machine:

                  sign    verify    sign/s verify/s
rsa  512 bits   0.0009s   0.0001s   1141.4  13736.3
rsa 1024 bits   0.0042s   0.0002s    238.3   4629.1
rsa 2048 bits   0.0260s   0.0007s     38.5   1480.9
rsa 4096 bits   0.1576s   0.0022s      6.3    456.1

Well the performance doesn't look so good now!

The trick is that the pkcs11 version uses a lot less CPU. Try using the option -multi to run multiple speed tests at once and compare your CPU usage with top. The pkcs11-enabled version will barely use the CPUs whereas a non-pkcs11 version will pin the CPU.

Verifying OpenSSL is using pkcs11

It's quite possible the version of OpenSSL you have or have compiled does not have the pkcs11 engine enabled. To show what engines you have:

# /usr/sfw/bin/openssl engine -c -t

(pkcs11) PKCS #11 engine support
 [RSA, DSA, DH, RAND, DES-CBC, DES-EDE3-CBC, DES-ECB, DES-EDE3, RC4, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-CTR, AES-192-CTR, AES-256-CTR, MD5, SHA1]
     [ available ]

But this doesn't guarantee that you are actually using the engine. You can check the hardware's engine use using kstat with the "-n ncp0" option or the "-n n2cp0" option for example:

# /usr/bin/kstat -n ncp0 -s rsaprivate

module: ncp                             instance: 0
name:   ncp0                            class:    misc
        rsaprivate                      6781417

To watch its use:

# while true; do kstat -n ncp0 | grep rsaprivate | nawk '{ print $2 }'; sleep 1; done

So to verify that it's really using it, you can perform an openssl speed test at the same time as running the above watch:

# /usr/sfw/bin/openssl speed -engine pkcs11 -elapsed rsa

engine "pkcs11" set.
You have chosen to measure elapsed time instead of user CPU time.
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077 CVE-2009-0590)
built on: date not available
options:bn(64,32) md2(int) rc4(ptr,char) des(ptr,risc1,16,long) aes(partial) blowfish(ptr)
compiler: information not available
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: ftime
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0003s   0.0002s   3103.7   5225.4
rsa 1024 bits   0.0007s   0.0003s   1482.0   3053.3
rsa 2048 bits   0.0023s   0.0008s    433.2   1286.6
rsa 4096 bits   0.7047s   0.0184s      1.4     54.2

Note that openssl does not use all of the possible SCF pkcs11 functions.

Overview

Some Solaris 10 machines support hardware SSL. These include those with the UltraSparc T1 and UltraSparc T2 chips. These are sometimes referred to niagara1 and niagara2.

The official documentation for this starts at Using the UltraSPARC cryptographic accelerators.

Default installations of Solaris on these machines will include a version of OpenSSL that works with the hardware. This is the one in /usr/sfw/bin.

To compile Apache against this one, add the configure option "--with-ssl=/usr/sfw".

Showing the version of this we see something like:

# /usr/sfw/bin/openssl version
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077 CVE-2009-0590)

So Solaris starts with the 0.9.7d version and patches various CVEs.

Next -> how do we know if we're using the crypto accelerators?