Author Archives: BenBE

About BenBE

Since I joined the Software Assessment Team in mid-2012 at CAcert I helped with the Software. I'm interested in cryptography and regularly break things to rebuild them better. Apart from crypto I work on Open Source Software projects like GeSHi.

New Software: The Heart of Gold

Today’s post is the last part the blog series about our “New Software”.
1. Rewriting the software driving our site
2. Modernising the Web Frontend
3. The Heart of Gold
This part will conclude our small blog series on the “New Software” and will provide some more details especially regarding the signer and some aspects regarding the certificate issuance not further mentioned before.

The Heart of Gold

With the last two posts and we talked about why we are rewriting our software and what the main changes on the user side will be. In today’s post we’re going to look behind the scenes to see how the signer “the heart of the system” works.

While looking at the existing Perl code the developing team decided that it is worth to do a code rewrite and refactoring for the signer too. This decision was made after analysis showed that the current code will most likely not be easily adjustable to satisfy our future requirements. Even with the most basic requirements that were layed out for the new software the code was failing several aspects and thus, instead of heavily rewriting things, it makes less work to ignore the existing code base and start anew. Going that way also allowed the team to freely choose the language best suited their needs. So for the signer C++ was chosen as coding language.

One aspect of immediate interest is and will be the new root structure which is currently under development in a separate project called “NRE” (New Root and Escrow). This structure requires the signer side to work with sub roots to sign the certificates. The primary root will be only used to sign the intermediate and sub (“profile”) roots. There will be a set of sub roots for the different kinds of certificates that CAcert is offering. One sub root for email, signing and login client certificates, one for code singing certificates, one for server certificates, one for organisation client certificates, one for organisation server certificates and so on. This also removes the differentiation between different “Classes” as they have never fit the CAcert model perfectly (see also http://serverfault.com/questions/365846/ssl-certificate-class-2-vs-class-3-vs-class-4 ) Instead the use of these certificates will be defined by certificate profiles selected in the front end which are passed onto the signer.

To avoid the problem with large CRL files, we are facing with the existing roots, there is the idea of having all the sub roots signing a set of further sub roots which will be used for a shorter time span of e.g. three to six months so that the CRL files will stay a reasonable size of below 1 MiB. That is also an additional complexity for the signer to handle, as the signer has to choose the correct sub root to sign the current task automatically. Also the management of the different CRLs would be hard with something similar to the current signing software.

Apart from the actual certificates used for signing there is the actual technology to handle all the cryptography. One aspect here is to be able to switch the crypto backend to what we think will be the best one at that time, i.e. OpenSSL, GnuTLS, WolffSSL or whatever provides the security guarantees we require. This will allow CAcert to react quickly if severe problems are found in a crypto backend. Currently only the use of OpenSSL is implemented, but interactions with OpenSSL have been kept together so that in implementation of another crypto backend is feasible.

Being able to switch out parts of the signer software with other libraries provides us with more flexibility at compile time and for reusing the code for testing. Together with the use of unit tests, that are run under Jenkins, there is a good change to see problems in the source code before things hit the life system. Also having the continuous integration aspect with Jenkins it allows us to check that current changes won’t break existing behaviour unless we want to have that behaviour actually changed.

Furthermore a new protocol for the communication between the signing system and the database has been developed. First of all, all communication gets wrapped into TLS to have an encrypted and authenticated connection, where both sides identify themselves to each other. Inside this secure channel a record-based protocol – similar to what is used to flash firmware to microcontrollers and EPROMs – is used to transmit all information required for the signing task. Building on the information received in those records the signer rebuilds the signing request and verifies the information (and their proofs of freshness) to be correct.

After a few further checks the signer hands this data to the crypto backend. This way all user entered data (common name, subject alternative names, organisation name, etc) stays UTF-8 encoded during the whole process. Also, due to the integrated library, there is no need to re-encode user input in specific formats for an external tool to process the signing task.

The protocol also contains several safety checks to make sure that only verifiable and properly embedded information is sent to and processed on the signer. One of them is a “proof” protocol to transmit facts about the state of the data affecting the signing task. On the other hand the protocol has been kept scalable so that it can be easily extended for future needs, e.g. if additional information or commands need to be processed.

For issuing a new certificate the signer client fetches all necessary information from the database. It then opens a connection to the signer and transmits all data for the certificate starting with the CSR or SPKAC-request for the certificate. When all data is transmitted the client asks the signer to sign the current certificate (“sign”). The signer checks the requests and signs it, but instead of returning the resulting certificate, it answers with a log over the current signing session (“setLog”). When the client confirms that the log has been saved (this could be implemented with a hash over the log beeing transmitted back in the “logSaved”-record) the signer transmits the new certificate (“respondCertificate”) and a code name of the sub CA that was actually used to sign this request (“signingCA”). Now the signer client writes the certificate, its serial, issuance and expiry date and possibly more data back to the database and closes this signing session.

For revoking certificates the signer client fetches serials of the same sub-ca (the certificate that actually signed the certificate to be revoked) and sends them ( with “addSerial”) over a fresh connection with the information which CA it is (as argument to “revoke”) to the signer. The signer fetches the current date and time, marks all given certificates as revoked on that instant, resigns the CRL and returns the exact revocation date and time, the X509_Algorithm Structure used to resign the CRL, the actual signature and the “lastUpdate” and “nextUpdate” fields from the CRL (as argument to “revoked”). That data is ASN1 encoded so it can easily be split afterwards. The client updates his local CRL accordingly with the given dates and the given signature and verifies that the signature is valid again. If the updated CRL validates the revocation process has successfully completed and both CRLs (on the signer, and on client) are in sync again. If the CRL validation fails, the client requests the full CRL to be transmitted in order to get the CRLs back in sync. This is not expected to happen in normal production service.

The internal code names for the various parts of the new software are based on the characters of the Michael Ende’s novel MOMO. Based on the story we are calling our web front end Gigi, as – like in the story – it leads your way. Cassiopeia on the other hand is a wise, trusty turtle with a hard shell to protect its secrets and thus the perfect name for our signer.

If you are interested in our work you can find the source code of the new software as well as other CAcert projects mirrored on . To help with development of our software please contact the development mailing list cacert-devel@list.cacert.org.
There will be a live presentation with the new software as well as the possibility to discuss the ideas with the core developers of the new software at the CAcert booth on the CLT2015.

New Software: Modernising the Web Frontend

With today’s post we continue the blog series about our “New Software”.
1. Rewriting the software driving our site
2. Modernising the Web Frontend
3. The Heart of Gold
As outlined in the previous post there were quite a few things to decide upfront, which doesn’t make actual implementation any easier as we will see today when going into more details with the web front end. Beware of the details though as this (and the next part) are going to be much more technical.

Modernising the Web Frontend

With the last post New Software: Rewriting the software driving our site you got a glimpse of why we are rewriting our software. Today we’re going to elaborate a bit more on the front-end part and will provide more in-depth information on the technical changes and news to expect. The main aim of the rewrite is to strengthen the security and to improve maintainability of the source code. On the other hand we aim to avoid showing too much changes to our users.

With choosing Java as the language for the front end we are taking the advantage to develop a modular design based on Object Orientated Programming. This should help to get a better transparency of the source code, which the old PHP source both was lacking. In addition to these basic aspects the use of Java allows for easier packaging of the final application as well as for reworking the code when there is need to do so. This can be aided with several refactoring and code management tools available for Java as the general tooling support is quite good here. One such example is the nice integration of static analysis using tools such as FindBugs to automatically find issues that a static analysis of the code can find, e.g. null-pointer dereferences, misused APIs, thread safety issues, …

As a continuous integration process units test are introduced to test all new features directly while writing the source code. Our tool for this is Jenkins. The unit tests do not only cover the new applied features, the whole source code is tested as well, so that side effects can be discovered in an early stage of the development. Right now there are about 250 test cases running, with many more still to come.

Another important change is the separation from the source logic and the HTML output. While PHP encourages you to intermix HTML templates and executable source, it threatens you to bite you whenever you are not cautious enough. To make it harder to be careless we decided to reduce the power our templates have: The templates are reduced to simply printing values and very easy conditions and loops. An additional safety measure with the template engine is the implicit escaping that is caused for every string that should be printed – if you need to output preprocessed HTML to the template you have to explicitly state so in the template.
The HTML output is based on HTML5 and formatted with CSS3 using LESS to support responsive design patterns and accessibility. Additionally we completely switched to use Unicode, having UTF-8 as our standard encoding.

There is also a change in the URL structure. The new URL concept is based on REST and can easily be scaled to a RESTful API in the future. No more ?id=42 or other meaningless numeric IDs for the various actions.

To avoid SQL injections we opted to use implicit escaping for all data sent to or retrieved from the database; similar to the implicit escaping used with all inputs to the template engine. This way we can process raw data internally and have the necessary conversions and escaping operations whenever the context of the information changes allowing us to concentrate on the actual processing and logic instead of caring about the security implications.

Looking at security we invested some work in reworking the way authentication is handled. While the old system only supported password-based authentication and client certificate login using X.509 there was no chance of combining multiple factors as a requirement for successful login to an account. This lack of multi-factor authentication caused some trouble in the past. To avoid this and to provide more flexibility for our users we want to change the authentication to allow for any combination of login methods, as long as at least one “strong” authentication (like password login, secure token, client certificate) in addition to an optional “multi-factor” authenticator is provided (additional factors may be older TOTPs, single access tokens, ping tokens).

To raise the security level of the password storage we will use with SHA2-256 to store passwords in a save manner. This ensures an attacker can’t easily parallelise brute-forcing or attacking the password hashes as SCrypt guarantees through its design that an attacker needs a certain memory and amount of computational power to calculate the hash value. A switch to SCrypt-SHA2-512 or other even more secure methods later on is already laid out in the source code and can be activated transparently upon next successful password login of the user.

While discussing the authentication of the user let’s take a short break to look at some other security features the new system will implement. As one of the most critical parts with transmitting any information on a secure channel is getting the channel secured we’ll be activating HSTS or also known as HTTP Strict Transport Security. Using HSTS as a baseline we ensure that everyone visiting the site to access sensitive information can benefit of protection from eavesdroppers for future visits. Even if this brings no security on its own – except for mitigating SSL stripping attacks, it sets up the stage for the second mechanism: HPKP. Using Public Key Pinning we not only enforce browsers who had a once good, encrypted channel, to enforce encryption on subsequent visits (as HSTS does), but in addition also tell what valid keys for such a good connection are. Like with HSTS also HPKP suffers the issue of being inline to the connection and thus vulnerable to a determined attacker. With DNS-based Authentication of Named Entities (or for short: DANE) we introduce a second vector for verification a browser can use when determining acceptable security certificates. And even with these mechanisms deployed there is still a lot of room for determining if the connection is secure, some of which will sure follow in a later step.

But apart from ensuring the browser of the user talks to the correct server it is equally important to secure the display of information against mistakes our systems might be doing. One of the options we choose to implement was restricting ourselves using CSP (Content Security Policy) and a separation of our content delivery hosts. Basically to avoid an attacker executing any JavaScript not authorized we split execution of scripts to a subdomain which only carries static script files, while all content that might be under control of the attacker (e.g. the field containing the user name) is delivered on a strictly No-JavaScript subdomain. Thus even XSS was possible it would harden ourselves ill side-effects. To additionally avoid being framed (no pun intended) we will be including various HTTP headers like X-Frame-Options to explicitly prohibit loading within a frame on the attackers website. Also we ask the user’s browser to handle files in the way we specify which, among others, includes disabling of MIME type sniffing. With these security mechanisms in place it raises the bar for an attacker quite a lot and – on the other side – allows us self-testing the application as CSP allows for reporting violations on its rule set. Explicitly triggering such reports can even test the browser and thus determine the protection level the application can expect from the client (e.g. if images which are included for the sole purpose of triggering a report are loaded rather than ignored we know that the client does not properly implement CSP).

Another change will be introduced for the registration of domain names. In the future there are at least 2 ownership checks required to activate a new domain with our system. There will be a choice between the long known email pings, the verification with a DNS-TXT entry, verification with reading HTTP-content from a special addressed file and the verification with CAcert server certificates securely running on internet based services like HTTP, SMTP or XMPP. The way those tests were designed ensures that someone just observing traffic might notice the checks taking place, but won’t gain any information from such observations. All identifiers in these tests are chosen randomly thus simply searching for a particular filename or content of a file with an internet search engine won’t provide you any means to impersonate that domain.

In addition to the checks during the registration of a domain there will be continuous checks to check a domain is still active. If less then two tests succeed the account owner will be informed that there is a verification problem that should be solved within a grace period before all certificates containing that domain will be revoked automatically.

All posts send from the new software will be digitally signed using S/MIME to ensure that these messages are really generated by the CAcert systems and are not SPAM. This has been planned for a long time for the existing code but we did not manage to introduce it there due to complexity and security when realising this function in PHP.

One last but not least interesting feature to be mentioned is the rework of the certificate issuing process in the front end. While in the old software the process was to get all data from the database and check it against the values given in the CSR, this is turned around for the new software: Now first the given CSR is checked and the data gained from the CSR is checked against the records in the database to pre-fill the certificate request form. Changing the way the issuing works we were able to simplify the interface while allowing for new features to come.

After the discussion on why we choose our coding framework and today’s post about the ideas for the new front end the next post will have a closer look at the details of the ideas of the new signer.

New Software: Rewriting the software driving our site

In today’s post we start a blog series about our “New Software”. It will consist of three parts:
1. Rewriting the software driving our site
2. Modernising the Web Frontend
3. The Heart of Gold

As some of the details outlined are still work in progress there’s a chance of things to change, but in general these are things presented to get a grasp of what the software team is being working on for the last year, despite being rather silent otherwise. Feedback to these plans is gladly welcome.

Rewriting the software driving our site

Over the past few months the software team along with other groups like NRE (New Root and Escrow) and Policy Group has been and still is working on rewriting the software that the whole CAcert website is running from scratch. The step was planned for quite some time already, but due to the complexity and the routine work required to keep the current software running it only became possible recently. In this blog post we’d like to describe some of the decisions that were made in this process as some of those decisions are commonly questioned and somewhat controversial.

When we set down to plan the new software one of the main reasons to do so was avoiding the pitfalls we are having with the old software from a maintainability point of view. The old software was written back in 2003 and has ever since only seen patches here and there. If you look closely at the current source you’ll notice about ten different coding styles interleaved into each other. Combined with the quirks of PHP this makes quite a mess that easily gets unmaintainable once you want to implement new features or try to update the software runtime.

Another important aspect when planning the software was redundancy. As the last year has shown you want to mistrust your SSL implementation and every other peace of software you use as much as is practical for your purpose. Yet you still want to protect yourself from those errors if there is a way to do so.

Discussions in the software team meetings were quite heated when the initial thoughts were shared and the basic decisions made. One of those decisions was to drop PHP in the new system due to its quirks and inconsistent API. When surveying other options for a replacement we also took into account that we did not want to have both the front-end and the back-end written in the same language to keep the redundancy we currently have (PHP in the front-end, Perl in the back-end). The list of possible languages contained PHP, Perl, Python, Java, C++, C. Other languages like Groovy, Ruby, Erlang, Haskell and many more were looked at but were discarded due to lack of enough software assessors confident enough to write security-related code with them OR because of weak typing which would reifter heavy discussion we settled for Java in the front-end which was not that undisputed in our team either. Thus when selecting Java we knew of the issues that its runtime has and decided to go along this way for mainly three reasons:

  • Given the number of CVEs it’s not worse then PHP even if many people think it is. When surveying the options we explicitly looked into the security track record and did a count on how many of the issues would have affected us if we restrained our use feature set to some extend. By doing so and leaving out several features like JSPs, JRMI, and many other advanced features we do not need we found it is possible to cut back on attack surface to an acceptable measure.
  • There is really good tooling support for software design, quality control and testing as well as for refactoring. Given the right tools you can prove that before and after a change was made your software still works the same – and all you need to do is writing a few rules saying how you get from A to B.
  • There are lots of people who know the language enough to write code with it that runs without falling over at any occasion – something which needs years of experience with languages like PHP and C++. Comparing these aspects to Python we weren’t as confident about the tooling support and were in addition lacking people in our team who could have done the software reviews. Thus even as we’d love to use something provable like Haskell there’s no point to do so if nobody can assess if a given patch was correct.

A similar situation arose with the signer (currently this part is called CommModule). Although this part basically “works” it’s always a hassle if anything needs to be changed as only few people in our team are confident enough to write Perl. Thus Perl was excluded for the signer too. Instead we settled for C++ for the signer, which might be inintuitive considering the common prejudices regarding the memory management, which, while similar to the broken one from C, has a lot of abstractions actually making live easier. By wrapping the cryptography interfaces of the used library you can keep memory management at a safe distance. Also use of C++ enables us to freely choose among all the (broken) SSL/TLS implementations available; something most other languages would not yield without writing some kind of wrapper.

On the cryptography PoV we discussed several implementations available but ruled one thing quite at the start: We will NOT implement any cryptography ourself on the signer. Thus the cryptography should come from an existing implementation like OpenSSL, LibreSSL, GnuTLS, libNSS, NaCl, CyaSSL/WolffSSL, PolarSSL/embedSSL. As several implementations head differing issues in recent time we decided to keep this part in the new system flexible to switch the implementation without much effort should problems arise. Our favoured backends here were GnuTLS with OpenSSL/LibreSSL as fallbacks – well knowing both have their quite distinct set of issues.

With the question of programming language answered let’s head over to the software design decisions made apart from the mere question of programming language. One aspect in the old system was its inherent lack of any meaningful documentation – something which made the old software basically unfit for any meaningful audit and a hell to maintain. Additionally the old software has a structure which makes writing sensible tests a nightmare (if at all possible). To work around these issues in the new software we plan to maintain documentation inline as part of the code to make updating it along changes to the code as easy as possible. Also most parts of the new software have been covered by a set of unit and integration tests which simplifies the life of our testers: In recent times not only once the test instruction to the old software read “Complete Retest” as there was a lack of integration tests.

Given all these details on the old software and our plans for the future we hope it gives a short introduction to what the software team is working on and why we react quite allergic to people asking when things will be “in their browsers”. It’s not that we are slacking off, but that there is much too much work to do than to communicate every tiny change. If you are interested in some more detail of the new software I’d recommend you reading the follow up parts of this little series which will detail both the features and architecture of the web front-end and will have a closer look at the new signer.

STARTTLS support for email ping connections

Based on some support requests recently, mainly from users of the privacy-concerned provider mailbox.org, we decided to include support for STARTTLS into the first phase of normal email pings. When registering a new email address for your account a ping email is sent in two steps, the first of which is performed synchronously when the request is placed (checking the server’s existence), while the actual sending uses a mail server at CAcert to handle delivery and retransmission.
The change was realized in two parts as based on support requests we received two distinct issues were present when deciding to send mails: The first issue (fixed in bug 1318) was about the order the receiving servers for a domain were tried. This lead sometimes to situations where mails from CAcert were marked as spam as the first server tried by our website software accidentially was the spam-trap of that domain. To avoid this the software now respects the priority given in the MX records and shuffles equal priority records in random order as allowed by the RFC.
Once the order of the servers, that should be tried to deliver the mail, has been decided on, the second change comes into play, which is explained further in bug 1288 of our issue tracker. The changes in the second part are focused on the connection content when talking to a foreign MTA. For this the code implementing the dialog phase has been reworked to query for STARTTLS in the feature list of the EHLO command (previously only a simple HELO was sent) and establishing an opportunistic layer of encryption with the other side. For simplicity whenever STARTTLS is advertised we will be using STARTTLS in this phase and thus fail the connection when no TLS session can be established.
We hope that this change lifts the delays some of our users experience when registering a new domain of certain providers. Although please note that most MTAs use anti-spam measures regardless of encryption and thus a manuel retry after some (usually 5) minutes might still be necessary.

 

All-time record on new users per year

[German version below]

Since a few days ago CAcert has more new users registered in 2014 than for any of the years before. Currently we are at about 31625 users and counting beating the record established in 2006 with 31542 users for the whole year in just about 11 months. With a rate of about 100 new users every day we have a faster-growing user base than ever. Given this support by our members CAcert is by far not dead – instead it shows the still existing need for a open and free certificate authority operating for their users instead of profit.

 

[German version]

Seit den letzten Tagen hat CAcert mehr neue Benutzer, die sich 2014 angemeldet haben, als in irgendeinem Jahr zuvor. Mit derzeit 31625 neuen Nutzern wurde der 2006 aufgestellte Rekord mit 31542 neuen Nutzern für das gesamte Jahr bereits nach 11 Monaten eingestellt. Mit täglich ungefähr 100 neuen Usern reißt der Zustrom neuer Mitglieder nicht ab. Entgegen aller Unkenrufe ist CAcert bei weitem nicht aus der Welt; stattdessen zeigt es die Notwendigkeit für eine offene und freie Zertifizierungsstelle, die aus Überzeugung für ihre Nutzer agiert, statt nach Profit zu streben.