With today’s post we continue the blog series about our “New Software”.
1. Rewriting the software driving our site
2. Modernising the Web Frontend
3. The Heart of Gold
As outlined in the previous post there were quite a few things to decide upfront, which doesn’t make actual implementation any easier as we will see today when going into more details with the web front end. Beware of the details though as this (and the next part) are going to be much more technical.
Modernising the Web Frontend
With the last post New Software: Rewriting the software driving our site you got a glimpse of why we are rewriting our software. Today we’re going to elaborate a bit more on the front-end part and will provide more in-depth information on the technical changes and news to expect. The main aim of the rewrite is to strengthen the security and to improve maintainability of the source code. On the other hand we aim to avoid showing too much changes to our users.
With choosing Java as the language for the front end we are taking the advantage to develop a modular design based on Object Orientated Programming. This should help to get a better transparency of the source code, which the old PHP source both was lacking. In addition to these basic aspects the use of Java allows for easier packaging of the final application as well as for reworking the code when there is need to do so. This can be aided with several refactoring and code management tools available for Java as the general tooling support is quite good here. One such example is the nice integration of static analysis using tools such as FindBugs to automatically find issues that a static analysis of the code can find, e.g. null-pointer dereferences, misused APIs, thread safety issues, …
As a continuous integration process units test are introduced to test all new features directly while writing the source code. Our tool for this is Jenkins. The unit tests do not only cover the new applied features, the whole source code is tested as well, so that side effects can be discovered in an early stage of the development. Right now there are about 250 test cases running, with many more still to come.
Another important change is the separation from the source logic and the HTML output. While PHP encourages you to intermix HTML templates and executable source, it threatens you to bite you whenever you are not cautious enough. To make it harder to be careless we decided to reduce the power our templates have: The templates are reduced to simply printing values and very easy conditions and loops. An additional safety measure with the template engine is the implicit escaping that is caused for every string that should be printed – if you need to output preprocessed HTML to the template you have to explicitly state so in the template.
The HTML output is based on HTML5 and formatted with CSS3 using LESS to support responsive design patterns and accessibility. Additionally we completely switched to use Unicode, having UTF-8 as our standard encoding.
There is also a change in the URL structure. The new URL concept is based on REST and can easily be scaled to a RESTful API in the future. No more ?id=42 or other meaningless numeric IDs for the various actions.
To avoid SQL injections we opted to use implicit escaping for all data sent to or retrieved from the database; similar to the implicit escaping used with all inputs to the template engine. This way we can process raw data internally and have the necessary conversions and escaping operations whenever the context of the information changes allowing us to concentrate on the actual processing and logic instead of caring about the security implications.
Looking at security we invested some work in reworking the way authentication is handled. While the old system only supported password-based authentication and client certificate login using X.509 there was no chance of combining multiple factors as a requirement for successful login to an account. This lack of multi-factor authentication caused some trouble in the past. To avoid this and to provide more flexibility for our users we want to change the authentication to allow for any combination of login methods, as long as at least one “strong” authentication (like password login, secure token, client certificate) in addition to an optional “multi-factor” authenticator is provided (additional factors may be older TOTPs, single access tokens, ping tokens).
To raise the security level of the password storage we will use with SHA2-256 to store passwords in a save manner. This ensures an attacker can’t easily parallelise brute-forcing or attacking the password hashes as SCrypt guarantees through its design that an attacker needs a certain memory and amount of computational power to calculate the hash value. A switch to SCrypt-SHA2-512 or other even more secure methods later on is already laid out in the source code and can be activated transparently upon next successful password login of the user.
While discussing the authentication of the user let’s take a short break to look at some other security features the new system will implement. As one of the most critical parts with transmitting any information on a secure channel is getting the channel secured we’ll be activating HSTS or also known as HTTP Strict Transport Security. Using HSTS as a baseline we ensure that everyone visiting the site to access sensitive information can benefit of protection from eavesdroppers for future visits. Even if this brings no security on its own – except for mitigating SSL stripping attacks, it sets up the stage for the second mechanism: HPKP. Using Public Key Pinning we not only enforce browsers who had a once good, encrypted channel, to enforce encryption on subsequent visits (as HSTS does), but in addition also tell what valid keys for such a good connection are. Like with HSTS also HPKP suffers the issue of being inline to the connection and thus vulnerable to a determined attacker. With DNS-based Authentication of Named Entities (or for short: DANE) we introduce a second vector for verification a browser can use when determining acceptable security certificates. And even with these mechanisms deployed there is still a lot of room for determining if the connection is secure, some of which will sure follow in a later step.
But apart from ensuring the browser of the user talks to the correct server it is equally important to secure the display of information against mistakes our systems might be doing. One of the options we choose to implement was restricting ourselves using CSP (Content Security Policy) and a separation of our content delivery hosts. Basically to avoid an attacker executing any JavaScript not authorized we split execution of scripts to a subdomain which only carries static script files, while all content that might be under control of the attacker (e.g. the field containing the user name) is delivered on a strictly No-JavaScript subdomain. Thus even XSS was possible it would harden ourselves ill side-effects. To additionally avoid being framed (no pun intended) we will be including various HTTP headers like X-Frame-Options to explicitly prohibit loading within a frame on the attackers website. Also we ask the user’s browser to handle files in the way we specify which, among others, includes disabling of MIME type sniffing. With these security mechanisms in place it raises the bar for an attacker quite a lot and – on the other side – allows us self-testing the application as CSP allows for reporting violations on its rule set. Explicitly triggering such reports can even test the browser and thus determine the protection level the application can expect from the client (e.g. if images which are included for the sole purpose of triggering a report are loaded rather than ignored we know that the client does not properly implement CSP).
Another change will be introduced for the registration of domain names. In the future there are at least 2 ownership checks required to activate a new domain with our system. There will be a choice between the long known email pings, the verification with a DNS-TXT entry, verification with reading HTTP-content from a special addressed file and the verification with CAcert server certificates securely running on internet based services like HTTP, SMTP or XMPP. The way those tests were designed ensures that someone just observing traffic might notice the checks taking place, but won’t gain any information from such observations. All identifiers in these tests are chosen randomly thus simply searching for a particular filename or content of a file with an internet search engine won’t provide you any means to impersonate that domain.
In addition to the checks during the registration of a domain there will be continuous checks to check a domain is still active. If less then two tests succeed the account owner will be informed that there is a verification problem that should be solved within a grace period before all certificates containing that domain will be revoked automatically.
All posts send from the new software will be digitally signed using S/MIME to ensure that these messages are really generated by the CAcert systems and are not SPAM. This has been planned for a long time for the existing code but we did not manage to introduce it there due to complexity and security when realising this function in PHP.
One last but not least interesting feature to be mentioned is the rework of the certificate issuing process in the front end. While in the old software the process was to get all data from the database and check it against the values given in the CSR, this is turned around for the new software: Now first the given CSR is checked and the data gained from the CSR is checked against the records in the database to pre-fill the certificate request form. Changing the way the issuing works we were able to simplify the interface while allowing for new features to come.
After the discussion on why we choose our coding framework and today’s post about the ideas for the new front end the next post will have a closer look at the details of the ideas of the new signer.