Tag Archives: Signing server

Signature server back in operation

Retour en fonctionnement du serveur de signature

Le serveur responsable de signer à la demande les certificats émis par CAcert dispose de deux disques durs, en redondance l’un de l’autre. Lorsqu’un dysfonctionnement se produit, aucune maintenance à distance n’est possible, car la machine n’est intentionnellement pas branchée au réseau. Seul un câble série permet d’échanger requêtes et réponses avec le reste de notre infrastructure. Aucune connexion n’est possible par ce moyen.

Or, depuis le 2 Août, nous observions la mise en attente de toutes les demandes de signature de certificats. L’équipe des infrastructures critiques est donc intervenue sur site ce 21 Août. Un problème dans le traitement d’un des certificats était la cause du blocage. Ce problème est résolu, mais reste à diagnostiquer avec précision. Il s’agit d’une série d’incidents que nous n’avions jamais vus auparavant.

Compte tenu des deux autres incidents intervenus plus tôt cette année, liés au système de fichiers de notre serveur de signature, nous devions accroitre sa résilience. Aussi, ce 21 août, l’équipe des infrastructures critiques a installé dans le rack un second serveur de signature, comme secours passif du premier. La présence de liens série dédiés vers chaque machine permettra à l’avenir de basculer très rapidement sur le second serveur de signature, en cas de nouveau problème. Dans tous les cas, les deux serveurs restent comme auparavant isolés du réseau.

Nous prions nos membres de nous excuser pour ces dysfonctionnements, et encourageons ceux résidant en Hollande où dans sa proche périphérie, à envisager de s’associer au travail de notre équipe des infrastructures critiques, ce qui augmenterait notre capacité d’intervention rapide.

Simultanément, nous espérons que l’intervention d’hier marque la fin de cette longue et exceptionnelle série.

English version

The server responsible for signing certificates issued by CAcert on demand has two hard disks, redundant to each other. When a malfunction occurs, no remote maintenance is possible, as the machine is intentionally not connected to the network. Only a serial cable is used to exchange requests and responses with the rest of our infrastructure. No connection is possible by this means.

However, since the 2nd of August, we have been seeing all certificate signing requests being put on hold. The Critical Infrastructure team therefore intervened on site on the 21st of August. A problem in the processing of one of the certificates was the cause of the blockage. This problem has been solved, but remains to be precisely diagnosed. This is a series of failures that we have never seen before.

In light of the two other incidents earlier this year related to the file system of our signature server, we needed to increase its resilience. So on 21 August, the Critical Infrastructure team installed a second signature server in the rack as a passive backup to the first. The presence of dedicated serial links to each machine will make it possible in future to switch very quickly to the second signature server in the event of a new problem. In any case, the two servers remain isolated from the network as before.

We apologise to our members for the inconvenience, and encourage those living in or near the Netherlands to consider working with our Critical Infrastructure team, which would increase our ability to respond quickly.

At the same time, we hope that yesterday’s intervention marks the end of this long and exceptional series.

New signer proves itself in use

EN: Signer is running again

DE: Signer ist wieder in Betrieb

FR: Signataire fonctionne à nouveau

ES: Firmante vuelve a funcionar

IT: Firmatario è di nuovo in funzione

The signer has been running again since yesterday, Friday, around 13:00 CEST. We then (while we were doing other work) watched the processing for about another hour… Around 0:30 CEST all outstanding certificate requests (~3000) were processed.

Things didn’t quite go as planned in June. As soon as something cannot be done remotely – there is no remote access to critical systems for security reasons – someone who is authorised to do so has to go the data centre in the Netherlands. Despite Corona, quarantine, floods, overtime at the company and whatever else comes up. That’s maybe two hours. Then two hours home again and in between the actual work. During the opening hours of the data centre, in your free time and paying for your own train ticket or petrol. It’s not always easy to reconcile all that. On Friday afternoon, however, the time had come and the Signer has now been running smoothly again for over a day.

As can be seen from the Critical Team’s plan published yesterday, preliminary work is already underway to make the system redundant throughout and even more robust, so that failures should no longer be noticed by users, because no one is interested in such failures! We are very sorry that you had to wait so long. At the same time, we thank the small core team who have sacrificed nights and weekends over the last five weeks to get the technology back up and running for the CAcert community!

Datacenter-Visit on 2021-07-16 *UPDATE*

The activation of signer machine was successful, all pending certificates were processed in the last hours.

Short version: There is a visit at the datacenter planned to enable the signer again (and do some other maintenance there).

Long version:

Unfortunately it was not possible to get the signer back to work again during the last visit due to a hardware-issue with the harddrive.

To get the server running on the (pre-)created backup drive did fail, too …

Therefore we took the time during the last weeks (when it was not possible to visit the datacenter due to different business and personal reasons) to rebuild a test-environment on spare hardware and to train ourselves.

We should now be able to do the necessary steps to bring back the signer machine to work.

In the background we’re currently adjusting our processes to make it easier to visit the datacenter during out-of-office-times (as every trip to the datacenter takes several hours additionally to the time we’re working at the servers).

In future we plan to set up an additional confuguration, which can take over in case of a failure in the datacenter, but this will still take time. However: The exact procedure needs to be worked out as the machines are not to be connected to the internet, but need to communicate (e.g. for CRL-creation, certificate serial numbers etc.).

CAcert signing server service restored

[29.08.2013 – 18:30] The operation of the CAcert signing server has been restored. It has been down from 28.08.2013 13:30 CEST until 29.08.2013 18:30 CEST.
By replacing a repeatedly failing primary disk drive, we expect that no more service outages will occur soon. All pending signing and revocation requests have been picked up and processed automatically after the restoration of the service this evening,

CAcert signing server temporarily out of service

[29.08.2013 – 09:00] The CAcert signing server is temporarily out of service. As far as we can tell, the problem started on 28.08.2013 around 13:30 CEST, and is likely to be similar to the problem we saw earlier this week.
Two CAcert critical system administrators will visit the hosting centre this afternoon in order to fix the problem if possible. We hope that the signing service will be back online around 17:30 CEST.
Currently pending signing and revocation requests will automatically be processed after the service is resumed.