Author Archives: dirk astrath

Visit at the Datacenter on 2024-02-02

Today we visited the datacenter again to return the newly installed backup machine webdb2 and verify some settings on the signer machines.

While we were onsite, we updated neary all critical machines (including our main firewall), which caused outages of some minutes of our services.

After all updates were done we ran some tests including issuing Class1 and Class3-certifcates (client and server). A minor isse with crl-server (not running the rsync-service) was corrected afterwards remotely.

Both signers are now communicating with their webdb-servers. In the next days we’ll set up an automatic backup of webdb1 to webdb2 so webdb2 can fully replace webdb1 in case of a failure, which could not be corrected remoted.

Client certificate login temporarily not possible

Today we had been informed about an issue with client certificates issued on or after December 20, 2023 being used for login to www.cacert.org.

We did immediately switched off the login to www.CAcert.org for investigation.

We just activated the Username/Password-Login again, but keep the certificate login closed until the issue is resolved.

We will give an update with more details and plan to active certificate login again as soon as the issue is fixed.

CAcert Services mostly running again

In Wednesday another visit at the datacenter took place, where we installed the updated webdb1-machine to the rack.

There are still some minor issues left (e.g. language selection for main website, automatic mails), which will be activated again remotely withins the next days.

This time the available time on critical teams site was blocked by some investigation issues (e.g.: What caused the outage, why did the internal routines and raid did not work) and non CAcert-related issues (as we all have a family and job, which are time-consuming as well) and and outage of usable internet-connection on critical teams site.

Naming this: If you’re living next to or in Netherlands and want to give us a helping hand for infrastructure and (possible) critical team feel free to contact us via support.

Critical servers upgrade project

As a faulty connection cable between www.cacert.org and the signer made it necessary to travel to the datacenter this weekend instead of the planned schedule later this year we were able to finish this part earlier than expected: We finalized on the last steps of moving CAcert to a more modem hardware and software on critical servers.

This project was started “somehow” in May 2020 when the signer power board broke just before the Corona-Lockdown took place. The old signer was replaced by the same model at this visit. Since then we had several outages, which were mainly caused by broken hardware, sometimes noticed by our members, sometimes only visible in our internal monitoring.

Today the last of the old servers (our signer) was powered down as it was replaced by two modern machines using a more recent debian release, but keeping the old signer-coding.

The complete hardware-replacement-project reduced the power consumption of all CAcert-servers for more than 60%.

But that’s not all: We have plans to put our signer-environment to a new software written in Go, but here we need YOUR help in testing and reviewing the code. Feel free to contact support@cacert.org to get in touch to our experts.

Upcoming changes during pentecost

+++ Update +++ www.cacert.org is now running on a new server, first tests were successful. Still some finetuning needs to be done afterwards +++ update +++

During the long weekend around pentecost (“Pfingsten” as it is called here in Germany) we’re planning the next step in replacing some hardware at the datacenter.

The main reason for the visit at the datacenter on monday is it to plug the serial connection between our webserver and signer to the new machine.

As our main website will move to a new server, which was installed in the datacenter during the last visit, there will be an interruption of service while doing the final copy and reconfiguration of the firewall (hopefully not longer than one hour).

While we’re at the datacenter we’re adding two SSD-drives to infra02. During the activation of the host system on these SSDs the services running on infra02 (like blog, wiki etc.) will not be accessible and/or slower than usual.

After all services are moved (remotely/afterwards) from the HDDs to SSDs everything should be active again … and most likely faster.

At a later visit (planned in July) the old sun1-server and old infra02-HDDs will be removed from the rack.

The final step for hardware-upgrade/replacement in the critical environment will be a replacement of the old signer machine(s) by new servers and HSM-modules. For this step software- as well as development team need some assistance in reviewing and testing especially the coding (written in Go). Feel free to contact us via support@.c.o, mailing-lists or using comments to this blog-entry.

Behind the scenes …

… we’ve just activated our own OCSP-resolver on our new arm64-servers.

This sounds a little bit unspectacular, but it’s a big milestone while replacing hard- and software within our environment as the old OCSP-resolver-software could not be ported to a recent debian and arm64-environment.

All other critical services (like Nameserver and CRL-Serving) were already moved successfully to our new power-saving machines (2 Raspberry Pi4) in the last weeks/months. OCSP needed some development and testing.

The virtual machines in the old environment are now stopped, within the next days the (power-consuming) sun3-server will then get it’s final shutdown and will be removed from CAcert-Rack during the next visit at the datacenter.

Our main website and signer-software will still be kept running on dedicated servers.

Upcoming Changes for www.cacert.org

Today we switched the connection to our main website as a preparation for a “bigger” change. Unfortunately this (temporary) change is not IPv6-capable, so only IPv4 is working currently.

Over the weekend we plan to move www.cacert.org to another server for a more recent environment and add a second firewall to our rack. During this server-transition you may face some issues while using www.cacert.org, after the weekend the services should be normal again.

Early next week we’ll enable IPv6 again for our main website (maybe by using a new IPv6-Address, but that’s not yet decided).

All other services (like blog/wiki/bugs/…) should remain active as usual as there is currently no planned update.

Nameserver-Changes for CAcert.org -update-

Update: Nameserver-transition is currently finished, new DNSSEC-records are set and active. KSK and ZSK were replaced by CSK.

In the ongoing process to update hard- and software we’re moving our main domain cacert.org to another master-nameserver-machine (with different nameserver-software) within our rack …

As we’re using DNSSEC to secure our domains, we need to update KSK and ZSK-keys for our domains during this progress, too.

Therefore you may face some DNSSEC-errors or issues in resolving cacert.org-domains within the next days, but this should resolve itself within some hours/days.

As soon as the transition of the nameserver-move is finished, I’ll update this post.

Todo: Give ns1.cacert.org the “old” nameserver-address again (after next hardware-change onsite) so secondary-nameserver ns3.cacert.org can get back to work. ns3 is currently not listed at our registrar, so not active for CAcert-Domains.

(Upcoming) work at the Datacenter

Update #1:

Moving www.cacert.org to new hardware was not successful due to some firewall settings, so we decided to keep the old server active.

During the next days/weeks we’ll change some firewall settings remotely so short downtimes may apply before we try to activate the new server during the next visit in some weeks.

Original note:

During the next visit at the datacenter on Friday we’re doing some hardware-changes within our rack, especially for our main website www.cacert.org.

As a preparation we will disable most of the services on www.cacert.org on Tuesday evening. The site will be fully operational again after the new server is up and running (most likely during Friday morning).

All other subdomains like blog/wiki/… will only have a short outage while we install a new firewall.

— this post will be updated after returning back from the datacenter —

Datacenter-Visit on 2021-07-16 *UPDATE*

The activation of signer machine was successful, all pending certificates were processed in the last hours.

Short version: There is a visit at the datacenter planned to enable the signer again (and do some other maintenance there).

Long version:

Unfortunately it was not possible to get the signer back to work again during the last visit due to a hardware-issue with the harddrive.

To get the server running on the (pre-)created backup drive did fail, too …

Therefore we took the time during the last weeks (when it was not possible to visit the datacenter due to different business and personal reasons) to rebuild a test-environment on spare hardware and to train ourselves.

We should now be able to do the necessary steps to bring back the signer machine to work.

In the background we’re currently adjusting our processes to make it easier to visit the datacenter during out-of-office-times (as every trip to the datacenter takes several hours additionally to the time we’re working at the servers).

In future we plan to set up an additional confuguration, which can take over in case of a failure in the datacenter, but this will still take time. However: The exact procedure needs to be worked out as the machines are not to be connected to the internet, but need to communicate (e.g. for CRL-creation, certificate serial numbers etc.).