Have you ever wondered how much effort it would take to adopt SCS standards in your OpenStack cloud?
We wanted to know this too, and as part of our work in the SCS standards team, we evaluated
the process of making a vanilla OpenStack cluster SCS-compliant.
In this blog post, we want to share the results of our findings and the process we went through.
Rest assured – it is actually quite easy to adopt SCS standards!
Where we started from
Our focus in this evaluation was on OpenStack clusters and therefore the IaaS standards, because for the IaaS layer we already
had a reference SCS Compatible IaaS scope at the time we started (in the future, a similar evaluation and blog post for the KaaS layer
is planned).
For the purpose of our evaluation, we set up two OpenStack clusters with Yaook
("Yet Another OpenStack On Kubernetes"): a virtualized test setup in our OpenStack cloud – i.e., OpenStack in OpenStack – and a
bare-metal production setup.
Yaook is a lifecycle management tool for OpenStack which leverages a Kubernetes cluster (provided by
Yaook K8s) to deploy and manage OpenStack components by means of the
Yaook Operator.
For the bare-metal production deployment we additionally used Yaook Bare Metal to deploy
and manage server hardware, including rollout and configuration of operating systems, networks and disks.
This test setup is represented in the following visualization provided in the Yaook documentation:
At the time of writing, a vanilla Yaook deployment is not SCS-compliant and, as such, it is the ideal playground for our evaluation.
Even better: the lessons we learned while adopting IaaS standards in these deployments can be easily transferred to other OpenStack
deployments which do not use Yaook.
Required standards
As it was already explained above, the main effort leading to this post was focused on the IaaS standards, mainly because
it was clearer which standards needed to be fulfilled for SCS Compatible IaaS scope.
In the SCS standardization framework, a scope groups multiple SCS standards for a certain layer (e.g., IaaS) to provide
a common ground for certification of a cloud service provider (CSP).
These scopes can have multiple versions, which move forward with updated and/or new standards. Older scopes deprecate
after some time, which gives CSPs time to update, but also keep the technology moving forward.
While it is true that all stable standards theoretically need to be complied to, some of them don't have tests
yet and/or are not featured in a scope, and are therefore not checked by most CSPs yet.
In particular, we focused on SCS Compatible IaaS v4, which is the effective scope at the time of writing.
The SCS documentation website lists the currently effective standards for this
scope which are relevant for our OpenStack deployment:
- SCS-0100-v3 Flavor Naming Standard
- SCS-0101-v1 Entropy Standard
- SCS-0102-v1 Image Metadata Standard
- SCS-0103-v1 Standard Flavors and Properties Standard
- SCS-0104-v1 Standard Images Standard
- SCS-0110-v1 SSD Flavors Decision Record (enhances SCS-100-v3)
Achieving compliance
In theory
After figuring out the relevant standard documents, the information required to achieve compliance needed to be extracted.
The focus here was on analyzing the document for the keywords mentioned in SCS-0001-v1 Sovereign Cloud Standards.
This provided all expected values, configurations and pre-configurable setups relevant for the OpenStack cluster.
Another source besides the standard documents were the tests provided for them. For example, SCS-0104-v1 Standard Images
doesn't provide a set of standard images, since they could change over time due to new requirements and the SCS didn't
want to change their standard document every time. Instead, the test provides a YAML file with the required
images as well
as additional meta information; the exact schema of this file is defined in the standard.
Without going too much in-depth (since most of this can be found in the standards themselves), the following points need
to be achieved in order to provide an SCS-compliant OpenStack cluster:
- Flavors must follow a naming schema defined by SCS-0100-v3 Flavor Naming if they start with
SCS-. This naming schema
also requires the underlying assignments (like core count, RAM, etc.) to be aligned with it. - To fulfill SCS-0101-v1, VMs must provide enough entropy for cryptographic operations.
- Images should be labeled with plain
Distribution Versionnames and provide relevant metadata, so called properties.
These properties are defined by SCS-0102-v1 Image metadata, some of which are also mandatory. - SCS-0103-v1 defines a list of mandatory and recommended flavors, which also follow the flavor naming scheme.
It requires additional properties, so-called extra specs to be defined in order to indicate an SCS flavor.
SCS-0110-v1 adds to this, since it requires two additional flavors with local SSDs or NVMEs as mandatory flavors. - SCS-0104-v1 defines a YAML file containing a list of mandatory and recommended images as well as metadata like their sources.
In practice
Luckily, most of these standards can be easily adopted with the help of the OSISM tools openstack-flavor-manager and
openstack-image-manager, which both offer options to create SCS-compliant flavors and images with the correct names
and relevant meta information. openstack-flavor-manager can do this fully automatic, whereas the openstack-image-manager requires
a file containing the necessary information; this is nonetheless easier than doing this work manually, since only one
file needs to be maintained and up-to-date with the standards.
First of all, install the tools:
pip install openstack-image-manager openstack-flavor-manager
Assuming that you have admin credentials for your OpenStack cloud (e.g., via ~/.config/openstack/clouds.yaml),
installing all necessary images and flavors requires only a few commands:
openstack-image-manager --cloud $CLOUD_NAME --images images.yaml
openstack-flavor-manager --cloud $CLOUD_NAME --recommended
Of course, you'll need to replace $CLOUD_NAME with your actual cloud name from your clouds.yaml.
The images.yaml contains all images with their metadata, sources and properties to be installed by openstack-image-manager.
We provide an example images.yaml that you can use.
Making all flavors compliant requires a bit more work if only a subset of your compute hosts have local SSD storage.
Local SSD storage is required by the two mandatory flavors SCS-4V-16-100s and SCS-2V-4-20s.
If this is the case, one needs to configure the nova scheduler to support host aggregates and group the compute hosts
with SSDs into an aggregate as described in the OpenStack docs.
Assuming that you have created such an aggregate with the property ssd=true, you can bind the two SSD flavors
to it as follows:
openstack flavor set --property aggregate_instance_extra_specs:ssd=true SCS-2V-4-20s
openstack flavor set --property aggregate_instance_extra_specs:ssd=true SCS-4V-16-100s
Without this additional configuration, workload might be scheduled to non-SSD-capable hosts.
Lastly, the entropy standard doesn't require any work as long as you only provide VM images with Linux kernels version 5.18 and newer.
This is the case for all the standard images mentioned in SCS-0104-v1 (which we have already configured above), and as such,
VMs spawned from them will have enough entropy.
Otherwise, you'll need to ensure sufficient entropy is available by some other means, e.g., with special CPU instructions such as
RDSEED/RDRAND.
The details are out of scope for this blog post, but the SCS-0101-v1 standard
lists some possible implementation approaches as a starting point.
But at what cost?
A relevant question we had with this was the cost of adopting all these standards for an OpenStack cluster.
Based on the work time after the cluster setup and without debugging, since this part would be minimized on a second
or third attempt at this, we estimated around 4-6 person-hours for a minimal, freshly installed cluster. Doing this multiple
times could reduce this time even further so something like 1-3 person-hours.
Now it is important to mention, that we had nearly ideal circumstances, since there was neither hardware missing nor
the additional costs associated with adopting an older cluster which already contained data.
If an older cluster needs to be adopted to the standards, it would be necessary to add metadata to existing images, possibly
change their names and (if desired) change flavor names to the SCS naming schema or better add the SCS- flavors as additional flavors to avoid breaking users of the previously existing flavors. This would require significantly more time
to do; we estimate this with around 0.2-1 person-hours per image or flavor. If this needed to be done more often or multiple
times, some form of automation would be recommended, but this would also incur some upfront person-hour cost.
Additional costs could come up if no SSDs were provided for the cluster. This would require a hardware upgrade, incurring
cost for hardware (120-200€ per Terabyte at the time of writing this blog post), server downtime as well as person-hours. The actual costs here are hard to estimate
and would probably change from case to case.
Nonetheless, it is worth to mention that in most cases, SCS compliance should be easily achievable for most OpenStack clusters
without having too much overhead in adoption costs. This could obviously change in the future with the arrival of
additional standards. Also, OpenStack setups that diverge from the vanilla upstream configurations significantly may cause additional effort to get into compliance.
Conclusion and outlook
In summary, it was pretty straightforward to adopt the current SCS standards of the IaaS layer and now we have a prime
example that you can achieve this even if you do not use the SCS reference implementation.
Overall, using Yaook for our evaluation allowed us to quickly set up an OpenStack cluster and focus on the adoption of
SCS standards in this cluster, albeit the nested virtualized test setup was a bit tricky to handle.
As a result of our efforts, we were able to collect valuable feedback which we used to improve general configuration,
our conformance test scripts and some of the involved tooling.
In the end, this setup was moved to another physical location and will provide the first SCS-compliant cluster of
Cloud&Heat Technologies GmbH built with Yaook.
Of course, work doesn't end here: in the future, we will need to keep up with the evolving standards.
And of course, we won't stop at the IaaS layer: stay tuned for another blog post for the KaaS layer!