Security and Compliance in the IBM Cloud – Posture Management

Moving workloads to a cloud provider presents a fundamental shift in the way security is handled for most organizations. The transition from being responsible for security for the entire stack in an on-premise DC to the shared responsibility model in a cloud environment is an area where security and operations teams need to pay close attention. A cloud service should have the lines of responsibility documented and client responsibilities clearly articulated so that there is no misconception. A lot of security breaches occur due to misconfigurations of a cloud service by an organization and assuming the cloud provider is responsible for all of the security. 

As an organization, how do you know if your cloud services are properly configured and where your risks are? With how accessible cloud services can be, not all cloud assets may be properly secured or tracked. This is where Cloud Security Posture Management (CSPM) tools come in. These tools provide security teams visibility by monitoring cloud environments to ensure that the deployed services or infrastructure do not have misconfigurations. This allows security teams to quickly act and remediate security issues in cloud service configurations instead of the misconfiguration going undetected until an attacker finds it. 

All the major cloud providers will offer some form of security and compliance detection for their cloud. Security vendors have CSPM products that work across cloud environments. In IBM Cloud, the Security and Compliance Center provides visibility into IBM Cloud services as well as some visibility into other cloud provider’s services. The service focuses on Posture Management, Configuration Governance as well as Security Insights from other tools in the cloud. Let’s take a look at the posture management features. 

The basis posture management functionality of the Security and Compliance Center comes from the IBM acquisition of Spanugo in 2020. A key part the service is defining a profile and a scope and attaching them to a scheduled scan. A profile would be made up of a collection of security controls called goals and a scope would be a collection of resources such as a resource group. There are numerous predefined security profiles such as ‘IBM Cloud Best Practises’ and CIS Benchmarks. Custom profiles can also be created. 

Scans can be scheduled to occur with the profile of controls on a defined scope as needed. These tools are meant to enable continuous security monitoring, so I would recommend at least a daily scan of the environment to ensure that any misconfigured services are quickly detected. Results of the scan populate the dashboard with a posture score and which resources are in violation of the specified controls.  

From the scan results above that the VPC Security Groups and ACLs are configured to allow connections to port 22 and 3389 from any source. Additionally, one of the Virtual Server Instances has a floating (public) IP address. This is against best practise and the combination of these misconfigurations would allow remote attackers to potentially access my virtual server.

Leveraging the Security and Compliance Center will help security teams ensure that they have a strong security posture and that their deployments in IBM Cloud are configured to best practises, helping them avoid costly security breaches. 

Deploying highly available web servers in a VPC on IBM Cloud

I frequently get a lot of questions on how to go about setting up a VPC in IBM Cloud so I decided to make a video out of it.

In the video below I step through how to create a VPC, how to deploy 3 Virtual Server Instances across multiple Availability Zones to run Apache and lastly how to expose those web servers to the internet using a VPC load balancer.

I’ll be following this up with videos on how to expand this VPC with other services and connect to a Classic VMware environment.

IBM Cloud – Classic vs VPC Networks

 In an earlier post, I wrote about how I needed to create a squid proxy server to get access to the internet from a server in my IBM Cloud Classic private network. What I want to do in this post is dive a bit deeper into the Classic network architecture and how that compares to Virtual Private Cloud (VPC).  

To recap, IBM Cloud Classic has separate private and public networks within a customer account. Servers that get deployed are placed into VLANs on the private network. They can optionally have VLANs for the public network connected to the public side. Having a public VLAN on the server means that the server also gets an internet routable IP address. From a security standpoint, security groups or a network perimeter firewall should be used as a layer of network protection from connections from the internet.  

Typically, what I recommend to most clients is to not use the public VLAN on the servers. While it could be secured as mentioned above, I (and most security teams that I deal with) do not feel comfortable with an internet routable IP on the servers directly. If an admin makes a mistake such as removing the security group or removing a public VLAN from a firewall, that server becomes exposed.  

My recommendation is to have the servers connected to the private VLANs only and have all traffic to or from the internet be NATted by the firewall which is connected to the public VLAN. More complex environments may have multiple firewalls.  

Having separate networks allows for servers to be physically disconnected from the internet. This has its benefits from a security point of view but does take some time to set up if internet connectivity is needed since the cloud administrator would need to configure the firewall device for NAT. The firewall can also become a bottleneck depending on throughput requirements because it is deployed on dedicated hardware. More firewalls may be needed as the environment scales out.   

There are a few use cases for why you would put the public VLAN directly on the server:  

  • The server is in a network DMZ  
  • An application does not work well with NAT  
  • Extra bandwidth for bandwidth pooling  

The first two bullets are straightforward while the last one is more from a billing perspective. IBM Cloud Classic networking gives a specific amount of free internet egress allotment when servers are deployed with public interfaces. This can range from 250GB to 20TB per server. These allotments can be pooled to be shared by all the servers in the region. Many customers never get internet egress charges since their usage falls within the free allotment.  

One of the main challenges with Classic networking as mentioned is getting it set up in the first place. For most customers with steady-state workloads, it is a one-time setup. For customers that are looking to build and tear down environments, some further configuration may often be needed. For example, if there is a new VLAN that is created in the cloud to isolate new servers for a specific project, the configuration needs to be added to the firewall to protect that VLAN.   

Another challenge in Classic networking is that it automatically assigns IP subnets to the customer account, from the 10.0.0.0/8 address space. This does not work for most enterprise customers. The configuration is needed on the firewall to enable custom addresses, through the creation of an overlay network.   

This is where VPC networking comes in. VPC allows customers to create their cloud environment on top of the IBM Cloud network. Where Classic networking is built using physical appliances, VPC uses logical components.   

For example, if I were to deploy a Virtual Server Instance (VSI) in a VPC and needed to have outbound internet access, I would not have to deploy a physical firewall device to perform NAT like in Classic. I could activate the public gateway in the VPC to perform NAT. It can be activated in seconds at the click of a button or using automation tools such as Terraform. This is important because it allows customers to be more agile and set up environments quicker than they could with Classic.  

There is also no restriction on private addresses that can be used; customers define the subnets that they want the servers to be provisioned on without workarounds such as using an overlay network like in Classic.  

Overall, VPC is a significant improvement for cloud networking over Classic but when implementing a new deployment, how would you decide on which to use? If implementing a new deployment, I would recommend deploying in VPC. But that may not be possible. VPC in its current form does not have complete feature parity with Classic in the services it supports. As of this writing, VSIs work in both Classic and VPC. Bare Metal servers and VMware solutions sit in Classic only. The Kubernetes Service clusters can sit in both, but only by using VSI worker nodes in VPC. Eventually, I expect all these services to be available in VPC.   

VPC is also still being deployed to all regions worldwide. Today it is targeted at Multi-Zone Regions (MZRs). These are the main cloud regions that have multiple Availability Zones (AZs) in a geographic region which means these regions get new services first. Single Zone Regions (1 AZ in a region) today are Classic only. So, depending on geographic or data residency requirements, deploying in an SZR with Classic may be a requirement.   

For enterprises that start with a Classic environment and want to have new deployments in VPC, it is possible to connect Classic and VPC networks together using the Transit Gateway service. This would be a common pattern as new modernized workloads run in VPC on the Kubernetes Service, while still needing to access data and legacy applications running in Classic.  

In a future post, I will show how to create a VPC and set up connectivity between it and a VMware cluster running in a Classic environment.   

Setting up a Squid Proxy on Centos 8 in IBM Cloud

Squid is an open-source proxy server that can support a wide variety of protocols such as HTTP and HTTPS. One of its uses is as a cache in front of large websites to accelerate the delivery of content. In my use case I used it as a forward proxy for outgoing HTTP requests to the public internet.

IBM Cloud Classic has a physically separated network for private traffic and public internet traffic. This physical separation allows clients to securely deploy workloads solely onto the private network with no ability for access to come from the internet. Typically in these scenarios, all network traffic would come through client Direct Links into the private network.

I recently needed to set up an HTTP proxy for a server that was on a private VLAN on the IBM Cloud Classic network.The server I had deployed had no access to the internet but needed to make an HTTP rest call to an endpoint on the internet to activate a software license. Since this was going to be a temporary requirement, I decided to set up a Squid proxy server on a Centos Linux virtual server instance (VSI) in order to provide internet access.

This VSI would have interfaces on the private and public network, which will allow it to receive traffic from my server on the local cloud network and make requests to the internet. After a few moments when the VSI finished deploying, I can see the public and private IPs of my Squid VSI.

I then setup Security Groups to block incoming traffic from the public interface to the VSI as a security precaution.

Setting up a basic Squid proxy on Centos 8 for my use case can be straight forward. Once connected to the VSI with SSH, run the following command as root or with sudo:

dnf install squid

Once Squid is installed, edit the /etc/squid/squid.conf configuration file. In this configuration file, there will be several default networks already set under the ACL for localnet. These can be commented out. Since I wanted the proxy only usable by my server, I added its IP address in specifically.

acl localnet src 10.141.20.100/32

Once done, save and quit the file. Then restart the Squid service with the command:

systemctl restart squid

Squid uses the default port of 3128. This port can be changed in the configuration file but it is not required to do so.

Using the private address of the Squid VSI and port 3128, I configured the proxy settings of the application on my server. It was able to make outgoing requests to the internet and activate the application license. Once I was done with the proxy I deleted the VSI, cutting off any public network access for my server. And because this is public cloud, I only paid a penny for that VSI for the hour I used it.

In a follow up post, I will show you a more in depth walk through on deploying a three tier application with VSIs.

Does vSphere have a place in your public cloud strategy?

Now I know what some of you may be thinking, if you are going to move workloads to a public cloud, why would you need vSphere anymore? Today I will give you a few reasons to consider VMware vSphere as part of your cloud strategy.

Server virtualization, may not be the hot technology that everyone is talking about these days. VM hypervisors is a commodity today. Containers, serverless, and cloud-native services are the future for greenfield or modernized workloads. And yet, the 800-pound gorilla in this space, VMware’s vSphere is a robust, time-tested platform which forms the cornerstone on top of which most organizations run their x86 workloads today. It is not uncommon when I talk to clients to discover that they are running hundreds or even thousands of VMware virtual machines in their data centers today.

With more and more enterprises looking to either shift or deploy new workloads on a public cloud, VMware’s place in the data center may not be as key in the future as it once was. Even if your workloads are not modernized and still running as virtual machines (and today that is still most enterprises), these can be run as virtual machines in a public cloud without VMware. All the major public clouds will have a cloud-native VM capability that allows customers to run their Windows and Linux virtual machines without the management or licensing of a hypervisor. So if you can deploy your virtual machine workloads on an IBM Cloud VSI or AWS EC2 instance, where is the place for VMware in our cloud strategy for running workloads? For me, there are three key reasons.

Firstly, migrating workloads onto a public cloud, or getting them back out is not always straight forward. Workloads should be portable. For enterprise clients with complex legacy workloads, having to convert those workloads to use whichever hypervisor format the cloud provider uses can complicate the migration and increase the risk of something not working or performing correctly. Using VMware vSphere on a public cloud provider reduces that risk since the virtual machines stay in the same format as they were. VMware also has tools like HCX to do things like stretch the on-premise network to the cloud, simplifying the migration for workloads that have complex dependencies. Keeping the virtual machines in VMware format also has a secondary benefit of making it easy to get them back out of the cloud provider without conversion or export.

Secondly, running VMware in the cloud reduces the amount of operational process and tooling change required for Day 2 operations. Running cloud-native services will require a change in organizational processes and does require a certain level of cloud maturity. While the level of control that clients get over VMware based solutions in a public cloud will differ depending on the cloud provider, I can say that IBM Cloud for VMware solutions allows clients to have root access and control of vCenter. This level of access enables clients to bring and continue to use the tools and processes that they are already using.

Lastly, running VMware on a public cloud allows organizations to take advantage of some of the benefits of the public cloud such as an OpEx model and on-demand resource scalability with a minimal commitment for workloads that may not scale-out as well as they scale up. When running VMware on a public cloud, you are generally paying per host in the cluster. This model allows you to scale down the number of hosts in the vSphere cluster and run denser during off-peak times of the year and then scale up quickly during peak months by adding additional hosts. In an on-premise environment I would have to have to size my cluster for peak usage. Being able to scale out a cluster and scale up the legacy workload within it during those peak times does bring some of those cloud benefits to otherwise traditional workloads that don’t work well with an application scale out model. This type of cluster scale-out approach also works well for disaster recovery use cases.

Now I have provided several reasons why you should consider VMware vSphere as part of your cloud strategy, but this does not mean it should be the end goal. Just because you may move your workloads into a public cloud with VMware does not mean that is where the journey ends. Running VMware on a public cloud should be a stepping stone to getting onto that public cloud while reducing risk and simplifying operations. Once in the public cloud, start looking at parts of these workloads and begin modernizing where it makes sense with containers or cloud-native services. This becomes much simpler to do when the workloads and dependencies are all in the cloud together.

Now that I have given you some reasons to consider vSphere as part of your public cloud strategy, let’s see how this could actually work. In a follow up I will show how some of the concepts I talked about can be put into practice. Stay tuned.