Published: Jun 01, 2021
Microsoft’s Build conference was on last week and it gave us lots of AI, Machine Learning, and every other type of “machines doe it better” cloud tech announcements. But there were also a few infrastructure announcements during the month. I’ve become the “Azure networking” person at my job, so it’s no surprise (to me) that I’m going to dive into some networking topics this month.
Microsoft announced several improvements to Azure VPN Gateway. There are some improvements to point-to-site VPN, but I’m more excited about the new site-to-site features.
When I set up an Azure “landing zone” for a new customer, 90% of the attempts at first usage go the same way. The customer tries to log into an Azure virtual machine over the site-to-site connection. For some reason, the connection fails. And the customer always says “there’s something wrong with Azure”. There isn’t – my deployments are repeats of previous template deployments with valid configurations. But we end up going through all the same validations to prove our point – the on-prem firewall is blocking external connections to TCP 22/3389. It sure would help if we could do a packet capture at the “edge” of Azure to prove that the packets never reach Azure. Well – now you can … at least with site-to-site VPN. A new feature on the VPN VNet Gateway allows you to create a packet capture from the flows going across the site-to-site VPN. I just wish that this feature was also added to ExpressRoute.
Another cause for issues is not receiving all the valid BGP routes from the customer’s on-premises network. In ExpressRoute deployments, you have been able to see the propagated routes from the peering connection. But there was no such feature for site-to-site VPN – until now. Now you can tell the on-prem network admins if they have failed to propagate routes to Azure and that might explain what Azure networked resources cannot reach/respond to on-premises servers/clients.
A much-wanted feature, support to use Azure Bastion over a virtual network peering connection, just became generally available. Before/without Azure Bastion, we needed to do some gymnastics to securely sign in to Azure virtual machines – RDP/SSH should not be open to the Internet!
Bastion is a simple-to-setup alternative to legacy approaches to enable secure sign-in. I’ve been using the preview quite a bit – it made it easier for consulting colleagues, customers, and external vendors to sign into their machines.
The new peering design option allows you to deploy a single Azure Bastion in a central virtual network – not necessarily the hub. This money-saving feature means that you don’t need to deploy Bastion into spoke virtual networks … in theory. An operator can log into the Azure Portal, select their subscription with the workload (and the one with Azure Bastion or this solution won’t work), and hit Connect > Bastion to sign in to the VM through the Azure Portal.
I am working with a typical customer right now. They have lots of legacy systems, that drive their operations, that are being migrated into Microsoft Azure. Those systems are installed and maintained by external vendors. The best way to give these vendors access is to give them read access to the workload resources and admin rights to the guest OS of the VMs. No open ports are required and they are forced to sign in to securely into the Azure Portal with limited access and visibility.
If you implement Azure Bastion with virtual network peering then you lose one thing – the limited visibility. The configuration requires that the operator has read rights to the virtual network that Bastion is deployed into. That network will have connections to lots of spoke virtual networks and that means that the operator can see the peering connections which provide a map of a large portion of the network – security officers will not like that! We’ve opted to deploy Bastion into each spoke where there is an external vendor requirement – and that’s most of the spokes and it is driving up costs.
The other issue with this approach is that Bastion is deployed as a network resource, not as an abstracted platform resource. If you use Azure Virtual WAN, then you don’t have an accessible VNet in the hub – and that means you need to deploy another centrally peered virtual network for Bastion, increasing complexity where Virtual WAN was removing complexity!
There are other issues with Bastion that are widely discussed and Microsoft has heard, but I really hope a “v2.0” release pushes Bastion down the stack and it becomes something that just works without networking so that we don’t have some of the complexity or security issues that are there today.
My first job out of college was in enterprise monitoring. I’ve worked with all sorts of monitoring tech – stuff an old employer developed for customers, CA Unicenter, BMC Patrol – and I moved onto the Microsoft Operations Manager (later rebranded to System Center Operations Manager) during the beta for the 2005 release. What I liked about MOM/SCOM was how the management was done. Management Packs provided a bundle of monitoring for the infrastructure and the workload. You could get free packs from Microsoft, third party packs, and even create your own packs.
Along comes Microsoft Azure and monitoring is … different. System Center isn’t really an option – the last I looked, the licensing costs for System Center with The Cloud were not favourable. So that leaves you looking at other options. There are a few third party names out there, but their costs are crazy. And then you settle on what Azure offers. And wow, is that a crazy mix. VMs to this, apps do that, there’s an agent for this, and agent for that, and Azure Monitor/Azure Monitor Logs/Log Analytics/Application Insights. It’s a hodgepodge mess of stuff. Microsoft appears to understand that – we’re seeing a move to simplicity via consolidation and functionality is moving to Log Analytics (data storage and some alerting) and Azure Monitor (visualisation and some alerting). But there is a big gap.
Azure monitoring is great at monitoring resources. I have full visibility over resources, their health and performance. Azure can also be great at monitoring bespoke applications built using the platform – as long as you do the work. But how many of your workloads are ready for PaaS? In reality, most of your workloads will never go to PaaS. So how do you monitor Windows/Linux services? How do you capture events from text-based log files? How do you monitor Citrix worker pools? I don’t have an answer for you. I sure hope that the teams behind Azure Monitor/Log Analytics (and there is a historic tie back to SCOM) learn from SCOM and how good it was at monitoring the guest OS and workload.