Games Vendors Play with Exchange Hardware Configurations
Planning a Hardware Refresh for Exchange?
Although the success of Office 365 means the market for on-premises servers is declining, many organizations want to stay master of their own destiny and run applications like Exchange or SharePoint in-house. One of the advantages of Microsoft’s approach is that they still produce new versions of these applications and enable hybrid connectivity between the cloud and on-premises environments. We do not know how long this will continue, but it seems reasonable to expect at least one more round of Exchange and SharePoint versions in the future.
If you decide to stay on-premises, you might also decide to refresh your hardware platform, perhaps alongside a move to Windows Server 2016, So you go into the market and begin a search to find suitable configurations. If you are looking for Exchange servers, you might find your way to Microsoft’s ESRP page and conclude that anything listed there as an “Exchange reviewed solution” is a good bet. But it’s not and here’s why.
The Race to Impress
Hardware vendors like having the fastest solution or one that supports more users, mailboxes, I/Os than their competitors. It is natural to want to be the best, fastest, biggest, or whatever and it is part of the strategy played to attract and win customers. Microsoft helps vendors understand applications like Exchange so that customers can buy configurations with confidence that the software will run well. But a program like ESRP is no guarantee that a tested configuration will be viable in production.
Say Goodbye to Traditional PC Lifecycle Management
Traditional IT tools, including Microsoft SCCM, Ghost Solution Suite, and KACE, often require considerable custom configurations by T3 technicians (an expensive and often elusive IT resource) to enable management of a hybrid onsite + remote workforce. In many cases, even with the best resources, organizations are finding that these on-premise tools simply cannot support remote endpoints consistently and reliably due to infrastructure limitations.
Think about a Formula 1 car. It can complete a 200-mile race at high speed but you would never use such a car for your daily commute. It is too high-tech, costly, dangerous, and consumes too many resources.
The same is true for some of the configurations tested by vendors. They meet the bar set by Microsoft because they satisfy the requirements set down for testing, but the configurations often do not satisfy criteria such as long-term robustness and reliability. The most egregious examples are in the solutions listed for 50,000 or more mailboxes. One favorite example is the IBM solution proposing a 2-DAG solution (24 servers) with 2 copies of each database for 120,000 users.
The bottom line is that no one, especially no one from the development team, reviews the solutions listed in ESRP to ask the hard question whether these solutions will work for customers. If this happened, I suspect that the configurations would be very different.
One game that vendors play is to downsize the tested mailboxes. Today, people want large mailboxes (just look at the 100 GB quotas offered in some Office 365 plans), so using test 1 GB mailboxes is not a representation of what a system must cope with in production.
Reduced Database Replication
Another game is to reduce the number of database copies. Microsoft’s preferred architecture for Exchange 2016 features four database copies, one of which is a lagged copy. This is the formula proven inside Exchange Online where Microsoft does not use backups and depends on Native Data Protection to keep mailbox data safe.
If you run a small Exchange environment, you might not want to take on the cost of so many database copies (possibly because you do not have four servers in a DAG), but at the high-end, suspicions are always raised when a vendor proposes to use fewer than four copies. Reducing database copies reduces the load on servers and storage but it does not help to create a realistic test.
Vendors can and will argue that unique features in their platform reduces the need for so many database copies. However, I always have doubts about lessening the ability of an application to use its inbuilt logic to protect its data, even on virtualized systems, especially when the number of host servers is less than the number of mailbox servers deployed in a DAG.
The golden rule is always the more copies of a database are available to Exchange, the smaller the risk of losing data through logical or physical corruption. I should show my colors here and say that I still prefer cold, hard, physical servers for Exchange. The same is true within Exchange Online where Microsoft runs no virtual mailbox servers.
Expensive, Fast Storage
Using high-end technology like flash storage is another way to generate great test results, especially in driving the Jetstress utility to new heights. Jetstress is not representative of real-life activity. The program exists to generate an artificial I/O workload to test storage to be certain that a configuration can support Exchange. It is possible to generate too many I/Os with Jetstress if that is what pleases you.
I wonder if a justification ever exists to use expensive high-end storage with Exchange. Fifteen years ago, Exchange 2003 was a real pig when it came to storage and consumed SAN resources with gusto. Since then the Exchange database team has driven the I/O profile steadily down to allow Exchange to run well with low-cost (slow) JBOD. Of course, Microsoft has their own good reasons for focusing on I/O because they have the small matter of > 100,000 mailbox servers to run inside Exchange Online. But that does not take away from the point that low-end storage combined with enough database copies is a highly efficient and cost-effective choice for Exchange storage designs.
Sources of Wisdom at Ignite
If you are looking for some solid advice about a hardware refresh for Exchange on-premises servers, go to the Ignite 2017 session “Design your Exchange infrastructure right (or consider moving to Office 365)” featuring Boris Lokhvitsky and Robert Gillies. Other interesting Ignite sessions are “Inside Exchange Online” by Matt Gossage (who was heavily involved in the crusade to reduce disk I/O) and “Insights on Exchange storage, high availability, and data protection” by Lin Chen. Finally, if you have any doubt about the preferred architecture, go to “The epic Exchange preferred architecture debate” where the shy and retiring Ross Smith IV will be joined by Lin Chen and Mike Cooper to discuss questions such as what type of storage solution to use.
You do not need to attend Ignite to learn from these sessions. Microsoft will post recordings afterwards in the Technical Community.
Practicality is Everything
The fact that a vendor manages to coax a configuration through Microsoft’s test criteria is mildly interesting. The solution becomes a lot more interesting and a practical candidate for consideration into production when it is based on reality. I regret that some vendors continue to focus on achieving test results that they can trumpet for marketing purposes instead of running tests that might be useful.
No one should be surprised that hardware vendors want to make their solutions look as good as they can. This has been the case since the first computer systems rolled out. But it is important to make sure that any solution you buy can work in a robust, reliable, and secure manner over its predicted lifespan. For Exchange 2016, that could be the next ten years…
Follow Tony on Twitter @12Knocksinna.
Want to know more about how to manage Office 365? Find what you need to know in “Office 365 for IT Pros”, the most comprehensive eBook covering all aspects of Office 365. Available in PDF and EPUB formats (suitable for iBooks) or for Amazon Kindle.