Microsoft Open Sources ESE, the Extensible Storage Engine
In a surprise development, Microsoft has released the source code for the Extensible Storage Engine (ESE) on GitHub. Fans of the non-SQL database engine, which has powered every version of Exchange since the initial 4.0 release twenty-five years ago, now have the chance to peruse the ESE code. Although Microsoft isn’t accepting suggestions to improve the code for now, they say that they’ll accept contributions in the future.
ESE (aka “Jet Blue”) is most closely linked with Exchange, but it’s featured in many other Microsoft products, including Active Directory. It’s also used by Windows 10 PCs today. A search of my PC revealed sixteen .EDB files, including Spartan.edb, apparently used by the Edge browser for backups.
The Evolution of ESE
Exchange 4.0 used ESE for its sole 16 GB mailbox database. The original implementation of SharePoint (code name “Tahoe” or SharePoint Portal Server 2001) used ESE too. SharePoint later moved to SQL and an engineering effort (project “Kodiak”) also considered moving Exchange to SQL around the Exchange 2003 timeframe. That work concluded that ESE was a better option for the kind of transactions a mail server processes. Microsoft then invested in driving down the I/O profile of Exchange from the heavyweight demands of Exchange 2003 to be able to run on JBOD. Reducing the cost of storage was an important influence over the economics of Microsoft’s cloud.
Today, Exchange 2019 and Exchange Online servers run a very different and much-developed ESE capable of supporting stretched Database Availability Groups (DAGs) with database copies deployed over up to sixteen servers. Hundreds of millions of Office 365 and Outlook.com users depend on ESE to get work done. In addition to user, group, shared, room, and other mailbox types, Exchange Online storage is the backbone of the Microsoft 365 substrate, which stores “digital twins” of information gathered from multiple workloads to make services like Search work. All in all, the ESE developers have been busy!
Passwords Haven’t Disappeared Yet
123456. Qwerty. Iloveyou. No, these are not exercises for people who are brand new to typing. Shockingly, they are among the most common passwords that end users choose in 2021. Research has found that the average business user must manually type out, or copy/paste, the credentials to 154 websites per month. We repeatedly got one question that surprised us: “Why would I ever trust a third party with control of my network?
In poking around the code, my favorite piece is ESEUTIL (or EDBUTIL as it was known until Exchange 2000). At one time, this utility program (Figure 1) was an essential part of an Exchange administrator’s toolkit. Defragmenting a database with ESEUTIL /d was the only way to shrink a database and recover disk space. For instance, on page 200 in my Exchange 4.0 book, I report that I was able to shrink a 500 MB database to 284 MB. Being able to shrink a database by almost half was a big thing in the days of small disks. More importantly, running ESEUTIL /p could rescue a database after some physical corruption occurred, usually because of a bad storage driver or hardware failure.
In the eyes of some, ESEUTIL was the universal panacea for any Exchange problem. Of course, it wasn’t. No utility program can be so powerful – but it didn’t stop some people recommending the periodic use of ESEUTIL to defragment databases or cure other ills. Over-enthusiastic use made ESEUTIL the colonic irrigator for Exchange.
Fortunately, drivers got better, hardware improved, and the ESE developers fixed bugs. From Exchange 2003 onwards, there was little reason to run ESEUTIL unless recommended by Microsoft Support. And once log shipping and the DAG became the norm, ESEUTIL moved to be an esoteric antique utility, pulled out only as a last-gasp effort to save a database if all else failed.
The Joy of Old Code
None of this takes away from the thrill of being able to peruse the ESEUTIL code. It’s like viewing an old friend, like the progress bar (Figure 2), watched so often in the 1990s as ESEUTIL processed databases (always too slowly).
The only bad thing is that Microsoft has removed comments made by developers from the code. I bet that developers inserted some choice remarks as they struggled to patch up broken databases. Microsoft says: “We will be pushing enhanced and cleaned up comments as we are able to review them.” In other words, after removing all the rude words and interesting comments which might illuminate some of the challenges encountered and met by developers over the years. Still, we have the code and that’s the important thing.
More to Come
Apart from the ESE code, Microsoft has published API documentation to help developers understand how ESE databases work. Additional insight into the background and function of the database engine is available from the ESE wiki page. The only thing that’s missing is that Microsoft has still to publish build files to assemble the ESE code into something that will run on a server.
Now that ESE is available in GitHub, who knows what code Microsoft might release next? Is SQL on the way? Whatever comes, it will be interesting and valuable.