|In two decades in business, ShowingTime has never had a series of outages like the recent ones. I wanted to reach out to all MLS partners and express my personal disappointment in how we have served you recently. In the last six weeks we’ve had a series of issues that revealed needs for improvements in some of our disaster avoidance and disaster recovery protocols. In response to this, we have hired an external IT audit company to review our systems and processes. We are pouring all available internal and external resources into resolving the problem and looking for any other loopholes in our infrastructure.
The external technical audit is still under way but there have been three different issues, all of which involved 3rd party software or vendors:
1) The nationwide internet outage of August 30th affected ShowingTime the same as it did most 3rd party vendors. ShowingTime has multiple internet service providers. The CenturyLink outage impacted all three. We are investigating if there is any vendor or method we could have used to avoid that, but it seems that was something that affected nearly everyone in varying degrees.
2) On August 20th and September 8th, we were affected by an Oracle database bug that reports database corruption when there isn’t any. The bug triggers data corruption protocols in the Oracle code and requires us to restore from the last full backup, then re-apply minute by minute changes from the logs up to current. Oracle engineers originally suggested a hardware failure in computer memory was the cause (which we replaced) but after the second instance they identified a bug in the database software for which a patch is being tested. ShowingTime maintains both primary and standby databases (as well as two parallel sets of backups) but in reporting corruption to the primary that replicated to the standby, this bug hit both and forced us to go to our 3rd and 4th lines of defense which took longer to execute recovery. In addition to fixing the bug, we have added additional live replication to make sure that should we ever need to go to our 3rd and 4th lines of defense (which had never happened before), the downtime will be more in the order of 15 minutes rather than hours. Oracle is the highest-end of major database technology (and the most expensive) and we believe that they will get this right, so that we can continue getting this right for you.
3) The third issue was with a software messaging layer from a vendor named RabbitMQ that we added to improve performance and capacity. It worked efficiently for months in increasing system speed and scalability, but ran into a bug there as well that caused it to begin crashing repeatedly and taking systems down. The manufacturer of that identified and provided a patch that resolved the problem and there have been no further issues in the last month with it.
The positives are that ShowingTime has definitely not shown a stability problem in our code or in our hardware design. We scale well in advance of need and try to never be over 35% utilization at highest peaks. Your data was also never at risk due to our multiple geographically distributed backups. We actually were down longer than it turned out to be necessary because we were trying to be 100% sure that the data was perfect. These issues DID show a weakness in our dependency on some 3rd party vendors for our IT infrastructure. This is unavoidable, but we must increase our testing of 3rd party dependencies. We did reveal some contingencies for which we were not adequately prepared to move at the speed required. I also believe that our long period of extraordinary systems stability caused us to be a little slower in resolving some serious issues as it had been years since dramatic intervention was needed.
We are operating in full candor here, as we owe this to you, our partners, and we appreciate your patience and understanding, and we do assure you that issues like you have seen in recent weeks will not continue. We truly value our friendship and relationships with all of you and want to restore the level of trust that you have had in us (many of you for well over a decade).
Please do let us know what further we can be doing to serve you better.