Courtesy of IESO, Ontario
The business case for business service management (BSM) at Ontario’s Independent Electricity System Operator, commonly referred to as IESO, started out as a proposal for solving a traditional IT management problem. Yet in the process of defining the problem and evaluating solutions, the IESO discovered a way to simultaneously enlist the endorsement of business users by incorporating the supervision, control and management of the power grid and its energy market systems into the IT project.
Efficient control is imperative when it comes to energy. Businesses and consumers in Ontario use more than 152,000,000 megawatt hours of electricity per year and the IESO is the not-for-profit corporate entity providing Reliability Coordinator and Balancing Authority services as system and market operator for the Province of Ontario. IESO dispatches generation in a competitive electricity market to balance the demand and generation. IESO deploys power systems elements to maintain a reliable power grid. IESO manages a competitive electricity market through demand forecasting and the operation of market systems for generators, traders, suppliers and consumers to buy and sell the energy required to meet that demand.
This orchestration of power trading and maintaining grid reliability is no small feat. The IESO’s responsibility includes harmonizing supply and demand across more than 20 different power generation companies, five transmission companies and 91 utilities – in all, serving an estimated 13 million people in the Province. Ensuring that there is enough energy to meet that demand is an ongoing and highly-complex process, requiring the close coordination of people, process and technology.
To this end, the energy industry, including the IESO, has become increasingly reliant on technology, which has provided both benefits and drawbacks. Automation has delivered new efficiencies, but as energy and utility industry management platforms have evolved from proprietary host systems to complex distributed technologies, managing technology has also proven a greater challenge. All of this calls for an integrated technology network – one with a service-based approach.
The specifics of the IESO’s IT environment aside, considering the case from a purely IT operations perspective, the IESO’s technology challenges are not unique to the energy industry. Across any vertical market, IT departments tend to struggle with aligning themselves with their businesses. Veterans on all sides will often observe that IT and the business rarely speak the same language – and more specifically, neither group truly understands the impact of IT in the context of business.
The symptom can be traced to the way IT operations has traditionally managed the IT infrastructure. For the last twenty years, IT has managed the infrastructure in the same manner in which it was originally defined – in silos. That is to say, IT has been managed as individual elements, for example, as servers and network components. In addition, IT-centric metrics have often been applied to measure success or failure – a typical benchmark is availability – whether a server is “up” or operational 99.999 percent of the time. Yet such metrics leave a central question unanswered – what is the business impact of that server’s availability?
Few end-users care whether or not a server, a switch or a router is up or down; rather users care whether the business service is working – services that incorporate IT and business processes, such as e-mail, the ERP system or financial management applications. In the IESO’s case, business users were most concerned with the availability of systems deploying a reliable power system or facilitating energy trading – not with underlying components that make up these systems. End users at the IESO desired an improved method of IT management – service management, or more specifically, business service management.
Sea of Red
For many in IT operations, Business Service Management is the holy grail of IT management. It may also seem a vague or idealist goal. This may be because IT operations are drowning in a sea of red – a direct result of component-based IT management – and a reference to the red, yellow or green color code of traditional IT management tools. Generally, when alerts and events stemming from management systems are initiated – be that network, systems or application performance tools – the alerts are presented on a component basis as opposed to in the context of business impact. [Figure 1]
An example of a common component-based management shortcoming is when an alert refers to a server or network router outage, but without information regarding how this impacts the application or services. If there is redundancy in the network, as there often is, a single router may not in fact be severely disruptive to the service. By contrast, if it’s a server on which multiple critical applications are dependent, the outage would be very serious. Yet, commonly both alerts are issued at approximately the same time. Traditional IT management tools give operations no way to distinguish between two or to prioritize based on impact. More than likely, alerts are addressed in the order in which they are received.
Compounding this problem is the growth of technology. As technology has evolved and matured, so have the number of components, systems, and infrastructure devices that comprise any given business or business system. The increasing complexity of these underlying components that comprise these systems, while providing the end-user with more value and capabilities, have created an even more challenging environment. The volume of alerts has simply grown beyond the human ability to manage. As such, unless the IT operations can understand the impact of an outage on the service involved, the frequency, duration and adverse business effects are likely to be increased and prolonged.
Figure 1: Courtesy of Managed Objects
Silos of Data
If priority and impact alone weren’t complicated enough, the mix of heterogeneous technology components, from various vendors within their infrastructures, has trapped IT organizations into using a number of existing IT network, systems, and applications management tools to monitor the health of their IT environment. Because roles and responsibilities are commonly segmented by functional discipline – that is network experts, application specialists, database champions and so on – each specialist is focused on managing their part of the IT infrastructure. These operational silos in combination with their distinctive IT management tools create silos of data. [Figure 2]
Figure 2: Courtesy of Managed Objects
To be valuable, this information must be manually consolidated and correlated in order to achieve end-to-end management and understand the impact on performance. It’s a labor-intensive process consumes countless hours of precious IT resources and slows IT problem resolution.
Translation to Impact
Business Service Management is a fundamental shift in the way technology is managed. Instead of managing technology as individual components BSM dynamically links these components to the services delivered to the business. With strong integration capabilities, BSM translates event data – that is data about the status of an individual component – into impact. In other words, BSM is a platform of information that illustrates the impact of IT with respect to the business.
In complex environments it’s not uncommon for enterprises to have thousands, even hundreds of thousands of components in their infrastructure – each of which is capable of incurring downtime and generating an event alert. The ability to translate event data into impact is inherently valuable and requires the integration of data from existing IT management tools onto a single pane of glass. In other words, vendor agnostic BSM solutions enable users to leverage their existing investment in application, network and system management tools – extending their value – and simultaneously understand the impact to business.
As with many large organizations, the IESO’s IT environment was as complex as it was critical. The IESO has a heterogeneous IT shop that counts a combination of Compaq and Sun machines and also three different monitoring tools including HP OpenView, IBM Tivoli and Microsoft MOM. As such, it wasn’t unusual for IT operations to have four or five different screens in front of them, all showing slightly different views of the enterprise. n this way, the IESO’s IT operators often found themselves inundated with alarms, that were void of context and with no business driven way to prioritize them. I Therefore the justification for a BSM initiative seemed simple and straight forward and the IESO’s IT department began to research and develop a BSM project it called CAMS (Central Alarm Management System).
IT Finds a Conductive Partner
IT can be a lonely department when lobbying for technology spending. By some market-watcher estimates, up to 60 percent of the large enterprise’s IT budget goes towards IT operations. In other words, more than half of a typical IT budget goes towards keeping things running. The other 40 percent is spent on human resources and this often leaves little in the way for strategic IT investments. However, in formulating a business case to justify the implementation of a BSM project, the IESO’s IT operations knew their business users were facing similar challenges. Business users – that is staff monitoring the flow and the buying and selling of electricity – were also reliant on technology management tools of a different sort. In addition to multiple market systems that facilitate Ontario’s electricity marketplace, much in the way a financial service firm might support online trading, the IESO’s system operator uses a SCADA/EMS (Supervisory Control and Data Acquisition/Energy Management System) to manage the electricity transmission grid.
In principle, SCADA/EMS works in a similar fashion as IT management systems. For example, a sharp rise or drop in voltage on a transmission grid can easily damage electrical equipment and appliances. Much in the way an IT management system trips an alarm if a server goes down, voltage, current or other operational fluctuations can cause SCADA/EMS alarms. The IT operations knew that the control room operators were also considering ways to enhance their existing alarm functions through the consolidation of the various systems. The IESO’s IT department and the control room operators had been looking for an opportunity to consolidate both electricity and IT management systems into a single operations management system project.
Preventing cascading events
The IESO had another vested interest in integrating electricity grid and market systems management into the CAMS project by: providing even more information than they already had to assist them in the prevention of cascading events. As part of the North American Electric Reliability Corporation’s(NERC) investigation of the blackout of August 2003, which was initiated in Ohio, regulators found that failure of the system monitoring and control functions over the electricity grid were contributing factors to the blackout. Such failures caused operators to either delay or altogether miss corrective measures for which the company managing that portion of the grid was responsible. Consequently, the events cascaded and rapidly spread across the region. NERC later assessed that the operator’s monitoring system did not meet NERC requirements.
Although benefits are difficult to precisely quantify, there is little doubt that a power grid failure – especially a preventable failure – has large and perhaps unnecessary financial effect. The Anderson Economic Group estimated the economic impact of the August 14, 2003 blackout was US$5,000,000,000. The Toronto Dominion Bank estimated the cost to the Ontario economy to be CAN$550,000,000.
Even momentary interruptions to the power supply could prove costly. A brief power interruption on April 16, 2005 in Pittsburgh, Pa. that forced a plant shutdown had an impact on the plant owner’s earnings in the range of $20 million to $25 million after tax.
Even though IESO monitoring tools had always met NERC requirements, the IESO wanted to enhance its existing grid and market monitoring tools. Likewise, users were looking to prioritize and manage alarms with role-based alarm views.
CAMS, along with the proper business logic, would augment the IESO’s system operators’ ability to operate a reliable and competitive electricity market. IT now had a business partner and a good case for BSM: creating a system that was to also be used by business users meant the IESO had a project supported by more than just IT.
The Business Case
The business case for the IESO’s CAMS centered on providing a flexible central alarm system with the capability to consolidate, correlate and provide service-based management of alarms across a range technological components on a single pane of glass. This included applications and components from traditional IT infrastructure, and also control room alarms from the SCADA/EMS in addition to the sensitive, customer facing market systems that facilitate the energy trading environment.
The IESO considered alternative courses of action for solving this challenge: a) upgrade existing management systems, b) build a custom system from scratch, or c) select a BSM software vendor with ability to integrate all the disparate systems onto a single console.
Upgrading the existing management systems to include the three different monitoring systems in HP OpenView, IBM Tivoli and Microsoft MOM would not provide IESO a centralized, consolidated view of the system. By the same token, building a custom system from scratch would not only be expensive to build and maintain, but it would also include inherent project risk and take a great deal of time to complete.
In a research study by Fujitsu Consulting, Managed Objects had been determined to be one of the industry leaders. Further, the company has demonstrated it had successfully implemented its product in several industries including energy, financial services, government and telecommunications. As a final test to ensure capability, the IESO instituted a three-month proof-of-concept before finally selecting Managed Objects as the vendor.
Reaping the Benefits
Today, the IESO has successfully tapped BSM to develop an integrated utility network. IT operations aside, it has modeled five services that are critical to the business, including IT operations, SCADA/EMS and several market systems. This, in turn, has led to more efficient and effective management of technology and has better aligned the IESO’s IT operations with the business operations. In the end, this has greatly simplified the management of technology governing energy transmission and market trading systems.
Specifically, the IESO identified wholly positive results along four critical dimensions:
• Seamless integration was a pivotal achievement since the IESO already had substantial investments in existing and federated IT management tools. In addition, the integration of non-traditional IT management systems such as the SCADA/EMS required the use of special communication standards to enable these separate systems to interoperate. The neutral approach to integration enabled the IESO to meet these requirements and to avoid additional investments in additional tools that can often trap enterprises in being dependent on a single vendor.
• Protected investments meant that the IESO was able to continue to leverage data arising from it existing investment in IT management tools including HP OpenView, IBM Tivoli and Microsoft MOM.
• Service modeling was an important accomplishment and the basis for prioritizing alarms. Mapping thousands of underlying infrastructure components to services or applications, could prove an onerous task. Managed Objects BSM provided automated modeling capabilities to facilitate this process.
• Role-based analytical views were also a key result since user communities can now have visualization requirements tailored to their specific roles at the IESO. For example, the control room operator does not need to see event alarms stemming from HP OpenView but would want to be aware of information arising from its SCADA/EMS systems. Other role-focused views include single-sign-on views tailored specifically for those monitoring one of the five critical services such as the market systems.
The energy industry is likely only to grow increasingly reliant on technology to manage its services and as such the line between IT and the business will only continue to blur. The ability to find and marry synergies between the business side of the IESO and IT operations has given way to an efficient new paradigm in the form of an integrated utility network.
About the Authors
Michele Hudnall is a former META Group analyst and is currently director of service management for Managed Objects.
Wang Chiu is a Senior Engineer and CAMS Project Manager for the Independent Electricity System Operator, Ontario.