I have recently changed roles in IBM, but for the past five years I have been working as a consultant in Software Services for WebSphere and specializing in messaging security. Typically discussions around security focus on some of the more dramatic threats: from the disgruntled employee with an axe to grind, to profit-driven organized crime, to state-sponsored cyber-attackers with political agendas. Although these incidents are scary, many of my clients did not find them compelling reasons to invest in security. If the enterprise had no high-value financial transactions, transmitted minimal personally identifiable information and there were no compliance drivers, the dramatic security breaches reported in the media did not tend to resonate deeply. Many of my clients simply did not feel the threats applied to them. But that does not mean that there is no risk.
I thought it would be useful to describe some of the more routine security threats that are not motivated by malice or profit. These will apply to everyone and hopefully inspire some readers to reconsider their security – or lack thereof.
One of my customers with a large messaging network has a central team who manage all of the queue managers. They have a change control process that requires developers to submit requests for new queues or channels and the administrators make sure that these adhere to standards and then build the requested objects. Turnaround time is one business day. Security on the network is of a variety I like to call "security for honest people." Specifically, the developers and users were assigned to specific client channels which were restricted from administrative actions.
But the change control process incurs a delay in handling requests and this creates an incentive to bypass it, especially for projects under chronic time pressure. Incentives will always affect behavior so naturally it was just a matter of time before developers found they could simply point their WMQ Explorer at the administrator's client channel and have free run of the queue manager. As long as they stuck to the published standards the administrators turned a blind eye and everyone was happy. Or they were, right up to the moment that someone advertised a new queue instance in Production.
The intent was that a new program would make requests to an existing service. However the developer used the name of the service queue as his reply-to queue. Because of the lack of communication, the administrators thought this new application was a new instance of the service provider rather than a client. Only after the new queue was defined in production and started receiving one third of all service requests was the mistake found. Because only about a third of requests failed, it took a while to track this down.
If you have gone to the trouble of creating standards and change management processes, proper security can enforce that these are followed.
When I mention security, most people immediately think of things related to intrusion prevention. Although that is a very important component of security, my working definition includes intrusion detection and recovery. For example, I like to monitor against a configuration baseline and then report any exceptions. This is especially important when the messaging network is managed as shared infrastructure where many applications depend on common components. The next example resulted in a major outage that lasted most of a week.
In this particular case my client implemented a new service which required persistent messages. Although testing went well, once promoted to production every once in a while a transaction would disappear. Eventually it was discovered that some of the requesting programs had failed to specify persistence on the request messages and were inheriting the value from the queue. Now and then one of these messages was lost on a channel with NPMSPEED(FAST). Rather than fixing the programs, it was decided to change the queue definition to specify default persistence. In addition to changing the intended service queue, the administrator also changed DEFPSIST in the system default model queue, reasoning that if the requests were persistent, the replies must be as well.
Of course, SYSTEM.DEFAULT.MODEL.QUEUE is a shared resource and the change affected many programs using that queue. Some programs explicitly specified non-persistence and were not impacted. Others took the option persistence as queue default and inherited the change. The result was some pretty strange behavior. Since a temporary dynamic queue cannot accept persistent messages, some programs replying to local queues failed. In other cases replies were sent to remote dynamic queues. In these cases the responding program succeeded but the channel had no place to put the message so it ended up in the Dead Queue. Once the application timed out waiting for the response the dynamic queue disappeared so it was not immediately obvious what the problem was. After nearly a week the root cause was identified and the changes, including the new service, were backed out.
The same security which monitors for malicious intrusion detection will detect and report accidental changes to the configuration baseline.
My last example is of a client who had been experiencing problems in their production network. In an effort to diagnose the issues, they took the extreme measure of stopping the production queue managers, backing them up and restoring the backups in the QA environment. They then proceeded to run a day's worth of production traffic through the system in an attempt to recreate and diagnose the problem.
Unfortunately, the channel definitions in the QA environment still pointed to production nodes upstream and downstream and the SSL certificates which were copied were accepted by the production nodes. In the course of recreating a day's traffic, all the transactions were routed to the actual production system and all the orders propagated to external business partners in duplicate. Some of the business partners executed the duplicate orders, causing a cascade that extended far outside the enterprise. The monetary and reputational cost was significant. The cost to prevent a recurrence was to reconfigure the existing security exits to look at IP address ranges as well as certificate names.
The security which protects you from malicious intruders also protects you from accidental breaches.
After reviewing the impact of these breaches and cost to prevent them, not one of my customers has ever decided that the loss was acceptable. In every case the financial and reputational impact was at least an order of magnitude greater than the remediation implemented to prevent future occurrences.
It seems that every day there's a new breach reported in the news involving millions of lost passwords, credit card numbers, government ID numbers or other sensitive information. But routine and accidental breaches are almost never reported and in my experience they represent the vast majority of incidents. In all of the examples I've cited my client assumed that their security was "good enough" or that they were simply not an attractive target so "it won't happen here." For those who believe their security is good enough, I say trust is good but verify and enforce is better. For those who believe it won't happen here I would point out that you do not have to be an attractive target to be the victim of a well-intentioned mistake. The most feared word in security is "oops!"
Don't wait for a media reported breach in your industry sector. Or an audit finding. Or an accidental breach. Secure your middleware network today.

Looking for MQ configuration baseline monitoring, exception reporting and more?
Michael DagSee www.mqsystems.com/MQS-Solutions.html#mqd...
9:57 AM