Logs, Trace and Dumps
Effective troubleshooting usually begins with a thorough examiniation of log files and any applicable product tracing. Sometimes the standard error logging will indicate what the problem is, however it may or may not be obvious what the root cause and therefore resolution is.
Take for example an out of memory error in WebSphere Application Server. This could be due to an application memory leak, inefficient algorithm(s), a JVM with too little heap allocated, a bottleneck at a connectivity point such as to a database, a product defect or inefficient garbage collection or any combination of these.
In order to make the right decision as to what action will resolve the issue you must take a closer examination of the information available. For WebSphere application server this may include any of the following depending on the problem being evaluated.
- Standard Out and Standard Error
These logs contain a combination of WebSphere Application Server JEE Container information and errors and application informaiton and errors. In addition to the symptoms these logs will give further indications as to the problem area.
- Monitoring reports - (PMI data)
Any application be it custom or commercial that collects PMI statistics from the run-time JVM can prove useful in diagnosing problems especially when the problems may be performance related. This includes the native PMI interface in the WebSphere Application Server for version 5.x and above.
When going beyond the levels of information provided above is necessary, then you must turn to tracing and dumps for analysis.
- Tracing at a component level
Each JEE container service area and associated JEE component area has a trace string assocaited with it. Most of the time in debugging an issue you will need to gather this information to receive support from IBM. However, you can conduct your own level of troubleshooting to determine if tuning and/or application re-factoring will resolve the issue. Refer to this article for more information: www-01.ibm.com/support/docview.wss?uid=s...
- Thread Dumps
A thread dump shows all live threads at the moment the thread dump is taken. Therefore, it is imperative that you take the thread dump while reproducing the problem or while the problem is ocurring if possible. Check the infocenter associated with your product for instructions. An example is provided in the url below.
- Core Dumps
Core dumps are thread dumps and more. They provide information on locks, loaded classes, some memory information, etc.
Core dumps are taken either through the wsadmin scripting interface or through process signaling (kill -3).
Core dumps are taken either through the wsadmin scripting interface or through process signaling (kill -3
- Heap Dumps
Below are some useful links with additional information:
WAS 7 InfoCenter - Heap Dump
tinyurl.com/8967nq5
Summary
Referring back to our original post in Part#1 we discussed the importance of having the fundamental knowledge required to troubleshoot effectively. We also mentioned the ISA tool. These are both essential when combined with the information in this post to troubleshoot WebSphere Application Server. Look for more posts on troubleshooting other WebSphere products from this Author in the near future.


