Troubleshooting a Metrics explosion.

Document ID:  TEC1167837
Last Modified Date:  08/03/2017
{{active ? 'Hide' : 'Show'}} Technical Document Details

Products

  • CA Application Performance Management

Releases

  • CA Application Performance Management:Release:10.0
  • CA Application Performance Management:Release:9.1 SP2
  • CA Application Performance Management:Release:9.1.4
  • CA Application Performance Management:Release:9.1.1
  • CA Application Performance Management:Release:9.1.5
  • CA Application Performance Management:Release:9.1.6
  • CA Application Performance Management:Release:9.1.7
  • CA Application Performance Management:Release:CA APM 9.5
  • CA Application Performance Management:Release:CA APM 9.6
  • CA Application Performance Management:Release:CA APM 9.7

Components

  • APM AGENTS:APMAGT
  • WILY CEM:APMCEM
  • INTROSCOPE:APMISP
Introduction:

 To troubleshoot a metric explosion, follow the instructions below.

 

 

Environment:
All supported APM Releases.
Instructions:

 

Tip 1 . Collect information on how the system is performing and get the number of metrics. The below information from the supportability metrics will give you a good idea on how the system is performing.

EM | Smartstor | Metadata | Metrics with Data 
EM | Internal | Number of Connection Tickets 
EM | Internal | Number of Virtual Metrics 
EM | Tasks | Harvest Duration 
EM | Tasks | Smartstor Duration 
EM | GC Heap | Bytes in Use 
EM | GC Heap | GC Duration 
EM | Smartstor | Metadata |converting spool to data 
EM | Smartstor | Metadata |metric num of metrics handled

Tip 2 . Always collect Agent logs (Autoprobe as well) and profile files.

Tip 3 . If possible, disable on the Agent side, the following traces as well as any other unnecessary one.

# TurnOn: SocketTracing 
# TurnOn: UDPTracing 
# TurnOn: FileSystemTracing 
# TurnOn: ThreadTracing

Please note that there is a documentation bug that says that SocketTracing is disabled if you use Typical setting. This is FALSE.

Socket is enabled by default, whether you use Typical or Full tracing options.

Tip 4 . The overall number of applications should not exceed 1500 and there should be no more than 5 per agent.

It is recommended to use 1000 -> 1200 applications.
If turning on baseline feature, the CPU drives up to max levels, you may need a more powerful server. 
There is a parameter called baselinefrequency in the EM and turning its value up from the default of 60 seconds might help reduce the CPU load. 
The property should be used only if you think turning on baseline affected the performance. Increase it to a higher value to increase performance.

It needs to go in the following entry lax.nl.java.option.additional in the lax file.

Example:

lax.nl.java.option.additional= -Dintroscope.enterprisemanager.baselinefrequency=120000

If possible, disable baselines db for testing purposes.

Tip 5 . Clamps for metrics and sql metrics should be set on the agent side to limit the number of metrics sent by the agent to prevent metric leak.

Our sizing guide recommends 15,000 metrics per agent. 
introscope.agent.metricClamp=15000

Tip 6 . Verify the EM's Smartstor db is pointing to a dedicated disk and the following property has been set as indicated. introscope.enterprisemanager.smartstor.dedicatedcontroller=true

This is a CA Technologies requirement.

Tip 7 . Make sure the EM has a min of 8 CPUs. This is advised.

The Introscope Sizing Guide notes the following:  - Two quad core 64-bit Intel Xeon 5570 processors at 2.8 GHz or higher

 

  • Enterprise Managers need a minimum of four CPU cores to perform key operations. CA Technologies recommends eight or more CPU cores.
  • Enterprise Managers perform a large number of calculations that involve floating-point math. Therefore Enterprise Managers benefit from the x86 chip design for Xeon or Opteron chips more than the RISC chip design used for SPARC and Power5 chips. For example, A Power5-based server has the same CPU and disk requirements per Enterprise Manager as a Xeon- or Opteron-based server. However, the Power5-based server has 20 percent lower capacity.

Please help us improve!

Will this information enable you to resolve your issue?

Please tell us what we can do better.

{{feedbackText.length ? feedbackText.length : '0'}}/255

{{status}}

Not what you were looking for?

Search Again >

Product Information

Support by Product >

Communities

Join a Community >