Figure 1 – PerfCounterInfo
Figure 2 - PerfInterval
Performance Provider
1. A "performance provider" (PerfProviderSummary) is any managed object that generates utilization or other performance metrics.
2. Performance providers include managed entities, such as hosts, virtual machines, compute resources, resource pools, datastores, and networks. Performance providers also include physical or virtual devices associated with these objects, such as virtual host-bus adapters and network-interface controllers (NICs)
3. Each performance provider—the instrumented device or entity—has its own set of counters that provides metadata about its available metrics. Each counter has a uniquekey, referred to as the counterId.
Performance Counter
4. Counters are organized by groups of finite system resources, such as memory, CPU, disk, and so on.
5. PerfCounterInfo data object, shown in Figure 1, represents a performance counter.
Name | Type | Description |
groupInfo | ElementDescription | The group of the performance counter with its label and summary details. |
Key | Int | A system-generated number that uniquely identifies the counter in the context of the system. The performance counter ID. |
Level | int | 1..4, 1 is default. Higher the setting more data is collected by vCenter. See below for details. Note this is only applicable for vCenter and not ESX host. |
nameInfo | ElementDescription | Counter name |
perDeviceLevel | Int (since 4.1) | >= level. |
rollupType | PerfSummaryType | One of average, latest, max, min, none, summation |
statsType | PerfStatsType | One of absolute, delta or rate |
unitInfo | ElementDescription | Unit for values of performance counter. |
6. The performance counter can be represented by the following dotted string notation:
[group].[counter].[rollupType]
Example, disk.usage.average
7. Here are a list of four levels and what counters are included:
a. Level 1: includes basic metrics: average usage for CPU, memory, disk, and network; system uptime, system heartbeat, and DRS metrics. It does not include statistics for any device.
b. Level 2: includes all counters with rollup types of average, summation, and latest for CPU, memory, disk, and network; system uptime, system Heartbeat, and DRS metrics. It does not include any statistics for device either.
c. Level 3: includes all metrics (including device metrics) for all counter groups except these with rollup types of maximum and minimum rollup types.
d. Level 4: includes all metrics supported by VirtualCenter, including maximum and minimum rollup types.
Performance Metric
8. The cpu.usage.average is a performance counter for average CPU utilization. When the counter is collected on CPU No. 1 of a host, a performance metric is formed. The performance metric is represented by PerfMetricId data object which consists of two parts:
a. counterId: The integer that identifies the performance counter.
b. instanceId: The name of the instance such as “vmnic1” or “vmhba0:0:0”.
Performance Interval
9. The interval has to be longer than the sampling interval, which can be found as refreshRate in the PerfProviderSummary data object returned by queryPerfProviderSummary() method, normally 20 second. For VirtualCenter Server systems, instances of this data object are referred to as “historical intervals” because they control how data collected from the ESX systems will be aggregated and stored in the database.
10. vCenter default setting is level 1 and it retains sampled stats data as follows:
· 5-minute samples for the past day
· 30-minute samples for the past week
· 2-hour samples for the past month
· 1-day samples for the past year
11. PerfInterval is explained below:
Name | Type | Description |
Enabled | Boolean | If disabled vCenter will not collect performance data for that interval or higher intervals. For example, disabling the "Past Month" interval disables both "Past Month" and "Past Year" intervals. The system will aggregate and retain performance data using the "Past Day" and "Past Week" intervals only. |
Key | Int | Id for interval |
Length | Int | Seconds that the stats corresponding to this interval are kept on system |
Level | Int | 1..4. higher the level more data is collected. |
Name | String | Name of historical interval. Example, “Past Day”, “Past Week” etc. |
samplingPeriod | Int | Number of seconds that data is sampled for this interval. The real-time samplingPeriod is 20 seconds. |
Default properties for the four built-in historical intervals include:
KEY | SAMPLINGPERIOD | LENGTH | NAME | LEVEL | ENABLED |
1 | 300 | 86400 | Past day | 1 | true |
2 | 1800 | 604800 | Past week | 1 | true |
3 | 7200 | 2592000 | Past month | 1 | true |
4 | 66400 | 31536000 | Past year | 1 | true |
12. In general, you should avoid changing the intervals as much as possible except the levels.
Real time vs historical performance statistics:
a. Real time stats collected at 20 secs sampling interval and kept for a length of 1 hour.
b. These real time samples are then processed to generate historical performance stats.
c. ESX only maintains 5 mins interval historical stats for one day length. vCenter does more aggregation and maintains stats for longer durations.
Important references:
1. vSphere SDK 5.0 reference on Performance Manager - http://vijava.sourceforge.net/vSphereAPIDoc/ver5/ReferenceGuide/vim.PerformanceManager.html
2. A decent introduction to performance manager APIs: http://www.doublecloud.org/2010/03/fundamentals-of-vsphere-performance-management/
1 comment:
Great summary. Thank you!
Post a Comment