Alerts and Events

Alerts

An alert is a notification generated by the system, based on a condition (or set of conditions). An alert can be set for a specific service.

The Alert condition allows the user to set different conditions depending on what alert notification user is looking for, the user can either set a single alert condition or multiple conditions based on the requirement. The user can choose any of the application alert conditions.

Application Alerts

  1. Average CPU Utilization
    Alerts are raised based on average CPU utilization over the previous threshold (in minutes) of application servers is above, below or equal to a specified threshold. This alert information typically helps to detect changes in load/traffic to scale the infrastructure accordingly.
  2. Sum Network In
    Alerts whenever the total size of the requests reaches a threshold over the previous threshold (in minutes). This alert information helps to detect if someone is trying to upload huge data, or if too much of data is coming in a short time. This alert may also flag an attack.
  3. Sum Network Out
    Alerts whenever the total size of responses reaches a threshold over the previous threshold (in minutes). This alert information helps to detect if someone is trying to download huge data, or if too much of data is being served over a short period of time. This alert may also flag any data theft.
  4. App Server Errors (Count)
    Alerts the user when the count of Error responses from application servers reaches a threshold number over the last set threshold (in minutes). This information helps the user to debug the servers for errors. This alert may also perform scanning prior to any attacks.
  5. WAF Events (Count)
    Alerts are flagged based on the number of events generated by the Web Application Firewall (WAF) based on the policies applied. This alert helps the user to check for an attack or false positives, upon which can block or fine tune the policies.
  6. App Server Monitoring
    Alerts user if an application server is responding to out of band health monitoring or not. This alerts helps the user to check the application server for health and fix.
  7. App Server Error Percentage
    Alerts user what portion of traffic is resulting in errors. Alerts raised on absolute error counts may not always make sense. This alert helps the user to troubleshoot the server when errors go disproportionately high.
  8. App Server Connection Errors (Count)
    Alerts user if the count of the Application server failure, or if the TCP connection reaches or crosses a specified threshold. This alert information helps the user to troubleshoot the application server for health issues or to scale the infrastructure.
  9. App Server Connection Error Percentage
    Alerts the user if Application server connection error reaches or crosses a specified value in percentage.
  10. App Server Latency
    Alerts user if the average response time of any application servers reaches a threshold in milliseconds. This alert formation helps the user to troubleshoot the application server for response time or to scale the infrastructure.
  11. App Sever Pending Requests (Count)
    Alerts user if the requests in the queue to be accepted are piling up or not. This alert information helps the user to troubleshoot the application server or to scale the infrastructure.
  12. Average Request Processing Time (ms)
    Alerts the user if the in-latency of the Lightning ADC reaches or crosses a specified value in ms.
  13. Average Response Processing Time (ms)
    Alerts the user if the out-latency of the Lightning ADC reaches or crosses a specified value in ms.
  14. Average Processing Time (ms)
    Alerts the user if in and out latency of the Lightning ADC reaches or crosses a specified value in ms.

Cluster Alerts

  1. Cluster CPU Utilization
    Alerts are raised based on Average CPU Utilization over the previous threshold (in minutes) of all the Lightning ADC instances in the Cluster is above or below to a specific threshold. This alert information typically helps to detect changes in load or traffic to scale the Lightning ADC infrastructure accordingly.
  2. Instance CPU Utilization
    Alerts whenever the CPU Utilization over the previous threshold (in minutes) of any one of the Lightning ADC instances in the Cluster is above or below to a specific threshold.
  3. Scale up/down of Cluster
    Alerts whenever a Lightning ADC joins or leaves the cluster. This alert information is helpful to update user managed DNS whenever there is a Lightning ADC infrastructure change.

Creating a New Application Alert

To create a new Application Alert, follow the below steps:

  1. Click Tenant > Tenant Name > Edit Configuration > Alerts.

    _images/image_alerts_NEW1.png

    Select Application Alerts from the Type drop-down list.

  2. Click on NEW ALERT.

    _images/image_alerts_NEW2.png
  3. Provide the Name, set the Condition (check the box next to multiple conditions to set more than one conditions), set the duration in minutes, set the frequency of alert checks before any alert is raised.

    _images/image_alerts_NEW3.png
  4. Check the box Send Email for an email to be sent to the registered account with alert notification. Also, check the box Webhook and enter the URL in the format https:// to which the alert related information is posted.

Creating a New Cluster Alert

To create a new Cluster Alert, follow the below steps:

  1. Click Tenant > Tenant Name > Edit Configuration > Alerts.

    _images/image_cluster_alerts_NEW1.png

    Select Cluster Alerts from the Type drop-down list.

  2. Click on NEW ALERT.

    _images/image_cluster_alerts_NEW2.png
  3. Provide the Name, set the Condition, set the duration in minutes, set the frequency of alert checks before any alert is raised.

    _images/image_cluster_alerts_NEW13.png
  4. Check the box Send Email for an email to be sent to the registered account with alert notification. Also, check the box Webhook and enter the URL in the format https:// to which the alert related information is posted.

    1. Cluster scale alert example and description of fields:

    {“type”:”alert”,”type”:”alert”,”timestamp”:1515401519418,”policyId”:”default-cluster-scale-alert”,”instancePublicAddresses”:[“54.70.243.175”],”clusterType”:”Lightning ADC - Manual”,”currClusterSize”:0,”instancePrivateAddresses”:[“172.31.36.162”],”clusterId”:”ibbl5a5dp6”,”serverId”:”0D24896C-F216-11E7-A35D-F5C5BBBEA5E7”,”instanceId”:”i-0f8105236392f0191”,”tenantName”:”Abhijit”,”scaleEvent”:”Down”,”instanceLaunchTimestamp”:1515401515174,”clusterName”:”dey-manual-cluster”,”clusterPrivateAddresses”:[],”prevClusterSize”:1,”clusterPublicAddresses”:[],”providerName”:”root”,”initiatedBy”:”email@a10networks.net”}

    • type
      The type of data is json being posted. It is “alert” in this case.
    • timestamp
      A UNIX epoch time indicating the time at which the alert is raised.
    • policyId
      The name of corresponding alert policy.
    • instancePublicAddresses
      The public addresses of the instance for which the scale event occurred.
    • clusterType
      The type of cluster for which the alert is raised. Currently the possible values are “Lightning ADC - Manual” and “Lightning ADC - Auto”.
    • currClusterSize
      The size of the cluster after the scale event occurred.
    • instancePrivateAddresses
      The private addresses of the instance for which the scale event occurred.
    • clusterId
      Unique 10 byte ID of the Cluster for a given tenant.
    • serverId
      Globally unique ID of the Lightning ADC for which the scale event occurred.
    • instanceId
      The instance ID (as allocated by the cloud in which the ADC is launched) of the Lightning ADC for which the scale event occurred.
    • tenantName
      The tenant that hosts the cluster.
    • scaleEvent
      The type of scale event that occurred. Possible values are “Up”/”Down”.
    • instanceLaunchTimestamp
      A UNIX epoch time indicating the time at which the instance was launched.
    • clusterName
      Name of the cluster for which the alert is raised.
    • clusterPrivateAddresses
      List of private IP addresses of all the instances in the cluster.
    • prevClusterSize
      Size of the cluster before the scale event occurred.
    • clusterPublicAddresses
      List of public IP addresses of all the instances in the cluster.
    • providerName
      The name of the provider that the tenant belongs to.
    • initiatedBy
      The ID of the user that initiated the scale down. If this is initiated by the controller value is “controller”.
    _images/scale-up-alert.png
    _images/scale-down-alert.png
    1. Instance CPU Utilization and Cluster CPU Utilization alert examples and description of fields:

    {“type”:”alert”,”type”:”alert”,”timestamp”:1515062852901,”policyId”:”default-instance-cpu-util-alert”,”privateAddresses”:[“10.142.0.4”],”clusterType”:”Lightning ADC - Auto”,”publicAddresses”:[“35.185.121.58”],”valueUnit”:”ABSOLUTE”,”clusterId”:”k4q87835ai”,”serverId”:”901059019666384389”,”operator”:”GT”,”durationInMins”:2,”condition”:“‘Instance CPU Utilization’ GT 5”,”tenantName”:”Madhusudhan”,”currentMetricValues”:{“InstanceCPUUtilization”:100.0},”metric”:”InstanceCPUUtilization”,”clusterName”:”madhugcpa1”,”function”:”AVG”,”updateFrequencyInMins”:1,”value”:5,”providerName”:”root”} {“type”:”alert”,”type”:”alert”,”timestamp”:1515585325491,”user”:”system”,”policyId”:”default-cluster-cpu-util-alert”,”clusterType”:”Lightning ADC - Manual”,”valueUnit”:”ABSOLUTE”,”clusterId”:”mc47ag1l2i”,”operator”:”GT”,”durationInMins”:3,”condition”:“‘Average Cluster CPU Utilization’ GT 5”,”tenantName”:”gumpu”,”currentMetricValues”:{“ClusterCPUUtilization”:12.0},”metric”:”ClusterCPUUtilization”,”clusterName”:”docker”,”function”:”AVG”,”updateFrequencyInMins”:1,”value”:5,”providerName”:”root”}

    • type
      The type of data json being posted. It is “alert” in this case.
    • timestamp
      A UNIX epoch time indicating the time at which the alert is raised.
    • policyId
      The name of corresponding alert policy.
    • privateAddresses
      The private addresses of the instance for which the alert is raised.
    • clusterType
      The type of cluster for which the alert is raised. Currently the possible values are “Lightning ADC - Manual” and “Lightning ADC - Auto”.
    • publicAddresses
      The public addresses of the instance for which the alert is raised.
    • valueUnit
      The unit in which the metric is configured for the policy. In this case it is ABSOLUTE.
    • clusterId
      Unique 10 byte ID of the Cluster for a given tenant.
    • serverId
      Globally unique ID of the Lightning ADC for which the alert is raised.
    • operator
      Operator in the policy that caused this alert to be raised. Currently “GT” (Greater Than) and “LT” (Lesser Than) are supported.
    • durationInMins
      The duration for which the metric needs to be considered for computing.
    • condition
      Human readable condition string that caused the alert to be raised.
    • tenantName
      The tenant that hosts the cluster.
    • currentMetricValues
      List of configured metric values recorded at the time the alert is raised.
    • metric
      The name of the configured metric that caused the alert.
    • clusterName
      Name of the cluster for which the alert is raised.
    • function
      The function used to compute the metric across data points. The value is “AVG” for InstanceCPUUtilization.
    • updateFrequencyInMins
      The duration at which the metrics needs to be updated. This is internal to the system and can be ignored.
    • value
      The configured value for the metric.
    • providerName
      The name of the provider that the tenant belongs to.
    _images/CPU-utilization-instance.png
    _images/CPU-utilization-cluster.png

Cluster Scale Up/Down Alerts to Trigger DNS Updates

Harmony Controller can notify whenever a Lightning ADC is added or removed from a cluster. Notifications can be sent through email, or through an http URL and the latter notification can allow you to automate the DNS updates whenever there is a Lightning ADC Scaleup/down.

  1. Select Cluster Alerts from the Type drop-down list and click on NEW ALERT.

  2. Select ScaleUp/Down of Cluster as a condition.

  3. (Optional) Check the box Send Email for an email to be sent to the registered account with alert notification.

  4. Check the box Webhook and enter the URL in the format https:// to which it alerts DNS related changes.

    A script can be run on the end-point URL (provided as webhook) that will wait for Scale Up/Down alerts from Harmony Controller. On receiving an alert, based on the alert type the script can add or remove the IP address of the scaled Lightning ADC from/to the application domain name.

Cluster CPU Alerts to Trigger Scaling

Harmony Controller can notify if the CPU utilization is high or lower than the configured thresholds. Notifications can be sent through email, or through an http URL and the latter notification can allow you to automate the scaling up and down of a manual cluster.

  1. Select Cluster Alerts from the Type drop-down list and click on NEW ALERT.

  2. Provide the Name, set the duration in minutes, set the frequency of alert checks before any alert is raised.

  3. Select Cluster CPU Utilization as a condition.

  4. (Optional) Check the box Send Email for an email to be sent to the registered account with alert notification.

  5. Check the box Webhook and enter the URL in the format https:// to which it alerts CPU utilization.

    A script can be run on the end-point URL (provided as webhook) that will wait for Cluster CPU utilization alerts from Harmony Controller. On receiving an alert, based on the CPU utilization of the cluster the script can launch a new or terminate an existing Lightning ADC instance. As an alternate option you can choose manually launch or terminate an Lightning ADC instance on receiving an alert email.

Events

Events are created to analyze the data in the occurrence of any events of traffic. The user can either create an event to analyze the impact of traffic on the complete Application or just the services offered as shown below.

User Defined Events

User defined events provide the detailed information of the various events created by the user such as services, application, host, smart flow and so on. It also provides the information such as the time at which the event was created/modified, the scope of the event, and the properties.

Blue/Green Events

The blue-green event is a technique that reduces downtime and risk by running two identical production environments called Blue and Green. At any time, only one of the environments is live, with the live environment serving all production traffic. The Blue/Green option in the events screen provides the various information such as Blue/Green event details, Type/Status of the Blue/Green event, and the Time at which the Blue/Green event occurred. The below figure shows the various information displayed on the Events page.

_images/image_events_NEW1.png

Creating a New Event

To create a new Event, follow the below steps:

  1. Click Configuration > Events, click NEW EVENT
_images/image_events_NEW1.png
  1. To create a new event for an Application, select the radio button next to Application in Scope.
_images/image_events_NEW3.png
  1. Click NEW EVENT. Provide the Name of the event, choose the scope of the event, provide the description, and the time at which the event is created and Save. The user can either create an event for an Application or a Service.
_images/image_events_NEW4.png

To view the Analytics data of the Events, go to Analytics > Events.

Note

Follow the same steps as above to create a new Blue/Green Event.