Table of ContentsPreviousNextIndex
 
PDF

Fidelia Technology Logo

    DGE Management

6.1 Configuring Data Gathering Elements (DGEs)

NetVigil uses a distributed, tiered architecture where the data collection and storage is handled by the DGE component. Each DGE polls data from the network devices, servers and applications and performs real-time aggregation and storage of this performance data in a local relational database. The DGE also triggers actions and notifications when it detects that the threshold conditions are exceeded or crossed.

6.1.1 Adding a New DGE

If you would like to expand your NetVigil system to monitor additional devices in remote geographical or logical locations, you can install the software on another physical machine and integrate it seamlessly into your existing setup. You can add multiple DGEs in the same location for load balancing or increasing monitoring capacity, and also distribute DGEs in multiple locations as needed.

You will need to pick a location name so that all devices for this location will be automatically assigned to this new DGE (you can pick any name, e.g., New York or datacenter3 or finance).

6.1.2 Changing DGE Database Type

By default, the DGE database is set to MySQL (licensed and shipped with NetVigil). However, you can use Oracle version 8.1 or higher instead.

NetVigil includes the appropriate Oracle JDBC drivers.

  1. Create Database Objects.
    1. cd NETVIGIL_HOME/database/schema/oracle
    2. Log into Oracle SQL*Plus.
    3. Run the following scripts at the SQL prompt:
    4. @TableScript.sql
    5. @AutoIncrementScript.sql
    6. @NullValueTriggerScript.sql

If the auto increment script doesn't run successfully, then execute each of the statement in the script individually at the SQL prompt.

  1. Update the NetVigil configuration file. Edit netvigil/etc/netvigil.xml and replace the statement:
    <dge vendor="mysql" ... 

with:

<dge vendor="jdbc:oracle:thin" port="1521"
user="insert_username_here"
password="insert_password_here"
name="insert_database_name_here"
driver="oracle.jdbc.driver.OracleDriver"
minConnections="4" maxConnections="10"
debugging="false"
url="jdbc:oracle:thin:$USER/$PASSWORD@$DEVICE:$PORT:$DATABASE"/> 
Note: Remember to place correct parameter values within the quotation marks.

6.1.3 Changing DGE Aggregation Scheme

You can change the DGE data aggregation scheme by updating NETVIGIL_HOME/etc/aggregation.xml. However, you MUST change this prior to the installation process and not after installation. When you change the aggregation scheme, all existing performance data will be deleted and fresh databases for the data aggregation will be used.

Increasing the time interval that the data is stored directly impacts the size of the DGE database.

6.1.4 Disk Space Requirements for DGE Aggregation

Note A DGE Disk Space Requirements calculator is available at http://support.fidelia.com/resources/dbsize/

The DGE database stores three main data types:

  1. Aggregated performance data
  2. Event data (threshold violations)
  3. Syslog and Trap text messages
    WARNING The largest component is typically the aggregated performance database. You can change the DGE data aggregation scheme by updating $NETVIGIL/etc/aggregation.xml. However, you MUST change this prior to the installation process and not after installation since the aggregation scheme is loaded into the configuration database on initialization. All existing performance data will be deleted and fresh databases for the data aggregation will be used (please contact support@fidelia.com if you would like to modify your aggregation scheme after installation).

Each aggregated data value is 30 bytes in size (including the size of its index). For the default aggregation scheme:

5 minute samples for 1 day = 60/5*24 = 288 samples
15 minute samples for 7 days = 60/15*24*7 = 672 samples
60 minute samples for 90 days = 60/60*24*90 = 2160 samples
1 day samples for 3 years = 1*365*3 = 1095 samples
TOTAL size per test = (288+672+2160+1095) * 30 bytes = 126 KB 
per test 

For 10,000 tests DGE database = 1.26GB

The database size for 10,000 tests using some alternate aggregation schemes are described in the table below:

Database Size for Specific Aggregation Schemes
Aggregation Scheme
DB Size for 10,000 tests
5 min for 1 day, 15 min for 1 week, 1 hour for 3 months, 1 day for 3 years
1.3 GB
5 min for 1 day, 15 min for 1 week, 1 hour for 1 month, 1 day for 2 years
0.75 GB
5 min for 1 day, 15 min for 1 month, 1 hour for 3 months, 1 day for 2 years
1.8 GB
5 min for 1 day, 15 min for 1 week, 1 hour for 6 months, 1 day for 2 years
1.9 GB
5 min for 30 days, 30 min for 3 months, 2 hours for 6 months, 1 day for 3 years
4.8 GB

Oracle also requires space for transaction logs. The transaction log size must be set to a minimum of 32MB.

6.2 DGE Management Using the Web Interface

In order for NetVigil to provide network management analysis for your organization, the monitoring infrastructure must first be established by the superuser.

6.2.1 Locations and Hosts in the Web Application

Once the hardware for the DGE monitoring devices has been installed and configured (see the NetVigil Install & Upgrade Guide), the superuser can establish the DGE locations and define the DGE hosts for use by the web application.

DGE Locations Page
DGE Management page

Create a New Data Gathering Element page

6.2.2 Setting DGE Device Limits

For multiple DGEs in a single geographical location, NetVigil uses a load balancing mechanism based on configurable device limits to ensure that DGE hosts are not overloaded. There are two limits, soft and hard, which are used to determine whether the DGE has the capacity to take on a newly-provisioned device (see the algorithm below for details). When a device is created, it is provisioned to a DGE based on the following algorithm:

  1. Find a DGE that services the location of the device.
  2. If there is only one DGE in that location, select that DGE

ELSE if there are many DGEs where the user already has devices, choose the one that's the least loaded.

ELSE if there is no DGE on which the user already has a device, pick the least loaded DGE that services the new device's location.

  1. If the total number of tests on the selected DGE exceeds the hard limit, select another DGE.
  2. If the total number of tests on the selected DGE exceeds the soft limit, add the device only if the user already has other tests on this DGE.

ELSE select another DGE.

  1. If no DGE with available capacity can be found for the selected location, log an error and do not provision the device.

6.3 Using the DGE Controller

6.3.1 Monitoring DGE operation/capacity

The DGE component keeps track of different types of monitors that are running, number of objects processed and number of items in various queues waiting to be processed. You can telnet into port 7655 (default, or use the port that you have configured) on the server the DGE component is running:

% telnet my_dge 7655 
Trying n.n.n.n...
Connected to my_dge
Escape character is '^]'. 
NetVigil device monitor 
password: *****
<<welcome>> 

Once logged in, you can use the status command to view the health of each monitor, as well as the number of times they have performed a health check of configured elements:

controller> status
<<begin>> 
Monitor[sql] - com.fidelia.emerald.monitor.SqlQueryMonitor
		Number of passes: 0
		Work Units processed: 0
			Thread Status: alive 
Monitor[radius] - com.fidelia.emerald.monitor.RadiusMonitor
		Number of passes: 993
		Work Units processed: 993
		Thread Status: alive 
Monitor[ldap] - com.fidelia.emerald.monitor.LdapMonitor
		Number of passes: 0
		Work Units processed: 0
		Thread Status: alive 
[additional status lines removed]
<<end>> 

On a healthy DGE, Thread Status for all the monitors should indicate alive and the number of passes and number of work units processed should be increasing, provided there are one or more tests of that particular type configured (and not suspended) in the system.

The DGE status server also provides important information regarding capacity planning. The Schedule Queue section of the status command output indicates how many tests are waiting to be performed:

MonitorServer
		Schedule Queue [Monitor[sql]] Size: 0
		Schedule Queue [Monitor[ldap]] Size: 0
		Schedule Queue [Monitor[radius]] Size: 0
		Schedule Queue [Monitor[port]] Size: 0
		Schedule Queue [Monitor[ntp]] Size: 0
		Schedule Queue [Monitor[poet]] Size: 0
		Schedule Queue [Monitor[ping]] Size: 0
		Schedule Queue [Monitor[snmp]] Size: 2
		Schedule Queue [Monitor[dns]] Size: 0
		Schedule Queue [Monitor[external]] Size: 0
		Result Queue Size: 0
		Aggregation Writer Queue Size: 0
		Result Writer Queue Size: 0
		Event Writer Queue Size: 0 

In the event of a network outage, the size of different queues may grow to a large number depending on the network topology and reachability of each device. Once the outage has been resolved, the queues should start to decrease. However, if under normal operating conditions the queue continues to grow, it would indicate that new tests are being added to the queue before existing tests can be performed, and your DGE capacity has been exceeded. At this point you need to either add another DGE at the same location, move some tests/devices to a different DGE (either at same location or a different location), reduce the frequency of the tests or suspend some tests until capacity on the DGE can be increased.

Once completed, you can use quit command to log out of the DGE status server:

controller> quit
<<bye>>
Connection closed by foreign host. 

Fidelia Technology, Inc.
Contact Us
Table of ContentsPreviousNextIndex