Purpose

This article explains how to troubleshoot and identify high memory issues on Versa operating System(VOS).

 

 

Troubleshooting High Memory

High memory usage investigation needs in-depth analysis of multiple components in Versa operating system. 

 

 

While investigating the high memory usage it comes under two major categories: 

 

 

  1. High Memory spikes reported at random time interval causing performance issues
  2. Persistent high memory usage which has been showing gradual increase since last service start.

 

 


Case-1: You can follow below steps when memory usage shows sudden spike at random time intervals:

 

  • You can obtain the time stamps of the memory spikes of last 2-4 intervals and collect the /var/log/versa/debug/mem-high/tshoot.log file from the VOS.
  • Collect the device resource usage(Total number of active sessions, Interface bandwidth usage, memory usage graph, CPU usage, total NAT sessions etc) details from cli and analytics as well and verify if there are any additional activities being performed during that time. 
  • Additional activities could be any scripts running at backend, extra traffic arriving on the box which is creating a greater number of sessions and taking more memory. Please do check the  counters/number of NAT sessions.
  • Monitor the poller and worker thread utilization and ensure box is not getting oversubscribed. Verify the platform hardware specs and its supported sessions/traffic load at any given point.
  • Compare the output for “top -H” from thsoot.log for 2-4 iterations and isolate which process is consuming the high memory during the reported time frame. 

 


 

Note: VOS releases 21.X please refer to /var/log/versa/debug/mem-high/tshoot.logFor older VOS releases you can obtain /var/log/versa/tshoot.log


 

Case-2: You can follow below detailed steps if you suspect memory is being held up and has been gradually growing from some time.

 

 

Step-1 Check the system details using the command “vsh details” and “free -h” from the shell . These commands will show you the hardware type, VOS software details, memory consumption(free and used), Spack and OSS Pack details in the system.

 


 

 

 

 

Free -h output shows valuable information of memory usage and space available in the RAM. It will also show you the used and available swap space and buffers which are used my Linux kernel. Swap space is utilized when system doesn’t have enough RAM space available. SWAP is additional memory space which is on the disk and not part of the Physical RAM itself. 

 

 

 

 

 

 

Step-2: Check the “show alarms” output to see if there are any high memory threshold breach alarms. While investigating the alarms one can infer if the memory spike is seen at some specific interval or is it being held up and not released. 

 

 

Step-3: Please verify and make a distinction whether the memory usage is expected or not. Please check the hardware model, total inbuilt memory, total traffic, and session load on the box and ensure it is not the case of oversubscription. 

 

 

Please use the following command to check the CPU/Memory load and active session count on the box. 

 


#show device clients


admin@cpe2-cli> show device clients


CLIENT  VSN  CPU   MEM   MAX       ACTIVE    FAILED

ID      ID   LOAD  LOAD  SESSIONS  SESSIONS  SESSIONS

-----------------------------------------------------

15      0    38    79    500000    40        0


 


This command provides the total counter of sessions which are created, closed and currently active. It does not necessarily mean that if number of sessions are less then traffic would also be less. There could be some FAT flows which could be carrying huge amount of traffic therefore please correlate the active sessions count along with actual traffic stats. Further correlation can also be made via analytics which shows the top consumers.

 



 

 

#show interfaces port statistics brief

 


This command will show you the traffic stats in BPS,PPS and in Percentage.



 

 

 



 

Step-4: Login into the analytics and check the memory utilization graph for last week or a month. This would give an indication when memory utilization starts increasing and if it has dropped at some interval.


 

 

 

 

 

If memory utilization is kept increasing steadily from some time and not getting released, then we need to investigate in more depth for underlying cause as it could be due to a memory leak in the system. Memory leaks are seen when system is unable to free the used memory block when it is no longer needed. In such instances, another iteration of same process occupies another subsequent block of memory that also doesn’t get relinquished. This cycle goes on and on and eventually leads to high memory usage in the system which is technically referred as memory leak.

 

 

Please check when the system starts growing the memory utilization and look for possible trigger/changes done on the appliance. You can check for events such as if there was any upgrade performed, network configuration change, new service or any security profile added(IPS/IDS, URL filtering, SSL inspection etc) or if there was any network instability started during that time frame etc. We have seen memory leak issues where there is lot of P2mp neighbor churn, IKE flaps, CGNAT flaps etc.

 

 

Once it is identified that there is substantial increase in the memory which is not usual then please follow below steps to root cause it.

 

 

Step-5: Get into appliance shell and obtain the /var/log/versa/thsoot.log for older releases and /var/log/versa/debug/high-mem for latest VOS releases.

 

 

Start the investigation from tshoot.log file obtained from either location and check the “top -H” output.

 



top - 21:15:16 up 40 days, 23:33,  1 user,  load average: 4.70, 4.38, 4.35
Threads: 329 total,   5 running, 324 sleeping,   0 stopped,   0 zombie
%Cpu(s): 20.7 us,  1.9 sy,  0.0 ni, 60.0 id,  0.5 wa,  0.0 hi,  0.2 si, 16.7 st
KiB Mem:   5964680 total,  4853592 used,  1111088 free,   176892 buffers
KiB Swap:  8384508 total,      124 used,  8384384 free.   850332 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
21781 root      20   0 3942172 900700 163920 S  1.7 15.1   1060:47 versa-vsmd
22108 root      20   0 3942172 900700 163920 S  0.0 15.1   0:00.00 eal-intr-thread
22109 root      20   0 3942172 900700 163920 R 35.8 15.1  19534:58 worker-0
22110 root      20   0 3942172 900700 163920 R 30.9 15.1  16735:09 worker-1
22111 root      20   0 3942172 900700 163920 S 32.5 15.1  16682:34 worker-2
22112 root      20   0 3942172 900700 163920 S 30.9 15.1  16713:24 worker-3
22113 root      20   0 3942172 900700 163920 R 30.2 15.1  16839:41 worker-4
22114 root      20   0 3942172 900700 163920 R 31.9 15.1  16687:02 worker-5
22115 root      20   0 3942172 900700 163920 S 16.3 15.1   9306:44 poller-0
22260 root      20   0 3942172 900700 163920 S  3.3 15.1   1765:11 vunet-timer
22281 root      20   0 3942172 900700 163920 S  1.3 15.1 883:48.57 ctrl-data-0
22284 root      20   0 3942172 900700 163920 S  3.0 15.1   1706:20 ipsec-control
22285 root      20   0 3942172 900700 163920 S  0.0 15.1   2:37.62 macsec-control
22289 root      20   0 3942172 900700 163920 S  0.3 15.1 110:31.81 lcore-watchdog
22339 root      20   0 3942172 900700 163920 S  0.0 15.1   1:11.56 kni-handle-requ
20348 root      20   0  877988 184456   6396 S  0.0  3.1 196:32.03 confd
20349 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.07 confd
20350 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.07 confd
20351 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.14 confd
20352 root      20   0  877988 184456   6396 S  0.0  3.1   0:03.22 confd
20353 root      20   0  877988 184456   6396 S  0.0  3.1   0:10.56 confd
20354 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.04 confd
20355 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.08 confd
20356 root      20   0  877988 184456   6396 S  0.0  3.1   0:02.89 confd
20357 root      20   0  877988 184456   6396 S  0.0  3.1   0:00.06 confd
20358 root      20   0  877988 184456   6396 S  0.0  3.1   0:07.55 confd



Top output is very useful and provide actual memory usage(Resident memory) for each process. Top command gives the statistics for available memory, used memory, used buffer and cache memory. If you are taking the top -H output from shell then you can sort the output (Shift +M) with memory usage. By default, it sorts the processes with CPU usage.

 

 

You can also dump the output of /proc/meminfo, which shows detailed output of memory counters. You can also collect “sudo cat /proc/vmallocinfo”.

 


$ cat /proc/meminfo
MemTotal:        5964680 kB
MemFree:         1113440 kB
MemAvailable:    1898280 kB
Buffers:          176892 kB
Cached:           850772 kB
SwapCached:          124 kB
Active:          2203948 kB
Inactive:         284024 kB
Active(anon):    1297588 kB
Inactive(anon):   235296 kB
Active(file):     906360 kB
Inactive(file):    48728 kB
Unevictable:       71428 kB
Mlocked:           71428 kB
SwapTotal:       8384508 kB
SwapFree:        8384384 kB
Dirty:               124 kB
Writeback:             0 kB
AnonPages:       1531684 kB
Mapped:           282636 kB
Shmem:              1492 kB
Slab:             103108 kB
SReclaimable:      67284 kB
SUnreclaim:        35824 kB
KernelStack:        5312 kB
PageTables:        16116 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    10318272 kB
Committed_AS:    1963340 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       2
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
DirectMap4k:      131060 kB
DirectMap2M:     3915776 kB
DirectMap1G:     2097152 kB



 

You can obtain the process status(PS) command output which lists the currently running processes which is red from /proc file system. 

 


[admin@cpe1: ~] $ ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head
  PID  PPID CMD                         %MEM %CPU
21781 21774 /opt/versa/bin/versa-vsmd - 15.0  228
20348     1 /opt/versa/confd/lib/confd/  3.0  0.3
21918     1 /opt/versa/bin/redis-server  1.8  1.1
21158 21154 /opt/versa/bin/versa-vmod -  1.7  0.0
21244     1 /opt/versa/bin/versa-rtd -N  1.6  0.0
20858     1 /usr/bin/python3 /opt/versa  1.0  0.0
20699 20697 /opt/versa/bin/versa-dnsd -  0.9  0.0
20862 20826 /opt/versa/bin/versa-acctmg  0.9  1.2
20810     1 /usr/bin/nodejs /opt/versa/  0.8  0.0



You can note down the process name which is consuming high amount of memory.

 


 

 

 

 

Step-6: Once you can identify the process which is consuming the more memory for example if we have versa-vsmd is consumer process, you can log into the vsmd vty and check the memory allocation:

 


Please execute the following malloc command:


vsm-vcsn0> show vsm statistics vmalloc

ID     ID String                         alloc    free   in_use  fail  nmismatch    used-bytes 
------ --------------------------------- ------   ------ ------- ----- ----------   ---------- 
1740   VMEM_ID_MCAST_FWD_STATS           1        0      1       0     0            1.2 KB
1743   VMEM_ID_MCAST_OIFLIST_HTABLE      7        0      7       0     0            784.0 KB
1745   VMEM_ID_VS_UTILS_PROFILER         6        0      6       0     0            30.0 KB
1755   VMEM_ID_MON_DATASTORE             3        0      3       0     0            416 B
1757   VMEM_ID_MON_DAC                   1        0      1       0     0            56.0 KB
1758   VMEM_ID_MON_CRC                   2        0      2       0     0            1.0 MB
1759   VMEM_ID_MON_ARC                   2        0      2       0     0            112.0 KB
1807   VMEM_ID_PRNG_CTXT                 6        0      6       0     0            192 B
1808   VMEM_ID_FDT                       7        0      7       0     0            770.0 KB
1812   VMEM_ID_MACSEC_CFG_PATH_GEN       8        0      8       0     0            1.5 KB
1821   VMEM_ID_MACSEC_STATS              7        0      7       0     0            1.9 KB
1823   VMEM_ID_DOT1X_LIB                 126      12     114     0     0            21.6 KB
1825   VMEM_ID_MACSEC_AUTH_DETAIL_TBL    4        0      4       0     0            640.2 KB
1848   VMEM_ID_TWAMP_MOD_HTABLE          7        0      7       0     0            12.2 KB
1873   VMEM_ID_VSMD_DLP_CFG_COOKIE       2        0      2       0     0            20.0 KB
1882   VMEM_ID_VSMD_DLP_TENANT_MAX       3        0      3       0     0            4.8 KB
1885   VMEM_ID_VSMD_DLP_RTE_RING         2        0      2       0     0            5.0 KB
1893   VMEM_ID_DYNAMIC_SCALE_GLOB_DATA   2        0      2       0     0            96 B
1897   VMEM_ID_VSMD_MDM_CFG_COOKIE       2        0      2       0     0            20.0 KB
1906   VMEM_ID_VSMD_MDM_TENANT_MAX       3        0      3       0     0            4.5 KB
1911   VMEM_ID_VSMD_MDM_STATS_THRAED     1        0      1       0     0            112 B
1914   VMEM_ID_HS_WRAPPER_CONTROL        1        0      1       0     0            320 B
----------------------------------------------------------------------------------------------
Total                               332479579 331953710 525869   0     0            595.5 MB  



Vmalloc command is very useful to see the memory allocation done from vsmd. You can identify the alloc and in_use column to verify if there is any leak under specific process/function. there are cases where Vmalloc usage doesn't yield any clue ,you can obtain the total memory allocation for the versa-vsmd from "top -H " output by observing the Resident memory and correlate the total used memory under vmalloc stats and check the difference. 



In case you see vamlloc memory usage accounting is very less in comparison to resident memory taken from "top -H" then there are memory blocks getting allocated through direct malloc calls which are not accounted under vsmd vmalloc stats. In such cases we may have to resort to jemmloc-caller-stats which needs to be enabled explicitly. Engineering cognizance is required.



vsm-vcsn0>  show vparse memory vdetect

ID ID String                               alloc free in_use  fail nmismatch  used-bytes 
-- --------------------------------------- ------------------ --------------- ---------- 
0  VDETECT_MEM_ID_LIB_GEN                  1     0    1       0    0          32 B
1  VDETECT_MEM_ID_SURICATA_APP_LAYER       27    5    22      0    0          7.9 KB
2  VDETECT_MEM_ID_MAIL_PARSER_SMTP_ANAMOLY 193   162  31      0    0          25.8 KB
3  VDETECT_MEM_ID_MAIL_PARSER_GEN          1     0    1       0    0          16 B
4  VDETECT_MEM_ID_DCERPC_PARSER_GEN        1     0    1       0    0          16 B
5  VDETECT_MEM_ID_FTP_PARSER_GEN           1     0    1       0    0          8 B
6  VDETECT_MEM_ID_JSNORM_ENGINE_GEN        18    0    18      0    0          163.9 KB
7  VDETECT_MEM_ID_LIBNORM_GEN              60    0    60      0    0          23.2 MB
8  VDETECT_MEM_ID_IPREP_GEN                3     0    3       0    0          1.4 KB
9  VDETECT_MEM_ID_URLF_VCATFEED_FEEDS      1     0    1       0    0          1.5 MB
10 VDETECT_MEM_ID_URLF_VCATFEED_MD5INFO    1     0    1       0    0          10.0 MB
11 VDETECT_MEM_ID_URLF_VCATFEED_INDEX      1     0    1       0    0          320.0 KB
12 VDETECT_MEM_ID_BC_URLF_CONFIG_VALUES    1     0    1       0    0          6.0 KB
13 VDETECT_MEM_ID_BC_URLF_SDK_STATUS       1     0    1       0    0          320 B
14 VDETECT_MEM_ID_BC_URLF_LOOKUP_SOURCE    1     0    1       0    0          32 B
15 VDETECT_MEM_ID_BC_URLF_TLD_HASH_TABLE   1     0    1       0    0          10.0 KB
16 VDETECT_MEM_ID_IPS_FILTERS              780   0    780     0    0          17.4 KB
17 VDETECT_MEM_ID_AV_GEN                   2     1    1       0    0          96 B
18 VDETECT_MEM_ID_FILEMAGIC_GEN            63    0    63      0    0          2.0 KB
19 VDETECT_MEM_ID_DEVID_GEN                49176 0    49176   0    0          3.0 MB
21 VDETECT_MEM_ID_HTP_CFG                  8     0    8       0    0          8.8 KB
22 VDETECT_MEM_ID_HTP_LIST                 24    0    24      0    0          1.5 KB
24 VDETECT_MEM_ID_HTP_HOOK                 25    0    25      0    0          400 B
25 VDETECT_MEM_ID_CPP_LIB_GEN              10    0    10      0    0          608 B
----------------------------------------------------------------------------- ----------
Total                                    50400  168   50232   0    0          38.3 MB   

 

 


vsm-vcsn0> show idp mem non-zero    <<<< Collect the IDP stats which are mostly for IDS/IPS.

 


 

 

 


Step-7: You can check the thrm event stats and see if there is any queue build up is seen. There could be cases where any of the cores being hogged and memory is increasing due to queue build up.



vsm-vcsn0> show vsm statistics thrm evstats                                                                                                          

---------------------------------------------------------------------------------------------------------------                                      

  S-tindex |  D-tindex | EV-posted  | EV-processed | EV-proc-fail | EV-po-invalid| EV-po-alloc  | EV-po-write  |                                     

---------------------------------------------------------------------------------------------------------------                                                                           

Thread group: 1                                                                                                                                      

------------------------------------------------------------------------------------------------------------                                         

    3943(1)|     3948(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3943(1)|     3949(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3943(1)|     3907(2)|   2349577952|   2016057686|            0|            0|            0|            0|                          <<<<<     

------------------------------------------------------------------------------------------------------------                                         

    3943(1)|     4155(3)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3944(1)|     3948(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3944(1)|     3949(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3944(1)|     3907(2)|     76294088|     35576529|            0|            0|            0|            0|                               <<<<<     

------------------------------------------------------------------------------------------------------------                                         

    3944(1)|     4155(3)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3945(1)|     3948(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3945(1)|     3949(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3945(1)|     3907(2)|     74415257|     35138969|            0|            0|            0|            0|                                 <<<<<                

------------------------------------------------------------------------------------------------------------                                         

    3945(1)|     4155(3)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3946(1)|     3948(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3946(1)|     3949(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3946(1)|     3907(2)|     77491717|     34925201|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3946(1)|     4155(3)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3947(1)|     3948(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3947(1)|     3949(0)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------                                         

    3947(1)|     3907(2)|    102054976|     34582805|            0|            0|            0|            0|                                    

------------------------------------------------------------------------------------------------------------                                         

    3947(1)|     4155(3)|            0|            0|            0|            0|            0|            0|                                        

------------------------------------------------------------------------------------------------------------ 



Step-8: We have seen couple of cases where memory increase is seen due to the build up in unused-dirty-pages. In the VOS codes before 21.2,the default Jemalloc library using 10sec decay-time to free unused-dirty-pages to the kernel. 


If you are on older build then you can try to set the dirty decay timer to 2000ms which will result in faster decay of unused dirty pages therefore less physical memory will be used. This will help in improving the performance of the system.


You can execute the following command from shell to chnage the dirty decay timer:



%sudo   ln -s 'dirty_decay_ms:2000'  /etc/malloc.conf



vsm-vcsn0> show vsm statistics jemalloc

___ Begin jemalloc statistics ___

                                 Version: "5.2.0-0-gb0b3e49a54ec29e32636f4577d9d5a896d67fd20"

Build-time option settings

  config.cache_oblivious: true

  config.debug: false

  config.fill: false

  config.lazy_lock: false

  config.malloc_conf: ""

  config.prof: false

  config.prof_libgcc: false

  config.prof_libunwind: false

  config.stats: true

  config.utrace: false

  config.xmalloc: false

Run-time option settings

  opt.abort: false

  opt.abort_conf: false

  opt.retain: true

  opt.dss: "secondary"

  opt.narenas: 12

  opt.percpu_arena: "disabled"

  opt.oversize_threshold: 8388608

  opt.metadata_thp: "disabled"

  opt.dirty_decay_ms: 2000 (arenas.dirty_decay_ms: 2000)                                     <<<<<<<<<<<<<<<

  opt.muzzy_decay_ms: 0 (arenas.muzzy_decay_ms: 0)




Analyzing Tech-support:

 

There would be cases where we may not get direct indication from vmalloc stats that what is consuming the memory, in such instances we need to analyze the device logs in more depth to understand the trigger which is causing high memory usage.

 

 

  1. Analyze the logs and find if there is any network instability(interface flaps, protocol flaps, IKE/Ipsec flaps, P2mp neighbor churn, CGnat churn etc) present on that device or in the network.

 



#Check if you see lot of NBR:CHANGE from specific site-id, it will show branch name as well. 

cat /var/log/versa/versa-infmgr.log | grep -i "P2MP:NBR:CHANGE"   

 


2020-01-10 11:40:08.155 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1653, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.383 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1653, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.704 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1647, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.721 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1647, rtt_index 11, tenant_id 1

2020-01-10 11:40:08.768 INFO infmgr_p2mp_neighbour_change:2596 P2MP:NBR:CHANGE remote network-id 76, branch-id 1653, rtt_index 11, tenant_id 1

 

 

 

#Verify if there is lot of NAT binding churn:

 

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:4661),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:4661),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:29461),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:29461),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:27748),

link-id 1 (vni-0/2.0), , seq 1 bindings(INT-WAN-Transport-VR:172.16.255.1 -> 178.219.184.46:27748),

 

 


#Get this from branches/site isolated in above step and verify issue is with paired-id ||  NAT change || BUG(EIM changing every 1-5min) || Link Flaps

 


Paired-site location-id should be same on both boxes and globally unique. It should not be overlapping with site-id of any other CPE/Controller except, using Site-ID from one of the CPE in HA as location-id on both.

 


cat /var/log/versa/versa-infmgr.log     

cat /var/log/versa/versa-service.log

vsh connect vsmd

show cgnat ei-mappings  <local-tnt-id>     >> Check if mapping is changing run @interval of 5 to 10 min (bug 42251 )

show cgnat ei-filters   <local-tnt-id>

 

 

  1. Check the device state/configuration and compare it with the date since when device start showing memory increase. Look for any new changes which would have triggered the usage.
  2. Check if there was any upgrade(VOS, SPACK, OSSpack) was performed on this device.
  3. Check if any new security module configuration such as SSL inspection/IDS/IPS is added. 

 

 

 

 

Following commands can be obtained from the VOS cli and shell.

 

 

From VOS command Cli:

 

show interfaces port statistics brief

show interfaces port statistics detail

show system details

show device clients 

show orgs org Tenant-1 sessions summary            <Change the Tenant-name

show configuration system service-options | details

show system load-stats

show statistics internal memusage                  <Make use of unhide full>


 

From shell: 

 

top -H (Shift +M)

free -h

lscpu

vsh details

cat /proc/net/dev

cat /proc/meminfo

ps -o pid,user,%mem,command ax | sort -b -k3 -r

ps axo %mem,rss,pid,euser,cmd | sort -nr | head -n 10

ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head

ps -o rss,sz,vsz  `pidof versa-vsmd`

sudo cat /proc/(pid of the process)/status       < sudo cat /proc/21781/status>

sudo cat /proc/vmallocinfo

sudo cat /proc/`pid of the process`/maps         < sudo cat /proc/21781/maps>

sudo cat /proc/`pid of the process’/smaps        < sudo cat /proc/21781/smaps>

sudo pmap -XX `pidof versa-vsmd`                 < sudo pmap -XX 21781>

 

 

From VSMD VTY: (collect these commands if you versa-vsmd is the top consumer)

 

 

Use “vsh connect vsmd” shell command to enter vsmd vty prompt.

 

 

show vsm statistics vmalloc

show vsm statistics rtemalloc summary

show vsm statistics rtemalloc segments

show vsm statistics rtemalloc memzones

show vsm statistics mbuf

show vsm statistics jemalloc | between CPU Merged

show vsm statistics jemalloc

show vsm statistics vmalloc sort

show vsm cpu info

show vparse memory vdetect

show idp mem non-zero

show vsf nfp module stats brief

show vsm statistics dropped

show vsm statistics datapath

show vsm statistics thrm evstats

show vsm statistics thrm detail



From INF-MGR:

show stats vmalloc sort

show stats vsm



From Vmod:

show vmalloc statistics sort





From Analytics:


Collect the Memory-Usage/CPU-Usage/Access-Circuits-usage/Top-Consumer graphs from analytics for correlation.