This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Core Memory

Looking to start a discussion on Core memory usage and memory trouble-shooting in general

1) The first problem we are seeing is that the Core will often be at 0 free memory with a HUGE amount in standby. Yes I know low free memory is "normal" in current server OS's and that standby memory is available to be used by another application if it needs it. But that is certainly not my experience either and def not when RR is the one with all the memory in standby

I have a Core that has 125GB of memory, 25GB is in use from every process and 100GB is in standby. 0 Free

I see the TN below about write caching. But a few issues

https://support.quest.com/rapid-recovery/kb/119686/high-paged-pool-ram-utilization-on-systems-running-appassure-or-rapid-recovery

a) The Core is 2012 R2 so should not be having this issue.

b) The technote gives no indications of how to confirm if you are having this issue.

c) Without a way to confirm if I am having this issue, the technote may not even help

https://www.quest.com/community/products/rapid-recovery/f/forum/21016/core-memory-usage-in-hyper-v-vms-making-vms-unresponsive#

2) Rammap file summary will often show a HUGE amount (+100GB) of memory in standby

Is this normal, does it show a problem with write cache (or anything else)

Why does this memory only show up in RAMMAP file sumary and point to the dfs.records file of our repo.

What does it not show up as standby memory allocated to core.service.exe (or any process) in task manager/ resource monitor. This is not how process are supposed to act

0 Tim Seymour over 6 years ago in reply to amjackson

Here are a few quick thoughts:

1. What is the bytes per sector of the disk your repository is stored on? An easy way to get that is to run "fsutil fsinfo ntfsinfo d:" at an admin command prompt. d: is the drive letter you have assigned to the disk the repository is on. If you have more than one extent, check all the different disks. Is the Bytes Per Sector also 512?

2. I'm trying to remember off the top of my head, but I believe that unless the bytes per sector of the repository and the bytes per sector of the disk match, we can't disable write caching.

3. Matching the Bytes Per Sector of the repository and the disk greatly improves performance in the testing I have seen done. So in the rebuild of your cores, it may not have been write caching that improved performance nearly as much as matching those values. Did you run the two cores you rebuilt with write caching enabled for any period of time?

I look forward to hearing back from you and I appreciate your feedback. We're always looking for feedback like yours to help us identify more ways to improve performance.
Cancel
Up 0 Down

Cancel
0 amjackson over 6 years ago in reply to Tim Seymour

Hi Tim,

I now believe you were correct. Matching up block sizes definitely makes a large difference in transfer/backup speeds.

I am still having issues with this particular core on Server 2012. Backup speeds are slower then I would expect and slower then what I'm seeing on my other cores, 2 of which are using the same storage with similar and reliable speeds.

The issue I have now.
1. Windows reporting a large amount of memory "in use". 222499 MB of 262144 MB approx as I look at it right now. 39859 MB in standby and 0 MB free.

2. For reference, the healthy core is 127461 MB of 262144 MB in use, 2955 MB in standby, and 131094 free.

I've drastically reduced the number of machines on this problem core to no avail. Healthy core now has 95 protected machines vs 48 on problem core. I think the # of machines has little impact on memory usage, but could be wrong.

I've changed the dedupe cache(unfortunately) from 128 GB to 64 GB with little to no effect. Only thing left I can think of is the size of the machines and jobs that are running on the bad core. I have a 30TB fileserver trying to base, and 6 6 TB exchange servers that come along with huge 7+ TB rollups for each machine.(managing rollups for servers of this size is a whole nother discussion I need to have with you guys lol).

I'm down to 3 Maximum concurrent transfers and 1 rollup. I can get decent speeds with this setup, but still HUGE in use memory which I feel is slowing the transfer speeds. Plus, no other machines can backup because the 7 most important servers are always backing and rolling up.

We've considered adding ram to this server but its expensive as ***, so today I decided to just build a new 2016 VM with 128GB or so of ram and throw the 6 exchange servers on it alone.

Just wanted to share my experiences with you guys again, thanks for all the feedback and information on this post. If there's anything that could help, or I am missing that would be great too. Cya!
Cancel
Up 0 Down

Cancel
0 Tim Seymour over 6 years ago in reply to amjackson

Thanks for the response and continuing the conversation. It's always helpful when someone shares their experiences.

Since a copy of the dedupe cache lives in active RAM, changing the dedupe cache from 128 to 64 GB should have immediately freed up 64 GB of RAM. Obviously that RAM could then be consumed by other functions and disappear over time, but it should have been noticeable.

Number of agents is important if you are doing lots of post backup processing jobs. For instance if you are running mountability and checksum checks on Exchange and attachability checks on SQL, you are going to use more resources than if you are just backing up machines with no checks. You are definitely right that larg agents (multiple TBs) are far more impacting than lots of smaller agents. A base image of a 30 TB server is going to take a LONG time. There's no way around that. You're trying to move 30 TB of data. 6 x 6 TB Exchange servers are also going to be significantly impacting. Are you doing mountability jobs on those servers. I'd bet that this is also part of that high memory usage.

I'm curious to see how things change now that you've moved the 6 Exchange servers to a single core. That would have been my recommendation also. Pull the heavier load machines out of the current core and put them on their own core so that it can focus solely on them and not interfere with other machines.
Cancel
Up 0 Down

Cancel