Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmstorage GC overhead grows on long running instance #5379

Closed
waldoweng opened this issue Nov 23, 2023 · 4 comments
Closed

vmstorage GC overhead grows on long running instance #5379

waldoweng opened this issue Nov 23, 2023 · 4 comments
Labels
enhancement New feature or request question The question issue

Comments

@waldoweng
Copy link
Contributor

waldoweng commented Nov 23, 2023

Describe the bug

vmstorage rely on lib/workingsetcache module to cache a lots of useful data: TSIDcache, MetricNamecache, etc.
lib/workingsetcache operate mainly on two modes: split and whole mode.
lib/workingsetcache use fastcache internally, when in whole mode, internal fastcache never get reset, but do cleanup by calling fastcache.bucket.cleanLocked() function which calls delete() regularly.

func (b *bucket) cleanLocked() {
	bGen := b.gen & ((1 << genSizeBits) - 1)
	bIdx := b.idx
	bm := b.m
	for k, v := range bm {
		gen := v >> bucketSizeBits
		idx := v & ((1 << bucketSizeBits) - 1)
		if (gen+1 == bGen || gen == maxGen && bGen == 1) && idx >= bIdx || gen == bGen && idx < bIdx {
			continue
		}
		delete(bm, k)
	}
}

But this won't be enough since map of golang never shrink:

In the official standard Go runtime implementation, the backing array of a map will never shrink, even if all entries are deleted from the map. This is a form of memory wasting. But in practice, this is seldom a problem and and actually often good for program performances.
https://go101.org/optimizations/6-map.html

type bucket struct {
	mu sync.RWMutex

	// chunks is a ring buffer with encoded (k, v) pairs.
	// It consists of 64KB chunks.
	chunks [][]byte

	// m maps hash(k) to idx of (k, v) pair in chunks.
	m map[uint64]uint64

	// idx points to chunks for writing the next (k, v) pair.
	idx uint64

	// gen is the generation of chunks.
	gen uint64

	getCalls    uint64
	setCalls    uint64
	misses      uint64
	collisions  uint64
	corruptions uint64
}

memory occupied and pointer reference by fastcache.bucket.m will never go away, result in a very smooth memory leak and increase of GC overhead.

To Reproduce

  1. Insert many new series data to make TSIDcache switch from split mode to whole mode
  2. run for very long time (in our cases, 2 month)

Version

all cluster version

Logs

No response

Screenshots

lots of inuse objects
gc overhead of 6%

Used command-line flags

No response

Additional information

No response

@waldoweng waldoweng added the bug Something isn't working label Nov 23, 2023
@valyala
Copy link
Collaborator

valyala commented Nov 23, 2023

@waldoweng , thanks for filing the detailed bugreport!

It is unexpected that the high number of items in the bucket.m map of type map[uint64]uint64 may lead to increased GC, since this map doesn't contain pointers and Go has an optimization for such maps according to https://go101.org/optimizations/6-map.html :

If the key type and element type of a map both don't contain pointers, then in the scan phase of a GC cycle, the garbage collector will not scan the entries of the map. This could save much time.

I see the following solutions if this is the real issue (probably, it is triggered by periodic deletion of items from the map):

  1. To copy periodically bucket.m items into a newly allocated map in order to eliminate allocations for superfluous items, which were already deleted:

    m := make(map[uint64]uint64, len(b.m))
    for k, v := range b.m {
       m[k] = v
    }
    b.m = m
  2. To use a custom map[string]uint64 with the limited size for mapping the entry key to an offset of the entry in bucket.chunks. This implementation may use an ordinary []uint64 slice containing bucket.chunks offsets and a hash/maphash.Bytes as an efficient hash function for mapping the entry key to an index in the slice. This should reduce the number of Go objects to 1 per such a map.

@waldoweng
Copy link
Contributor Author

@valyala, thanks for the quick response!
I should read the doc you mention again and try to find the real culprit, will report back if found.
closing this issue for now.

valyala added a commit to VictoriaMetrics/fastcache that referenced this issue Nov 24, 2023
…r to reduce map fragmentation

Hopefully, this may help reducing memory usage and GC pressure for long-running processes, which use fastcache.

See VictoriaMetrics/VictoriaMetrics#5379
valyala added a commit that referenced this issue Nov 24, 2023
…1.12.2

This should help reducing GC overhead growth at #5379
valyala added a commit that referenced this issue Nov 24, 2023
…1.12.2

This should help reducing GC overhead growth at #5379
@valyala
Copy link
Collaborator

valyala commented Nov 24, 2023

@waldoweng , I implemented the first solution mentioned above in VictoriaMetrics/fastcache@959e53c . This change has been added into VictoriaMetrics repository in the commit a7800cd . Could you try building VictoriaMetrics from this commit according to these instructions and verify whether it helps reducing GC overhead growth for long-running VictoriaMetrics instances under your workload?

If you use cluster version of VictoriaMetrics, then try building and running vmstorage from the commit 3d22f98 according to these docs.

@valyala valyala added enhancement New feature or request question The question issue and removed bug Something isn't working labels Nov 24, 2023
@waldoweng
Copy link
Contributor Author

@valyala, just want to let you know that by talking to people at golang nut, it seems to confirm this issue.
https://groups.google.com/g/golang-nuts/c/ECk8RJgdI9Q

But your implementation is coming way sooooooooooo fast!

I should try to verify this fix, will report back when I got some results, thanks!

AndrewChubatiuk pushed a commit to AndrewChubatiuk/VictoriaMetrics that referenced this issue Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question The question issue
Projects
None yet
Development

No branches or pull requests

2 participants