Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit number of concurrent compactions #8276

Closed
hpbieker opened this issue Apr 9, 2017 · 22 comments
Closed

Limit number of concurrent compactions #8276

hpbieker opened this issue Apr 9, 2017 · 22 comments
Assignees
Milestone

Comments

@hpbieker
Copy link
Contributor

hpbieker commented Apr 9, 2017

Proposal:
I would like InfluxDB to have a configuration setting to limit the number of concurrent compactions.

Current behavior:
InfluxDB starts as many compactions as it can concurrently. It is not limited to the number of MAXGOPROCS as far as I understand.

Desired behavior:
The user can limit the number of compactions running at the same time.

Use case:
On my system I had 170 full compactions running concurrently after a restart of InfluxDB as all of them are basically started at the same time. This caused the machine to run out of memory and starts swapping. Thuse it because useless. It might also cause the machine to run out of disk space.

I have 16 cores and 64 GB RAM. 1500 shards distributed on ~10 databases. 1.7 TB data.

By limiting the number of concurrent compactions, these situations can be prevented.

@jwilder
Copy link
Contributor

jwilder commented Apr 9, 2017

Can you attach some profile when this is occurring? Also, what version are you running?

curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1" 
curl -o goroutine.txt "http://localhost:8086/debug/pprof/goroutine?debug=1" 
curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1" 
curl -o vars.txt "http://localhost:8086/debug/vars" 
iostat -xd 1 30 > iostat.txt
influx -execute "show shards" > shards.txt
influx -execute "show stats" > stats.txt
influx -execute "show diagnostics" > diagnostics.txt

@hpbieker
Copy link
Contributor Author

Hi,
Fortuantly we were able to "fix" this by ensuring that only a ~10 shards that needed compacting had write permission, and then restarting and updating the permission once it was done processing the shards. I am therefore not able to provide these stats.

@hpbieker
Copy link
Contributor Author

hpbieker commented Apr 10, 2017

Here is a copy of (a part of) the log which shows that very many compactions are started at the same time:
https://gist.github.com/hpbieker/c2107428fd5dc787b359c0e023397efd

This can easyly be reproduced by:

  • Disable full compaction by setting "compact-full-write-cold-duration" to a large value
  • Have ~100 large shards (~2 GB each)
  • Write one or more point to each of the shards
  • Wait for the non full compactions to complete their jobs
  • Then change the "compact-full-write-cold-duration" to "1m" and restart Influx

Influx will then start to do a full compaction of all 100 shards at once. Each compaction may take a few minutes so they do not have the time to complete before others are started, and the system will start trashing (100 % utilization of swap media) or you will get a OOM.

@hpbieker
Copy link
Contributor Author

hpbieker commented Apr 11, 2017

It looks to me that #7142 will solve this in a more generic way?

@jwilder
Copy link
Contributor

jwilder commented Apr 12, 2017

What version are you running?

@hpbieker
Copy link
Contributor Author

InfluxDB starting, version 1.3.0~n201704010800, branch master, commit 15e594f

@hpbieker
Copy link
Contributor Author

hpbieker commented Apr 12, 2017

In our use case we are doing a backfill to InfluxDB with data from 2008 to 2017. Each file we import might have data for a day or a month and we are importing it in acending order (more or less). In some cases we have to import some of the data twice. Shard duration i 1 week, but we do have multiple databases.

I guess it is this backfill combined with a restart of influx before compaction is done that causes the large number of compactions to start at the same time.

@jwilder
Copy link
Contributor

jwilder commented Apr 12, 2017

@hpbieker For that range of time that you are backfilling, you may want to bump up the shard duration to greater than 1 week to reduce the number of shards you have. 1-3 months might be better. If that data is never going to be removed, even as high as 1 year would be good. It sounds like you may have sparse data as well. The FAQ has other suggestions about config and schema design.

I've run into the many compactions issue you are seeing due to the server being restarted frequently and compactions never completing. We do need to handle this case better when there are many shards.

@hpbieker
Copy link
Contributor Author

@jwilder , thank you for your suggestion. I agree that increasing the shard size to at least 4 weeks will be beneficial for read performance when the time window is > 1 week. However, it will result in large shards that will have to be recompacted from time to time.

@hpbieker
Copy link
Contributor Author

We ran into the same problem yesterday when adding one to each shards in a database using a script. It triggered the compacting of all shards at the same time. It looks like it fails with OOM if I have more than ~10 full compactions at the same time.

What is memory consuming in the full compactions? It looks like the memory usage increases as the compaction is progressing, but shouldn't influx free the memory as it is done with parts of the files? Or does it somehow wait to free some of the memory until a compaction is completed?

I guess I will have the same problem if I do a drop measurement XX because that will trigger a full compaction of all the shards at the same time.

@hpbieker
Copy link
Contributor Author

In order to limit the number of compactions, I think the code below in engine.go should be modified to a) start a limited number of go rutines or b) add some code to throttle the number of go rutines running. The number should be configurable.

Any suggestion of the best way to do this?

// Apply concurrently compacts all the groups in a compaction strategy.
func (s *compactionStrategy) Apply() {
start := time.Now()

var wg sync.WaitGroup
for i := range s.compactionGroups {
	wg.Add(1)
	go func(groupNum int) {
		defer wg.Done()
		s.compactGroup(groupNum)
	}(i)
}
wg.Wait()

atomic.AddInt64(s.durationStat, time.Since(start).Nanoseconds())

}

@jwilder jwilder assigned jwilder and unassigned stuartcarnie Apr 26, 2017
@jwilder jwilder mentioned this issue May 2, 2017
4 tasks
@jwilder
Copy link
Contributor

jwilder commented May 4, 2017

Fixed via #8348

@jwilder jwilder closed this as completed May 4, 2017
@jwilder jwilder added this to the 1.3.0 milestone May 4, 2017
@hpbieker
Copy link
Contributor Author

hpbieker commented May 6, 2017

I just want to to say that the fix worked very well. This is the start up after I had droped a measurement thus required recompaction of all my shards in one of the databases. As you see the memory consumption is very steady. Thank you!
image

@jwilder
Copy link
Contributor

jwilder commented May 6, 2017

@hpbieker Great! Do you have a graph of goroutines by chance? It's available in _internal. They should drop significantly as well once the shards go cold and are recompacted.

@hpbieker
Copy link
Contributor Author

hpbieker commented May 6, 2017

Hi again @jwilder

I think the first graph below confirms that -- we upgraded at 9:00 today :-) The second graph is the same graph, but only with data after the upgrade.
image
image

@hpbieker
Copy link
Contributor Author

hpbieker commented May 7, 2017

I do not know if it is related to this commit or an earlier, but it looks like the current version uses a bit more memory than the previous version. The memory consumption now ends up at around 38.4 GB, but it used to be 32.3 GB. As you see from the graphs, I get a significant jump in various measurements at around 09:00 when we did the upgrade.
May 5th 2017, 15:37:21.000: InfluxDB starting, version 1.3.0, n201704010800, branch master, commit 15e594f
May 6th 2017, 09:13:26.000: InfluxDB starting, version 1.3.0, n201705050800, branch master, commit 0b018ca
May 6th 2017, 16:54:51.000: InfluxDB starting, version 1.3.0, n201705050800, branch master, commit 0b018ca

image

@jwilder
Copy link
Contributor

jwilder commented May 8, 2017

@hpbieker Would you be able to grab a heap profile?

@hpbieker
Copy link
Contributor Author

hpbieker commented May 8, 2017

@jwilder Here is the heap:
heap.txt

@jwilder
Copy link
Contributor

jwilder commented May 8, 2017

@hpbieker #8370 has a change that might reduce the memory usage increase you are seeing. Would you be able to test that PR out to see if memory usage is reduced?

@jwilder
Copy link
Contributor

jwilder commented May 10, 2017

#8370 is merged and in current nightlies.

@hpbieker
Copy link
Contributor Author

hpbieker commented May 19, 2017

@jwilder , we did a restart 16th May at 12:00. I am not sure how to read this, but I see that the heep in use has been reduced, but heap alloc has increased. Also the number of object has increased. Should I provide some more stats?

May 16th 2017, 12:05:11.000 INFO - InfluxDB starting, version 1.3.0-n201705150800, branch master, commit 3b70086 - influx-log
May 14th 2017, 02:04:21.000 INFO - InfluxDB starting, version 1.3.0-n201705050800, branch master, commit 0b018ca - influx-log

image

@jwilder
Copy link
Contributor

jwilder commented May 19, 2017

@hpbieker Are you running into issues with the current build or just noting the difference here? If you can grab a heap snapshot that might be useful.

curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants