Queries slow down hundreds times after overwriting points #6611

kub00n · 2016-05-12T09:55:55Z

Bug report

System info:
InfluxDB 0.13, InfluxDB 0.12.2
Ubuntu 14.04.1

Steps to reproduce:

Add some measurements ( 5M points, timerange: 5000 s )

...
test_series,tag=tag1 value=2 1462060800001
test_series,tag=tag2 value=3 1462060800002
test_series,tag=tag0 value=4 1462060800003
...

Run a query
Insert exactly the same points again (overwriting old ones)
Run a query

Expected behavior:
Both query return the same results at a similar time

Actual behavior:
Results are the same, but second query (after overwrite) is almost 500 times slower than the first query. Repeating query after 10s works a bit faster but still far from the first query.

Additional info:
I wrote simple python script to reproduce it : influx_overwrite_bug.py :

==CREATING DATABASE==
==INSERTING SERIES==
* inserting 5000000 points: from 1462060800000ms - to 1462065800000ms
==QUERY==
* query: select * from test_series LIMIT 1
* result: {u'tag': u'tag0', u'value': 1, u'time': 1462060800000}
* duration: 0.0093s
==INSERTING SERIES==
* inserting 5000000 points: from 1462060800000ms - to 1462065800000ms
==QUERY==
* query: select * from test_series LIMIT 1
* result: {u'tag': u'tag0', u'value': 1, u'time': 1462060800000}
* duration: 4.4923s
==SLEEP 10s==
==QUERY==
* query: select * from test_series LIMIT 1
* result: {u'tag': u'tag0', u'value': 1, u'time': 1462060800000}
* duration: 3.1518s
==REMOVING DATABASE==

The text was updated successfully, but these errors were encountered:

If there were duplicate points in multiple blocks, we would correctly dedup the points and mark the regions of the blocks we've read. Unfortunately, we were not excluding the already points as the cursor moved to points in the later blocks which could cause points to be return twice incorrectly. Fixes #6611

jwilder · 2016-05-18T23:41:56Z

@kub00n Thanks for providing the python script. That made it really easy to reproduce and track down the issue. The perf issue is fixed in master, but your script highlighted a correctness issue with deduplicating overwritten points. See #6668 which will fix that issue.

jwilder added this to the 1.0.0 milestone May 13, 2016

francisdb mentioned this issue May 17, 2016

Slow writes / memory peaks / crashes caused by large amounts of shards creation #6635

Closed

jwilder mentioned this issue May 18, 2016

Fix points already read from being returned more than once #6668

Merged

3 tasks

jwilder added the area/tsm label May 18, 2016

jwilder closed this as completed in #6668 May 18, 2016

jsternberg mentioned this issue Jul 17, 2016

New queries stop responding after multiple past queries #7022

Closed

timhallinflux modified the milestones: 1.0.0, 1.0.0 beta Dec 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queries slow down hundreds times after overwriting points #6611

Queries slow down hundreds times after overwriting points #6611

kub00n commented May 12, 2016 •

edited

Loading

jwilder commented May 18, 2016

Queries slow down hundreds times after overwriting points #6611

Queries slow down hundreds times after overwriting points #6611

Comments

kub00n commented May 12, 2016 • edited Loading

Bug report

jwilder commented May 18, 2016

kub00n commented May 12, 2016 •

edited

Loading