Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cxx: Memory leak for anonymous function parameters #3372

Closed
techee opened this issue Apr 30, 2022 · 15 comments · Fixed by #3375
Closed

cxx: Memory leak for anonymous function parameters #3372

techee opened this issue Apr 30, 2022 · 15 comments · Fixed by #3375
Assignees

Comments

@techee
Copy link
Contributor

techee commented Apr 30, 2022

When running a slightly older version of ctags (in Geany) with valgrind, I'm getting this:

==27073== 288 (176 direct, 112 indirect) bytes in 2 blocks are definitely lost in loss record 15,678 of 17,610
==27073==    at 0x48490C0: malloc (vg_replace_malloc.c:381)
==27073==    by 0x4B13233: eMalloc (routines.c:220)
==27073==    by 0x4AB7433: createToken (cxx_token.c:30)
==27073==    by 0x4AEE40F: objPoolGet (objpool.c:66)
==27073==    by 0x4AB7647: cxxTokenCreate (cxx_token.c:78)
==27073==    by 0x4AB7647: cxxTokenCreateAnonymousIdentifier (cxx_token.c:142)
==27073==    by 0x4AB1493: cxxParserTokenChainLooksLikeFunctionParameterList.part.0 (cxx_parser_function.c:2179)
==27073==    by 0x4AB18AB: cxxParserTokenChainLooksLikeFunctionParameterList (cxx_parser_function.c:1934)
==27073==    by 0x4AB18AB: cxxParserLookForFunctionSignatureCheckParenthesisAndIdentifier (cxx_parser_function.c:621)
==27073==    by 0x4AB1C7F: cxxParserLookForFunctionSignature (cxx_parser_function.c:966)
==27073==    by 0x4AAF663: cxxParserAnalyzeOtherStatement (cxx_parser.c:1505)
==27073==    by 0x4AAFEB3: cxxParserParseBlockInternal (cxx_parser_block.c:686)
==27073==    by 0x4AAFEB3: cxxParserParseBlock (cxx_parser_block.c:798)
==27073==    by 0x4AAE1C7: cxxParserMain (cxx_parser.c:1886)
==27073==    by 0x4B04D77: createTagsForFile (parse.c:3737)
==27073==    by 0x4B04D77: createTagsWithFallback1 (parse.c:3857)

This is for function parameter kinds enabled and it seems to be related to the situation when anonymous function parameter is discovered like in void foo(int *). I suspect that normally pIdentifier points to something like token chain that is released by some other code while in this case the token is created and not freed in:

pIdentifier = cxxTokenCreateAnonymousIdentifier(CXXTagKindPARAMETER);

@techee
Copy link
Contributor Author

techee commented Apr 30, 2022

I'm also getting this leak:

==27073== 144 (88 direct, 56 indirect) bytes in 1 blocks are definitely lost in loss record 14,839 of 17,610
==27073==    at 0x48490C0: malloc (vg_replace_malloc.c:381)
==27073==    by 0x4B13233: eMalloc (routines.c:220)
==27073==    by 0x4AB7433: createToken (cxx_token.c:30)
==27073==    by 0x4AEE40F: objPoolGet (objpool.c:66)
==27073==    by 0x4AB37EF: cxxParserParseNextToken (cxx_parser_tokenizer.c:1241)
==27073==    by 0x4AAE23B: cxxParserParseAndCondenseSubchainsUpToOneOf (cxx_parser.c:196)
==27073==    by 0x4AAE41F: cxxParserParseAndCondenseCurrentSubchain (cxx_parser.c:117)
==27073==    by 0x4AAFEE3: cxxParserParseBlockInternal (cxx_parser_block.c:738)
==27073==    by 0x4AAFEE3: cxxParserParseBlock (cxx_parser_block.c:798)
==27073==    by 0x4AB040F: cxxParserParseBlockHandleOpeningBracket (cxx_parser_block.c:167)
==27073==    by 0x4AAFEC7: cxxParserParseBlockInternal (cxx_parser_block.c:717)
==27073==    by 0x4AAFEC7: cxxParserParseBlock (cxx_parser_block.c:798)
==27073==    by 0x4AAE1C7: cxxParserMain (cxx_parser.c:1886)
==27073==    by 0x4B04D77: createTagsForFile (parse.c:3737)
==27073==    by 0x4B04D77: createTagsWithFallback1 (parse.c:3857)

but in this case I don't really know what's going on.

@masatake
Copy link
Member

What kind of input did you give to ctags?

@masatake
Copy link
Member

The first one is not reproduced with the input int foo(int *);.

[jet@living]~/var/ctags-github% cat foo.c                                         
void foo(int *);
[jet@living]~/var/ctags-github%  valgrind ./ctags --kinds-C='*' -o - foo.c        
==102964== Memcheck, a memory error detector
==102964== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==102964== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==102964== Command: ./ctags --kinds-C=* -o - foo.c
==102964== 
__anon0f73801a010d	foo.c	/^void foo(int *);$/;"	z	prototype:foo	typeref:typename:int *	file:
foo	foo.c	/^void foo(int *);$/;"	p	typeref:typename:void	file:
==102964== 
==102964== HEAP SUMMARY:
==102964==     in use at exit: 18,398 bytes in 665 blocks
==102964==   total heap usage: 9,569 allocs, 8,904 frees, 518,762 bytes allocated
==102964== 
==102964== LEAK SUMMARY:
==102964==    definitely lost: 0 bytes in 0 blocks
==102964==    indirectly lost: 0 bytes in 0 blocks
==102964==      possibly lost: 0 bytes in 0 blocks
==102964==    still reachable: 18,398 bytes in 665 blocks
==102964==         suppressed: 0 bytes in 0 blocks
==102964== Rerun with --leak-check=full to see details of leaked memory
==102964== 
==102964== For lists of detected and suppressed errors, rerun with: -s
==102964== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@techee
Copy link
Contributor Author

techee commented Apr 30, 2022

What kind of input did you give to ctags?

That's a good question - basically all Geany sources, GTK, glib and all of our unit tests. But I haven't explored in detail what exactly is causing it yet.

@techee
Copy link
Contributor Author

techee commented May 1, 2022

The cxxTokenCreateAnonymousIdentifier leak seems to be triggered by this piece of code

G_DEFINE_TYPE_WITH_CODE (GDummyProxyResolver, g_dummy_proxy_resolver, G_TYPE_OBJECT,
			 G_IMPLEMENT_INTERFACE (G_TYPE_PROXY_RESOLVER,
						g_dummy_proxy_resolver_iface_init)
			 _g_io_modules_ensure_extension_points_registered ();
			 g_io_extension_point_implement (G_PROXY_RESOLVER_EXTENSION_POINT_NAME,
							 g_define_type_id,
							 "dummy",
							 -100))

static void
g_dummy_proxy_resolver_finalize (GObject *object)
{
  /* must chain up */
  G_OBJECT_CLASS (g_dummy_proxy_resolver_parent_class)->finalize (object);
}

from
https://gitlab.gnome.org/GNOME/glib/-/blob/main/gio/gdummyproxyresolver.c

@masatake
Copy link
Member

masatake commented May 1, 2022

@techee, thank you.
I shrank the input you showed:

f(a,){

masatake added a commit to masatake/ctags that referenced this issue May 1, 2022
…ken parameter list

Close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters are deleted in its caller side.  This assumption
is true if the parameter list where the anonymous parameters are
defined is valid.

If the parameter list is broken, the tokens are never deleted. This
cause a memory leak as reported in universal-ctags#3372.

With this change, cxxParserTokenChainLooksLikeFunctionParameterList()
releases the tokens if given parameter list is broken.

Signed-off-by: Masatake YAMATO <[email protected]>
masatake added a commit to masatake/ctags that referenced this issue May 1, 2022
…ken parameter list

Close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters are deleted on its caller side.  This assumption
is true if the parameter list where the anonymous parameters are
defined is valid.

If the parameter list is broken, the tokens are never deleted. This
cause a memory leak, as reported in universal-ctags#3372.

With this change, cxxParserTokenChainLooksLikeFunctionParameterList()
releases the tokens if given parameter list is broken.

Signed-off-by: Masatake YAMATO <[email protected]>
@techee
Copy link
Contributor Author

techee commented May 2, 2022

I'm also getting this

==12381== 112 (88 direct, 24 indirect) bytes in 1 blocks are definitely lost in loss record 523 of 540
==12381==    at 0x48490C0: malloc (vg_replace_malloc.c:381)
==12381==    by 0x211FC3: eMalloc (routines.c:220)
==12381==    by 0x16555B: createToken (cxx_token.c:30)
==12381==    by 0x225DF7: objPoolGet (objpool.c:66)
==12381==    by 0x161917: cxxParserParseNextToken (cxx_parser_tokenizer.c:1241)
==12381==    by 0x15C3A7: cxxParserParseAndCondenseSubchainsUpToOneOf (cxx_parser.c:305)
==12381==    by 0x15C3A7: cxxParserParseAndCondenseSubchainsUpToOneOf (cxx_parser.c:187)
==12381==    by 0x15C4F7: cxxParserParseAndCondenseCurrentSubchain (cxx_parser.c:117)
==12381==    by 0x15C37F: cxxParserParseAndCondenseSubchainsUpToOneOf (cxx_parser.c:256)
==12381==    by 0x15C37F: cxxParserParseAndCondenseSubchainsUpToOneOf (cxx_parser.c:187)
==12381==    by 0x15C4F7: cxxParserParseAndCondenseCurrentSubchain (cxx_parser.c:117)
==12381==    by 0x15E00B: cxxParserParseBlockInternal (cxx_parser_block.c:738)
==12381==    by 0x15E00B: cxxParserParseBlock (cxx_parser_block.c:798)
==12381==    by 0x15C29F: cxxParserMain (cxx_parser.c:1914)
==12381==    by 0x13F61F: createTagsForFile (parse.c:3759)
==12381==    by 0x13F61F: createTagsWithFallback1 (parse.c:3907)

for

G_DEFINE_TYPE_WITH_CODE (GBufferedInputStream,
			 g_buffered_input_stream,
			 G_TYPE_FILTER_INPUT_STREAM,
                         G_ADD_PRIVATE (GBufferedInputStream)
			 G_IMPLEMENT_INTERFACE (G_TYPE_SEEKABLE,
						g_buffered_input_stream_seekable_iface_init))

static void
g_buffered_input_stream_class_init (GBufferedInputStreamClass *klass)
{
}

but the code looks very similar and is probably the same leak.

@masatake
Copy link
Member

masatake commented May 2, 2022

Reproduced even with ctags + #3373.

I tried the input shorter: G (f () g ()) h (void) {.

masatake added a commit to masatake/ctags that referenced this issue May 2, 2022
Close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As the allocated tokens are not deleted, as reported in universal-ctags#3372.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing current binput
file.

Signed-off-by: Masatake YAMATO <[email protected]>
masatake added a commit to masatake/ctags that referenced this issue May 2, 2022
…meters

Close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As results, the allocated tokens are not deleted.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing current binput
file.

Signed-off-by: Masatake YAMATO <[email protected]>
masatake added a commit to masatake/ctags that referenced this issue May 2, 2022
…meters

Close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As results, the allocated tokens are not deleted.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing current input
file.

Signed-off-by: Masatake YAMATO <[email protected]>
masatake added a commit to masatake/ctags that referenced this issue May 2, 2022
…meters

Close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As a result, the allocated tokens are not deleted.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing the current input
file.

Signed-off-by: Masatake YAMATO <[email protected]>
@masatake
Copy link
Member

masatake commented May 2, 2022

I made a new pull request for fixing this issue.

masatake added a commit to masatake/ctags that referenced this issue May 6, 2022
…meters

Partially close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As a result, the allocated tokens are not deleted.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing the current input
file.

Signed-off-by: Masatake YAMATO <[email protected]>
@masatake
Copy link
Member

masatake commented May 6, 2022

About the example input shown in the initial comment, #3375 fixes the memory leak.
So I will merge #3375.
However, #3375 doesn't fix the memory leaks about the 2nd example input (#3372 (comment)) and its shortened version g (a () b c)) {. The memory leak is reproduced even if --extras=-{anonymous} is given.
So I will keep this open.

masatake added a commit to masatake/ctags that referenced this issue May 6, 2022
…meters

Partially close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As a result, the allocated tokens are not deleted.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing the current input
file.

Signed-off-by: Masatake YAMATO <[email protected]>
masatake added a commit to masatake/ctags that referenced this issue May 6, 2022
…meters

Partially close universal-ctags#3372.

The original code assumed tokens allocated in
cxxParserTokenChainLooksLikeFunctionParameterList() for tagging
anonymous parameters were deleted on its caller side
(cxxParserEmitFunctionParameterTags()).

However, in some conditions, cxxParserEmitFunctionParameterTags() are
not called. As a result, the allocated tokens are not deleted.

To avoid the memory leaking, the deletion cannot depend on
cxxParserEmitFunctionParameterTags(). This change uses per-parsing
trashbox for deleting the anonymous tokens. The objects linked to
the trashbox are destroyed at the end of parsing the current input
file.

Signed-off-by: Masatake YAMATO <[email protected]>
@masatake masatake reopened this May 7, 2022
techee added a commit to techee/geany that referenced this issue May 13, 2022
…meters

This is a cherry-picked commit

55ba7264c0722c9519c0fdfacdf2363ad1c59e94

from uctags to address a memory leak reported in

universal-ctags/ctags#3372
@masatake masatake added the C/C++ label Jun 19, 2022
@masatake
Copy link
Member

When I tried reproducing the leak today on Fedora36, I couldn't.

$ cat /tmp/foo.c              
G_DEFINE_TYPE_WITH_CODE (GBufferedInputStream,
			 g_buffered_input_stream,
			 G_TYPE_FILTER_INPUT_STREAM,
                         G_ADD_PRIVATE (GBufferedInputStream)
			 G_IMPLEMENT_INTERFACE (G_TYPE_SEEKABLE,
						g_buffered_input_stream_seekable_iface_init))

static void
g_buffered_input_stream_class_init (GBufferedInputStreamClass *klass)
{
}
$ ./ctags --version
Universal Ctags 5.9.0(cbdc5c4b6), Copyright (C) 2015-2022 Universal Ctags Team
Universal Ctags is derived from Exuberant Ctags.
Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert
  Compiled: Jun 20 2022, 00:48:48
  URL: https://ctags.io/
  Optional compiled features: +wildcards, +regex, +iconv, +option-directory, +xpath, +json, +interactive, +sandbox, +yaml, +packcc, +optscript, +pcre2
$ valgrind --leak-check=full ./ctags /tmp/foo.c
==2184768== Memcheck, a memory error detector
==2184768== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==2184768== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==2184768== Command: ./ctags /tmp/foo.c
==2184768== 
==2184768== 
==2184768== HEAP SUMMARY:
==2184768==     in use at exit: 18,398 bytes in 665 blocks
==2184768==   total heap usage: 9,634 allocs, 8,969 frees, 526,786 bytes allocated
==2184768== 
==2184768== LEAK SUMMARY:
==2184768==    definitely lost: 0 bytes in 0 blocks
==2184768==    indirectly lost: 0 bytes in 0 blocks
==2184768==      possibly lost: 0 bytes in 0 blocks
==2184768==    still reachable: 18,398 bytes in 665 blocks
==2184768==         suppressed: 0 bytes in 0 blocks
==2184768== Reachable blocks (those to which a pointer was found) are not shown.
==2184768== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2184768== 
==2184768== For lists of detected and suppressed errors, rerun with: -s
==2184768== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@techee
Copy link
Contributor Author

techee commented Jun 20, 2022

Can't reproduce it on Debian 11 either (previously I used Debian 10 but with the latest valgrind). Instead, I'm getting

==8354== Conditional jump or move depends on uninitialised value(s)
==8354==    at 0x15BC98: directiveHash (cpreprocessor.c:1153)
==8354==    by 0x15BC98: handleDirective (cpreprocessor.c:1208)
==8354==    by 0x15BC98: cppGetc (cpreprocessor.c:1871)
==8354==    by 0x162843: cxxParserSkipToNonWhiteSpace (cxx_parser_tokenizer.c:32)
==8354==    by 0x162843: cxxParserParseNextToken (cxx_parser_tokenizer.c:1248)
==8354==    by 0x15D85F: cxxParserParseAndCondenseSubchainsUpToOneOf (cxx_parser.c:196)
==8354==    by 0x15D85F: cxxParserParseUpToOneOf (cxx_parser.c:333)
==8354==    by 0x15D85F: cxxParserParseEnum (cxx_parser.c:898)
==8354==    by 0x15F367: cxxParserParseBlockInternal (cxx_parser_block.c:395)
==8354==    by 0x15F63F: cxxParserParseBlock (cxx_parser_block.c:798)
==8354==    by 0x15CDDB: cxxParserMain (cxx_parser.c:1922)
==8354==    by 0x13E71F: createTagsForFile (parse.c:3759)
==8354==    by 0x13E71F: createTagsWithFallback1 (parse.c:3907)
==8354==    by 0x13EA0B: createTagsWithFallback (parse.c:3998)
==8354==    by 0x13EA0B: parseMio (parse.c:4160)
==8354==    by 0x13EB53: parseFileWithMio (parse.c:4207)
==8354==    by 0x12FCC3: createTagsForEntry (main.c:221)
==8354==    by 0x12FB1F: recurseUsingOpendir (main.c:115)
==8354==    by 0x12FB1F: recurseIntoDirectory (main.c:184)
==8354==    by 0x12FC9B: createTagsForEntry (main.c:215)

but that seems to be like a false positive because I don't see anything wrong with the code. So maybe this issue can be closed.

@masatake
Copy link
Member

masatake commented Jun 20, 2022

Thank you for trying.

What kind of input did you give to ctags for making valgrind report the "Conditional jump or move depends on uninitialised value(s)" ?

@techee
Copy link
Contributor Author

techee commented Jun 20, 2022

It was

G_SLICE=always-malloc G_DEBUG=gc-friendly:resident-modules valgrind --tool=memcheck --leak-check=full --leak-resolution=high --log-file=vgdump --suppressions=/usr/share/glib-2.0/valgrind/glib.supp ctags --kinds-C='*' -R ~/Downloads/glib-2.70.2

with valgrind 3.16.1.

On Debian 10 for which I reported this issue originally it was valgrind 3.19.0 compiled form sources (the shipped valgrind version didn't work correctly on my ARM-based macbook with parallels desktop linux VM). Yeah, I'm making by best not to make bugs reported by me reproducible by anyone else ;-).

@masatake
Copy link
Member

Thank you.
I'll close this one in this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment