Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951

andersalsdu · 2022-02-16T14:13:51Z

We compile Python code and have the impression that the Kaitai_struct compiler does not always work deterministically. Sometimes the compiler claims not to find certain types from imported other kaitai files. We have a main kaitai file which imports six other kaitai files using the toplevel sequencies as subtypes. When compiling again, it then works.

This behaviour occurs from time to time, but its unpredictable.
What is interesting is that the error code says always that different types are not found!

This could have to do with the fact that maybe certain structures are not available at compilation time because the compilation of the subfiles takes place in parallel.

andersalsdu · 2022-07-22T12:17:58Z

Hello,
can you tell me if this topic is still being addressed? 😃

generalmimon · 2022-07-22T13:26:59Z

@andersalsdu:

can you tell me if this topic is still being addressed? 😃

I might look into it sometime, but I'm not sure whether I'll be able to reproduce it.

If you could prepare and share a small reproducible example of .ksy specifications, that would help. It may be easier for you than if I had to do it by blind experimentation, because you can use the specs on which it was happening in the first place as a inspiration and see what's unusual there.

generalmimon · 2022-07-22T13:31:46Z

But generally yes, I know that imports are broken in several aspects (e.g. #295), and I plan to fix them eventually. So this issue may disappear along with it.

andersalsdu · 2022-07-23T15:14:12Z

@generalmimon thank you for your quick reply.
I will see if I can create a small example for you in the next few days.

andersalsdu · 2022-11-16T10:11:58Z

Hello,
I cannot give out my data for reasons of intellectual property. I have tried to create dummy data with the same error, but I have not succeeded. 😞

generalmimon · 2024-03-06T22:04:13Z

I managed to reproduce the issue. Turns out it wasn't that hard to reproduce and seems to happen almost consistently once you reach some (minimum) number of imports. I used the following script to generate 1 main and 50 imported specs:

Python script to generate gen/main.ksy and gen/impXX.ksy test specs

import random
from pathlib import Path

NUM_IMPORTS = 50
SHUFFLE_IMPORTS = True

NUM_DEC_PLACES = len(str(NUM_IMPORTS - 1))

def fmt_i(i):
    return str(i).zfill(NUM_DEC_PLACES)

GEN_DIR = Path('gen')
GEN_DIR.mkdir(exist_ok=True)

import_order = use_order = range(NUM_IMPORTS)

if SHUFFLE_IMPORTS:
    import_order = list(import_order)
    random.shuffle(import_order)

with open(GEN_DIR / 'main.ksy', 'w', encoding='utf-8') as main_f:
    meta = '''\
meta:
  id: main
  imports:
'''
    main_f.write(meta)
    for i in import_order:
        main_f.write(f'    - imp{fmt_i(i)}\n')
    main_f.write('seq:\n')
    for i in use_order:
        main_f.write(f'''\
   - id: f{fmt_i(i)}
     type: imp{fmt_i(i)}
''')

for i in import_order:
    try:
        with open(GEN_DIR / f'imp{fmt_i(i)}.ksy', 'x', encoding='utf-8') as imp_f:
            meta = f'''\
meta:
  id: imp{fmt_i(i)}
'''
            imp_f.write(meta)
    except FileExistsError:
        pass

Usage: python gen-ksy.py (gen-ksy.py is my working name of the script) - it will create a gen/ folder with all the .ksy specs.

Note: Randomizing the order of imports using random.shuffle (and controlled by the SHUFFLE_IMPORTS boolean constant) isn't strictly necessary because the issue is presumably reproducible with any order (even the sequential one). I just wanted to eliminate the possibility that the sequential order is somehow suboptimal for getting race conditions.

The generated .ksy specs are as simple as they can be. The main.ksy file has the following structure:

meta:
  id: main
  imports:
    - imp27
    - imp08
    # ...
seq:
  - id: f00
    type: imp00
  - id: f01
    type: imp01
  # ...

The imported specs have nothing but the /meta/id:

meta:
  id: imp17

Now, if you run this command a few times:

kaitai-struct-compiler --ksc-exceptions --outdir outdir -t python gen/main.ksy

... you should be occasionally able to run into something like:

gen/main.ksy: /seq/24:
	error: unable to find type 'imp24', searching from main

Sometimes, multiple errors are reported at once:

gen/main.ksy: /seq/27:
	error: unable to find type 'imp27', searching from main

gen/main.ksy: /seq/46:
	error: unable to find type 'imp46', searching from main

Overall, the reproduction rate is around 30%. Interestingly enough, increasing the number of imports (i.e. the NUM_IMPORTS constant in the Python script above) doesn't seem to have any significant effect on this rate - 50 imports seems about optimal.

Note that when you add e.g. --verbose all to see what's going on, the reproduction rate goes down dramatically (I'm guessing something like 1% instead of 30% without it), but it's still possible to reproduce. I suppose the output streams are implicitly synchronized in Scala, so it lowers the chance of the race condition (which usually results in some imported types being lost), but doesn't completely eliminate it.

When reproducing this originally reported error, I've also seen java.lang.ArrayIndexOutOfBoundsException thrown from the internal implementation of scala.collection.mutable.HashMap. This had much lower probability than the unable to find type error, it only happened in about 1 in 3000 tries. Nevertheless, it's a more interesting consequence of the race condition than the unable to find type error, because it pinpoints the problem(s) exactly via a stack trace.

Actually, it was possible to get two kinds of java.lang.ArrayIndexOutOfBoundsException:

from the mutable.HashMap.get() call in JavaClassSpecs$.cached() - JavaClassSpecs.scala:62

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 23 out of bounds for length 16
	at scala.collection.mutable.HashMap.get(HashMap.scala:86)
	at io.kaitai.struct.formats.JavaClassSpecs$.cached(JavaClassSpecs.scala:62)
	at io.kaitai.struct.formats.JavaClassSpecs.$anonfun$importRelative$1(JavaClassSpecs.scala:23)
	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:687)
	at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:467)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)

This is a consequence of unsynchronized concurrent mutable accesses to these non-thread-safe hash maps (JavaClassSpecs.scala:18-19):

  private val relFiles = mutable.Map[String, ClassSpec]()
  private val absFiles = mutable.Map[String, ClassSpec]()

i.e. here (JavaClassSpecs.scala:71):

          cacheMap(name) = spec

from the mutable.HashMap.contains() call in LoadImports.loadImport() - LoadImports.scala:79

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 42 out of bounds for length 32
	at scala.collection.mutable.HashMap.contains(HashMap.scala:86)
	at io.kaitai.struct.precompile.LoadImports.$anonfun$loadImport$2(LoadImports.scala:87)
	at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:470)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)

This is a consequence of these unsynchronized concurrent accesses (LoadImports.scala:79-80):

          if (!specs.contains(specName)) {
            specs(specName) = spec

... to specs shared between threads, which is a ClassSpecs instance (LoadImports.scala:15):

class LoadImports(specs: ClassSpecs) {

... and ClassSpecs inherits from mutable.HashMap, which is again non-thread-safe (ClassSpecs.scala:14):

abstract class ClassSpecs(val firstSpec: ClassSpec) extends mutable.HashMap[String, ClassSpec] {

So for both these kinds of hash maps, synchronization needs to be added to solve the problem.

Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. Most of the time, the resulting `ClassSpecs` object (which is just our small wrapper around `scala.collection.mutable.HashMap`) happened to contain all types that were imported, but when a race condition occurred when the imported types were being added (more specifically, in the `LoadImports.loadImport()` method), usually a few imported types went missing. Another possible observed consequence of this race condition (but much less likely) was that the `!specs.contains(specName)` check in the `loadImport` method failed, i.e. raised a `java.lang.ArrayIndexOutOfBoundsException` from the internal implementation of the `mutable.HashMap.contains()` method, suggesting that the internal fields of the `HashMap` were in an inconsistent state. There was also a small probability for another race condition, this time in the `cached` method of the `JavaClassSpecs` object. This also resulted in a `java.lang.ArrayIndexOutOfBoundsException` exception thrown from within the `mutable.HashMap.get()` method. However, this was referrering to different `HashMap`s than the `ClassSpecs` container, namely in the private `relFiles` and `absFiles` fields internal to `JavaClassSpecs`. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, indicating that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. Well, it must be added that since Scala 2.13, the inheritance from `HashMap` as `ClassSpecs` does it was deprecated. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem.

Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. It must be added that since Scala 2.13, the inheritance from `HashMap` was deprecated, so we'll have to find another way, but for now it works. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.

andersalsdu · 2024-03-12T12:50:32Z

Good work and many thanks. Looking forward to using the fixed version. 👍

generalmimon added the bug found label Feb 19, 2022

generalmimon mentioned this issue Mar 6, 2024

Fix occasional failed imports due to race conditions kaitai-io/kaitai_struct_compiler#270

Merged

GreyCat closed this as completed in kaitai-io/kaitai_struct_compiler#270 Mar 9, 2024

generalmimon added this to the v0.11 milestone Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951

Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951

andersalsdu commented Feb 16, 2022 •

edited

Loading

andersalsdu commented Jul 22, 2022

generalmimon commented Jul 22, 2022

generalmimon commented Jul 22, 2022

andersalsdu commented Jul 23, 2022

andersalsdu commented Nov 16, 2022

generalmimon commented Mar 6, 2024 •

edited

Loading

andersalsdu commented Mar 12, 2024

Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951

Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951

Comments

andersalsdu commented Feb 16, 2022 • edited Loading

andersalsdu commented Jul 22, 2022

generalmimon commented Jul 22, 2022

generalmimon commented Jul 22, 2022

andersalsdu commented Jul 23, 2022

andersalsdu commented Nov 16, 2022

generalmimon commented Mar 6, 2024 • edited Loading

andersalsdu commented Mar 12, 2024

andersalsdu commented Feb 16, 2022 •

edited

Loading

generalmimon commented Mar 6, 2024 •

edited

Loading