-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951
Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code #951
Comments
Hello, |
I might look into it sometime, but I'm not sure whether I'll be able to reproduce it. If you could prepare and share a small reproducible example of |
But generally yes, I know that imports are broken in several aspects (e.g. #295), and I plan to fix them eventually. So this issue may disappear along with it. |
@generalmimon thank you for your quick reply. |
Hello, |
I managed to reproduce the issue. Turns out it wasn't that hard to reproduce and seems to happen almost consistently once you reach some (minimum) number of imports. I used the following script to generate 1 main and 50 imported specs: Python script to generate
|
Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. Most of the time, the resulting `ClassSpecs` object (which is just our small wrapper around `scala.collection.mutable.HashMap`) happened to contain all types that were imported, but when a race condition occurred when the imported types were being added (more specifically, in the `LoadImports.loadImport()` method), usually a few imported types went missing. Another possible observed consequence of this race condition (but much less likely) was that the `!specs.contains(specName)` check in the `loadImport` method failed, i.e. raised a `java.lang.ArrayIndexOutOfBoundsException` from the internal implementation of the `mutable.HashMap.contains()` method, suggesting that the internal fields of the `HashMap` were in an inconsistent state. There was also a small probability for another race condition, this time in the `cached` method of the `JavaClassSpecs` object. This also resulted in a `java.lang.ArrayIndexOutOfBoundsException` exception thrown from within the `mutable.HashMap.get()` method. However, this was referrering to different `HashMap`s than the `ClassSpecs` container, namely in the private `relFiles` and `absFiles` fields internal to `JavaClassSpecs`. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, indicating that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. Well, it must be added that since Scala 2.13, the inheritance from `HashMap` as `ClassSpecs` does it was deprecated. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem.
Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. It must be added that since Scala 2.13, the inheritance from `HashMap` was deprecated, so we'll have to find another way, but for now it works. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.
Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. It must be added that since Scala 2.13, the inheritance from `HashMap` was deprecated, so we'll have to find another way, but for now it works. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.
Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. It must be added that since Scala 2.13, the inheritance from `HashMap` was deprecated, so we'll have to find another way, but for now it works. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.
Fixes kaitai-io/kaitai_struct#951 Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions. I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html). However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. It must be added that since Scala 2.13, the inheritance from `HashMap` was deprecated, so we'll have to find another way, but for now it works. The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.) For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.
Good work and many thanks. Looking forward to using the fixed version. 👍 |
We compile Python code and have the impression that the Kaitai_struct compiler does not always work deterministically. Sometimes the compiler claims not to find certain types from imported other kaitai files. We have a main kaitai file which imports six other kaitai files using the toplevel sequencies as subtypes. When compiling again, it then works.
This behaviour occurs from time to time, but its unpredictable.
What is interesting is that the error code says always that different types are not found!
This could have to do with the fact that maybe certain structures are not available at compilation time because the compilation of the subfiles takes place in parallel.
The text was updated successfully, but these errors were encountered: