Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tsunami wrapper in production: runs indefinitely #98

Closed
MichaelLangbein opened this issue May 3, 2023 · 12 comments · Fixed by #102
Closed

Tsunami wrapper in production: runs indefinitely #98

MichaelLangbein opened this issue May 3, 2023 · 12 comments · Fixed by #102
Assignees

Comments

@MichaelLangbein
Copy link
Contributor

In production, the tsunami wrapper never gets further than the state running.

Here's the docker-log for the tsunami wrapper after receiving an order:

order message handler receiver: {"orderId":1}
2023-05-03 10:19:18,329 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Got new order to handle with wrapper 1
2023-05-03 10:19:18,498 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input constraint for tsunami mag: 6.85
2023-05-03 10:19:18,499 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input constraint for tsunami lon: -71.37228
2023-05-03 10:19:18,500 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input constraint for tsunami lat: -30.92143
2023-05-03 10:19:18,501 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Parsed constraints
2023-05-03 10:19:18,511 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Define the literal constraints for the jobs
2023-05-03 10:19:18,515 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Define the complex constraints for the jobs
2023-05-03 10:19:18,516 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Define the bbox constraints for the jobs
2023-05-03 10:19:18,517 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Process job
2023-05-03 10:19:18,536 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Lookup the process id
2023-05-03 10:19:18,555 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Got the process id 4
2023-05-03 10:19:18,571 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Created one job
2023-05-03 10:19:18,578 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input for mag: 6.85
2023-05-03 10:19:18,583 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input for lon: -71.37228
2023-05-03 10:19:18,587 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input for lat: -30.92143
2023-05-03 10:19:18,591 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added job to order
2023-05-03 10:19:18,593 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Start mapping to wps inputs
2023-05-03 10:19:18,597 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: retrieve getCapabilities document from  https://riesgos.52north.org/wps (retries: 0)
2023-05-03 10:19:19,782 [tsunami-asyncwrapper_new-order_subscription] INFO  org.n52.geoprocessing.wps.client.WPSClientSession: CONNECT
2023-05-03 10:19:20,787 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: retrieved getCapabilities document from https://riesgos.52north.org/wps (retries: 0)
2023-05-03 10:19:20,787 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Start calling the wps itself under https://riesgos.52north.org/wps/get_tsunamap
2023-05-03 10:19:20,790 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: retrieve wps process description for get_tsunamap (retries: 0)
2023-05-03 10:19:21,048 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: retrieved wps process description for get_tsunamap (retries: 0)
2023-05-03 10:19:21,139 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: <wps:Execute xmlns:wps="http://www.opengis.net/wps/2.0" xmlns:ows="http://www.opengis.net/ows/2.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/2.0 http://schemas.opengis.net/wps/2.0/wps.xsd" response="document" mode="sync" service="WPS" version="2.0.0"><ows:Identifier>get_tsunamap</ows:Identifier><wps:Input id="lat"><wps:Data mimeType="text/xml"><wps:LiteralValue>-30.92143</wps:LiteralValue></wps:Data></wps:Input><wps:Input id="lon"><wps:Data mimeType="text/xml"><wps:LiteralValue>-71.37228</wps:LiteralValue></wps:Data></wps:Input><wps:Input id="mag"><wps:Data mimeType="text/xml"><wps:LiteralValue>6.85</wps:LiteralValue></wps:Data></wps:Input><wps:Output id="tsunamap" transmission="reference" mimeType="application/xml" schema="http://earthquake.usgs.gov/eqcenter/shakemap" encoding="UTF-8"></wps:Output></wps:Execute>
2023-05-03 10:19:21,142 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: execute WPS process get_tsunamap (retries: 0)
2023-05-03 10:19:22,721 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: execute WPS process get_tsunamap (retries: 1)
2023-05-03 10:19:23,875 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: execute WPS process get_tsunamap (retries: 2)
2023-05-03 10:19:30,727 [pulsar-timer-5-1] INFO  org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl: [new-order] [tsunami-asyncwrapper_new-order_subscription] [ac2a0] Prefetched messages: 1 --- Consume throughput received: 0.02 msgs/s --- 0.00 Mbit/s --- Ack sent rate: 0.00 ack/s --- Failed messages: 0 --- batch messages: 0 ---Failed acks: 0

@MichaelLangbein
Copy link
Contributor Author

Does not work:

<wps:Execute xmlns:wps="http://www.opengis.net/wps/2.0" xmlns:ows="http://www.opengis.net/ows/2.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/2.0 http://schemas.opengis.net/wps/2.0/wps.xsd" response="document" mode="sync" service="WPS" version="2.0.0">
    <ows:Identifier>get_tsunamap</ows:Identifier>
    <wps:Input id="lat">
        <wps:Data mimeType="text/xml">
            <wps:LiteralValue>-30.92143</wps:LiteralValue>
        </wps:Data>
    </wps:Input>
    <wps:Input id="lon">
        <wps:Data mimeType="text/xml">
            <wps:LiteralValue>-71.37228</wps:LiteralValue>
        </wps:Data>
    </wps:Input>
    <wps:Input id="mag">
        <wps:Data mimeType="text/xml">
            <wps:LiteralValue>6.85</wps:LiteralValue>
        </wps:Data>
    </wps:Input>
    <wps:Output id="tsunamap" transmission="reference" mimeType="application/xml" schema="http://earthquake.usgs.gov/eqcenter/shakemap" encoding="UTF-8"></wps:Output>
</wps:Execute>

Doesn't work, either ... even though it's one of AWIs pre-calculated simulations:

<wps:Execute xmlns:wps="http://www.opengis.net/wps/2.0" xmlns:ows="http://www.opengis.net/ows/2.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/2.0 http://schemas.opengis.net/wps/2.0/wps.xsd" response="document" mode="sync" service="WPS" version="2.0.0">
    <ows:Identifier>get_tsunamap</ows:Identifier>
    <wps:Input id="lat">
        <wps:Data mimeType="text/xml">
            <wps:LiteralValue>-32.9011</wps:LiteralValue>
        </wps:Data>
    </wps:Input>
    <wps:Input id="lon">
        <wps:Data mimeType="text/xml">
            <wps:LiteralValue>-71.8152</wps:LiteralValue>
        </wps:Data>
    </wps:Input>
    <wps:Input id="mag">
        <wps:Data mimeType="text/xml">
            <wps:LiteralValue>8.5</wps:LiteralValue>
        </wps:Data>
    </wps:Input>
    <wps:Output id="tsunamap" transmission="reference" mimeType="application/xml" schema="http://earthquake.usgs.gov/eqcenter/shakemap" encoding="UTF-8"></wps:Output>
</wps:Execute>

@arnevogt
Copy link
Member

arnevogt commented May 4, 2023

Do we actually have an example execute request that works?

@bpross-52n maybe?

@bpross-52n
Copy link
Member

I just tested this execute request successfully:

<wps:Execute xmlns:wps="http://www.opengis.net/wps/1.0.0" service="WPS" version="1.0.0">
    <p0:Identifier xmlns:p0="http://www.opengis.net/ows/1.1">get_tsunamap</p0:Identifier>
    <wps:DataInputs>
        <wps:Input>
            <p0:Identifier xmlns:p0="http://www.opengis.net/ows/1.1">lon</p0:Identifier>
            <p0:Title xmlns:p0="http://www.opengis.net/ows/1.1">lon</p0:Title>
            <p0:Abstract xmlns:p0="http://www.opengis.net/ows/1.1"></p0:Abstract>
            <wps:Data>
                <wps:LiteralData>-71.8004</wps:LiteralData>
            </wps:Data>
        </wps:Input>
        <wps:Input>
            <p0:Identifier xmlns:p0="http://www.opengis.net/ows/1.1">lat</p0:Identifier>
            <p0:Title xmlns:p0="http://www.opengis.net/ows/1.1">lat</p0:Title>
            <p0:Abstract xmlns:p0="http://www.opengis.net/ows/1.1"></p0:Abstract>
            <wps:Data>
                <wps:LiteralData>-32.9143</wps:LiteralData>
            </wps:Data>
        </wps:Input>
        <wps:Input>
            <p0:Identifier xmlns:p0="http://www.opengis.net/ows/1.1">mag</p0:Identifier>
            <p0:Title xmlns:p0="http://www.opengis.net/ows/1.1">mag</p0:Title>
            <p0:Abstract xmlns:p0="http://www.opengis.net/ows/1.1"></p0:Abstract>
            <wps:Data>
                <wps:LiteralData>8.0</wps:LiteralData>
            </wps:Data>
        </wps:Input>
    </wps:DataInputs>
    <wps:ResponseForm>
        <wps:ResponseDocument storeExecuteResponse="true" status="true">
            <wps:Output mimeType="application/xml" asReference="true">
                <p0:Identifier xmlns:p0="http://www.opengis.net/ows/1.1">tsunamap</p0:Identifier>
            </wps:Output>
        </wps:ResponseDocument>
    </wps:ResponseForm>
</wps:Execute>

@MichaelLangbein MichaelLangbein changed the title Tsunami wrapper in production: runs infinitely Tsunami wrapper in production: runs indefinitely May 4, 2023
@arnevogt
Copy link
Member

I tried to narrow down the reason for this error.
This time it is not the WPS Client Lib. I created a minimal example (see GIST).
The example works. Also the logged WPS execute request (1.0.0) works fine if copied to curl/Postman.

<wps:Execute xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd" service="WPS" version="1.0.0">
    <ows:Identifier>get_tsunamap</ows:Identifier>
    <wps:DataInputs>
        <wps:Input>
            <ows:Identifier>lat</ows:Identifier>
            <wps:Data>
                <wps:LiteralData>-30.0</wps:LiteralData>
            </wps:Data>
        </wps:Input>
        <wps:Input>
            <ows:Identifier>lon</ows:Identifier>
            <wps:Data>
                <wps:LiteralData>-70.0</wps:LiteralData>
            </wps:Data>
        </wps:Input>
        <wps:Input>
            <ows:Identifier>mag</ows:Identifier>
            <wps:Data>
                <wps:LiteralData>8.5</wps:LiteralData>
            </wps:Data>
        </wps:Input>
    </wps:DataInputs>
    <wps:ResponseForm>
        <wps:ResponseDocument>
            <wps:Output asReference="true" mimeType="application/xml" schema="http://earthquake.usgs.gov/eqcenter/shakemap" encoding="UTF-8">
                <ows:Identifier>tsunamap</ows:Identifier>
            </wps:Output>
        </wps:ResponseDocument>
    </wps:ResponseForm>
</wps:Execute>

The values for lat, lon and mag have to be as in the pre-calculated simulations. Otherwise the WPS responds with the exception message No such file or directory: (-71.8152, -32.9011) that we saw in the WPS log. The first example failed for this reason. The second example and the one Benni posted work actually fine.

If it is not the WPS Client Lib it could still be:

  • we still mix WPS versions somewhere in the wrapper so that the process is not executed using WPS 1.0.0
    • but I cannot find an issue in the source code
  • the UI does not pass the parameters to the wrapper correctly
    • maybe confusing lat, lon
  • we are chasing a phantom and just made a stupid mistake when creating the order

@MichaelLangbein
Copy link
Contributor Author

@arnevogt , @bpross-52n : my second request in this comment didn't work because it used WPS-V2. With WPS-V1 syntax it does indeed work fine.

@arnevogt : did you deploy your changes to production?

@MichaelLangbein
Copy link
Contributor Author

@arnevogt : I have deployed the latest state of main branch to production.

Tsunami wrapper does work fine if the input is an earthquake that tsuami-wps knows about.
However, if it isn't known (i.e if mag < 8), the wrapper still keeps hanging in running state.

Here's the tsunami-wrapper log for an order with mag < 8:

2023-05-15 09:11:09,422 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Got new order to handle with wrapper 3
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input constraint for tsunami mag: 6.5
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input constraint for tsunami lon: -69.13947
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input constraint for tsunami lat: -33.59573
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Parsed constraints
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Define the literal constraints for the jobs
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Define the complex constraints for the jobs
2023-05-15 09:11:09,425 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Define the bbox constraints for the jobs
2023-05-15 09:11:09,426 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Process job
2023-05-15 09:11:09,427 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Lookup the process id
2023-05-15 09:11:09,432 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Got the process id 3
2023-05-15 09:11:09,436 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Created one job
2023-05-15 09:11:09,438 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input for mag: 6.5
2023-05-15 09:11:09,442 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input for lon: -69.13947
2023-05-15 09:11:09,447 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added literal input for lat: -33.59573
2023-05-15 09:11:09,449 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Added job to order
2023-05-15 09:11:09,450 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Start mapping to wps inputs
2023-05-15 09:11:09,450 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: retrieve getCapabilities document from  https://riesgos.52north.org/wps (retries: 0)
2023-05-15 09:11:09,452 [tsunami-asyncwrapper_new-order_subscription] INFO  org.n52.geoprocessing.wps.client.WPSClientSession: CONNECT
2023-05-15 09:11:09,569 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: retrieved getCapabilities document from https://riesgos.52north.org/wps (retries: 0)
2023-05-15 09:11:09,569 [tsunami-asyncwrapper_new-order_subscription] INFO  AbstractWrapper: Start calling the wps itself under https://riesgos.52north.org/wps/get_tsunamap
2023-05-15 09:11:09,569 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: retrieve wps process description for get_tsunamap (retries: 0)
2023-05-15 09:11:09,666 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: retrieved wps process description for get_tsunamap (retries: 0)
2023-05-15 09:11:09,670 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: <wps:Execute xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd" service="WPS" version="1.0.0"><ows:Identifier>get_tsunamap</ows:Identifier><wps:DataInputs><wps:Input><ows:Identifier>lat</ows:Identifier><wps:Data><wps:LiteralData>-33.59573</wps:LiteralData></wps:Data></wps:Input><wps:Input><ows:Identifier>lon</ows:Identifier><wps:Data><wps:LiteralData>-69.13947</wps:LiteralData></wps:Data></wps:Input><wps:Input><ows:Identifier>mag</ows:Identifier><wps:Data><wps:LiteralData>6.5</wps:LiteralData></wps:Data></wps:Input></wps:DataInputs><wps:ResponseForm><wps:ResponseDocument><wps:Output asReference="true" mimeType="application/xml" schema="http://earthquake.usgs.gov/eqcenter/shakemap" encoding="UTF-8"><ows:Identifier>tsunamap</ows:Identifier></wps:Output></wps:ResponseDocument></wps:ResponseForm></wps:Execute>
2023-05-15 09:11:09,671 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: execute WPS process get_tsunamap (retries: 0)
2023-05-15 09:11:09,766 [tsunami-asyncwrapper_new-order_subscription] WARN  WPSProcess: wps process returned an exception, retryable: true
org.n52.geoprocessing.wps.client.WPSClientException: Could not decode Inputstream.
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:620)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveResponseOrExceptionReportInpustream(WPSClientSession.java:588)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveDataViaPOST(WPSClientSession.java:659)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveExecuteResponseViaPOST(WPSClientSession.java:690)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:354)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:347)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:104)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:102)
	at org.n.riesgos.asyncwrapper.utils.RetryUtilsKt.retry(RetryUtils.kt:22)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess.runProcess(WPSProcess.kt:102)
	at org.n.riesgos.asyncwrapper.dummy.AbstractWrapper.runOneJob(AbstractWrapper.kt:286)
	at org.n.riesgos.asyncwrapper.dummy.AbstractWrapper.fillConstraintsAndRun(AbstractWrapper.kt:203)
	at org.n.riesgos.asyncwrapper.dummy.AbstractWrapper.run(AbstractWrapper.kt:66)
	at org.n.riesgos.asyncwrapper.events.OrderMessageHandler.handleMessage(OrderMessageHandler.kt:23)
	at org.n.riesgos.asyncwrapper.pulsar.PulsarConsumer.receiveMessages(PulsarConsumer.kt:22)
	at org.n.riesgos.asyncwrapper.pulsar.PulsarConsumer.run(PulsarConsumer.kt:13)
	at java.base/java.lang.Thread.run(Thread.java:831)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[11,34]
Message: unexpected tag: {http://www.opengis.net/wps/1.0.0}ExceptionReport
	at org.n52.iceland.util.XmlFactories.unexpectedTag(XmlFactories.java:142)
	at org.n52.geoprocessing.wps.client.decoder.stream.GetStatusResponse100Decoder.readExecuteResponse(GetStatusResponse100Decoder.java:101)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readExecuteResponse(WPSResponseReader.java:75)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readElement(WPSResponseReader.java:64)
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:618)
	... 16 common frames omitted
2023-05-15 09:11:10,767 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: execute WPS process get_tsunamap (retries: 1)
2023-05-15 09:11:10,880 [tsunami-asyncwrapper_new-order_subscription] WARN  WPSProcess: wps process returned an exception, retryable: true
org.n52.geoprocessing.wps.client.WPSClientException: Could not decode Inputstream.
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:620)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveResponseOrExceptionReportInpustream(WPSClientSession.java:588)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveDataViaPOST(WPSClientSession.java:659)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveExecuteResponseViaPOST(WPSClientSession.java:690)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:354)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:347)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:104)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:102)
	at org.n.riesgos.asyncwrapper.utils.RetryUtilsKt.retry(RetryUtils.kt:22)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess.runProcess(WPSProcess.kt:102)
	at org.n.riesgos.asyncwrapper.dummy.AbstractWrapper.runOneJob(AbstractWrapper.kt:286)
	at org.n.riesgos.asyncwrapper.dummy.AbstractWrapper.fillConstraintsAndRun(AbstractWrapper.kt:203)
	at org.n.riesgos.asyncwrapper.dummy.AbstractWrapper.run(AbstractWrapper.kt:66)
	at org.n.riesgos.asyncwrapper.events.OrderMessageHandler.handleMessage(OrderMessageHandler.kt:23)
	at org.n.riesgos.asyncwrapper.pulsar.PulsarConsumer.receiveMessages(PulsarConsumer.kt:22)
	at org.n.riesgos.asyncwrapper.pulsar.PulsarConsumer.run(PulsarConsumer.kt:13)
	at java.base/java.lang.Thread.run(Thread.java:831)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[11,34]
Message: unexpected tag: {http://www.opengis.net/wps/1.0.0}ExceptionReport
	at org.n52.iceland.util.XmlFactories.unexpectedTag(XmlFactories.java:142)
	at org.n52.geoprocessing.wps.client.decoder.stream.GetStatusResponse100Decoder.readExecuteResponse(GetStatusResponse100Decoder.java:101)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readExecuteResponse(WPSResponseReader.java:75)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readElement(WPSResponseReader.java:64)
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:618)
	... 16 common frames omitted
2023-05-15 09:11:11,881 [tsunami-asyncwrapper_new-order_subscription] INFO  WPSProcess: execute WPS process get_tsunamap (retries: 2)
2023-05-15 09:11:26,380 [pulsar-timer-5-1] INFO  org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl: [new-order] [tsunami-asyncwrapper_new-order_subscription] [c7ee3] Prefetched messages: 0 --- Consume throughput received: 0.02 msgs/s --- 0.00 Mbit/s --- Ack sent rate: 0.00 ack/s --- Failed messages: 0 --- batch messages: 0 ---Failed acks: 0

@arnevogt
Copy link
Member

@MichaelLangbein
#100 only changed the logging - unless the WPS client has really weird behavior. I will look into the clients code.

Confirm that the wrapper is still stuck if the input values don't match a pre-calculated simulation. The issue itself is related to the PyWPS/get_tsunamap process. From the log you can see that WPS (not the wrapper) is stuck during the third retry. I don't know why the WPS behaves like this, but this is repeatable also with the GIST I wrote.
I think the only way to prevent this state in the wrapper (besides using correct input values) is to implement a timeout.
Maybe it makes sense to create a new, separate issue for that(?)

@MichaelLangbein
Copy link
Contributor Author

Maybe it makes sense to create a new, separate issue for that(?)

I've created an issue at https://gitlab.awi.de/tsunawi/tsunawi/-/issues/2 , but since this is a private repo, I'm keeping this one open.
@bpross-52n : I believe you have write-access to that repo, right?

@arnevogt
Copy link
Member

I also created a new issue regarding the timeout for the wrapper because the case that a WPS process is stuck can (in theory) happen with all processes.

#101

@MichaelLangbein
Copy link
Contributor Author

Confirm that the wrapper is still stuck if the input values don't match a pre-calculated simulation. The issue itself is related to the PyWPS/get_tsunamap process. From the log you can see that WPS (not the wrapper) is stuck during the third retry. I don't know why the WPS behaves like this, but this is repeatable also with the GIST I wrote.

Alright, I can confirm that the wrapper is still stuck if the input values don't match a pre-calculated simulation.

I am, however, not certain that the issue is with PyWPS/get_tsunamap. Simply sending an execute-request with postman several times does not cause py-wps to stop responding. I can reproduce the error only with the production-environment and with your GIST.

@MichaelLangbein
Copy link
Contributor Author

BTW: my intellij debugger gives the following output when running that GIST:

Connected to the target VM, address: '127.0.0.1:53399', transport: 'socket'
Starting ...
2023-05-15 16:38:02,583 [main] INFO  org.n52.geoprocessing.wps.client.WPSClientSession: CONNECT
Connected successfully!
May 15, 2023 4:38:07 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processDescription$2 invoke
INFO: retrieve wps process description for get_tsunamap (retries: 0)
May 15, 2023 4:38:07 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processDescription$2 invoke
INFO: retrieved wps process description for get_tsunamap (retries: 0)
May 15, 2023 4:38:07 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess runProcess
INFO: <wps:Execute xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsAll.xsd" service="WPS" version="1.0.0"><ows:Identifier>get_tsunamap</ows:Identifier><wps:DataInputs><wps:Input><ows:Identifier>lat</ows:Identifier><wps:Data><wps:LiteralData>-32.9011</wps:LiteralData></wps:Data></wps:Input><wps:Input><ows:Identifier>lon</ows:Identifier><wps:Data><wps:LiteralData>-79.8152</wps:LiteralData></wps:Data></wps:Input><wps:Input><ows:Identifier>mag</ows:Identifier><wps:Data><wps:LiteralData>8.5</wps:LiteralData></wps:Data></wps:Input></wps:DataInputs><wps:ResponseForm><wps:ResponseDocument><wps:Output asReference="true" mimeType="application/xml" schema="http://earthquake.usgs.gov/eqcenter/shakemap" encoding="UTF-8"><ows:Identifier>tsunamap</ows:Identifier></wps:Output></wps:ResponseDocument></wps:ResponseForm></wps:Execute>
May 15, 2023 4:38:07 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2 invoke
INFO: execute WPS process get_tsunamap (retries: 0)
May 15, 2023 4:45:38 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess isWPSClientException
WARNING: wps process returned an exception, retryable: true
org.n52.geoprocessing.wps.client.WPSClientException: Could not decode Inputstream.
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:620)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveResponseOrExceptionReportInpustream(WPSClientSession.java:588)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveDataViaPOST(WPSClientSession.java:659)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveExecuteResponseViaPOST(WPSClientSession.java:690)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:354)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:347)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:104)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:102)
	at org.n.riesgos.asyncwrapper.utils.RetryUtilsKt.retry(RetryUtils.kt:22)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess.runProcess(WPSProcess.kt:102)
	at ExecService_ws.<init>(ExecService.ws.kts:41)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.evalWithConfigAndOtherScriptsResults(BasicJvmScriptEvaluator.kt:105)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.invoke$suspendImpl(BasicJvmScriptEvaluator.kt:47)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.invoke(BasicJvmScriptEvaluator.kt)
	at org.jetbrains.kotlin.scripting.compiler.plugin.AbstractScriptEvaluationExtension$doEval$1.invokeSuspend(AbstractScriptEvaluationExtension.kt:156)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlin.coroutines.ContinuationKt.startCoroutine(Continuation.kt:115)
	at kotlin.script.experimental.impl.RunSuspendKt.internalScriptingRunSuspend(runSuspend.kt:19)
	at org.jetbrains.kotlin.scripting.compiler.plugin.AbstractScriptEvaluationExtension.doEval(AbstractScriptEvaluationExtension.kt:135)
	at org.jetbrains.kotlin.scripting.compiler.plugin.AbstractScriptEvaluationExtension.eval(AbstractScriptEvaluationExtension.kt:122)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler.doExecute(K2JVMCompiler.kt:114)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler.doExecute(K2JVMCompiler.kt:53)
	at org.jetbrains.kotlin.cli.common.CLICompiler.execImpl(CLICompiler.kt:100)
	at org.jetbrains.kotlin.cli.common.CLICompiler.execImpl(CLICompiler.kt:46)
	at org.jetbrains.kotlin.cli.common.CLITool.exec(CLITool.kt:101)
	at org.jetbrains.kotlin.cli.common.CLITool.exec(CLITool.kt:79)
	at org.jetbrains.kotlin.cli.common.CLITool.exec(CLITool.kt:43)
	at org.jetbrains.kotlin.cli.common.CLITool$Companion.doMainNoExit(CLITool.kt:180)
	at org.jetbrains.kotlin.cli.common.CLITool$Companion.doMainNoExit$default(CLITool.kt:178)
	at org.jetbrains.kotlin.cli.common.CLITool$Companion.doMain(CLITool.kt:167)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler$Companion.main(K2JVMCompiler.kt:251)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler.main(K2JVMCompiler.kt)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[11,34]
Message: unexpected tag: {http://www.opengis.net/wps/1.0.0}ExceptionReport
	at org.n52.iceland.util.XmlFactories.unexpectedTag(XmlFactories.java:142)
	at org.n52.geoprocessing.wps.client.decoder.stream.GetStatusResponse100Decoder.readExecuteResponse(GetStatusResponse100Decoder.java:101)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readExecuteResponse(WPSResponseReader.java:75)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readElement(WPSResponseReader.java:64)
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:618)
	... 35 more

May 15, 2023 4:45:41 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2 invoke
INFO: execute WPS process get_tsunamap (retries: 1)
May 15, 2023 4:45:59 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess isWPSClientException
WARNING: wps process returned an exception, retryable: true
org.n52.geoprocessing.wps.client.WPSClientException: Could not decode Inputstream.
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:620)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveResponseOrExceptionReportInpustream(WPSClientSession.java:588)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveDataViaPOST(WPSClientSession.java:659)
	at org.n52.geoprocessing.wps.client.WPSClientSession.retrieveExecuteResponseViaPOST(WPSClientSession.java:690)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:354)
	at org.n52.geoprocessing.wps.client.WPSClientSession.execute(WPSClientSession.java:347)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:104)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2.invoke(WPSProcess.kt:102)
	at org.n.riesgos.asyncwrapper.utils.RetryUtilsKt.retry(RetryUtils.kt:22)
	at org.n.riesgos.asyncwrapper.process.wps.WPSProcess.runProcess(WPSProcess.kt:102)
	at ExecService_ws.<init>(ExecService.ws.kts:41)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.evalWithConfigAndOtherScriptsResults(BasicJvmScriptEvaluator.kt:105)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.invoke$suspendImpl(BasicJvmScriptEvaluator.kt:47)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.invoke(BasicJvmScriptEvaluator.kt)
	at org.jetbrains.kotlin.scripting.compiler.plugin.AbstractScriptEvaluationExtension$doEval$1.invokeSuspend(AbstractScriptEvaluationExtension.kt:156)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlin.coroutines.ContinuationKt.startCoroutine(Continuation.kt:115)
	at kotlin.script.experimental.impl.RunSuspendKt.internalScriptingRunSuspend(runSuspend.kt:19)
	at org.jetbrains.kotlin.scripting.compiler.plugin.AbstractScriptEvaluationExtension.doEval(AbstractScriptEvaluationExtension.kt:135)
	at org.jetbrains.kotlin.scripting.compiler.plugin.AbstractScriptEvaluationExtension.eval(AbstractScriptEvaluationExtension.kt:122)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler.doExecute(K2JVMCompiler.kt:114)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler.doExecute(K2JVMCompiler.kt:53)
	at org.jetbrains.kotlin.cli.common.CLICompiler.execImpl(CLICompiler.kt:100)
	at org.jetbrains.kotlin.cli.common.CLICompiler.execImpl(CLICompiler.kt:46)
	at org.jetbrains.kotlin.cli.common.CLITool.exec(CLITool.kt:101)
	at org.jetbrains.kotlin.cli.common.CLITool.exec(CLITool.kt:79)
	at org.jetbrains.kotlin.cli.common.CLITool.exec(CLITool.kt:43)
	at org.jetbrains.kotlin.cli.common.CLITool$Companion.doMainNoExit(CLITool.kt:180)
	at org.jetbrains.kotlin.cli.common.CLITool$Companion.doMainNoExit$default(CLITool.kt:178)
	at org.jetbrains.kotlin.cli.common.CLITool$Companion.doMain(CLITool.kt:167)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler$Companion.main(K2JVMCompiler.kt:251)
	at org.jetbrains.kotlin.cli.jvm.K2JVMCompiler.main(K2JVMCompiler.kt)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[11,34]
Message: unexpected tag: {http://www.opengis.net/wps/1.0.0}ExceptionReport
	at org.n52.iceland.util.XmlFactories.unexpectedTag(XmlFactories.java:142)
	at org.n52.geoprocessing.wps.client.decoder.stream.GetStatusResponse100Decoder.readExecuteResponse(GetStatusResponse100Decoder.java:101)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readExecuteResponse(WPSResponseReader.java:75)
	at org.n52.geoprocessing.wps.client.xml.WPSResponseReader.readElement(WPSResponseReader.java:64)
	at org.n52.geoprocessing.wps.client.WPSClientSession.parseInputStreamToString(WPSClientSession.java:618)
	... 35 more

May 15, 2023 4:46:02 PM org.n.riesgos.asyncwrapper.process.wps.WPSProcess$runProcess$processOutput$2 invoke
INFO: execute WPS process get_tsunamap (retries: 2)
Disconnected from the target VM, address: '127.0.0.1:53399', transport: 'socket'

Process finished with exit code 130 (interrupted by signal 2: SIGINT)

Not sure why there is a ParseError. And even if there is, the app should just throw that error a final time instead of hanging.

@arnevogt
Copy link
Member

@MichaelLangbein
I debugged a little bit into the WPS Client's code.
The client does not expect the nested wps:ExceptionReport tag. Tomorrow I will check if this another issue with the client or the response document is invalid.
Unfortunately that parsing error hides the exception that caused the failure of the WPS process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants