-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOFError Stream has been closed #237
Comments
Sounds to me like an error due to the GC collecting the remote end of the stream. Can you give a stack trace? |
Good Morning, yes - I Can give a bit more background here. First the Stacktraces:
The exception happen to be in different remote module-calls. As mentioned, its random - not at a deterministic point during execution. The general setup for the connection is (shortened) Server (Inside IDA-Python-Environment which basically is the standard system python 2.7.12, rpyc 3.4.4. For completion, IDA-Python only runs within an existing IDA-Session, so IDA/IDA-Python is running within a screen or tmux session.
On client side: (pypy, rpyc 3.4.4)
And after that points its just a few million requests / per hour to the modules only available in the ida-python environment. This worked and still works if using python instead of pypy for the client side with rpyc. So basically, from my understanding the only thing that changed is the frequency of remote module calls, by a multiple of a few thousand invokes more per minute. The GC process on Server side shouldn't be affected since the Server Environment remained unaffected by the switch to PyPy. I'm still doing more testing on this problem - but if you have some idea, that would be great :) As for now, I worked arround my problem by catching EOFError exception, terminating anything ida-related, restarting the server and resuming the execution. Still gives me an approx 2000% speedup compared to running analysis with Python instead of PyPy. |
Hey, first thanks for the stack traces and example code. I have some superficial feedback. Hopefully, I will be able to take a closer look tonight. This might be related a history of unresolved EOFErrors: #124, #162, #190 and #217. However, in none of these cases sufficient information was reported. Can you provide a full minimal working example by any chance?
cpython has a garbage collector too, so this doesn't rule out a garbage collection problem. If you do more work (more created objects), the GC is typically run sooner (which would be consistent with your statement that you have more requests/time). But you are right, from the server code it looks like there is always a hard reference to the connection on the stack (you could make extra sure by assigning first Could you add
When there is a non-deterministic bug associated with high resource usage, it is most of the time related to either multithreading or garbage collection, so I would check these first. BTW: the server is terminated at the same time as the client receives the EOFError, I assume?
Dirty..;) |
Hello. Thanks for the reply. Yes the Server is terminated at the same time the client receives the EOFError. I am going to try your suggestions next week and report back as soon as I have the results. Best Reagrds, |
Hello again. As of now, I am closing this Issue with the potential to reopen it. The config changes to the GC did not solve it, however with my current tests after changing some things unrelated to rpyc in the algorithmic part of my code, the error did not occur today. However, this needs more investigations from my side. We'll see. When I'm capable of providing more useful information, the issue gets reopened. Best regards |
Hello coldfix. I'm leaving this closed - but going to give some more details: It's gc related, on client-side (The PyPy instance). so basically, wrapping my client-side with gc.disable {.... code } and gc.enable solves the problem. So I am assuming its the PyPI implementation of garbage collection which causes the problem. Updating this when there is more insight. |
So. Back - and I think I found the problem. As mentioned before. Turning of gargabe collection on the client side (pypy) the problem disappeared. I started to let rpyc print the stacktrace each time protocol.py :: 532
res = AsyncResult(weakref.proxy(self)) Changing this to protocol.py :: 532
res = AsyncResult(weakref.ref(self)) seems to fix the problem. I cannot tell you why exactly this fixes it, nor do I know if Maybe that helps solving this issue. Best regards, |
Hey, thanks for the additional information! I don't fully believe using weakref.ref will fix all of these problems, but if it helps in your case, I'm all for it. Can you submit a PR implementing this change? It will take me a few days to process though. Best, Thomas |
Going to do it next week. There's also some more time to test this thoroughly! Greetings |
BTW, I am currently doubting the usefulness of using a weakref for netrefs/asyncs at all. The simplest solution may be to just pass a regular ref unless you see any problems? This means that netrefs will keep connections alive even if you throw away the reference to |
This means that netrefs will now keep connections alive. This may be related to several EOFError issues where the program only keeps reference to a netref, but not the connection. May also be related to pypy related EOFError issues where weakref.proxy may have problems, see #237.
This release brings a few minor backward incompatibilities, so be sure to read on before upgrading. However, fear not: the ones that are most likely relevant to you have a relatively simple migration path. Backward Incompatibilities ^^^^^^^^^^^^^^^^^^^^^^^^^^ * ``classic.teleport_function`` now executes the function in the connection's namespace by default. To get the old behaviour, use ``teleport_function(conn, func, conn.modules[func.__module__].__dict__)`` instead. * Changed signature of ``Service.on_connect`` and ``on_disconnect``, adding the connection as argument. * Changed signature of ``Service.__init__``, removing the connection argument * no longer store connection as ``self._conn``. (allows services that serve multiple clients using the same service object, see `#198`_). * ``SlaveService`` is now split into two asymetric classes: ``SlaveService`` and ``MasterService``. The slave exposes functionality to the master but can not anymore access remote objects on the master (`#232`_, `#248`_). If you were previously using ``SlaveService``, you may experience problems when feeding the slave with netrefs to objects on the master. In this case, do any of the following: * use ``ClassicService`` (acts exactly like the old ``SlaveService``) * use ``SlaveService`` with a ``config`` that allows attribute access etc * use ``rpyc.utils.deliver`` to feed copies rather than netrefs to the slave * ``RegistryServer.on_service_removed`` is once again called whenever a service instance is removed, making it symmetric to ``on_service_added`` (`#238`_) This reverts PR `#173`_ on issue `#172`_. * Removed module ``rpyc.experimental.splitbrain``. It's too confusing and undocumented for me and I won't be developing it, so better remove it altogether. (It's still available in the ``splitbrain`` branch) * Removed module ``rpyc.experimental.retunnel``. Seemingly unused anywhere, no documentation, no clue what this is about. * ``bin/rpyc_classic.py`` will bind to ``127.0.0.1`` instead of ``0.0.0.0`` by default * ``SlaveService`` no longer serves exposed attributes (i.e., it now uses ``allow_exposed_attrs=False``) * Exposed attributes no longer hide plain attributes if one otherwise has the required permissions to access the plain attribute. (`#165`_) .. _#165: #165 .. _#172: #172 .. _#173: #173 .. _#198: #198 .. _#232: #232 .. _#238: #238 .. _#248: #248 What else is new ^^^^^^^^^^^^^^^^ * teleported functions will now be defined by default in the globals dict * Can now explicitly specify globals for teleported functions * Can now use streams as context manager * keep a hard reference to connection in netrefs, may fix some ``EOFError`` issues, in particular on Jython related (`#237`_) * handle synchronous and asynchronous requests uniformly * fix deadlock with connections talking to each other multithreadedly (`#270`_) * handle timeouts cumulatively * fix possible performance bug in ``Win32PipeStream.poll`` (oversleeping) * use readthedocs theme for documentation (`#269`_) * actually time out sync requests (`#264`_) * clarify documentation concerning exceptions in ``Connection.ping`` (`#265`_) * fix ``__hash__`` for netrefs (`#267`_, `#268`_) * rename ``async`` module to ``async_`` for py37 compatibility (`#253`_) * fix ``deliver()`` from IronPython to CPython2 (`#251`_) * fix brine string handling in py2 IronPython (`#251`_) * add gevent_ Server. For now, this requires using ``gevent.monkey.patch_all()`` before importing for rpyc. Client connections can already be made without further changes to rpyc, just using gevent's monkey patching. (`#146`_) * add function ``rpyc.lib.spawn`` to spawn daemon threads * fix several bugs in ``bin/rpycd.py`` that crashed this script on startup (`#231`_) * fix problem with MongoDB, or more generally any remote objects that have a *catch-all* ``__getattr__`` (`#165`_) * fix bug when copying remote numpy arrays (`#236`_) * added ``rpyc.utils.helpers.classpartial`` to bind arguments to services (`#244`_) * can now pass services optionally as instance or class (could only pass as class, `#244`_) * The service is now charged with setting up the connection, doing so in ``Service._connect``. This allows using custom protocols by e.g. subclassing ``Connection``. More discussions and related features in `#239`_-`#247`_. * service can now easily override protocol handlers, by updating ``conn._HANDLERS`` in ``_connect`` or ``on_connect``. For example: ``conn._HANDLERS[HANDLE_GETATTR] = self._handle_getattr``. * most protocol handlers (``Connection._handle_XXX``) now directly get the object rather than its ID as first argument. This makes overriding individual handlers feel much more high-level. And by the way it turns out that this fixes two long-standing issues (`#137`_, `#153`_) * fix bug with proxying context managers (`#228`_) * expose server classes from ``rpyc`` top level module * fix logger issue on jython .. _#137: #137 .. _#146: #146 .. _#153: #153 .. _#165: #165 .. _#228: #228 .. _#231: #231 .. _#236: #236 .. _#237: #237 .. _#239: #239 .. _#244: #244 .. _#247: #247 .. _#251: #251 .. _#253: #253 .. _#264: #264 .. _#265: #265 .. _#267: #267 .. _#268: #268 .. _#269: #269 .. _#270: #270 .. _gevent: http://www.gevent.org/
Hello rpyc community,
I am here with more of a question than an issue. The general problem ist that the error I am facing occurs quite randomly in a setup that is not reproducable here.
The short description is: I'm using a OneShotServer in an application, more detailed - The OneShotServer is running inside the IDA-Pro-Python (Python 2.7, rpyc 3.4.4) environment and exports the common ida modules. The client application uses those exported modules. Single-Threaded, only one connection over localhost (Python 2.7, rpyc 3.4.4)
This process worked so far and still does. However, recently I switched to pypy and since that change, at random events (sometimes after 5 minutes, sometimes after 10 or 15), suddenly an "EOFError: Stream has been closed" exception occurs - accordingly to the stacktrace among the execution path of syncrequest. Completely random, at different module calls and different times.
So, basically the only thing that changed during the transition to pypy is, well, the requests reaching the OneShotServer are coming faster. Approx. a 1000-times faster or even more than that. So, we're talking about several million rpc-invokes per hour, done over localhost.
The question now is: How does the OneShotServer react under such loads, can this be the source of EOFError Stream has been closed error and if yes, what could I possibly do about it despite "sleep" on client side?
Some thoughts on this are very much appreciated.
Best Regards,
Andreas
The text was updated successfully, but these errors were encountered: