Discussion:
[gevent] Gevent C-API
Pan
2017-09-27 13:36:53 UTC
Permalink
Hello,

First off, thank you for the awesome library.

I have a question regarding use of gevent via a C-API, specifically in
Cython generated code. Have seen the pxd definitions for cares and libev in
the project but it does not look like the gevent API itself is exposed via
Cython.

The use case I have for this is with gevent being used as an IO library for
a C library (libssh2), that is in turn used via a python wrapper extension (
ssh2-python <https://github.com/ParallelSSH/ssh2-python>). The C library
natively supports non-blocking mode and gevent is used in python space to
connect the python C library wrapper with an event loop and co-operative
sockets. So far so good, see gevent using client here
<https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/ssh2_client.py>
.

Since the underlying library is in C, the library also uses native threads
via gevent's native thread pool to offload non network related blocking
calls to threads. However, when it comes to waiting for network I/O, those
threads have to acquire the GIL in order to call gevent's co-operative
select.

For example reading from network and writing to local file - the file write
is implemented in C, releases the GIL and runs in a native thread while
reading from network has to call gevent.select and therefore has to hold
the GIL
<https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/native/ssh2.pyx#L125-L131>.
This, however, blocks other threads from running.

This is what I would like to avoid by interfacing with a C-API. If gevent's
select calls were done in C space, without the GIL, the rest of the
application could continue processing. The library is a parallel SSH client
so this is quite common.

Is this at all possible? If not, would running separate hubs in multiple
native threads achieve the same result, ie not blocking other native
threads from running due to holding the GIL? Or would those hubs also need
the GIL, effectively serialising them?

Am aware of gipc but as the library being used is native code, multiple
processes with IPC is not a good fit for this use case.

Another use case I have in mind is for scaling gevent past a single core
with multiple hubs in native threads that do not hold the GIL. This is
feasible in this case as a native library is used so the GIL will not
serialise everything (this is what gipc aims to solve).

Thank you for reading.

Pan
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Grady Player
2017-09-27 14:13:14 UTC
Permalink
So I am not a maintainer of this project or anything, just a user, but:

If you are dealing with native C stuff I would just do stuff on real threads and avoid gevent, if your program is well structured enough to keep all of the python stuff silo'd it it might be possible.

You generally shouldn't worry about the GIL in C Python until you have done some profiling to see if there is really an issue.

People often run Cython to get around the GIL altogether, so not sure if you are running without nogil...

Just some ramblings...

Grady


Sent from my iPhone
Post by Pan
Hello,
First off, thank you for the awesome library.
I have a question regarding use of gevent via a C-API, specifically in Cython generated code. Have seen the pxd definitions for cares and libev in the project but it does not look like the gevent API itself is exposed via Cython.
The use case I have for this is with gevent being used as an IO library for a C library (libssh2), that is in turn used via a python wrapper extension (ssh2-python). The C library natively supports non-blocking mode and gevent is used in python space to connect the python C library wrapper with an event loop and co-operative sockets. So far so good, see gevent using client here.
Since the underlying library is in C, the library also uses native threads via gevent's native thread pool to offload non network related blocking calls to threads. However, when it comes to waiting for network I/O, those threads have to acquire the GIL in order to call gevent's co-operative select.
For example reading from network and writing to local file - the file write is implemented in C, releases the GIL and runs in a native thread while reading from network has to call gevent.select and therefore has to hold the GIL. This, however, blocks other threads from running.
This is what I would like to avoid by interfacing with a C-API. If gevent's select calls were done in C space, without the GIL, the rest of the application could continue processing. The library is a parallel SSH client so this is quite common.
Is this at all possible? If not, would running separate hubs in multiple native threads achieve the same result, ie not blocking other native threads from running due to holding the GIL? Or would those hubs also need the GIL, effectively serialising them?
Am aware of gipc but as the library being used is native code, multiple processes with IPC is not a good fit for this use case.
Another use case I have in mind is for scaling gevent past a single core with multiple hubs in native threads that do not hold the GIL. This is feasible in this case as a native library is used so the GIL will not serialise everything (this is what gipc aims to solve).
Thank you for reading.
Pan
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Pan
2017-09-27 16:31:42 UTC
Permalink
Thanks for the reply.

Yes, in general I would agree, however, I am already using Cython code as a
wrapper to the C library, which I've also written, and for native code for
file reading/writing, both of which release the GIL. Am also already using
native threads to call blocking native code functions.

The problem is the GIL *does* need to be held while calling gevent.select
as that is a python function which has the effect of blocking all other
threads (please correct me if gevent.select does in fact release the GIL
after it is called). It does allow other greenlets to run as the select is
co-operative but the library also has native threads spawned by python code
which is blocked by the GIL. That is what I would like to avoid.

There is no 'issue' as such, just more performance to be had if it is
feasible to use gevent's select sans GIL. The other use case I am
interested in is as I mentioned scaling gevent on multiple cores with
threads without resorting to multi processing and IPC.

This last use case in particular applies to all python wrappers of native
libraries that want to use gevent for non-blocking IO. Not a very common
use case I would guess, but an interesting one none the less.

Both these use cases would benefit the library in terms of performance and
scaling, it has been profiled and load tested already.

Avoiding gevent entirely would mean having to interface with libev
directly, as I want to keep the library non-blocking, and have to duplicate
a lot of the work that gevent does which I'd rather avoid. If it came down
to that being the only option I'd probably just accept it as 'not possible'
for the time being.

Thank you for the input though.
Post by Grady Player
If you are dealing with native C stuff I would just do stuff on real
threads and avoid gevent, if your program is well structured enough to keep
all of the python stuff silo'd it it might be possible.
You generally shouldn't worry about the GIL in C Python until you have
done some profiling to see if there is really an issue.
People often run Cython to get around the GIL altogether, so not sure if
you are running without nogil...
Just some ramblings...
Grady
Sent from my iPhone
Hello,
First off, thank you for the awesome library.
I have a question regarding use of gevent via a C-API, specifically in
Cython generated code. Have seen the pxd definitions for cares and libev in
the project but it does not look like the gevent API itself is exposed via
Cython.
The use case I have for this is with gevent being used as an IO library
for a C library (libssh2), that is in turn used via a python wrapper
extension (ssh2-python <https://github.com/ParallelSSH/ssh2-python>). The
C library natively supports non-blocking mode and gevent is used in python
space to connect the python C library wrapper with an event loop and
co-operative sockets. So far so good, see gevent using client here
<https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/ssh2_client.py>
.
Since the underlying library is in C, the library also uses native threads
via gevent's native thread pool to offload non network related blocking
calls to threads. However, when it comes to waiting for network I/O, those
threads have to acquire the GIL in order to call gevent's co-operative
select.
For example reading from network and writing to local file - the file
write is implemented in C, releases the GIL and runs in a native thread
while reading from network has to call gevent.select and therefore has to
hold the GIL
<https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/native/ssh2.pyx#L125-L131>.
This, however, blocks other threads from running.
This is what I would like to avoid by interfacing with a C-API. If
gevent's select calls were done in C space, without the GIL, the rest of
the application could continue processing. The library is a parallel SSH
client so this is quite common.
Is this at all possible? If not, would running separate hubs in multiple
native threads achieve the same result, ie not blocking other native
threads from running due to holding the GIL? Or would those hubs also need
the GIL, effectively serialising them?
Am aware of gipc but as the library being used is native code, multiple
processes with IPC is not a good fit for this use case.
Another use case I have in mind is for scaling gevent past a single core
with multiple hubs in native threads that do not hold the GIL. This is
feasible in this case as a native library is used so the GIL will not
serialise everything (this is what gipc aims to solve).
Thank you for reading.
Pan
--
You received this message because you are subscribed to the Google Groups
"gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Grady Player
2017-09-27 17:07:38 UTC
Permalink
Yes, in general I would agree, however, I am already using Cython code as a wrapper to the C library, which I've also written, and for native code for file reading/writing, both of which release the GIL. Am also already using native threads to call blocking native code functions.
native io on the same thread will block gevent anyway, because it doesn't have any way to yield to it... on a different thread it doesn't matter.

Generally when you are using gevent you are using it to prevent the actual overhead of thread switching, and you are running python on one real thread and the GIL is irrelevant.

also holding the GIL does't prevent native code from using select().

running on multiple threads with multiple cores generally doesn't help performance with gevent unless you are running multiple python processes and gipc or something.
it is possible (I believe) to carefully write multiple python threads each with its own gevent greenlets, in that case the GIL wouldn't be irrelevant, but you would have to be pretty careful that your application logic knew what was going on.
Thanks for the reply.
Yes, in general I would agree, however, I am already using Cython code as a wrapper to the C library, which I've also written, and for native code for file reading/writing, both of which release the GIL. Am also already using native threads to call blocking native code functions.
The problem is the GIL does need to be held while calling gevent.select as that is a python function which has the effect of blocking all other threads (please correct me if gevent.select does in fact release the GIL after it is called). It does allow other greenlets to run as the select is co-operative but the library also has native threads spawned by python code which is blocked by the GIL. That is what I would like to avoid.
There is no 'issue' as such, just more performance to be had if it is feasible to use gevent's select sans GIL. The other use case I am interested in is as I mentioned scaling gevent on multiple cores with threads without resorting to multi processing and IPC.
This last use case in particular applies to all python wrappers of native libraries that want to use gevent for non-blocking IO. Not a very common use case I would guess, but an interesting one none the less.
Both these use cases would benefit the library in terms of performance and scaling, it has been profiled and load tested already.
Avoiding gevent entirely would mean having to interface with libev directly, as I want to keep the library non-blocking, and have to duplicate a lot of the work that gevent does which I'd rather avoid. If it came down to that being the only option I'd probably just accept it as 'not possible' for the time being.
Thank you for the input though.
If you are dealing with native C stuff I would just do stuff on real threads and avoid gevent, if your program is well structured enough to keep all of the python stuff silo'd it it might be possible.
You generally shouldn't worry about the GIL in C Python until you have done some profiling to see if there is really an issue.
People often run Cython to get around the GIL altogether, so not sure if you are running without nogil...
Just some ramblings...
Grady
Sent from my iPhone
Post by Pan
Hello,
First off, thank you for the awesome library.
I have a question regarding use of gevent via a C-API, specifically in Cython generated code. Have seen the pxd definitions for cares and libev in the project but it does not look like the gevent API itself is exposed via Cython.
The use case I have for this is with gevent being used as an IO library for a C library (libssh2), that is in turn used via a python wrapper extension (ssh2-python <https://github.com/ParallelSSH/ssh2-python>). The C library natively supports non-blocking mode and gevent is used in python space to connect the python C library wrapper with an event loop and co-operative sockets. So far so good, see gevent using client here <https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/ssh2_client.py>.
Since the underlying library is in C, the library also uses native threads via gevent's native thread pool to offload non network related blocking calls to threads. However, when it comes to waiting for network I/O, those threads have to acquire the GIL in order to call gevent's co-operative select.
For example reading from network and writing to local file - the file write is implemented in C, releases the GIL and runs in a native thread while reading from network has to call gevent.select and therefore has to hold the GIL <https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/native/ssh2.pyx#L125-L131>. This, however, blocks other threads from running.
This is what I would like to avoid by interfacing with a C-API. If gevent's select calls were done in C space, without the GIL, the rest of the application could continue processing. The library is a parallel SSH client so this is quite common.
Is this at all possible? If not, would running separate hubs in multiple native threads achieve the same result, ie not blocking other native threads from running due to holding the GIL? Or would those hubs also need the GIL, effectively serialising them?
Am aware of gipc but as the library being used is native code, multiple processes with IPC is not a good fit for this use case.
Another use case I have in mind is for scaling gevent past a single core with multiple hubs in native threads that do not hold the GIL. This is feasible in this case as a native library is used so the GIL will not serialise everything (this is what gipc aims to solve).
Thank you for reading.
Pan
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Pan
2017-09-28 09:27:03 UTC
Permalink
| native io on the same thread will block gevent anyway

Native IO is run in a separate native thread.

| also holding the GIL does't prevent native code from using select

Native code has to hold the gil to call select as it is a python function.
Holding the gil blocks python code from running. When python code is also
responsible for interacting with native code (no gil) and spawning native
IO handling threads (no gil), it becomes a bottleneck.

| running on multiple threads with multiple cores generally doesn't help
performance with gevent unless you are running multiple python processes
and gipc or something

That is true for python code but not native code. When using python
wrappers to native libraries which do not hold the gil, it is not true.
This is, again, the use case here.

It is technically feasible to scale gevent on multiple *native threads*
with a hub per thread *as long as* the code being executed in greenlets
does not hold the gil or is in other words, native code/not python code.
This is, again, the use case here.

Whether that is practically feasible with the current API is what I would
like to find out.

| it is possible (I believe) to carefully write multiple python threads
each with its own gevent greenlets

The library is already able to do that. To do it effectively, either each
individual hub thread releases the gil (or they become serialised), or it
is possible to call gevent's co-operative calls via a C-API while not
holding the gil.

In short, the only thing running in greenlets that needs the gil and cannot
run in multiple threads at once is gevent's select.

The question is simply whether or not there is a way to call gevent.select
via a C-API *or *whether or not multiple native threads with their own hub
each release the GIL when running the event loop.
Post by Pan
Yes, in general I would agree, however, I am already using Cython code as
a wrapper to the C library, which I've also written, and for native code
for file reading/writing, both of which release the GIL. Am also already
using native threads to call blocking native code functions.
native io on the same thread will block gevent anyway, because it doesn't
have any way to yield to it... on a different thread it doesn't matter.
Generally when you are using gevent you are using it to prevent the actual
overhead of thread switching, and you are running python on one real thread
and the GIL is irrelevant.
also holding the GIL does't prevent native code from using select().
running on multiple threads with multiple cores generally doesn't help
performance with gevent unless you are running multiple python processes
and gipc or something.
it is possible (I believe) to carefully write multiple python threads each
with its own gevent greenlets, in that case the GIL wouldn't be irrelevant,
but you would have to be pretty careful that your application logic knew
what was going on.
Thanks for the reply.
Yes, in general I would agree, however, I am already using Cython code as
a wrapper to the C library, which I've also written, and for native code
for file reading/writing, both of which release the GIL. Am also already
using native threads to call blocking native code functions.
The problem is the GIL *does* need to be held while calling gevent.select
as that is a python function which has the effect of blocking all other
threads (please correct me if gevent.select does in fact release the GIL
after it is called). It does allow other greenlets to run as the select is
co-operative but the library also has native threads spawned by python code
which is blocked by the GIL. That is what I would like to avoid.
There is no 'issue' as such, just more performance to be had if it is
feasible to use gevent's select sans GIL. The other use case I am
interested in is as I mentioned scaling gevent on multiple cores with
threads without resorting to multi processing and IPC.
This last use case in particular applies to all python wrappers of native
libraries that want to use gevent for non-blocking IO. Not a very common
use case I would guess, but an interesting one none the less.
Both these use cases would benefit the library in terms of performance and
scaling, it has been profiled and load tested already.
Avoiding gevent entirely would mean having to interface with libev
directly, as I want to keep the library non-blocking, and have to duplicate
a lot of the work that gevent does which I'd rather avoid. If it came down
to that being the only option I'd probably just accept it as 'not possible'
for the time being.
Thank you for the input though.
Post by Grady Player
If you are dealing with native C stuff I would just do stuff on real
threads and avoid gevent, if your program is well structured enough to keep
all of the python stuff silo'd it it might be possible.
You generally shouldn't worry about the GIL in C Python until you have
done some profiling to see if there is really an issue.
People often run Cython to get around the GIL altogether, so not sure if
you are running without nogil...
Just some ramblings...
Grady
Sent from my iPhone
Hello,
First off, thank you for the awesome library.
I have a question regarding use of gevent via a C-API, specifically in
Cython generated code. Have seen the pxd definitions for cares and libev in
the project but it does not look like the gevent API itself is exposed via
Cython.
The use case I have for this is with gevent being used as an IO library
for a C library (libssh2), that is in turn used via a python wrapper
extension (ssh2-python <https://github.com/ParallelSSH/ssh2-python>).
The C library natively supports non-blocking mode and gevent is used in
python space to connect the python C library wrapper with an event loop and
co-operative sockets. So far so good, see gevent using client here
<https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/ssh2_client.py>
.
Since the underlying library is in C, the library also uses native
threads via gevent's native thread pool to offload non network related
blocking calls to threads. However, when it comes to waiting for network
I/O, those threads have to acquire the GIL in order to call gevent's
co-operative select.
For example reading from network and writing to local file - the file
write is implemented in C, releases the GIL and runs in a native thread
while reading from network has to call gevent.select and therefore has
to hold the GIL
<https://github.com/ParallelSSH/parallel-ssh/blob/libssh2/pssh/native/ssh2.pyx#L125-L131>.
This, however, blocks other threads from running.
This is what I would like to avoid by interfacing with a C-API. If
gevent's select calls were done in C space, without the GIL, the rest of
the application could continue processing. The library is a parallel SSH
client so this is quite common.
Is this at all possible? If not, would running separate hubs in multiple
native threads achieve the same result, ie not blocking other native
threads from running due to holding the GIL? Or would those hubs also need
the GIL, effectively serialising them?
Am aware of gipc but as the library being used is native code, multiple
processes with IPC is not a good fit for this use case.
Another use case I have in mind is for scaling gevent past a single core
with multiple hubs in native threads that do not hold the GIL. This is
feasible in this case as a native library is used so the GIL will not
serialise everything (this is what gipc aims to solve).
Thank you for reading.
Pan
--
You received this message because you are subscribed to the Google Groups
"gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "gevent: coroutine-based Python network library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gevent+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...