Given this simple python script:
#!/usr/bin/python
from subprocess import *
proc = Popen(["/bin/sh", "-c", "for i in `seq 1 10`; do echo $i; sleep 1; done"], stdout=PIPE, bufsize=1)
a = proc.stdout.__iter__()
print a.next()
it is interesting to notice that a.next()
takes 10 seconds to run. I tried
with bufsize=1
(line buffered) and bufsize=0
(no buffering).
I tried:
sh -c 'for i in `seq 1 10`; do echo $i; sleep 1; done' | cat
sh -c 'for i in `seq 1 10`; do echo $i; sleep 1; done' | less
sh -c 'for i in `seq 1 10`; do echo $i; sleep 1; done' > /tmp/test
in all those three cases you see the output one line at a time, so it looks like python is doing the buffering.
Using readline()
instead of the generator works:
#!/usr/bin/python
from subprocess import *
proc = Popen(["/bin/sh", "-c", "for i in `seq 1 10`; do echo $i; sleep 1; done"], stdout=PIPE, bufsize=1)
print proc.stdout.readline()
Therefore python is doing funny buffering when using __iter__()
on a file object. Good to know.
Here is how to get an iterator that does not introduce unwanted buffering:
def iterlines(fd):
while True:
line = fd.readline()
if len(line) == 0: break
yield line