我试图在ruffus管道中使用Sailfish,它将多个fastq文件作为参数.我使用
python中的子进程模块执行Sailfish,但是即使我设置shell = True,子进程调用中的<()也不起作用. 这是我想用python执行的命令:
sailfish quant [options] -1 <(cat sample1a.fastq sample1b.fastq) -2 <(cat sample2a.fastq sample2b.fastq) -o [output_file]
或(最好):
sailfish quant [options] -1 <(gunzip sample1a.fastq.gz sample1b.fastq.gz) -2 <(gunzip sample2a.fastq.gz sample2b.fastq.gz) -o [output_file]
概括:
someprogram <(someprocess) <(someprocess)
我将如何在python中执行此操作?子流程是正确的方法吗?
解决方法
要模仿
bash process substitution:
#!/usr/bin/env python from subprocess import check_call check_call('someprogram <(someprocess) <(anotherprocess)',shell=True,executable='/bin/bash')
在Python中,您可以使用命名管道:
#!/usr/bin/env python from subprocess import Popen with named_pipes(n=2) as paths: someprogram = Popen(['someprogram'] + paths) processes = [] for path,command in zip(paths,['someprocess','anotherprocess']): with open(path,'wb',0) as pipe: processes.append(Popen(command,stdout=pipe,close_fds=True)) for p in [someprogram] + processes: p.wait()
其中named_pipes(n)是:
import os import shutil import tempfile from contextlib import contextmanager @contextmanager def named_pipes(n=1): dirname = tempfile.mkdtemp() try: paths = [os.path.join(dirname,'named_pipe' + str(i)) for i in range(n)] for path in paths: os.mkfifo(path) yield paths finally: shutil.rmtree(dirname)
另一种更优选的方法(不需要在磁盘上创建命名条目)来实现bash进程替换是使用/ dev / fd / N文件名(如果可用的话)为suggested by @Dunes.在FreeBSD上,fdescfs(5)
(/dev/fd/#
) creates entries for all file descriptors opened by the process.为了测试可用性,跑:
$test -r /dev/fd/3 3</dev/null && echo /dev/fd is available
如果失败了;尝试将符号/ / dev / fd符号链接到proc(5)
,因为它在某些Linux上完成:
$ln -s /proc/self/fd /dev/fd
这是基于/ dev / fd的someprogram<(someprocess)<(anotherprocess)bash命令的实现:
#!/usr/bin/env python3 from contextlib import ExitStack from subprocess import CalledProcessError,Popen,PIPE def kill(process): if process.poll() is None: # still running process.kill() with ExitStack() as stack: # for proper cleanup processes = [] for command in [['someprocess'],['anotherprocess']]: # start child processes processes.append(stack.enter_context(Popen(command,stdout=PIPE))) stack.callback(kill,processes[-1]) # kill on someprogram exit fds = [p.stdout.fileno() for p in processes] someprogram = stack.enter_context( Popen(['someprogram'] + ['/dev/fd/%d' % fd for fd in fds],pass_fds=fds)) for p in processes: # close pipes in the parent p.stdout.close() # exit stack: wait for processes if someprogram.returncode != 0: # errors shouldn't go unnoticed raise CalledProcessError(someprogram.returncode,someprogram.args)
注意:在我的Ubuntu机器上,子进程代码仅适用于Python 3.4,尽管自Python 3.2以来pass_fds可用.