batch processing - automatic loading of directory files into a python script -

- June 15, 2015

i have 125 files in directory on linux machine. have script called annotate.py takes in 1 single file , adds features column. able put filename of 1 of 125 files , run annotate.py script, not effective programming.

all 125 files have similar format in terms of column names, , column numbers. can please tell me how can run annotate.py on 125 files?

annotate.py merges 2 files on chromosome , position columns. input_file1 125 files read in 1 @ time , merged input_file2. output should different files each name of original input file 1.

#!/usr/bin/python #python snp_search.py  input_file1 input_file2 import numpy np import pandas pd  snp_f=pd.read_table('input_file1.txt', sep="\t", header=none)#input_file1 snp_f.columns=['chr','pos'] lsnp_f=pd.read_table('input2_snpsearch.txt', sep="\t", header=true)#input_file2 lsnp_f.columns=['snpid','chr','pos'] final_snp=pd.merge(snp_f,lsnp_f, on=['chr','pos']) final_snp.to_csv('input_file1_annotated.txt', index=false,sep='\t')

please help! thanks!

the os module friend http://docs.python.org/2/library/os.html. basic idea import os , use os.listdir() list of files in directory you're interested in. following work.

import numpy np import pandas pd import os   input_file2 = 'input2_snpssearch.txt' input_dir = './' #or other path files = os.lisdir(input_dir) #listdir give file names  #you don't want merge input_file2 , #in case it's in same directory other files #filter out. files_of_interest = (f f in files if f != input_file2)  f in files_of_interest:     full_name = os.path.join(input_dir, f) #necessary if input_dir not './'     snp_f=pd.read_table(full_name, sep="\t", header=none)#input_file1     snp_f.columns=['chr','pos']     lsnp_f=pd.read_table(input_file2, sep="\t", header=true)#input_file2     lsnp_f.columns=['snpid','chr','pos']     final_snp=pd.merge(snp_f,lsnp_f, on=['chr','pos'])     new_fname = f.split('.')[0] + '_annotated.txt'     final_snp.to_csv(os.path.join(input_dir, new_fname), index=false,sep='\t')

Search This Blog

You

batch processing - automatic loading of directory files into a python script -

Comments

Post a Comment

Popular posts from this blog

Need help in packaging app using TideSDK on Windows -

asp.net - redirect .aspx with query string to html page using htaccess -

java - Why does my date parsing return a weird date? -