본문 바로가기
IT/Python

python nuclear translocation analysis - from imagej

by 르미르미 2023. 10. 28.

 

 

Image j nuclear translocation measurement

96well plate 이용, B02~G11 까지 분석 각 well 당 fld1~9 red analysis - nucleus and cytosol //알파벳 B에서 G까지 반복 (b,c,d,e,f,g = 66,67,68,69,70,71) for (charCode = 66; charCode

arumyworld.tistory.com

 

data name 

B - 02-1 ~ G - 11-9

.csv 

 

subgroup B - 02 ~ B - 11

                G - 02 ~ G - 11

 

red 의 nuclear translocation 확인하기 (nuclear, cytosol)

green 은 그 단백질이 발현되는 세포만 labeling 

!rm *.xlsx
!rm *.csv

## 파일 업로드
from google.colab import files
uploaded = files.upload()

 

Input

Green_background=1895
Red_background=680
green_threshold=400
ratio_threshold=1
File_name='231028'

 

Datasorting

import pandas as pd
import numpy as np

n, c, r, g={}, {}, {}, {}

namelist2=[]
df={}

for fn in uploaded.keys():
  namelistb=fn.split('-')[0:3]
  nameb='-'.join(namelistb)
  name=nameb.replace(" ", "")
  namelist2.append(name)
  if fn.split('-')[3].split('.')[0]=='Green':
    g[name]=pd.read_csv(fn)['Mean']-Green_background
  if fn.split('-')[3].split('.')[0]=='Red':
    n[name]=pd.read_csv(fn)['Nucleus']-Red_background
    c[name]=pd.read_csv(fn)['Band']-Red_background

namelist=set(namelist2)

for k in namelist:
  r[k]=n[k]/c[k]
  df[k]=pd.concat([g[k],n[k], c[k], r[k]], axis=1)
  df[k].columns=['Green','Nucleus', 'Cytosol', 'Ratio']

 

(Option) Green positive cell only

for k in namelist:
  df[k]=df[k][df[k]['Green'] > green_threshold].reset_index(drop=True)

 

모든 데이터 정리 --> Excel export

variables = ['Nucleus', 'Green', 'Cytosol', 'Ratio', 'Nucleus_Mean', 'Green_Mean', 'Cytosol_Mean', 'Ratio_Mean', 'Number', 'Ratio_Positive']
result={}
result_df={}
result_s={}
new_series={}
dff={}

for v in variables:
  result[v]={}
for k in namelist:
  for v in variables[0:4]:
    result[v][k]=df[k][v]
  for v in range(4,7):
    vkey=variables[v-4]
    result[variables[v]][k]=df[k][vkey].mean()
  result[variables[7]][k]=df[k][variables[3]].mean()*100
  result[variables[8]][k]=len(df[k])
  if len(result['Ratio'][k]) > 0 :
    result['Ratio_Positive'][k]=len(result['Ratio'][k][result['Ratio'][k] > ratio_threshold])/len(result['Ratio'][k])*100

#순서대로 정렬
sorted_dict={}
for v in variables:
  sorted_keys = sorted(result[v].keys())
  sorted_dict[v] = {key: result[v][key] for key in sorted_keys}

#df로 바꾸기
for v in variables[0:4]:
  result_df[v]=pd.concat(sorted_dict[v], axis=1)
for v in variables[4:]:
  result_s[v]=pd.Series(sorted_dict[v])

  #특정 패턴을 가진 인덱스를 선택
  new_series[v]={}
  for k in namelist:
    new_k=k.split('-')[0]+'-'+k.split('-')[1]
    selected_result_s = result_s[v][result_s[v].index.str.startswith(new_k) ]
    new_indices=selected_result_s.index.str.split('-').str[2]
    new_series[v][new_k]=pd.Series(selected_result_s.values, index=new_indices)


group=[]
for k in namelist:
  group.append(k.split('-')[0]+'-'+k.split('-')[1])
group=set(group)

for v in variables[4:]:
  dff[v]=pd.DataFrame(new_series[v])
  result_df[v] = dff[v].sort_index(axis=1)

with pd.ExcelWriter(File_name+'.xlsx') as writer:
  for v in variables:
    result_df[v].to_excel(writer, sheet_name=v)
files.download(File_name+'.xlsx')

 

댓글