発現量のデータを読み込む

csv形式ならread.csvで、タブ区切りならread.tableで読み込みます。

In [1]:
ppm <- read.csv("ppm.csv", as.is = T)
head(ppm)
A data.frame: 6 × 6
reference_idFe_control_1Fe_left_3Fe_right_4Fe_starv_5annotation
<chr><dbl><dbl><dbl><dbl><chr>
1AT1G01010.118.91439422.61774529.562407021.329631NAC domain containing protein 1
2AT1G01020.113.331731 7.539248 8.7370676 9.057788Arv1-like protein
3AT1G01020.212.165205 6.954307 8.4976959 8.181228Arv1-like protein
4AT1G01030.1 1.333173 2.794721 0.9574869 1.363538AP2/B3-like transcriptional factor family protein
5AT1G01040.155.65997850.62995351.943662451.522259dicer-like 1
6AT1G01040.253.82686549.33008350.507432149.379556dicer-like 1

長さのデータを読み込む

データベースから出力したファイルを読み込みます。こちらはタブ区切りなのでread.tableを使います。 psql -h gerbera pothos -AtF $'\t' -c "select reference_id, length(sequence) from ref_ath10" >length.txt

In [2]:
len <- read.table("length.txt", as.is = T)
head(len)
A data.frame: 6 × 2
V1V2
<chr><int>
1ATMG00130.1 366
2ATCG00860.16885
3ATMG00516.1 318
4ATCG00670.1 591
5ATCG01020.1 159
6ATMG00900.1 771

表を結合する

遺伝子番号をキーにして二つのデータを結合します。

In [3]:
d <- merge(ppm, len, by.x = 1, by.y = 1)
head(d)
A data.frame: 6 × 7
reference_idFe_control_1Fe_left_3Fe_right_4Fe_starv_5annotationV2
<chr><dbl><dbl><dbl><dbl><chr><int>
1AT1G01010.118.91439422.61774529.562407021.329631NAC domain containing protein 1 1688
2AT1G01020.113.331731 7.539248 8.7370676 9.057788Arv1-like protein 1623
3AT1G01020.212.165205 6.954307 8.4976959 8.181228Arv1-like protein 1085
4AT1G01030.1 1.333173 2.794721 0.9574869 1.363538AP2/B3-like transcriptional factor family protein1905
5AT1G01040.155.65997850.62995351.943662451.522259dicer-like 1 6251
6AT1G01040.253.82686549.33008350.507432149.379556dicer-like 1 5877

長さで割る

In [4]:
d[,2:(dim(d)[2] - 2)] <- d[,2:(dim(d)[2] - 2)] / d[,7] * 1000
head(d)
A data.frame: 6 × 7
reference_idFe_control_1Fe_left_3Fe_right_4Fe_starv_5annotationV2
<chr><dbl><dbl><dbl><dbl><chr><int>
1AT1G01010.111.205209613.39913817.513274312.636037NAC domain containing protein 1 1688
2AT1G01020.1 8.2142522 4.645255 5.3832826 5.580892Arv1-like protein 1623
3AT1G01020.211.2121703 6.409499 7.8319778 7.540303Arv1-like protein 1085
4AT1G01030.1 0.6998284 1.467045 0.5026178 0.715768AP2/B3-like transcriptional factor family protein1905
5AT1G01040.1 8.9041718 8.099497 8.3096564 8.242243dicer-like 1 6251
6AT1G01040.2 9.1589017 8.393752 8.5940841 8.402171dicer-like 1 5877

ファイルに書き出す

In [5]:
write.csv(d[,-dim(d)[2]], "fpkm.csv", row.names = FALSE)