我正在尝试使用panda筛选数据
df_c = pd.read_csv (r'C:\Users\User\Documents\Research\output-mutations.csv')
df_c.drop_duplicates(subset = ["FILENAME", "CHAIN", "MUTATION_CODE"],
keep = False, inplace = True)
df_c.to_csv (r'C:\Users\User\Documents\Research\output-mutations-concise.csv')```
现在,我只有这些了。我正在尝试删除其中一列中的重复项,并打印上一列中的所有匹配项。我有一个我想做的例子,但我不知道从哪里开始,以及使用熊猫的命令。我试过了.drop命令
1条回答网友
1楼 ·
发布于 2023-03-07 00:11:12
您可以使用groupby
和apply
代替drop
df = df.groupby(by=[ "FILENAME", "CHAIN"],as_index=False).apply(lambda x: ";".join(x["MUTATION_CODE"]))
df.columns = ["FILENAME", "CHAIN", "MUTATION_CODE"]
print(df)
itme.xyz
[…]that will be the end of this article. Here youll uncover some websites that we feel you will value, just click the hyperlinks over[…]