我为文件夹中的所有重复图像生成了一个python字典。python命令现在包含以下格式的值:
{
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_xyz.jpg": ["image_1.jpg", "image_abc.jpg"],
"image_abc.jpg": ["image_xyz.jpg","image_1.jpg"],
"image_2.jpg": ["image_3.jpg"],
"image_3.jpg": ["image_2.jpg"],
"image_5.jpg": []
}
因此,每个键、值对在列表中至少出现两次。存在没有重复项的键的空列表。有没有办法删除所有存在的重复键值对?因此,字典如下所示:
{
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_2.jpg": ["image_3.jpg"],
"image_5.jpg": []
}
我尝试使用list首先存储键值对的所有值,然后从字典中删除它们,但它清空了整个字典。
source = {
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_xyz.jpg": ["image_1.jpg", "image_abc.jpg"],
"image_abc.jpg": ["image_xyz.jpg","image_1.jpg"],
"image_2.jpg": ["image_3.jpg"],
"image_3.jpg": ["image_2.jpg"],
"image_5.jpg": []
}
dest = dict()
for k,v in source.items():
ok = True
for k1,v1 in dest.items():
if k in v1: ok = False
if ok: dest[k] = v
print(dest) # New filtered dict
我通常在清除列表中的重复项时使用此方法:
首先将所有值放入矩阵/2维列表中,包括键,因此前3个值如下所示:
{
"image_1.jpg": ['image_xyz.jpg', 'image_abc.jpg'],
"image_xyz.jpg": ["image_1.jpg", "image_abc.jpg"],
"image_abc.jpg": ["image_xyz.jpg","image_1.jpg"],
}
将变成:
List=[
["image_1.jpg","image_xyz.jpg","image_abc.jpg"],
["image_xyz.jpg","image_1.jpg","image_abc"],
["image_abc.jpg","image_xyz.jpg","image_1.jpg"]
]
确保键都在第0个位置,这样你就可以保存它们了。
keys=[x[0] for x in List]
然后排序列表:
sorted_list=[sorted(x) for x in List]
然后使用嵌套for循环中的if语句简单地比较它们,如果一个列表等于另一个列表,则将其删除:
for i in sorted_list:
for j,k in enumerate(sorted_list):
if i==k:
del sorted_list[j] # deleting any equal lists
现在,重复的都不见了,你有键将列表转换回字典,如果喜欢的话
总体代码(如果需要):
List=[
["image_1.jpg","image_xyz.jpg","image_abc.jpg"],
["image_xyz.jpg","image_1.jpg","image_abc"],
["image_abc.jpg","image_xyz.jpg","image_1.jpg"]
]
keys=[x[0] for x in List]
sorted_list=[sorted(x) for x in List]
for i in sorted_list:
for j,k in enumerate(sorted_list):
if i==k:
del sorted_list[j] # deleting any equal lists