网站链接: element-ui dtcms
当前位置: 首页 > 技术博文  > 技术博文

Python笔记:python将Annotations中的xml文件中文标签转成英文标签!

2021/4/2 22:42:22 人评论

python将Annotations中的xml文件中文标签转成英文标签! 文章目录1. VOC格式的xml标签2. 代码1. VOC格式的xml标签 2. 代码 # encoding:utf-8 import os import xml.etree.ElementTree as ETcount 0 list_xml [] dict {"0其他垃圾/塑料快餐盒": "…

python将Annotations中的xml文件中文标签转成英文标签!

文章目录

  • 1. VOC格式的xml标签
  • 2. 代码

1. VOC格式的xml标签

2. 代码

# encoding:utf-8
import os
import xml.etree.ElementTree as ET

count = 0
list_xml = []
dict = {"0其他垃圾/塑料快餐盒": "1 snack box",
        "1其他垃圾/塑料": "1 stained plastic",
        "2其他垃圾/烟蒂": "1 cigarette butts",
        "3其他垃圾/牙签": "1 toothpick",
        "4其他垃圾/盘子": "1 basin",
        "5其他垃圾/木筷子": "1 chopstick",
        "6厨余垃圾/厨余垃圾": "2 Kitchen waste",
        "7厨余垃圾/骨头": "2 bone",
        "8厨余垃圾/果皮": "2 peel",
        "9厨余垃圾/果肉": "2 fruit pulp",
        "10厨余垃圾/茶叶渣": "2 tea leaves",
        "11厨余垃圾/菜叶": "2 vegetable leaf",
        "12厨余垃圾/蛋壳": "2 eggshell",
        "13厨余垃圾/鱼骨": "2 fishbone",
        "14可回收物/充电宝": "3 charge pal",
        "15可回收物/包": "3 package",
        "16可回收物/塑料化妆品": "3 cosmetics bottles",
        "17可回收物/塑料玩具": "3 plastic toys",
        "18可回收物/盘子": "3 bowl tub",
        "19可回收物/塑料晾衣架": "3 hangers",
        "20可回收物/快递纸带": "3 courier bags",
        "21可回收物/电线": "3 wire",
        "22可回收物/旧衣服": "3 old clothes",
        "23可回收物/易拉罐": "3 cans",
        "24可回收物/旧枕头": "3 pillow",
        "25可回收物/毛绒玩具": "3 plush toys",
        "26可回收物/洗发水瓶": "3 shampoo bottle",
        "27可回收物/玻璃杯": "3 glass",
        "28可回收物/鞋子": "3 leather shoes",
        "29可回收物/砧板": "3 chopping block",
        "30可回收物/纸盒": "3 carton",
        "31可回收物/调料瓶": "3 spice bottles",
        "32可回收物/玻璃瓶": "3 glass bottle",
        "33可回收物/金属食品罐": "3 food can",
        "34可回收物/铁锅": "3 pan",
        "35可回收物/食用油桶": "3 oil drum",
        "36可回收物/塑料饮料瓶": "3 bottles",
        "37有害垃圾/电池": "4 cell",
        "38有害垃圾/软膏": "4 ointment",
        "39有害垃圾/过期药物": "4 expired drugs"
        }

openPath = "F:\data-2019-12-30-垃圾分类\VOCdevkit\VOC2007\Annotations_Chinese"
savePath = "F:\data-2019-12-30-垃圾分类\VOCdevkit\VOC2007\Annotations_Englist"
fileList = os.listdir(openPath)         # 得到进程当前工作目录中的所有文件名称列表
for fileName in fileList:               # 获取文件列表中的文件
    if fileName.endswith(".xml"):       # 只看xml文件
        print("filename=:", fileName)
        tree = ET.parse(os.path.join(openPath, fileName))
        root = tree.getroot()
        print("root-tag=:", root.tag)   # ',root-attrib:', root.attrib, ',root-text:', root.text)
        for child in root: # 第一层解析
            if child.tag == "object":   # 找到object标签
                print(child.tag)
                for sub in child:
                    if sub.tag == "name":
                        print("标签名字:", sub.tag, ";文本内容:", sub.text)
                        if sub.text not in list_xml:
                            list_xml.append(sub.text)
                        if sub.text in list(dict.keys()):
                            sub.text = dict[sub.text]
                            print(sub.text)
                            count = count + 1
        # tree.write(os.path.join(savePath,fileName))
    print("=" * 20)

print(count)
for i in list_xml:
    print(i)

相关资讯

    暂无相关的数据...

共有条评论 网友评论

验证码: 看不清楚?