beautifulsoup saving file

最近用bs4处理xml文件,遇到了一个在爬虫时候从未思考过的问题——

修正从xml文件中解析出的文件树,并将changes保存到原来的xml文件中。

我一直在beautifulsoup的手册中去寻找库函数,实际只需要简单的文件读写操作:

1
2
3
4
5
6
7
8
9
from bs4 import BeautifulSoup

soup = BeautifulSoup(open('test.xml'), 'xml')
add = BeautifulSoup("<a>Foo</a>", 'xml')
soup.orderlist.append(add)
print(soup.prettify())
f = open('test.xml', 'w')
f.write(str(soup))
f.close()

附一个简单xml文件用来实验:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<?xml version="1.0" encoding="utf-8"?>
<orderlist>
<order>
<customer>姓名1</customer>
<phone>电话1</phone>
<address>地址1</address>
<count>点餐次数1</count>
</order>
<order>
<customer>姓名2</customer>
<phone>电话2</phone>
<address>地址2</address>
<count>点餐次数2</count>
</order>