I have the following code to cleanup the log file to get XML out of it (log file is not well formatted and doesn't have root) and then parse and perform other functions. Clean up works, but But XML Parser is throwing me error for some xml data which contain some special characters. My code is as below:
with open(log_file, 'r') as fr, open('XMLinLog2.xml', 'w') as fw:
fw.write("<document>
")
for line in fr:
if line.strip().startswith('<'):
fw.write(' ' + line)
fw.write("
</document>")
# --- Parsing Log files after cleanup ---
doc = ET.parse('XMLinLog2.xml')
The xml data in log file which throws me error is for;
(1) Ops Désactivée 23:59 and (2) [ mono @ 90° >> +1
which after cleanup in the log file is shown as Ops D?sactiv?e 23:59 and [ mono @ 90? >> +1 respectively.
So I figured out ? character is causing issues.
Question:
- How do I deal with this error?
- If I need to print the those data, how can I print them correctly? I dont want to print ?. Because I assume it will throw error whenever I have french text coming in for é .
Full error here:
raceback (most recent call last):
File "C:/Users/PycharmProjects/IMSS_TestHarness/Libraries/try.py", line 23, in
doc = ET.parse('XMLinLog2.xml')
File "C:UsersAppDataLocalProgramsPythonPython38-32libxmletreeElementTree.py", line 1202, in parse
tree.parse(source, parser)
File "C:UsersAppDataLocalProgramsPythonPython38-32libxmletreeElementTree.py", line 595, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 3299, column 22
Process finished with exit code 1
Log file:
1. 2020-08-03 15:59:54.635 (72 ,Effective Commit) Info Sending:
<U_DisplayCommand>
<DestinationId>5035</DestinationId>
<DisplayId>1</DisplayId>
<LineTextEnglish>
<Line>Ops Disabled 23:59 N</Line>
</LineTextEnglish>
<LineTextFrench>
**<Line>Ops Désactivée 23:59</Line>**
</LineTextFrench>
</U_DisplayCommand>
<U_DisplayCommand>
<DestinationId>5085</DestinationId>
<DisplayId>1</DisplayId>
<LineTextEnglish>
<Line>Vaudreuil-Dori P123A</Line>
<Line>[ mono @ 90° >> +1</Line>
</LineTextEnglish>
<LineTextFrench>
<Line>Vaudreuil-Dori P123A</Line>
<Line>[ mono @ 90° >> +1</Line>
</LineTextFrench>
</U_DisplayCommand>
Thanks in advance.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…