Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
137 views
in Technique[技术] by (71.8m points)

Why might Python break down halfway through a loop? TypeError: __getitem__

The Goal

I have a directory with 65 .txt files, which I am parsing, one by one, and saving the outputs into 65 corresponding .txt files. I then plan to concatenate them, but I'm not sure if jumping straight to that might help find a solution here.

The Problem

I am receiving:

TypeError: 'NoneType' object has no attribute 'getitem'

and have seen two similar threads:

TypeError: 'NoneType' object has no attribute '__getitem__'

Python: TypeError: 'NoneType' object has no attribute '__getitem__'

My problem seems somewhat strange, however, as it does manage to go through the input files, parsing them and writing the output file about ten times, at which point I get the error. The files are all similar, just HTML source code from website (i.e. the same website, just different pages of it, and so the same basic HTML structure).

Here is the function where the error occurs; in the last line of this snippet:

def parse(elTree):
    desired_value = elTree.xpath('my_very_long_xpath')
    desired_value = [x.get('title')[8:] for x in desired_value]

I do have a few more variants of these - I am actually parsing for about 5 to 6 different desired_values. And all of this is simply running inside of a larger loop where the files are read in to the parse function and then the output is written to a new file.

What I have tried

I have removed the file where I initially got the error, but the same error occurred at the next file. I did the same again, removing two files, but still getting that error.

I introduced a time.sleep(3) between each file, just to allow things to maybe run more smoothly. I realized there may be a buffer for the whole process, which is maybe being read and it is just being wiped, and so there is no file there... Here is a similar occurrence within a loop in C. Unfortunately the sleep for 3 seconds (plus then scattered around at various other points) didn't help me. the code fails at exactly the same point.

According to the documentation, a TypeError arises when a function is applied to an object of inappropriate type, so how can it be that it is occurring after functioning correctly 10 or 11 times? Here is more official information regarding the __getitem__ method

As the code does work well otherwise, I haven't included the rest, but if someone suspects it may originate from somewhere else, with good reason, then I will add more of the code.

I have inspected the contents of the .txt files for those that worked and those where it failed and the xpaths work in both, the contents are there to be found and parsed.

I used the code on 30 copies of the same file, which did execute successfully, so there must be subtle differences in the HTML code, which my parser is not recognizing.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

TypeError: 'NoneType' object has no attribute '__getitem__' means that you attempted to use some kind of indexing, like mylist[2], on None instead of on something like a list. This means that the internal call to that object's __getitem__ failed, because None, which is an object of type Nonetype, doesn't have such a method defined for it.

The problem is in x.get('title')[8:]: the get() method didn't find any key called 'title' in x, so it returned None. However, you then try to slice it with [8:]. If it had returned a list or similar object it would work fine, but not so with None.

I recommend introducing some kind of error handling:

try:
    desired_value = [x.get('title')[8:] for x in desired_value]
except TypeError:
    return

You will have to correct and expand this stub to make it behave in a way that's appropriate for your program. Maybe instead of a return statement you'll need to define some kind of default desired_value or something.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...