It's not about the compiler doing a static analysis based on unrelated branches when compiling to bytecode; it's much simpler.
Python has a rule for distinguishing global, closure, and local variables. All variables that are assigned to in the function (including parameters, which are assigned to implicitly), are local variables (unless they have a global
or nonlocal
statement). This is explained in Binding and Naming and subsequent sections in the reference documentation.
This isn't about keeping the interpreter simple, it's about keeping the rule simple enough that it's usually intuitive to human readers, and can easily be worked out by humans when it isn't intuitive. (That's especially important for cases like this—the behavior can't be intuitive everywhere, so Python keeps the rule simple enough that, once you learn it, cases like this are still obvious. But you definitely do have to learn the rule before that's true. And, of course, most people learn the rule by being surprised by it the first time…)
Even with an optimizer smart enough to completely remove any bytecode related to if False: ord=None
, ord
must still be a local variable by the rules of the language semantics.
So: there's an ord =
in your function, therefore all references to ord
are references to a local variable, not any global or nonlocal that happens to have the same name, and therefore your code is an UnboundLocalError
.
Many people get by without knowing the actual rule, and instead use an even simpler rule: a variable is
- Local if it possibly can be, otherwise
- Enclosing if it possibly can be, otherwise
- Global if it's in globals, otherwise
- Builtin if it's in builtins, otherwise
- an error
While this works for most cases, it can be a bit misleading in some cases—like this one. A language with LEGB scoping done Lisp-style would see that ord
isn't in the local namespace, and therefore return the global, but Python doesn't do that. You could say that ord
is in the local namespace, but bound to a special "undefined" value, and that's actually close to what happens under the covers, but that's not what the rules of Python say, and, while it may be more intuitive for simple cases, it's harder to reason through.
If you're curious how this works under the covers:
In CPython, the compiler scans your function to find all assignments with an identifier as a target, and stores them in an array. It removes global and nonlocal variables. This arrays ends up as your code object's co_varnames
, so let's say your ord
is co_varnames[1]
. Every use of that variable then gets compiled to a LOAD_FAST 1
or STORE_FAST 1
, instead of a LOAD_NAME
or STORE_GLOBAL
or other operation. That LOAD_FAST 1
just loads the frame's f_locals[1]
onto the stack when interpreted. That f_locals
starts off as an array of NULL pointers instead of pointers to Python objects, and if a LOAD_FAST
loads a NULL pointer, it raises UnboundLocalError
.