Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
663 views
in Technique[技术] by (71.8m points)

comparison - what happens when you compare two strings in python

When comparing strings in python e.g.

if "Hello" == "Hello":
    #execute certain code

I am curious about what the code is that compares the strings. So if i were to compare these in c i would just compare each character and break when one character doesn't match. i'm wondering exactly what the process is of comparing two strings like this, i.e. when it will break and if there is any difference between this comparison and the method said above other than redundancy in lines of code

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'm going to assume you are using CPython here, the standard Python.org implementation. Under the hood, the Python string type is implemented in C, so yes, testing if two strings are equal is done exactly like you'd do it in C.

What it does is use the memcmp() function to test if the two str objects contain the same data, see the unicode_compare_eq function defined in unicodeobject.c:

static int
unicode_compare_eq(PyObject *str1, PyObject *str2)
{
    int kind;
    void *data1, *data2;
    Py_ssize_t len;
    int cmp;

    len = PyUnicode_GET_LENGTH(str1);
    if (PyUnicode_GET_LENGTH(str2) != len)
        return 0;
    kind = PyUnicode_KIND(str1);
    if (PyUnicode_KIND(str2) != kind)
        return 0;
    data1 = PyUnicode_DATA(str1);
    data2 = PyUnicode_DATA(str2);

    cmp = memcmp(data1, data2, len * kind);
    return (cmp == 0);
}

This function is only called if str1 and str2 are not the same object (that's an easy and cheap thing to test). It first checks if the two objects are the same length and store the same kind of data (string objects use a flexible storage implementation to save memory; different storage means the strings can't be equal).

There are other Python implementations, like Jython or IronPython, which may use different techniques, but it basically will come down to much the same thing.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...