Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
292 views
in Technique[技术] by (71.8m points)

c - I need a mix of strtok and strtok_single

I have the following string that I am trying to parse for variables.

char data[]="to=myself@gmail.com&cc=youself@gmail.com&title=&content=how are you?&signature=best regards."

I started with strtok and the following code

char *to=parsePostData("to",data);

char* parsePostData(char s[],char t[])
{
  char *postVal;
  char *pch;
  char tCpy[512];//Make a copy. Otherwise, strtok works on the char pointer, and original char array gets modified/ corrupted.
  strcpy(tCpy,t);
  pch = strtok (tCpy,"=&");
  while (pch != NULL)
  {
      if(strcmp(pch,s)==0) {
            pch= strtok (NULL, "&");
                return pch;          
      }else{
        pch = strtok (NULL, "=&");  
      }
  }      
}

This works fine, except when it comes to consecutive delimiters such as the one after "title". So I found this custom strtok_single implementation. Need to know when no data appears between two token separators using strtok()

char * strtok_single (char * str, char const * delims)
{
  static char  * src = NULL;
  char  *  p,  * ret = 0;

  if (str != NULL)
    src = str;

  if (src == NULL)
    return NULL;

  if ((p = strpbrk (src, delims)) != NULL) {
    *p  = 0;
    ret = src;
    src = ++p;
  }

  return ret;
}

But with this, the problem is I cannot get "signature", as there is no & delimiter after that.

How can I get a mix of this two, so I dont miss out the last variable, and I can handle consecutive delimiters?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There are two bugs lurking here. One is in strtok_single(). If you run it repeatedly, it does not return the last segment, after the = after signature, unlike strtok().

When that's fixed, there is still a problem with the code in parsePostData(); it returns a pointer to an automatic variable. The copy of the string must be handled differently; the simplest way (which is consistent with using strtok() rather than strtok_r() or strtok_s()) is to make the tCpy variable static.

Test program emt.c

This is a composite program that shows the problems and also a set of fixes. It applies different 'splitter' functions — functions with the same signature as strtok() — to the data. It demonstrates the bug in strtok_single() and that strtok_fixed() fixes that bug. It demonstrates that the code in parsePostData() works correctly when it is fixed and strtok_fixed() is used.

#include <stdio.h>
#include <string.h>

/* Function pointer for strtok, strtok_single, strtok_fixed */
typedef char *(*Splitter)(char *str, const char *delims);

/* strtok_single - as quoted in SO 30294129 (from SO 8705844) */
static char *strtok_single(char *str, char const *delims)
{
    static char  *src = NULL;
    char  *p,  *ret = 0;

    if (str != NULL)
        src = str;

    if (src == NULL)
        return NULL;

    if ((p = strpbrk(src, delims)) != NULL)
    {
        *p  = 0;
        ret = src;
        src = ++p;
    }

    return ret;
}

/* strtok_fixed - fixed variation of strtok_single */
static char *strtok_fixed(char *str, char const *delims)
{
    static char  *src = NULL;
    char  *p,  *ret = 0;

    if (str != NULL)
        src = str;

    if (src == NULL || *src == '')    // Fix 1
        return NULL;

    ret = src;                          // Fix 2
    if ((p = strpbrk(src, delims)) != NULL)
    {
        *p  = 0;
        //ret = src;                    // Unnecessary
        src = ++p;
    }
    else
        src += strlen(src);

    return ret;
}

/* Raw test of splitter functions */
static void parsePostData1(const char *s, const char *t, Splitter splitter)
{
    static char tCpy[512];
    strcpy(tCpy, t);
    char *pch = splitter(tCpy, "=&");
    while (pch != NULL)
    {
        printf("  [%s]
", pch);
        if (strcmp(pch, s) == 0)
            printf("matches %s
", s);
        pch = splitter(NULL, "=&");
    }
}

/* Fixed version of parsePostData() from SO 30294129 */
static char *parsePostData2(const char *s, const char *t, Splitter splitter)
{
    static char tCpy[512];
    strcpy(tCpy, t);
    char *pch = splitter(tCpy, "=&");
    while (pch != NULL)
    {
        if (strcmp(pch, s) == 0)
        {
            pch = splitter(NULL, "&");
            return pch;
        }
        else
        {
            pch = splitter(NULL, "=&");
        }
    }
    return NULL;
}

/* Composite test program */
int main(void)
{
    char data[] = "to=myself@gmail.com&cc=youself@gmail.com&title=&content=how are you?&signature=best regards.";
    char *tags[] = { "to", "cc", "title", "content", "signature" };
    enum { NUM_TAGS = sizeof(tags) / sizeof(tags[0]) };

    printf("
Compare variants on strtok()
");
    {
        int i = NUM_TAGS - 1;
        printf("strtok():
");
        parsePostData1(tags[i], data, strtok);
        printf("strtok_single():
");
        parsePostData1(tags[i], data, strtok_single);
        printf("strtok_fixed():
");
        parsePostData1(tags[i], data, strtok_fixed);
    }

    printf("
Compare variants on strtok()
");
    for (int i = 0; i < NUM_TAGS; i++)
    {
        char *value1 = parsePostData2(tags[i], data, strtok);
        printf("strtok: [%s] = [%s]
", tags[i], value1);
        char *value2 = parsePostData2(tags[i], data, strtok_single);
        printf("single: [%s] = [%s]
", tags[i], value2);
        char *value3 = parsePostData2(tags[i], data, strtok_fixed);
        printf("fixed:  [%s] = [%s]
", tags[i], value3);
    }

    return 0;
}

Example output from emt

Compare variants on strtok()
strtok():
  [to]
  [myself@gmail.com]
  [cc]
  [youself@gmail.com]
  [title]
  [content]
  [how are you?]
  [signature]
matches signature
  [best regards.]
strtok_single():
  [to]
  [myself@gmail.com]
  [cc]
  [youself@gmail.com]
  [title]
  []
  [content]
  [how are you?]
  [signature]
matches signature
strtok_fixed():
  [to]
  [myself@gmail.com]
  [cc]
  [youself@gmail.com]
  [title]
  []
  [content]
  [how are you?]
  [signature]
matches signature
  [best regards.]

And:

Compare variants on strtok()
? strtok: [to] = [myself@gmail.com]
? single: [to] = [myself@gmail.com]
? fixed:  [to] = [myself@gmail.com]
? strtok: [cc] = [youself@gmail.com]
? single: [cc] = [youself@gmail.com]
? fixed:  [cc] = [youself@gmail.com]
? strtok: [title] = [content=how are you?]
? single: [title] = []
? fixed:  [title] = []
? strtok: [content] = [how are you?]
? single: [content] = [how are you?]
? fixed:  [content] = [how are you?]
? strtok: [signature] = [best regards.]
? single: [signature] = [(null)]
? fixed:  [signature] = [best regards.]

The correct (? = U+2713) and incorrect (? = U+2715) marks were added manually when posting the answer.

Observe how only the lines tagged 'fixed' contain exactly what is wanted each time around.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...