Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
423 views
in Technique[技术] by (71.8m points)

c - method for expand a-z to abc...xyz form

Hi:) what i'm trying to do is write a simple program to expand from shortest entry

for example

a-z or 0-9 or a-b-c or a-z0-9

to longest write

for example

abc...xyz or 0123456789 or abc or abcdefghijklmnouprstwxyz0123456789

1-st examle shortest entry = 1-st example result which should give:)

so far i write something like this and it's work only for letters from a to z:

expand(char s[])
{
   int i,n,c;
   n=c=0;
   int len = strlen(s);
   for(i = 1;s[i] > '0' && s[i]<= '9' || s[i] >= 'a' && s[i] <= 'z' || s[i]=='-';i++)
   {
      /*c = s[i-1];
      g = s[i];
      n = s[i+1];*/
      if( s[0] == '-')
          printf("%c",s[0]);
      else if(s[i] == '-')
      {
         if(s[i-1]<s[i+1])
         {
             while(s[i-1] <= s[i+1])
             {
                 printf("%c", s[i-1]);
                 s[i-1]++;
             }
         }
         else if(s[i-1] == s[i+1])
             printf("%c",s[i]);
         else if(s[i+1] != '-')
             printf("%c",s[i]);
         else if(s[i-1] != '-')
             printf("%c",s[i]);
    }
    else if(s[i] == s[i+1])
    {

      while(s[i] == s[i+1])
      {
           printf("%c",s[i]);
           s[i]++;
      }
    }
    else if( s[len] == '-')
         printf("%c",s[len]);
 }

}

but now i'm stuck:(

any ideas what should i check to my program work correctly?

Edit1: @Andrew Kozak (1) abcd (2) 01234

Thanks for advance:)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here is a C version (in about 38 effective lines) that satisfies the same test as my earlier C++ version.

The full test program including your test cases, mine and some torture test can be seen live on http://ideone.com/sXM7b#info_3915048

Rationale

I'm pretty sure I'm overstating the requirements, but

  • this should be an excellent example of how to do parsing in a robust fashion
    • use states in an explicit fashion
    • validate input (!)
      • this version doesn't assume a-c-b can't happen
      • It also doesn't choke or even fail on simple input like 'Hello World' (or (char*) 0)
  • it shows how you can avoid printf("%c", c) each char without using extraneous functions.
  • I put in some comments as to explain what happens why, but overall you'll find that the code is much more legible anyways, by

    • staying away from too many short-named variables
    • avoiding complicated conditionals with un-transparent indexers
    • avoiding the whole string length business: We only need max lookahead of 2 characters, and *it=='-' or predicate(*it) will just return false if it is the null character. Shortcut evaluation prevents us from accessing past-the-end input characters
  • ONE caveat: I haven't implemented a proper check for output buffer overrun (the capacity is hardcoded at 2048 chars). I'll leave it as the proverbial exercise for the reader

Last but not least, the reason I did this:

  • It will allow me to compare raw performance of the C++ version and this C version, now that they perform equivalent functions. Right now, I fully expect the C version to outperform the C++ by some factor (let's guess: 4x?) but, again, let's just see what suprises the GNU compilers have in store for us. More later Update turns out I wasn't far off: github (code + results)

Pure C Implementation

Without further ado, the implementation, including the testcase:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int alpha_range(char c) { return (c>='a') && (c<='z'); }
int digit_range(char c) { return (c>='0') && (c<='9'); }

char* expand(const char* s)
{
    char buf[2048];

    const char* in  = s;
          char* out = buf;

    // parser state
    int (*predicate)(char) = 0; // either: NULL (free state), alpha_range (in alphabetic range), digit_range (in digit range)
    char lower=0,upper=0;       // tracks lower and upper bound of character ranges in the range parsing states

    // init
    *out = 0;

    while (*in)
    {
        if (!predicate)
        {
            // free parsing state
            if (alpha_range(*in) && (in[1] == '-') && alpha_range(in[2]))
            {
                lower = upper = *in++;
                predicate = &alpha_range;
            }
            else if (digit_range(*in) && (in[1] == '-') && digit_range(in[2]))
            {
                lower = upper = *in++;
                predicate = &digit_range;
            } 
            else *out++ = *in;
        } else
        { 
            // in a range
            if (*in < lower) lower = *in;
            if (*in > upper) upper = *in;

            if (in[1] == '-' && predicate(in[2])) 
                in++; // more coming
            else
            {
                // end of range mode, dump expansion
                char c;
                for (c=lower; c<=upper; *out++ = c++);
                predicate = 0;
            }
        }
        in++;
    }

    *out = 0; // null-terminate buf
    return strdup(buf);
}

void dotest(const char* const input)
{
    char* ex = expand(input);
    printf("input : '%s'
output: '%s'

", input, ex);

    if (ex)
        free(ex);
}

int main (int argc, char *argv[])
{
    dotest("a-z or 0-9 or a-b-c or a-z0-9"); // from the original post
    dotest("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6"); // from my C++ answer
    dotest("-x-s a-9 9- a-k-9 9-a-c-7-3"); // assorted torture tests

    return 0;
}

Test output:

input : 'a-z or 0-9 or a-b-c or a-z0-9'
output: 'abcdefghijklmnopqrstuvwxyz or 0123456789 or abc or abcdefghijklmnopqrstuvwxyz0123456789'

input : 'This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6'
output: 'This is some efghijklmnopqrstuvwxyz test in 567 steps; this works: abc. This works too: bcdefghijk. Likewise 45678'

input : '-x-s a-9 9- a-k-9 9-a-c-7-3'
output: '-stuvwx a-9 9- abcdefghijk-9 9-abc-34567'


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...