Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
949 views
in Technique[技术] by (71.8m points)

ruby - Case-insensitive Array#include?

I want to know what's the best way to make the String.include? methods ignore case. Currently I'm doing the following. Any suggestions? Thanks!

a = "abcDE"
b = "CD"
result = a.downcase.include? b.downcase

Edit: How about Array.include?. All elements of the array are strings.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Summary

If you are only going to test a single word against an array, or if the contents of your array changes frequently, the fastest answer is Aaron's:

array.any?{ |s| s.casecmp(mystr)==0 }

If you are going to test many words against a static array, it's far better to use a variation of farnoy's answer: create a copy of your array that has all-lowercase versions of your words, and use include?. (This assumes that you can spare the memory to create a mutated copy of your array.)

# Do this once, or each time the array changes
downcased = array.map(&:downcase)

# Test lowercase words against that array
downcased.include?( mystr.downcase )

Even better, create a Set from your array.

# Do this once, or each time the array changes
downcased = Set.new array.map(&:downcase)

# Test lowercase words against that array
downcased.include?( mystr.downcase )

My original answer below is a very poor performer and generally not appropriate.

Benchmarks

Following are benchmarks for looking for 1,000 words with random casing in an array of slightly over 100,000 words, where 500 of the words will be found and 500 will not.

  • The 'regex' text is my answer here, using any?.
  • The 'casecmp' test is Arron's answer, using any? from my comment.
  • The 'downarray' test is farnoy's answer, re-creating a new downcased array for each of the 1,000 tests.
  • The 'downonce' test is farnoy's answer, but pre-creating the lookup array once only.
  • The 'set_once' test is creating a Set from the array of downcased strings, once before testing.
                user     system      total        real
regex      18.710000   0.020000  18.730000 ( 18.725266)
casecmp     5.160000   0.000000   5.160000 (  5.155496)
downarray  16.760000   0.030000  16.790000 ( 16.809063)
downonce    0.650000   0.000000   0.650000 (  0.643165)
set_once    0.040000   0.000000   0.040000 (  0.038955)

If you can create a single downcased copy of your array once to perform many lookups against, farnoy's answer is the best (assuming you must use an array). If you can create a Set, though, do that.

If you like, examine the benchmarking code.


Original Answer

I (originally said that I) would personally create a case-insensitive regex (for a string literal) and use that:

re = /A#{Regexp.escape(str)}z/i # Match exactly this string, no substrings
all = array.grep(re)              # Find all matching strings…
any = array.any?{ |s| s =~ re }   #  …or see if any matching string is present

Using any? can be slightly faster than grep as it can exit the loop as soon as it finds a single match.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...