Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
326 views
in Technique[技术] by (71.8m points)

algorithm - Detect repeated tuples [fi,(j-1), fi,j ,fi,j+1] on txt file using java

I'm looking for a small code snippet that will find and detect in (a) line(s) in file and alert user that the line(or lines) include(s) unacceptable entries
but could not find.

So for example I have in a file following:

myFile.txt:

Field1,Field2,Field3,Field4,Field5,Field6,Field7
a,b,a,d,e,f,g
h,i,h,i,h,ff,f27
f31,f32,f33,f34,f35,f36,f37
f41,f42,f43,f44,f45,f46,f47
f51,f52,f53,f54,f55,f56,f57
f61,f62,a,b,a,f66,f67
f71,f72,f73,f74,f75,f76,f77
f81,f82,f83,f84,f85,f86,f87
f91,f92,f93,f94,f95,f96,f97
f101,f102,f103,f104,f105,f106,f107
f111,f112,f113,f114,f115,f116,f117
f121,f122,f123,f124,f125,f126,f127
f131,f132,f133,f134,f135,f136,f137
f141,f142,f143,f144,f145,f146,f147
f151,f152,f153,f154,f155,f156,f157
f161,a,b,a,f165,f166,f167
i,h,ff,f174,f175,f176,f177
f181,f182,f183,f184,f185,f186,f187
f191,f192,f193,f194,f195,f196,f197
f201,f202,f203,f204,f205,f206,f207
f211,f212,f213,f214,f215,f216,f217
f221,f222,f223,f224,f225,f226,f227
f231,f232,f233,f234,f235,f236,f237
f241,f242,f243,f244,f245,f246,f247
f251,f252,f253,f254,f255,f256,f257
f261,f262,f263,f264,f265,f266,f267
f271,f272,f273,f274,f275,f276,f277
f281,f282,f283,i,h,ff,f287
fn1,fn2,fn3,fn4,fn5,fn6,fn7
f301,f302,f303,f304,f305,f306,f307

ALL VALUES ON TXT FILE ARE TREATED AS STRINGS.

unacceptable entries

unacceptable entrie in a line(or lines) are the lines that include a fi,j where a tuple [fi,(j-1), fi,j ,fi,j+1] existed already before or after in the txt file. i.e for a targeted field X detect if the field on the left XL and the field on the right XR don't match on any previous field in the txt file and hence if It matches we have to output: the filed X on the line Number is problematic because is the Tuple [XL,X,XR] is already defined on the previous Line number
and we diplay : - all The lines that will cause a conflict: That means, + The previous Line (that first occurence will be accepted on txt file reading) and + The problematic Lines(that follow The previous Line on txt file reading and hence would be ignored)
- The row number for accepted first occurence Tuple but accepted - The eventually row numbers for Not accepted Tuples that would be ignored - The Tuples [XL,X,XR] that cause the problem.

Example:

Field1;Field2;Field3;Field4;Field5;Field6;Field7<--------Headers
a;b;a;d;e;f;g
h;i;h;i;h;ff;f27
f31;f32;f33;f34;f35;f36;f37
f41;f42;f43;f44;f45;f46;f47
f51;f52;f53;f54;f55;f56;f57
f61;f62;a;b;a;f66;f67
............................
f161;a;b;a;f165;f166;f167
i;h;ff;f174;f175;f176;f177
...........................
f281;f282;f283;i;h;ff;f287
fn1;fn2;fn3;fn4;fn5;fn6;fn7

It will display :

[a;b;a], accepetd on line 1 but rejected on lines: 6,16
Line accepted is : a;b;a;d;e;f;g
Line(s) rejected are: f61;f62;a;b;a;f66;f67
                      f161;a;b;a;f165;f166;f167

[h;i;h], Not accepted at all. rejected on lines: 2 
Line accepted is: empty
Lines rejected :  h;i;h;i;h;ff;f27

[i;h;ff],Not accepted at all. rejected on lines: 2,17,28
Line accepted is: empty
Lines rejected :
             h;i;h;i;h;ff;f27
             i;h;ff;f174;f175;f176;f177
             f281;f282;f283;i;h;ff;f287

N.B: Not accepted at all will be displayed if the list of accepted Line is empty i.e when the problem occurs at the same line.

Any advice,help is welcome.

Update

I gave an answer.

Thank You very much.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is sort of the point of objects. You should create an object model that reflects the things you are working with.

So first You would create a class, something like this

public class SeptTuple {
  public final String field1, field2, ..., field7

  public SeptTuple(String f1, String f2, ..., String f7) {
    field1 = f1;
    ...
    field7 = f7;
  }

  @Override
  public boolean equals(Object o) {
    if(!(o instanceof SeptTuple))
      return false;

    SeptTuple s = (SeptTuple)o;
    return Objects.equals(field1, s.field1) && Objects.equals(field2, s.field2) && ... && Objects.equals(field7, s.field7)
  }

  @Override
  public int hashcode() {
    // If 2 objects are equal, they must return the same hashcode
    return Objects.hash(field1, field2, ..., field7);
  }
}

And then once you make that, finding dupes is as easy as

Map<SeptTuple, SeptTuple> map = new HashMap<>();
....
// If already set, map will return the old value on put
SeptTuple temp = map.put(newSetTuple, newSetTuple);
if(temp != null) {
   // handle clash
}

If you need to find equal parts in subsets of each row, than break this solution down into as many objects as you need to accurately represent each element of the tuple. (You will need to make 3 classes to represent each part of your tuple.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...