Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

regex - Oracle SQL: the insert query with REGEXP_SUBSTR expression is very long ( split string )

I must insert into table 2 fields (first the Primary key(about the articles) and the second concerns their size(of these articles).

In source envrionnement, i have into table, the primary key(TK Articles) and a concatenation of a size into second field. However, i must insert into target table, the TK Articles and the several size of the Artcles. For example,

Source:

ART        SIZE**                                      
1        |  28/30   
2        |  30/32   
3        | Size 10/Size 12/Size 14/Size 14

Target:

ART         Size
1        |   28  
1        |   30                   
2        |   30            
2        |   32             
3        |  Size 10         
3        |  Size 12       
3        |  Size 14      
3        |  Size 16

The difficulty is to know how many '/' is included in the field?

I have made a query

SELECT ART,
       REGEXP_SUBSTR(SIZE,'[^/]+',1,level)
FROM TABLLE
CONNECT BY REGEXP_SUBSTR(SIZE,'[^/]+',1,level) IS NOT NULL;

the select transaction works and display results in 46 seconds. But the TABLE have 100 000 lines and the insert transaction is too long and doesn't work.

Somebody can help me on this point?

Thanks & Regards

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Regular expressions are very expensive to compute. If there is a need to process a large number of rows, personally I would go with a stored procedure - pipelined table function:

-- table with 100000 rows
create table Tb_SplitStr(col1, col2) as
  select level
       , 'Size 10/Size 12/Size 14/Size 14/Size 15/Size 16/Size 17'
   from dual
  connect by level <= 100000 
  1. PL/SQL package:

    create or replace package Split_Pkg as
      type T_StrList is table of varchar2(1000);
      function Str_Split(
         p_str in varchar2,
         p_dlm in varchar2
      ) return T_StrList pipelined;
    end;
    
    create or replace package body Split_Pkg as
      function Str_Split(
         p_str in varchar2,
         p_dlm in varchar2
      ) return T_StrList pipelined
      is
         l_src_str  varchar2(1000) default p_str;
         l_dlm_pos  number;
      begin
         while l_src_str is not null
         loop
            l_dlm_pos := instr(l_src_str, p_dlm);
            case
              when l_dlm_pos = 0
              then pipe row (l_src_str);
                   l_src_str := '';
              else pipe row(substr(l_src_str, 1, l_dlm_pos - 1));
                   l_src_str := substr(l_src_str, l_dlm_pos + 1);
            end case;
         end loop;
         return;
      end; 
    end;
    
  2. SQL Query with regexp functions:

    with ocrs(ocr) as(
       select level
         from ( select max(regexp_count(col2, '[^/]+')) as mx
                  from tb_splitStr) t
       connect by level <= t.mx
    )
    select count(regexp_substr(s.col2, '[^/]+', 1, o.ocr)) as res
      from tb_splitStr s
       cross join ocrs o
    

Result:

-- SQL with regexp
SQL> with ocrs(ocr) as(
  2    select level
  3     from ( select max(regexp_count(col2, '[^/]+')) as mx
  4              from tb_splitStr) t
  5    connect by level <= t.mx
  6  )
  7  select count(regexp_substr(s.col2, '[^/]+', 1, o.ocr)) as res
  8    from tb_splitStr s
  9     cross join ocrs o
 10  ;

Res
------------------------------
                        700000
Executed in 4.093 seconds

SQL> /

Res
------------------------------
                        700000
Executed in 3.812 seconds



--Query with pipelined table function  
SQL> select count(*)
  2    from Tb_SplitStr s
  3    cross join table(split_pkg.Str_Split(s.col2, '/'))
  4  ;

 COUNT(*)
----------
    700000
Executed in 2.469 seconds

SQL> /

COUNT(*)
----------
    700000
Executed in 2.406 seconds

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...