我一再需要编写复杂的SQL查询,用于即席查询和内置于应用程序中的查询,其中大部分查询只是重复“代码”.
用传统的编程语言编写这样的可憎之处会让你陷入困境,但我(我)还没有找到任何可靠的技术来阻止SQL查询代码的重复.
编辑:1,我要感谢那些为我原来的例子提供了极好改进的回答者.但是,这个问题不是我的例子.这是关于SQL查询的重复性.因此,到目前为止,答案(JackP,Leigh)可以很好地证明您可以通过编写更好的查询来减少重复性.然而,即便如此,你仍然会面临一些显然无法删除的重复性:这总是让我厌倦了sql.在“传统”编程语言中,我可以进行很多重构以最大限度地减少代码中的重复性,但是使用sql似乎没有(?)工具允许这样做,除了编写一个重复性较低的语句之外.
请注意,我已经删除了Oracle标记,因为我真的很感兴趣是否没有数据库或脚本语言允许更多内容.
这是今天拼凑在一起的一颗宝石.它基本上报告了单个表的一组列中的差异.请浏览以下代码,尤其是最后的大查询.我会在下面继续.
-- -- Create Table to test queries -- CREATE TABLE TEST_ATTRIBS ( id NUMBER PRIMARY KEY,name VARCHAR2(300) UNIQUE,attr1 VARCHAR2(2000),attr2 VARCHAR2(2000),attr3 INTEGER,attr4 NUMBER,attr5 VARCHAR2(2000) ); -- -- insert some test data -- insert into TEST_ATTRIBS values ( 1,'Alfred','a','Foobar',33,44,'e'); insert into TEST_ATTRIBS values ( 2,'Batman','b',66,'e'); insert into TEST_ATTRIBS values ( 3,'Chris','c',99,'e'); insert into TEST_ATTRIBS values ( 4,'Dorothee','d','e'); insert into TEST_ATTRIBS values ( 5,'Emilia','e','Barfoo','e'); insert into TEST_ATTRIBS values ( 6,'Francis','f','e'); insert into TEST_ATTRIBS values ( 7,'Gustav','g','e'); insert into TEST_ATTRIBS values ( 8,'Homer','h','e'); insert into TEST_ATTRIBS values ( 9,'Ingrid','i','e'); insert into TEST_ATTRIBS values (10,'Jason','j','Bob','e'); insert into TEST_ATTRIBS values (12,'Konrad','k','e'); insert into TEST_ATTRIBS values (13,'Lucas','l','e'); insert into TEST_ATTRIBS values (14,'DUP_Alfred','FOOBAR','e'); insert into TEST_ATTRIBS values (15,'DUP_Chris','e'); insert into TEST_ATTRIBS values (16,'DUP_Dorothee','e'); insert into TEST_ATTRIBS values (17,'DUP_Gustav','X','e'); insert into TEST_ATTRIBS values (18,'DUP_Homer','e'); insert into TEST_ATTRIBS values (19,'DUP_Ingrid','Y','foo','e'); insert into TEST_ATTRIBS values (20,'Martha','m',88,'f'); -- Create comparison view CREATE OR REPLACE VIEW TA_SELFCMP as select t1.id as id_1,t2.id as id_2,t1.name as name,t2.name as name_dup,t1.attr1 as attr1_1,t1.attr2 as attr2_1,t1.attr3 as attr3_1,t1.attr4 as attr4_1,t1.attr5 as attr5_1,t2.attr1 as attr1_2,t2.attr2 as attr2_2,t2.attr3 as attr3_2,t2.attr4 as attr4_2,t2.attr5 as attr5_2 from TEST_ATTRIBS t1,TEST_ATTRIBS t2 where t1.id <> t2.id and t1.name <> t2.name and t1.name = REPLACE(t2.name,'DUP_','') ; -- NOTE THIS PIECE OF HORRIBLE CODE REPETITION -- -- Create comparison report -- compare 1st attribute select 'attr1' as Different,id_1,id_2,name,name_dup,CAST(attr1_1 AS VARCHAR2(2000)) as Val1,CAST(attr1_2 AS VARCHAR2(2000)) as Val2 from TA_SELFCMP where attr1_1 <> attr1_2 or (attr1_1 is null and attr1_2 is not null) or (attr1_1 is not null and attr1_2 is null) union -- compare 2nd attribute select 'attr2' as Different,CAST(attr2_1 AS VARCHAR2(2000)) as Val1,CAST(attr2_2 AS VARCHAR2(2000)) as Val2 from TA_SELFCMP where attr2_1 <> attr2_2 or (attr2_1 is null and attr2_2 is not null) or (attr2_1 is not null and attr2_2 is null) union -- compare 3rd attribute select 'attr3' as Different,CAST(attr3_1 AS VARCHAR2(2000)) as Val1,CAST(attr3_2 AS VARCHAR2(2000)) as Val2 from TA_SELFCMP where attr3_1 <> attr3_2 or (attr3_1 is null and attr3_2 is not null) or (attr3_1 is not null and attr3_2 is null) union -- compare 4th attribute select 'attr4' as Different,CAST(attr4_1 AS VARCHAR2(2000)) as Val1,CAST(attr4_2 AS VARCHAR2(2000)) as Val2 from TA_SELFCMP where attr4_1 <> attr4_2 or (attr4_1 is null and attr4_2 is not null) or (attr4_1 is not null and attr4_2 is null) union -- compare 5th attribute select 'attr5' as Different,CAST(attr5_1 AS VARCHAR2(2000)) as Val1,CAST(attr5_2 AS VARCHAR2(2000)) as Val2 from TA_SELFCMP where attr5_1 <> attr5_2 or (attr5_1 is null and attr5_2 is not null) or (attr5_1 is not null and attr5_2 is null) ;
如您所见,生成“差异报告”的查询使用相同的sql SELECT块5次(很容易就是42次!).这让我觉得绝对是脑死亡(我可以这么说,毕竟我写了代码),但我还没有找到任何好的解决方案.
>如果这是一些实际应用程序代码中的查询,我可以编写一个函数,将此查询拼凑为一个字符串,然后我将查询作为字符串执行.
> – >构建字符串对于测试和维护来说是可怕且可怕的.如果“应用程序代码”是用PL / sql这样的语言编写的,那就会感觉很糟糕.
>或者,如果从PL / sql或类似用途中使用,我猜有一些程序手段可以使这个查询更易于维护.
> – >将可以在单个查询中表达的内容展开到程序步骤中以防止代码重复也感觉不对.
>如果这个查询需要作为数据库中的视图,那么 – 据我所知 – 除了实际维护我上面发布的视图定义之外别无他法. (!!?)
> – >我实际上不得不对一个2页的视图定义进行一些维护,一旦离语句不远.显然,在此视图中更改任何内容都需要对视图定义进行正则表达式文本搜索,以确定是否在另一行中使用了相同的子语句以及是否需要在那里进行更改.
因此,正如标题所述 – 有什么技术可以防止写这种可憎的事情?
解决方法
> t1.name<>如果t1.name = REPLACE(t2.name,’DUP_’,”),则t2.name始终为true – 您可以删除前者
>通常你想要联盟. union表示union all然后删除重复项.它可能在这种情况下没有区别,但总是使用union all是一个好习惯,除非你明确想要删除任何重复项.
>如果您愿意在转换为varchar之后进行数值比较,则可能需要考虑以下因素:
create view test_attribs_cast as select id,attr1,attr2,cast(attr3 as varchar(2000)) as attr3,cast(attr4 as varchar(2000)) as attr4,attr5 from test_attribs; create view test_attribs_unpivot as select id,1 as attr#,attr1 as attr from test_attribs_cast union all select id,2,attr2 from test_attribs_cast union all select id,3,attr3 from test_attribs_cast union all select id,4,attr4 from test_attribs_cast union all select id,5,attr5 from test_attribs_cast; select 'attr'||t1.attr# as different,t1.id as id_1,t1.name,t1.attr as val1,t2.attr as val2 from test_attribs_unpivot t1 join test_attribs_unpivot t2 on( t1.id<>t2.id and t1.name = replace(t2.name,'') and t1.attr#=t2.attr# ) where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null) or (t1.attr is not null and t2.attr is null);
第二种观点是一种不透明的操作 – 如果你至少11g你可以用unpivot
clause更简洁地做到这一点 – 见here的一个例子
>我说如果你能在sql中做到这一点,不要沿着程序路线走下去,但……
>尽管您在测试和维护时提到了问题,但动态sql可能值得考虑
– 编辑 –
为了回答问题的更一般方面,有一些技术可以减少sql中的重复,包括:
>意见 – 你知道那一个:)
> Common Table Expressions(见here)
>数据库的个别功能,如解码(请参阅Leigh关于如何减少重复的答案),window functions和hierarchical/recursive查询等等
但是你不能直接将OO思想引入sql世界 – 在许多情况下,如果查询是可读的并且写得很好,重复就很好了,并且为了避免重复而采用动态sql(例如)是不明智的.
包括Leigh建议的更改和CTE而不是视图的最终查询可能如下所示:
with t as ( select id,attr#,decode(attr#,1,attr3,attr4,attr5) attr from test_attribs cross join (select rownum attr# from dual connect by rownum<=5)) select 'attr'||t1.attr# as different,t2.attr as val2 from t t1 join test_attribs_unpivot t2 on( t1.id<>t2.id and t1.name = replace(t2.name,'') and t1.attr#=t2.attr# ) where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null) or (t1.attr is not null and t2.attr is null);