以下是tSQL查询:
- DECLARE @table TABLE(data VARCHAR(20))
- INSERT INTO @table VALUES ('not duplicate row')
- INSERT INTO @table VALUES ('duplicate row')
- INSERT INTO @table VALUES ('duplicate row')
- INSERT INTO @table VALUES ('second duplicate row')
- INSERT INTO @table VALUES ('second duplicate row')
- SELECT data
- INTO #duplicates
- FROM @table
- GROUP BY data
- HAVING COUNT(*) > 1
- -- delete all rows that are duplicated
- DELETE FROM @table
- FROM @table o INNER JOIN #duplicates d
- ON d.data = o.data
- -- insert one row for every duplicate set
- INSERT INTO @table(data)
- SELECT data
- FROM #duplicates
我理解它在做什么,但逻辑的最后一部分(在每个重复集合后插入一行)没有意义.我们有一组代码–delete所有重复的行,那就删除了重复项,那么最后一部分是什么?
谢谢
解决方法
Where we have the set of code for –delete all rows that are duplicated,that gets rid of the duplicates so what’s the part of the last section?
首先,它删除所有曾经重复的行.也就是说,所有行,原始也是.在上面的例子中,DELETE之后只有一行(‘不重复行’)将保留在表中.其他所有四行都将被删除.
最好的方法是:
- WITH q AS (
- SELECT data,ROW_NUMBER() OVER (PARTITION BY data ORDER BY data) AS rn
- FROM @table
- )
- DELETE
- FROM q
- WHERE rn > 1