我們經常會碰到需要刪除數據表中重復記錄,下面我來總結了幾種能刪除重復記錄並助相對來講效率是非常不錯的,有需要了解的朋友可進入參考。
創建一個表用來存放,要刪除的記錄的id信息:
代碼如下
復制代碼
CREATE TABLE `tmp_ids` (
`id` int(11),
`name` char(20)
) ENGINE=MyISAM;
如果要刪除的記錄不多的話,可以把這個表創建成內存表形式:
代碼如下
復制代碼
CREATE TABLE `tmp_ids` (
`id` int(11),
`name` char(20)
) ENGINE=HEAP;
然後在test表中刪除重復記錄:
代碼如下
復制代碼
insert into tmp_ids select min(id),name from test group by name having count(*)>1 order by null;
delete a.* from test a,tmp_ids b where b.name=a.name and a.id>b.id;
truncate table tmp_ids;
方法二
復制無重復記錄到新表格,刪除舊表格,然後重命名新表格為舊表名稱。
代碼如下
復制代碼
mysql> select * from duplicate where id in(select min(id) from duplicate group by name);
+----+-------+
| id | name |
+----+-------+
| 1 | wang |
| 3 | wdang |
| 5 | wdand |
| 6 | wddda |
+----+-------+
4 rows in set (0.01 sec)
mysql> create table duplica select * from duplicate where id in(select min(id) from duplicate group by name);
Query OK, 4 rows affected (0.02 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> drop table duplicate;
Query OK, 0 rows affected (0.01 sec)
mysql> alter table duplica rename to duplicate;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from duplicate;
+----+-------+
| id | name |
+----+-------+
| 1 | wang |
| 3 | wdang |
| 5 | wdand |
| 6 | wddda |
+----+-------+
4 rows in set (0.00 sec)
mysql> alter table duplicate modify id int(2) not null primary key auto_increment;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
後來想了一個語句搞定了:
代碼如下
復制代碼
mysql> use mysql
Database changed
mysql> select * from duplicate;
+----+-------+
| id | name |
+----+-------+
| 1 | wang |
| 3 | wdang |
| 5 | wdand |
| 6 | wddda |
| 2 | wang |
| 4 | wdang |
+----+-------+
6 rows in set (0.00 sec)
mysql> delete duplicate as a from duplicate as a,
-> (
-> select * from duplicate group by name having count(1)>1) as b
-> where a.name=b.name and a.id > b.id;
Query OK, 2 rows affected (0.00 sec)
mysql> select * from duplicate;
+----+-------+
| id | name |
+----+-------+
| 1 | wang |
| 3 | wdang |
| 5 | wdand |
| 6 | wddda |
+----+-------+
4 rows in set (0.00 sec)
保留ID最小的記錄。
第1種:
代碼如下
復制代碼
delete from %s where goodsurl in (select goodsrul as gurl1 from %s
#group by grul1 having count(gurl1)>1)rs1 and id not in (select min(id)as id2 from %s
#group by goodsurl having count(goodsurl)>1)rs2"%(a,a,a)
第2種:這種方法不使用子集,但是我不知道怎麼把rs1,rs2裡面的元組嵌入到SQL語句中,
代碼如下
復制代碼
exeSql = "select min(id) from %s group by goodsurl havingcount(goodsurl)>1)"%(a,)
cur.execute(exeSql)
rs1 = cur.fetchall()
exeSql = "select goodsurl from %s group by goodsurl havingcount(goodsurl)>1"%(a,)
cur.execute(exeSql)
rs2 = cur.fetchall()
exeSql = "delete from %s where goodsurl in %s and id not in %"%(a,rs2,rs1)
cur.execute(exeSql)