כיצד למחוק שורות כפולות ב-SQL

בסעיף זה, אנו לומדים דרכים שונות למחוק שורות כפולות MySQL ו-Oracle . אם ה SQL הטבלה מכילה שורות כפולות, אז עלינו להסיר את השורות הכפולות.

הכנת נתונים לדוגמה

הסקריפט יוצר את הטבלה בשם אנשי קשר .

 DROP TABLE IF EXISTS contacts; CREATE TABLE contacts ( id INT PRIMARY KEY AUTO_INCREMENT, first_name VARCHAR(30) NOT NULL, last_name VARCHAR(25) NOT NULL, email VARCHAR(210) NOT NULL, age VARCHAR(22) NOT NULL );

בטבלה לעיל, הכנסנו את הנתונים הבאים.

 INSERT INTO contacts (first_name,last_name,email,age) VALUES (&apos;Kavin&apos;,&apos;Peterson&apos;,&apos;[email protected]&apos;,&apos;21&apos;), (&apos;Nick&apos;,&apos;Jonas&apos;,&apos;[email protected]&apos;,&apos;18&apos;), (&apos;Peter&apos;,&apos;Heaven&apos;,&apos;[email protected]&apos;,&apos;23&apos;), (&apos;Michal&apos;,&apos;Jackson&apos;,&apos;[email protected]&apos;,&apos;22&apos;), (&apos;Sean&apos;,&apos;Bean&apos;,&apos;[email protected]&apos;,&apos;23&apos;), (&apos;Tom &apos;,&apos;Baker&apos;,&apos;[email protected]&apos;,&apos;20&apos;), (&apos;Ben&apos;,&apos;Barnes&apos;,&apos;[email protected]&apos;,&apos;17&apos;), (&apos;Mischa &apos;,&apos;Barton&apos;,&apos;[email protected]&apos;,&apos;18&apos;), (&apos;Sean&apos;,&apos;Bean&apos;,&apos;[email protected]&apos;,&apos;16&apos;), (&apos;Eliza&apos;,&apos;Bennett&apos;,&apos;[email protected]&apos;,&apos;25&apos;), (&apos;Michal&apos;,&apos;Krane&apos;,&apos;[email protected]&apos;,&apos;25&apos;), (&apos;Peter&apos;,&apos;Heaven&apos;,&apos;[email protected]&apos;,&apos;20&apos;), (&apos;Brian&apos;,&apos;Blessed&apos;,&apos;[email protected]&apos;,&apos;20&apos;); (&apos;Kavin&apos;,&apos;Peterson&apos;,&apos;[email protected]&apos;,&apos;30&apos;),

אנו מבצעים את הסקריפט כדי ליצור מחדש נתוני בדיקה לאחר ביצוע א לִמְחוֹק הַצהָרָה .

השאילתה מחזירה נתונים מטבלת אנשי הקשר:

 SELECT * FROM contacts ORDER BY email;

תְעוּדַת זֶהוּת	שם פרטי	שם משפחה	אימייל	גיל
7	בן	בארנס	[מוגן באימייל]	עשרים ואחת
13	בריאן	בָּרוּך	[מוגן באימייל]	18
10	אליזה	בנט	[מוגן באימייל]	23
1	קאווין	פיטרסון	[מוגן באימייל]	22
14	קאווין	פיטרסון	[מוגן באימייל]	23
8	מישה	ברטון	[מוגן באימייל]	עשרים
אחד עשר	מיכאל	ברזים	[מוגן באימייל]	17
4	מיכאל	ג'קסון	[מוגן באימייל]	18
2	ניק	ג'ונאס	[מוגן באימייל]	16
3	פיטר	גן העדן	[מוגן באימייל]	25
12	פיטר	גן העדן	[מוגן באימייל]	25
5	שון	אפונה	[מוגן באימייל]	עשרים
9	שון	אפונה	[מוגן באימייל]	עשרים
6	טום	אוֹפֶה	[מוגן באימייל]	30

שאילתת SQL הבאה מחזירה את הודעות האימייל הכפולות מטבלת אנשי הקשר:

 SELECT email, COUNT(email) FROM contacts GROUP BY email HAVING COUNT (email) &gt; 1;

אימייל	COUNT (אימייל)
[מוגן באימייל]	2
[מוגן באימייל]	2
[מוגן באימייל]	2

יש לנו שלוש שורות עם לְשַׁכְפֵּל מיילים.

נוסחת מייסון

(א) מחק שורות כפולות עם הצהרת DELETE JOIN

 DELETE t1 FROM contacts t1 INNERJOIN contacts t2 WHERE t1.id <t2.id and t1.email="t2.email;" < pre> <p> <strong>Output:</strong> </p> <pre> Query OK, three rows affected (0.10 sec) </pre> <p>Three rows had been deleted. We execute the query, given below to finds the <strong>duplicate emails</strong> from the table.</p> <pre> SELECT email, COUNT (email) FROM contacts GROUP BY email HAVING COUNT (email) &gt; 1; </pre> <p>The query returns the empty set. To verify the data from the contacts table, execute the following SQL query:</p> <pre> SELECT * FROM contacts; </pre> <br> <table class="table"> <tr> <td>id</td> <td>first_name</td> <td>last_name</td> <td>Email</td> <td>age</td> </tr> <tr> <td>7</td> <td>Ben</td> <td>Barnes</td> <td> [email protected] </td> <td>21</td> </tr> <tr> <td>13</td> <td>Brian</td> <td>Blessed</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>10</td> <td>Eliza</td> <td>Bennett</td> <td> [email protected] </td> <td>23</td> </tr> <tr> <td>1</td> <td>Kavin</td> <td>Peterson</td> <td> [email protected] </td> <td>22</td> </tr> <tr> <td>8</td> <td>Mischa</td> <td>Barton</td> <td> [email protected] </td> <td>20</td> </tr> <tr> <td>11</td> <td>Micha</td> <td>Krane</td> <td> [email protected] </td> <td>17</td> </tr> <tr> <td>4</td> <td>Michal</td> <td>Jackson</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>2</td> <td>Nick</td> <td>Jonas</td> <td> [email protected] </td> <td>16</td> </tr> <tr> <td>3</td> <td>Peter</td> <td>Heaven</td> <td> [email protected] </td> <td>25</td> </tr> <tr> <td>5</td> <td>Sean</td> <td>Bean</td> <td> [email protected] </td> <td>20</td> </tr> <tr> <td>6</td> <td>Tom</td> <td>Baker</td> <td> [email protected] </td> <td>30</td> </tr> </table> <p>The rows <strong>id&apos;s 9, 12, and 14</strong> have been deleted. We use the below statement to delete the duplicate rows:</p> <p>Execute the script for <strong>creating</strong> the contact.</p> <pre> DELETE c1 FROM contacts c1 INNERJ OIN contacts c2 WHERE c1.id &gt; c2.id AND c1.email = c2.email; </pre> <br> <table class="table"> <tr> <td>id</td> <td>first_name</td> <td>last_name</td> <td>email</td> <td>age</td> </tr> <tr> <td>1</td> <td>Ben</td> <td>Barnes</td> <td> [email protected] </td> <td>21</td> </tr> <tr> <td>2</td> <td> <strong>Kavin</strong> </td> <td> <strong>Peterson</strong></td> <td> <strong> [email protected] </strong> </td> <td> <strong>22</strong> </td> </tr> <tr> <td>3</td> <td>Brian</td> <td>Blessed</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>4</td> <td>Nick</td> <td>Jonas</td> <td> [email protected] </td> <td>16</td> </tr> <tr> <td>5</td> <td>Michal</td> <td>Krane</td> <td> [email protected] </td> <td>17</td> </tr> <tr> <td>6</td> <td>Eliza</td> <td>Bennett</td> <td> [email protected] </td> <td>23</td> </tr> <tr> <td>7</td> <td>Michal</td> <td>Jackson</td> <td> [email protected] </td> <td>18</td> </tr> <tr> <td>8</td> <td> <strong>Sean</strong> </td> <td> <strong>Bean</strong> </td> <td> <strong> [email protected] </strong> </td> <td> <strong>20</strong> </td> </tr> <tr> <td>9</td> <td>Mischa</td> <td>Barton</td> <td> [email protected] </td> <td>20</td> </tr> <tr> <td>10</td> <td> <strong>Peter</strong> </td> <td> <strong>Heaven</strong> </td> <td> <strong> [email protected] </strong> </td> <td> <strong>25</strong> </td> </tr> <tr> <td>11</td> <td>Tom</td> <td>Baker</td> <td> [email protected] </td> <td>30</td> </tr> </table> <h2>(B) Delete duplicate rows using an intermediate table</h2> <p>To delete a duplicate row by using the intermediate table, follow the steps given below:</p> <p> <strong>Step 1</strong> . Create a new table <strong>structure</strong> , same as the real table:</p> <pre> CREATE TABLE source_copy LIKE source; </pre> <p> <strong>Step 2</strong> . Insert the distinct rows from the original schedule of the database:</p> <pre> INSERT INTO source_copy SELECT * FROM source GROUP BY col; </pre> <p> <strong>Step 3</strong> . Drop the original table and rename the immediate table to the original one.</p> <pre> DROP TABLE source; ALTER TABLE source_copy RENAME TO source; </pre> <p>For example, the following statements delete the <strong>rows</strong> with <strong>duplicate</strong> emails from the contacts table:</p> <pre> -- step 1 CREATE TABLE contacts_temp LIKE contacts; -- step 2 INSERT INTO contacts_temp SELECT * FROM contacts GROUP BY email; -- step 3 DROP TABLE contacts; ALTER TABLE contacts_temp RENAME TO contacts; </pre> <h2>(C) Delete duplicate rows using the ROW_NUMBER() Function</h2> <h4>Note: The ROW_NUMBER() function has been supported since MySQL version 8.02, so we should check our MySQL version before using the function.</h4> <p>The following statement uses the <strong>ROW_NUMBER ()</strong> to assign a sequential integer to every row. If the email is duplicate, the row will higher than one.</p> <pre> SELECT id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY email ) AS row_num FROM contacts; </pre> <p>The following SQL query returns <strong>id list</strong> of the duplicate rows:</p> <pre> SELECT id FROM (SELECT id, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY email) AS row_num FROM contacts ) t WHERE row_num&gt; 1; </pre> <p> <strong>Output:</strong> </p> <table class="table"> <tr> <td>id</td> </tr> <tr> <td>9</td> </tr> <tr> <td>12</td> </tr> <tr> <td>14</td> </tr> </table> <h2>Delete Duplicate Records in Oracle</h2> <p>When we found the duplicate records in the table, we had to delete the unwanted copies to keep our data clean and unique. If a table has duplicate rows, we can delete it by using the <strong>DELETE</strong> statement.</p> <p>In the case, we have a column, which is not the part of <strong>group</strong> used to <strong>evaluate</strong> the <strong>duplicate</strong> records in the table.</p> <p>Consider the table given below:</p> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>02</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>03</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>04</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>05</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>06</td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td>07</td> <td>Pumpkin</td> <td>Yellow</td> </tr> </table> <br> <pre> -- create the vegetable table CREATE TABLE vegetables ( VEGETABLE_ID NUMBER generated BY DEFAULT AS ID ENTITY, VEGETABLE_NAME VARCHAR2(100), color VARCHAR2(20), PRIMARY KEY (VEGETABLE_ID) ); </pre> <br> <pre> -- insert sample rows INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Yellow&apos;); </pre> <br> <pre> -- query data from the vegetable table SELECT * FROM vegetables; </pre> <p>Suppose, we want to keep the row with the highest <strong>VEGETABLE_ID</strong> and delete all other copies.</p> <pre> SELECT MAX (VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ORDER BY MAX(VEGETABLE_ID); </pre> <br> <table class="table"> <tr> <td>MAX(VEGETABLE_ID)</td> </tr> <tr> <td>2</td> </tr> <tr> <td>5</td> </tr> <tr> <td>6</td> </tr> <tr> <td>7</td> </tr> </table> <p>We use the <strong>DELETE</strong> statement to delete the rows whose values in the <strong>VEGETABLE_ID COLUMN</strong> are not the <strong>highest</strong> .</p> <pre> DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MAX(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ); </pre> <p>Three rows have been deleted.</p> <pre> SELECT *FROM vegetables; </pre> <br> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td> <strong>02</strong> </td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td> <strong>05</strong> </td> <td>Onion</td> <td>Red</td> </tr> <tr> <td> <strong>06</strong> </td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td> <strong>07</strong> </td> <td><pumpkin td> <td>Yellow</td> </pumpkin></td></tr> </table> <p>If we want to keep the row with the lowest id, use the <strong>MIN()</strong> function instead of the <strong>MAX()</strong> function.</p> <pre> DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MIN(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ); </pre> <p>The above method works if we have a column that is not part of the group for evaluating duplicate. If all values in the columns have copies, then we cannot use the <strong>VEGETABLE_ID</strong> column.</p> <p>Let&apos;s drop and create the <strong>vegetable</strong> table with a new structure.</p> <pre> DROP TABLE vegetables; CREATE TABLE vegetables ( VEGETABLE_ID NUMBER, VEGETABLE_NAME VARCHAR2(100), Color VARCHAR2(20) ); </pre> <br> <pre> INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1,&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1, &apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(3,&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(&apos;4,Pumpkin&apos;,&apos;Yellow&apos;); SELECT * FROM vegetables; </pre> <br> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>03</td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td>04</td> <td>Pumpkin</td> <td>Yellow</td> </tr> </table> <p>In the vegetable table, the values in all columns <strong>VEGETABLE_ID, VEGETABLE_NAME</strong> , and color have been copied.</p> <p>We can use the <strong>rowid</strong> , a locator that specifies where Oracle stores the row. Because the <strong>rowid</strong> is unique so that we can use it to remove the duplicates rows.</p> <pre> DELETE FROM Vegetables WHERE rowed NOT IN ( SELECT MIN(rowid) FROM vegetables GROUP BY VEGETABLE_ID, VEGETABLE_NAME, color ); </pre> <p>The query verifies the deletion operation:</p> <pre> SELECT * FROM vegetables; </pre> <br> <table class="table"> <tr> <td>VEGETABLE_ID</td> <td>VEGETABLE_NAME</td> <td>COLOR</td> </tr> <tr> <td>01</td> <td>Potato</td> <td>Brown</td> </tr> <tr> <td>02</td> <td>Onion</td> <td>Red</td> </tr> <tr> <td>03</td> <td>Pumpkin</td> <td>Green</td> </tr> <tr> <td>04</td> <td>Pumpkin</td> <td>Yellow</td> </tr> </table> <hr></t2.id>

שלוש שורות נמחקו. אנו מבצעים את השאילתה, שניתנה להלן כדי למצוא את אימיילים כפולים מהשולחן.

 SELECT email, COUNT (email) FROM contacts GROUP BY email HAVING COUNT (email) &gt; 1;

השאילתה מחזירה את הסט הריק. כדי לאמת את הנתונים מטבלת אנשי הקשר, בצע את שאילתת SQL הבאה:

 SELECT * FROM contacts;

תְעוּדַת זֶהוּת	שם פרטי	שם משפחה	אימייל	גיל
7	בן	בארנס	[מוגן באימייל]	עשרים ואחת
13	בריאן	בָּרוּך	[מוגן באימייל]	18
10	אליזה	בנט	[מוגן באימייל]	23
1	קאווין	פיטרסון	[מוגן באימייל]	22
8	מישה	ברטון	[מוגן באימייל]	עשרים
אחד עשר	מיכאל	ברזים	[מוגן באימייל]	17
4	מיכאל	ג'קסון	[מוגן באימייל]	18
2	ניק	ג'ונאס	[מוגן באימייל]	16
3	פיטר	גן העדן	[מוגן באימייל]	25
5	שון	אפונה	[מוגן באימייל]	עשרים
6	טום	אוֹפֶה	[מוגן באימייל]	30

השורות זיהויים 9, 12 ו-14 נמחקו. אנו משתמשים במשפט שלהלן כדי למחוק את השורות הכפולות:

בצע את הסקריפט עבור יצירה איש הקשר.

 DELETE c1 FROM contacts c1 INNERJ OIN contacts c2 WHERE c1.id &gt; c2.id AND c1.email = c2.email;

תְעוּדַת זֶהוּת	שם פרטי	שם משפחה	אימייל	גיל
1	בן	בארנס	[מוגן באימייל]	עשרים ואחת
2	קאווין	פיטרסון	[מוגן באימייל]	22
3	בריאן	בָּרוּך	[מוגן באימייל]	18
4	ניק	ג'ונאס	[מוגן באימייל]	16
5	מיכאל	ברזים	[מוגן באימייל]	17
6	אליזה	בנט	[מוגן באימייל]	23
7	מיכאל	ג'קסון	[מוגן באימייל]	18
8	שון	אפונה	[מוגן באימייל]	עשרים
9	מישה	ברטון	[מוגן באימייל]	עשרים
10	פיטר	גן העדן	[מוגן באימייל]	25
אחד עשר	טום	אוֹפֶה	[מוגן באימייל]	30

(ב) מחק שורות כפולות באמצעות טבלת ביניים

כדי למחוק שורה כפולה באמצעות טבלת הביניים, בצע את השלבים המפורטים להלן:

שלב 1 . צור טבלה חדשה מִבְנֶה , זהה לטבלה האמיתית:

 CREATE TABLE source_copy LIKE source;

שלב 2 . הכנס את השורות הנבדלות מלוח הזמנים המקורי של מסד הנתונים:

js base64 פענוח

 INSERT INTO source_copy SELECT * FROM source GROUP BY col;

שלב 3 . שחרר את הטבלה המקורית ושנה את שם הטבלה המיידית לטבלה המקורית.

 DROP TABLE source; ALTER TABLE source_copy RENAME TO source;

לדוגמה, ההצהרות הבאות מוחקות את שורות עם לְשַׁכְפֵּל מיילים מטבלת אנשי הקשר:

 -- step 1 CREATE TABLE contacts_temp LIKE contacts; -- step 2 INSERT INTO contacts_temp SELECT * FROM contacts GROUP BY email; -- step 3 DROP TABLE contacts; ALTER TABLE contacts_temp RENAME TO contacts;

(ג) מחק שורות כפולות באמצעות הפונקציה ROW_NUMBER()

הערה: הפונקציה ROW_NUMBER() נתמכת מאז גירסת MySQL 8.02, לכן עלינו לבדוק את גירסת ה-MySQL שלנו לפני השימוש בפונקציה.

ההצהרה הבאה משתמשת ב- ROW_NUMBER () כדי להקצות מספר שלם רציף לכל שורה. אם האימייל משוכפל, השורה תהיה גבוהה מאחת.

 SELECT id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY email ) AS row_num FROM contacts;

שאילתת SQL הבאה מחזירה רשימת תעודות זהות מבין השורות הכפולות:

 SELECT id FROM (SELECT id, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY email) AS row_num FROM contacts ) t WHERE row_num&gt; 1;

תְפוּקָה:

תְעוּדַת זֶהוּת

מחק רשומות כפולות באורקל

כשמצאנו את הרשומות הכפולות בטבלה, נאלצנו למחוק את העותקים הלא רצויים כדי לשמור על הנתונים שלנו נקיים וייחודיים. אם לטבלה יש שורות כפולות, נוכל למחוק אותה באמצעות ה- לִמְחוֹק הַצהָרָה.

במקרה, יש לנו עמודה, שאינה חלק ממנה קְבוּצָה היה להעריך ה לְשַׁכְפֵּל רשומות בטבלה.

גיל פיט דיווידסון

שקול את הטבלה המופיעה להלן:

VEGETABLE_ID	VEGETABLE_NAME	צֶבַע
01	תפוח אדמה	חום
02	תפוח אדמה	חום
03	בצל	אָדוֹם
04	בצל	אָדוֹם
05	בצל	אָדוֹם
06	דלעת	ירוק
07	דלעת	צהוב

 -- create the vegetable table CREATE TABLE vegetables ( VEGETABLE_ID NUMBER generated BY DEFAULT AS ID ENTITY, VEGETABLE_NAME VARCHAR2(100), color VARCHAR2(20), PRIMARY KEY (VEGETABLE_ID) );

 -- insert sample rows INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_NAME,color) VALUES(&apos;Pumpkin&apos;,&apos;Yellow&apos;);

 -- query data from the vegetable table SELECT * FROM vegetables;

נניח שאנו רוצים לשמור על השורה עם הגבוהה ביותר VEGETABLE_ID ולמחוק את כל שאר העותקים.

 SELECT MAX (VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color ORDER BY MAX(VEGETABLE_ID);

MAX(VEGETABLE_ID)

אנו משתמשים ב- לִמְחוֹק הצהרה למחיקת השורות שהערכים שלהן ב- עמודה VEGETABLE_ID אינם ה הֲכִי גָבוֹהַ .

charat מחרוזת java

 DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MAX(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color );

שלוש שורות נמחקו.

 SELECT *FROM vegetables;

VEGETABLE_ID	VEGETABLE_NAME	צֶבַע
02	תפוח אדמה	חום
05	בצל	אָדוֹם
06	דלעת	ירוק
07		צהוב

אם אנחנו רוצים לשמור את השורה עם המזהה הנמוך ביותר, השתמש ב- MIN() פונקציה במקום ה MAX() פוּנקצִיָה.

 DELETE FROM vegetables WHERE VEGETABLE_IDNOTIN ( SELECT MIN(VEGETABLE_ID) FROM vegetables GROUP BY VEGETABLE_NAME, color );

השיטה לעיל עובדת אם יש לנו עמודה שאינה חלק מהקבוצה להערכת כפילויות. אם לכל הערכים בעמודות יש עותקים, אז לא נוכל להשתמש ב- VEGETABLE_ID טור.

בואו נשאיר וניצור את ירקות שולחן עם מבנה חדש.

 DROP TABLE vegetables; CREATE TABLE vegetables ( VEGETABLE_ID NUMBER, VEGETABLE_NAME VARCHAR2(100), Color VARCHAR2(20) );

 INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1,&apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(1, &apos;Potato&apos;,&apos;Brown&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color)VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(2,&apos;Onion&apos;,&apos;Red&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(3,&apos;Pumpkin&apos;,&apos;Green&apos;); INSERT INTO vegetables (VEGETABLE_ID,VEGETABLE_NAME,color) VALUES(&apos;4,Pumpkin&apos;,&apos;Yellow&apos;); SELECT * FROM vegetables;

VEGETABLE_ID	VEGETABLE_NAME	צֶבַע
01	תפוח אדמה	חום
01	תפוח אדמה	חום
02	בצל	אָדוֹם
02	בצל	אָדוֹם
02	בצל	אָדוֹם
03	דלעת	ירוק
04	דלעת	צהוב

בטבלת הירקות, הערכים בכל העמודות VEGETABLE_ID, VEGETABLE_NAME , והצבע הועתקו.

אנחנו יכולים להשתמש ב rowid , איתור המציין היכן אורקל מאחסנת את השורה. בגלל ה rowid הוא ייחודי כך שנוכל להשתמש בו כדי להסיר את השורות הכפולות.

 DELETE FROM Vegetables WHERE rowed NOT IN ( SELECT MIN(rowid) FROM vegetables GROUP BY VEGETABLE_ID, VEGETABLE_NAME, color );

השאילתה מאמתת את פעולת המחיקה:

 SELECT * FROM vegetables;

VEGETABLE_ID	VEGETABLE_NAME	צֶבַע
01	תפוח אדמה	חום
02	בצל	אָדוֹם
03	דלעת	ירוק
04	דלעת	צהוב

TechCodeview

כיצד למחוק שורות כפולות ב- SQL?