Question:

Why does my code insert duplicate entries into my MYSQL table?

by  |  earlier

0 LIKES UnLike

Grrr... I've got my program (almost) working correctly, but I don't know why when my code imports from CSV it adds to the table instead of replacing the old information ... I get lots of duplicate entries. Help?

Thanks in advance, and here's the code:

<?php

include "connect.php";

if(isset($_POST['submit']))

{

$filename=$_POST['filename'];

$handle = fopen("$filename", "r");

while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)

{

$import="REPLACE into info(alpha,beta,c,delta,alex) values('$data[0]','$data[1]','$data[2]',...

mysql_query($import) or die(mysql_error());

}

fclose($handle);

print "Import done";

}

else

{

print "<form action='import.php' method='post'>";

print "Type file name to import:<br>";

print "<input type='text' name='filename' size='20'><br>";

print "<input type='submit' name='submit' value='submit'></form>";

}

?>

 Tags:

   Report

2 ANSWERS


  1. I will answer your question first.  Please see a serious note below about SQL injection vulnerability in your code!

    REPLACE does not guarantee that the row is unique.  It means &quot;INSERT unless there is a conflict in a unique index, in which case replace the existing row with this one.&quot;  In order for REPLACE to do anything other than function as INSERT, there must be a unique index on the table.

    The purpose of REPLACE is to resolve errors arising from duplicate entries in indices that must contain only unique values, not to avoid duplication.

    If you are concerned about duplication in a specific column and want it to contain only unique values, a unique index and REPLACE could be an appropriate solution.  Here&#039;s one way to ensure that no two rows are exactly identical (zero duplication): concatenate and hash the row&#039;s values, put the hash in an array, and use in_array() to check for duplicates.

    Here&#039;s an example:

    &lt;?php

    // pass the hash array as a reference so we can insert

    // a new hash if we need to

    function isDuplicate(&amp;$hashArray, $rowArray)

    {

    // concatenate all the values together into a string

    $row = &#039;&#039;;

    foreach($rowArray as $column) {

    $row .= $column;

    }

    // convert row to a hash

    // this will allow you to reduce the memory used to track the rows

    $rowHash = sha1($row);

    // check if we have this hash in the table

    $found = in_array($rowHash, $hashArray);

    // the hash wasn&#039;t found, so it&#039;s new and unique

    // put it in the array

    if (!$found)

    $hashArray[] = $rowHash;

    // return true if we the hash is a duplicate

    // return false if the hash is unique

    return $found;

    }

    ?&gt;

    SQL injection vulnerability: NEVER, NEVER, NEVER put data directly into a database without first protecting yourself from attack.  If any of the columns contained an SQL injection, your entire database could be destroyed or compromised.  Not to mention that your insert will fail if there is an unescaped special character in the data.

    Use mysql_real_escape_string() to escape the data before entering it.  Consider using prepared queries (mysqli or PDO) to make managing this very simple.


  2. I would simply like to re-emphasize what Fergus has said.

    You have a deadly security hole!  Never ever ever ever EVER do ANYTHING with input without filtering it first!

    Keep in mind that even if you had JavaScript validation in place, or the data was coming from a select box, you still need to watch out.... somebody could very easily send information to your server from a page other than the one you intended the data to come from.

    This also goes for your file access.  What if somebody chose to open sensitive files such as connect.php or .htaccess?

    Good luck with your web development.

Question Stats

Latest activity: earlier.
This question has 2 answers.

BECOME A GUIDE

Share your knowledge and help people by answering questions.