On 3 Aug 2005 06:42:46 -0700,
si**********@kingshc.nhs.uk wrote:
Before I go into specifics, this is my problem. I've have a table
that gets updated with large amounts of data on a monthly basis.
Sometimes (rarely) identical rows of data are on one months import that
already exist from the previous month. I can identify these rows from a
combination of two fields (sampleID and testname).
My question is this. Would it be an appropriate 'fix' if I created a
new 'formula' field on the table comprising of a concatentation of
these two fields and then made that an index field (no duplicates)? My
guess would be that if we then tried to import a record with a sampleID
and testname that already existed, then the import for that record
would simply fail.
Would this work? Is there a better way? My background is more with
Access so apologies if I'm not using the right terminology.
Hi Simon,
Yes, this might work - though (as Simon Hayes already pointed out) it
depends on the used method to import the data if it would refuse only
the duplicated row(s) or roll back the entire operation.
However, there is no need to add this 'formula' column. You can just as
easy define a PRIMARY KEY or UNIQUE constraint on a combination of two
(or more) columns. So in your case, the syntax might be something like:
ALTER TABLE MyTable
ADD CONSTRAINT NoDuplicates UNIQUE (sampleID, testname)
I would definitely add this constraint, if I were yoou. But I would not
rely on the database's error detection and handlig to prevent the
duplicates. I think of constraints as a safety belt - a good driver
makes sure it's never needed; the belt is only there because even the
best driver sometimes makes a mistake. But you don't go crashing into a
brick wall on purpose, because the safety belt will keep you safe.
So in short:
- Incorporate Simon Hayes' suggestion into your code to prevent
insertion of duplicates;
- Add a UNIQUE or PRIMARY KLEY constraint on (sampleID, testname) to
make sure that duplicates will enver be entered by any means;
- Don't add an extra column to store the concatenation of sampleID +
testname, since you don't need it for this purpose.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)