Mysql Auto Increment - still cant get it

Name

Jason
Jason
karl
karl
karl
greg


could someone give me help with the query for finding this
 

Name         AutoInc
Jason         1
Jason         2
Jason         3
karl             1
karl             2
greg           1

Thanks

My code so far is as follows,
I need to concat surname+firstname as 'CAT'

And then give a count to all the duplicate 'CAT' feilds (so i can eventually only choose the one with the most info attatched to it.)

Then i need to select only the 'CAT' with the HIGHEST YTDfigures.

I have posted this question 4 times, but ultimately, i'm not getting anywhere!

THe rest of the code is just trying to format the database and clean it up.
Thanks for any help


SELECT       storename      as   'BASE STORE',

             concat(substring(typename,1),' (',
                    substring(typecode,1),')')

                            as   'CUSTOMER TYPE',
             custcode       as   'CUSTOMER CODE',

             concat(substring(surname,1),'',
                    substring(firstname,1))  as 'CAT',

             @inc:=@inc +1 as count,


             surname        as   'SURNAME',
             firstname      as   'FIRSTNAME',

             concat(substring(address1,1),'    ',
                    substring(address2,1),'    ',
                    substring(address3,1),'    ',
                    substring(postcode,1,4))

                            as   'ADDRESS',
#       address1    as   'Address',
#       address2    as   'Suburb',
#       address3    as   'State',
#       postcode    as   'Post Code'
       Phone1,
       Phone2,
       Phone3,
       Emailaddress,

      mtdpurchases,
      ytdpurchases,


         tempF1,
         tempF2,
         tempF3
#        filesearch1

        FROM customers, (select @inc:=0) Count

        Group by cat,count desc;

ALTER TABLE customers ADD grp ENUM('catGrp') not null;





#1) UPDATE PHONE FEILDS;
UPDATE customers SET phone1 = replace(replace(replace(phone1, ' ',''),'-',''),'.','');
UPDATE customers SET phone2 = replace(replace(replace(phone2, ' ',''),'-',''),'.','');
UPDATE customers SET phone3 = replace(replace(replace(phone3, ' ',''),'-',''),'.','');

UPDATE customers SET phone1 ='' where character_length(phone1)<7;
UPDATE customers SET phone2 ='' where character_length(phone2)<7;
UPDATE customers SET phone3 ='' where character_length(phone3)<7;

UPDATE customers SET phone1 ='' where phone1 = 'DO NOT CONTACT';
UPDATE customers SET phone2 ='' where phone2 = 'DO NOT CONTACT';
UPDATE customers SET phone3 ='' where phone3 = 'DO NOT CONTACT';

UPDATE customers SET phone1 = ''
where
phone1 not like '%0%' and
phone1 not like '%1%' and
phone1 not like '%2%' and
phone1 not like '%3%' and
phone1 not like '%4%' and
phone1 not like '%5%' and
phone1 not like '%6%' and
phone1 not like '%7%' and
phone1 not like '%8%' and
phone1 not like '%9%';


#1A) CREATE/DROP Tempphone feilds;
                                 ALTER TABLE customers DROP filesearch1;
                                 ALTER TABLE customers DROP tempF1;
                                 AlTER TABLE customers DROP tempF2;
                                 AlTER TABLE customers DROP tempF3;

                                 ALTER TABLE customers ADD filesearch1 text not null;
                                 ALTER TABLE customers ADD tempF1 text(30);
                                 ALTER TABLE customers ADD tempF2 text(30);
                                 ALTER TABLE customers ADD tempF3 text(30);

#1B) COPYAWAY;
                                 UPDATE customers SET tempF1 = phone1;
                                 UPDATE customers SET tempF2 = phone2;
                                 UPDATE customers SET tempF3 = phone3;
#1C) CLEAR DOWN;
                                 UPDATE customers SET Phone1 ='';
                                 UPDATE customers SET phone2 ='';
                                 UPDATE customers SET phone3 ='';


#1D) INSERT INTO CORRECT FEILDS;
UPDATE customers SET phone1 = tempF1 where tempF1 LIKE '04%';
UPDATE customers SET phone1 = tempF2 where tempF2 LIKE '04%' and tempF2 <> phone1;
UPDATE customers SET phone1 = tempF3 where tempF3 LIKE '04%' and tempF3<>phone1;

#1E) INSERT FROM ALL TEMP fEILDS into COLUMN 3 where = MOBILE
UPDATE customers SET phone2 = tempF1 where tempF1 NOT like '04%';
UPDATE customers SET phone2 = tempF2 where tempF2 NOT like '04%';
UPDATE customers SET phone2 = tempF3 where tempF3 NOT like '04%';



# Insert any email address into the emailaddress feild from the tempPhone Feilds
UPDATE customers SET emailaddress = tempF1 where tempF1 like '%@%';
UPDATE customers SET emailaddress = tempF2 where tempF2 like '%@%';
UPDATE customers SET emailaddress = tempF3 where tempF3 like '%@%';



#1F) DROP all TEMP Phone Feilds(remember to remove from select)

            ALTER TABLE customers DROP filesearch1;
                 ALTER TABLE customers DROP tempF1;
                         AlTER TABLE customers DROP tempF2;
                             AlTER TABLE customers DROP tempF3;


#1G) If phone1 is blank make Phone2, phone 1;
UPDATE customers SET phone1 = phone2 where phone1 ='';

#1H) if phone1 is now doubled up, remove from phone2;
UPDATE customers SET phone2='' where phone1=phone2;


#2)EMAIL ADDRESSES) Update the EMAILADDRESS FEILD FROM any address FEILD;
UPDATE customers SET emailaddress = address1 where address1 like '%@%';
UPDATE customers SET emailaddress = address2 where address2 like '%@%';
UPDATE customers SET emailaddress = address3 where address3 like '%@%';

#2A) clear addresses where they dont contain email addresses
UPDATE customers SET address1='' where address1 like '%@%';
UPDATE customers SET address2='' where address2 like '%@%';
UPDATE customers SET address3='' where address3 like '%@%';

#2A) clear phones where they dont contain email addresses
UPDATE customers SET phone1='' where phone1 like '%@%';
UPDATE customers SET phone2='' where phone2 like '%@%';
UPDATE customers SET phone3='' where phone3 like '%@%';

UPDATE customers SET address1 = ''
where
address1 not like '%0%' and
address1 not like '%1%' and
address1 not like '%2%' and
address1 not like '%3%' and
address1 not like '%4%' and
address1 not like '%5%' and
address1 not like '%6%' and
address1 not like '%7%' and
address1 not like '%8%' and
address1 not like '%9%' and
address1 not like '%CHEQUES%' and
address1 not like '%LAYBYS%' and
address1 not like 'CO' and
address1 not like 'C/O';


UPDATE customers SET address1= replace(address1,'NO TIME','');
UPDATE customers SET address2= replace(address2,'NO TIME','');
UPDATE customers SET address3= replace(address2,'NO TIME','');
UPDATE customers SET address1= replace(address1,'DID NOT HAVE TIME','');
UPDATE customers SET address1= replace(address1,'DID NOT WANT TO LEAVE ADDRESS','');
UPDATE customers SET address1= replace(address1,'DID NOT WANT TO LEAVE DETAILS','');
UPDATE customers SET address1= replace(address1,'NO TIME FOR DETAILS','');
UPDATE customers SET address1= replace(address1,'ADDRESS UNKNOWN','');
UPDATE customers SET address1= replace(address1,'NO ADDRESS GIVEN','');
UPDATE customers SET address1= replace(address1,'DID NOT WANT TO LEAVE ADD','');
UPDATE customers SET address1= replace(address1,'NO ADDRESS AT PRESENT','');
UPDATE customers SET address1= replace(address1,'LIVES OVER SEAS','');
UPDATE customers SET address1= replace(address1,'DETAILS TO BE ENTERED','');
UPDATE customers SET address1= replace(address1,'TO BE ENTERED','');
UPDATE customers SET address1= replace(address1,'OVERSEAS','');

DELETE FROM customers where address1 like '%DEAD%';
DELETE FROM customers where address2 like '%DEAD%';
DELETE FROM customers where address3 like '%DEAD%';
DELETE FROM customers where address1 like '%DECEASED%';
DELETE FROM customers where address2 like '%DECEASED%';
DELETE FROM customers where address3 like '%DECEASED%';
DELETE FROM customers where address1 like 'DO NOT USE THIS CODE';
DELETE FROM customers where address1 like 'DETAILS ALREADY EXIST';
DELETE FROM customers where address1 like 'TO BE DELETED';
DELETE FROM customers where address1 like 'DETAILS ALREADY ENTERED';
DELETE FROM customers where address1 like 'DETAILS ARE UNDER';
DELETE FROM customers where address1 like 'TEST';
DELETE FROM customers where address1 like 'WITCHERY';
DELETE FROM customers where address1 like 'WITCHERY EMPLOYEE';
DELETE FROM customers where address1 like 'WITCHERY STAFF';

DELETE FROM customers where address2 like 'NO FURTHER PURCHASES';

UPDATE customers SET address1='' where character_length(address1)<2;
UPDATE customers SET address2='' where character_length(address2)<2;
UPDATE customers SET address3='' where character_length(address3)<2;

UPDATE customers SET address1='' where address1='DO NOT CONTACT';
UPDATE customers SET address2='' where address2='DO NOT CONTACT';
UPDATE customers SET address3='' where address3='DO NOT CONTACT';

#3) POSTCODE;
UPDATE customers SET postcode='' where character_length(postcode)<4;


#4) TITLES from first and last name;
UPDATE customers SET custaccname = replace(custaccname, 'MR','');
UPDATE customers SET custaccname = replace(custaccname, 'MRS','');
UPDATE customers SET custaccname = replace(custaccname, 'MRS.','');
UPDATE customers SET custaccname = replace(custaccname, 'MS','');
UPDATE customers SET custaccname = replace(custaccname, 'M S','');
UPDATE customers SET custaccname = replace(custaccname, 'MISS','');
UPDATE customers SET custaccname = replace(custaccname, 'MIZZ','');
UPDATE customers SET custaccname = replace(custaccname, 'MIS','');
UPDATE customers SET custaccname = replace(custaccname, 'DR','');
UPDATE customers SET custaccname = replace(custaccname, 'SIR','');
UPDATE customers SET custaccname = replace(custaccname, 'NA','');

UPDATE customers SET custaccname = replace(firstname, 'MR','');
UPDATE customers SET custaccname = replace(firstname, 'MRS','');
UPDATE customers SET custaccname = replace(firstname, 'MRS.','');
UPDATE customers SET custaccname = replace(firstname, 'MS','');
UPDATE customers SET custaccname = replace(firstname, 'M S','');
UPDATE customers SET custaccname = replace(firstname, 'MISS','');
UPDATE customers SET custaccname = replace(firstname, 'MIZZ','');
UPDATE customers SET custaccname = replace(firstname, 'MIS','');
UPDATE customers SET custaccname = replace(firstname, 'DR','');
UPDATE customers SET custaccname = replace(firstname, 'SIR','');
UPDATE customers SET custaccname = replace(firstname, 'NA','');

UPDATE customers SET custaccname = replace(surname, 'MR','');
UPDATE customers SET custaccname = replace(surname, 'MRS','');
UPDATE customers SET custaccname = replace(surname, 'MRS.','');
UPDATE customers SET custaccname = replace(surname, 'MS','');
UPDATE customers SET custaccname = replace(surname, 'M S','');
UPDATE customers SET custaccname = replace(surname, 'MISS','');
UPDATE customers SET custaccname = replace(surname, 'MIZZ','');
UPDATE customers SET custaccname = replace(surname, 'MIS','');
UPDATE customers SET custaccname = replace(surname, 'DR','');
UPDATE customers SET custaccname = replace(surname, 'SIR','');
UPDATE customers SET custaccname = replace(surname, 'NA','');

Open in new window

LVL 8
kingjelyAsked:
Who is Participating?
 
Dave BaldwinFixer of ProblemsCommented:
No, you would never get it done by hand.  I would use that to write a program(s) that uses that code in several variations to find duplicates.  You might also use several different sorting orders (ORDER BY) on different fields also.  I would start with a small sample like only last names that start with A to see what I was going to be dealing with.  When you find duplicates but with different customer codes, you have to decide what to do.  Do you merge the data?  Which customer code do you keep?
Google for "MySQL remove duplicates" will return quite a few links.  Most deal with the situation of exact duplicate rows which isn't your situation but they'll give you a lot to work with.
0
 
dqmqCommented:
>I have posted this question 4 times, but ultimately, i'm not getting anywhere!
With all due respect, your question is incomprehensible.

Take a careful look at the first part.  "Jason" occurs twice in the table, but you want it to show up three times in the query for no explicble reason.

And it gets worse, after that. I'm sorry to be so blunt, but if you expect constructive help you need to state your problem more clearly.



0
 
kingjelyAuthor Commented:

Sorry I missed a word, this is my first problem.

Name

Jason  <======================
Jason
Jason
karl
karl
karl
greg


I need
 

Name         AutoInc
Jason         1
Jason         2
Jason         3
karl             1
karl             2
greg           1


Okay i will try a 6th time, Its frustratingly obvisouly, it's not that easy to explain is it.
Thanks for being so blunt.
0
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

 
dqmqCommented:
Now we have 3 Karl's to start and end up with only 2. :>).   But I think I get your point.  

Does your table have a primary key?

You say that you want to select the row with the highest count
You also say you want to selec the row with the  highest YTD figures
Which is it?  
0
 
arnoldCommented:
post:
show create table <tablename>

What you are pointing out has no meaning because you have not provided a context to your question.

What is the data that you have? What is the structure of the database/tables that you have.
What is it you want to get out of the data that you have?

Does your show create table <tablename>
Include an autoincrement column?

If I read your code, it seems that you want to use an @inc variable that will increment, to what end?

Are you trying to create a stored procedure to output this data?
0
 
kingjelyAuthor Commented:

From the start.

1) I have a 85,000 record customer database.

2) I need to find all the duplicate customer codes that have been entered over 14 stores.

3) I have concat(firstname+surname) to give me unique customers.
 (even tho there is a custcode, which does this, it chooses concat(firstname,1,4 + (surname,1,4)
BUT as there are so many occurances, of where the (first4 + the second 4) is replicated, the system adds a 1, 2 3 on the end.

eg
JASOBARB
JASOBARB1
JASOBARB2
JASOBARB3

Yet, JASOBARB may be

firstname   surname
JASON     BARBIE
JASON     BARKER
JASON     BARBER
JASON     BARBIE

If that makes sence so far.
in summary, the way the system creates custcode(their customer code) it replicates if it already exists;

Or if the same person is entered 2 times or more, at 1 or more shops.

As i have mentioned, I have concat firstname+surname.

NOW, i want to GROUP the NEWLY CONCATINATED FIELD.

eg

concatname       grpid
JASON                  1
JASON                  2
JASON                  3
KARL                    1
KARL                    2
GREG                    1

From here, i then want to select the concatname feild in the customer database, which has the
HIGHEST YTD(year to date figures) as to Keep that customercode, in that grpId

Then the out put would only show concatname, with the highest YTD figures.

I posted all my code, as I am struggling with asking my question as it is not EASY, and although im not Stupid, i though i would post it all because there are many smarter people in the world, Then myself that may beable to help and see what im trying to achive.

Thanks for any positive feedback/help
0
 
kingjelyAuthor Commented:

Here are a few more things i have to include, (that is the rest of my code, that i sent)

The fields are;

·        Customer Type. (i.e. VIP, retail or Staff)
·         Customer Code.
·         Surname.
·         First Name.
·         Address. (Street number or P.O. Box, Street, Suburb, State, Postcode.
·         Phone 1.
·         Phone 2. (Mobile)
·         Email

 
Things to do;
1)  Remove ANY EXISTING titles(MR,MISS,MRS ect) from the data.
2)  phone1 or phone2 or phone 3 (remove, /."\ " ")
3)  IF address1 or address2 or address3 or phone1 or phone2 or phone3 have "DO NOT CONTACT" replace address1, addess2, address3 with 'blank'

4)  convert to 'blank' IF  address1, address2, adderss3, phone1, phone2, phone3  Does NOT Contain "CHEQUES", "LAYBYS" 0r any number.

5)  convert to 'blank' if address1 address2 address3 contains 'deceased'

6) compare custcodes to see which has most data.

7) combine records.

0
 
kingjelyAuthor Commented:

My whole issue is number

6) above.

They have come back and said, they want the customercode, not only with the most columns filled in, but with the highest YTD figues.

i hope i have now explained it so it makes some sense.

Thankyou

0
 
arnoldCommented:
now the picture is getting clearer of what you want to do which is having a set of 85,000 records you would like to transform the data into a new db/table structure.

The transition table the new autoincrement column will increment for every new record that is added.
0
 
kingjelyAuthor Commented:
Hi yea, I didn't want to go into what i was actually doing, because no one really wants to deal with a question 5 pages long, so i was just trying to give examples of where i was stuck.(unsuccessfully!)

Yes, I am modifying a customer database basically to consolidate it. Also to remove unnecessary data, and then format the data and get it into the correct feilds.

I am stuck at the point where, the client wants duplicate customer code removed.

First i need to find unique customers,
Then make a judgement which customer entry has the 'Most data entered' (which i have not worked out yet how to do)
Then, and this is what i need help with, grouping the unique customercodes, and then out of the group find each record containing the highest YTD values.

eg

concatname       grpid     YTD
JASON                  1        50.00
JASON                  2        100.00
JASON                  3        150.00
KARL                    1         0.00
KARL                    2         75.00
GREG                    1         0.00

to return

JASON                  3        150.00
KARL                    2         75.00
GREG                    1         0.00
0
 
Dave BaldwinFixer of ProblemsCommented:
The first thing I would do is make a backup copy of the database.  Auto-increment has a specific meaning in MySQL but I don't think it's what you mean.  That may be a major point of confusion.  

For finding duplicate customers (not ids), you can use their whole name and address and get better results than concatenating names.  Even full names can generate duplicates that way.  There are about 300 people in California with my first and last name and 1000's in the US.  But only 1 at my address.  A long time ago, I had to write a program (in Basic!) to eliminate dupes from a mailing list.  Exact dupes were pretty easy.  Inexact ones took a lot more effort.  I think I ended up going column by column finding both duplicate names and then addresses.  Hand editting was the last step.  But I didn't have 85,000 entries either.
0
 
Dave BaldwinFixer of ProblemsCommented:
You were typing while I was.  grpid and YTD represent the same info but YTD is the actual data.
0
 
kingjelyAuthor Commented:
Hi Dave, Yea you understand what I'm doing, that's cool !!

Concat firstname+surname+address is a good idea, there are just so many variables, I'm just gona have to do my best.

One of the main things also, Is that the customer has been entered 2 or more times, with the spelling of the name and/or address wrong by a few letters ect, so many variables with out going through like you said manually.


Thanks for your help.
0
 
flytox06Commented:
kingjely,

there's no better unique identifier than email, if not NULL use it, if NULL, use other fields.

For your problem of selecting max values, I suggest you to read carefully what follows :

http://jan.kneschke.de/projects/mysql/groupwise-max/

This is a quite complete list of how to retrieve rows with max values.

Hope that helps. If not, please paste an other example of what you got and what you want as a result.

Good luck.
0
 
kingjelyAuthor Commented:
Hi Flytox

thanks for this,

Unfortunately there are only about 8,000 email addresses stored in the email address feild ( and in other columns) so that still leaves me over 75,000 nulls.

I'll definately have a read so thanks, and i'll get back .
0
 
Dave BaldwinFixer of ProblemsCommented:
Don't concatenate.  MySQL will allow you to match on more than one field at a time.  The code below or something like it will let you keep changing the selection as you need to check different combinations.  Make sure your character set is set to CI (case insensitive).  You don't want to go thru all combinations of case.
SELECT * FROM Customers WHERE fname = 'joe' AND lname = 'jones' AND street = '123 Maple' AND city = "Hometown" AND state = "Indiana" AND zip = "12345" ORDER BY zip;

Open in new window

0
 
arnoldCommented:
You could use the group by as a means to determine if there are users that have duplicates

I would think that prior to reprocessing the data you come up with the new database structure:
i.e. can a user have multiple roles i.e. VIP as well as a retail and staff?
can the user have multiple addresses i.e. billing address, shipping address, alternate address?
Similarly for the phones, emails.
Once you have that structure you can then go a record at a time and transition the data into the new table set.  The other thing is you can maintain the link of the current data with the new
i.e. Mr Jay Jones and jay jones being one and the same, you can then link/transition/merge their data in the new db structure.
Presumably 10% of the 85,000 records might be duplicative. You would likely need to review it to make sure that the items that appear to be identical truly are.
You could have the same person that moved from one address to another and thus is now showing up twice.
0
 
kingjelyAuthor Commented:
HI dave, I have only just been able to see the attached code for some reason, sorry.

So i would have to select each person and each feild to make them unique, is what you are saying?

Unfortunately 85,000 select statements like this, would take me weeks, or am i misinterpreting what you are saying?

thanks
Jas
0
 
kingjelyAuthor Commented:
Cool,

Thanks for your ideas and help Dave.

I'll have a look now

Cheers.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.