?
Solved

only use a mysql query which can be run 1000 times and only put new rows in table if the title,body changes

Posted on 2013-11-29
26
Medium Priority
?
357 Views
Last Modified: 2014-03-07
users have a title and body in their social networking website
Think of myspace for example where some users1 has not edited their title and body 2 years.  So user1 title,body 2 years ago is the same as user1 title,body current.
I do not want many rows of user1 because user1 is unchanged.
I do not want to prune duplicate values.

columns:
 id (int),profile_id (varchar), timestamp, title (varchar), body (varchar).  If body,title is different save a new title,body


I want historical values so if values are different, insert do not update


Please use mysql query.
because I have another question of trigger and stored procedure and php

I want to test this one query in mysql workbench query editor.
0
Comment
Question by:rgb192
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 13
  • 11
26 Comments
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 39685578
it isn't very clear what you are asking us for...

it isn't very clear what you are trying to achieve..

please provide some sample data to clarify the scenario...

ie.. what do you mean by duplicate...
       what is the key to your table....
         (how is the History maintained/related?)
      you mention 2 years of unchanged data , but don't accurately reflect how that is to be identified....

    e.g. if the user hasn't changed the data ... the natural assumption is that only a single row will exist....
         

 if the user has cyclicly return to a "body" which is the same as a previous one ... how should that be treated?

  e.g.   July 2011  bay city rollers fan
            aug 2011  cure fan
           aug 2012  bay city rollers fan

What is the significance of the process being run 1 thousand times?
0
 

Author Comment

by:rgb192
ID: 39687068
ie.. what do you mean by duplicate...
same title/body
       what is the key to your table....
I have not created the table yet
         (how is the History maintained/related?)
I do not know
      you mention 2 years of unchanged data , but don't accurately reflect how that is to be identified....
Going to the website page (myspace.com) and gathering title and body many times for 2 years

    e.g. if the user hasn't changed the data ... the natural assumption is that only a single row will exist....
         

 if the user has cyclicly return to a "body" which is the same as a previous one ... how should that be treated?
  e.g.   July 2011  bay city rollers fan
            aug 2011  cure fan
           aug 2012  bay city rollers fan

I have not thought about that yet and will be a future question.  Only comparing to current title/body row.



What is the significance of the process being run 1 thousand times?
This may be run every hour for two years
0
 

Author Comment

by:rgb192
ID: 39703214
Did I answer your questions properly?
0
NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

 
LVL 27

Expert Comment

by:skullnobrains
ID: 39726548
use id,body,title as the primary key or unique index of the history table

run insert queries on that table : if the data already exists, nothing will be inserted and the query will fail with an explicit error

you can use "insert ignore" so you can run a single insert query for multiple rows : in that case you won't be able to tell which rows were ignored or inserted unless you run an additional select but the mysql_affected_rows will reflect the number of rows that were actually inserted
0
 

Author Comment

by:rgb192
ID: 39727948
Insert ignore where not the same profile_id, title, body. Because two users may have the same title and body.
What is the query for that?
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39728683
nothing related to the query

just create a primary key or unique index with the combination of fields that should never be duplicated

if you create a unique index on the 3 keys specified, 2 users with the same body and title are allowed as long as they have different ids
0
 

Author Comment

by:rgb192
ID: 39738147
is this the proper syntax?

CREATE UNIQUE INDEX index_name
ON table_name (column1_name, column2_name,column3_name)


CREATE UNIQUE INDEX duplicationPrevention
ON userText ( profile_id, title, body)


and if this were in the bottom of a create table query

 UNIQUE INDEX duplicationPrevention ( `profile_id`, `title`, `body`)
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39738268
yes. create this index and run your insert ignore queries. you can use now() for the timestamp. batch-inserting multiple rows will work as expected. if you want the timestamp to be updated even when the data has not changed, add "on duplicate key update timestamp=VALUES(timestamp)"
0
 

Author Comment

by:rgb192
ID: 39750587
I do not want duplicate key update
if there is new data, then it is a new inserted row, so I can see historical changes

I do not understand what the insert statement should be
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39750918
then forget about the on duplicate key if you don't need it.

the insert query is just your existing insert query with an extra ignore keyword :

insert IGNORE into tbl_name (col1,col2,col3) VALUES ...

design the insert exactly in the same way you would if you wanted the duplicated rows to actually be inserted.

it is the job of mysql to detect that the row is a duplicate using the unique index or primary key you defined.

the ignore keyword makes mysql ignore the duplicate rows without throwing an sql error. this is more convenient when you insert rows one by one because it makes the error handling easier. it is mandatory if you insert multiple rows at once because mysql would stop inserting after the first duplicate if you did a regular insert (at least with mysql's default config)
0
 

Author Comment

by:rgb192
ID: 39773657
insert IGNORE into tbl_name (col1,col2,col3) VALUES ...

insert IGNORE into tbl_name (profile_id,title,body) VALUES ...

would my unique key work for this query?

CREATE UNIQUE INDEX duplicationPrevention
ON userText ( profile_id, title, body)
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39775194
yes. give it a try
0
 

Author Comment

by:rgb192
ID: 39782712
CREATE TABLE `history`
 ( 
`history_id` int(11) NOT NULL, 
`profile_id` varchar(20) default NULL,
 `title` varchar(100) default NULL, 
`body` varchar(2000) default NULL,
 PRIMARY KEY (`history_id`),
UNIQUE INDEX duplicationPrevention ( profile_id, title, body)
 )
 ENGINE=MyISAM DEFAULT CHARSET=utf8;

Open in new window


Error Code: 1071. Specified key was too long; max key length is 1000 bytes



but
CREATE TABLE `history`
 ( 
`history_id` int(11) NOT NULL, 
`profile_id` varchar(20) default NULL,
 `title` varchar(100) default NULL, 
`body` varchar(200) default NULL,
 PRIMARY KEY (`history_id`),
UNIQUE INDEX duplicationPrevention ( profile_id, title, body)
 )
 ENGINE=MyISAM DEFAULT CHARSET=utf8;

Open in new window


works


however the body field requires 2000 characters.
0
 

Author Comment

by:rgb192
ID: 39782809
correction, I found the maximum length of every field
drop table history;
CREATE TABLE `history`
 ( 
`history_id` int(11) NOT NULL, 
`profile_id` varchar(16) default NULL,
 `title` varchar(50) default NULL, 
`body` varchar(550) default NULL,
 PRIMARY KEY (`history_id`),
UNIQUE INDEX duplicationPrevention ( profile_id, title, body)
 )
 ENGINE=MyISAM DEFAULT CHARSET=utf8;

Open in new window


Error Code: 1071. Specified key was too long; max key length is 1000 bytes
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39798669
this is a limitation in mysql : key lengths can't be bigger with this engine and mysql version.

note that using multibyte encodings such as UTF-8 makes the length of your columns bigger so it is not surprising you hit a 1000 limit with 550+50+16

i don't know exactly the limits of each build+engine. current innodb limits are above 3k so one option is to upgrade to a newer version. i guess this can also be done at compile time.

---

a simple workaround could be to store the md5 of the body in a separate column and use it in the index.

i'd recommend this solution since it is easy to implement. you can use mysql itself to calculate the md5 using the builtin MD5() function.

- create the necessary column (let's call it body_hash)
- update table set body_hash=MD5(body)
- change your insert query so it also populates the body_hash column as well
0
 

Author Comment

by:rgb192
ID: 39809194
could you elaborate or show a link of md5 example.

Because I think  when I insert into original table automatic insert into second table (what is this called)?
And then deletes from first table delete second table. (What is this called)
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39811755
could you elaborate or show a link of md5 example.

i'll be very happy to answer specific questions related to that matter, or possibly modify code that you provide if it is reasonably clean, but i don't have time to do your work since i'm pretty busy with mine already.

why don't you try and follow the 3 simple steps i gave ? i'll gladly help if you get stuck along the way.
i forgot to mention the implicit fourth step : change the unique index to ( profile_id, title, body_hash) which mysql will allow since the body_hash column will be much smaller than the initial body column

Because I think  when I insert into original table automatic insert into second table (what is this called)?
And then deletes from first table delete second table. (What is this called)

no idea what you are talking about. you never mentioned a second table before and you don't need one.
0
 

Author Comment

by:rgb192
ID: 39816941
are these the 3 steps

- create the necessary column (let's call it body_hash)
- update table set body_hash=MD5(body)
- change your insert query so it also populates the body_hash column as well


this is an internal system.
is there a way to design without md5

maybe using
id instead of md5 id
0
 
LVL 27

Accepted Solution

by:
skullnobrains earned 2000 total points
ID: 39823805
other options :

- you can give a shorter length to the text column containing the body if you are sure the length of the texts will be shorter.

- if you expect english text, you can use ASCII instead of UTF-8 which you don't need

- you can declare the body as non-multibyte charset (latin1 should do for example) even though you actually store utf-8 inside that column. beware that this is a dirty hack, and that each accentuated or other weird character in your input will reduce the available length of the text container.

----

the md5 way is better.

- create the necessary column (let's call it body_hash)
- update table set body_hash=MD5(body)
- create a unique index on ( profile_id, title, body_hash)
- change your insert query so it also populates the body_hash column as well
<< insert into ...  (...,body) VALUES (.....,$BODY)
>> insert into ... (...,body,body_hash) VALUES (....,$BODY,MD5($BODY))

you can also compute the md5 beforehand in your code so you don't send the body to mysql twice

maybe using
id instead of md5 id

no way. ids are unrelated to the body's content so you cannot expect to detect changes in the body using them. if this is not obvious to you, then you are way over your head and you definitely should focus on understanding rather than solving... or hire someone.
0
 

Author Comment

by:rgb192
ID: 39857883
no way. ids are unrelated to the body's content so you cannot expect to detect changes in the body using them. if this is not obvious to you, then you are way over your head and you definitely should focus on understanding rather than solving... or hire someone.

I am hiring someone.  I just dont know what to tell him.
Could you give an example of how id is not body and only md5 is body.
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39870728
your are supposes to generate the content of the md5 field using the content of the body field. that is how they are related.

as to why the id is not, i can't answer as it is pretty much the same as answering to "why do the number of oranges in my basket is unrelated to the color of the apples"
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39870738
if you hire someone and he's having a hard time, maybe let him read and post in this thread
0
 

Author Closing Comment

by:rgb192
ID: 39911416
I need to understand a tutorial, then I will learn more about this process. Thanks.
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39911928
ok. feel free to ask questions in this thread along the way. i may need a few days to answer depending on the moment you ask, though, cause i'm not always around
0
 

Author Comment

by:rgb192
ID: 39913210
Thanks. This question requires me to study tutorials so i think above my skill level.
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
In this blog post, we’ll look at how using thread_statistics can cause high memory usage.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question