Solved

only use a mysql query which can be run 1000 times and only put new rows in table if the title,body changes

Posted on 2013-11-29
26
326 Views
Last Modified: 2014-03-07
users have a title and body in their social networking website
Think of myspace for example where some users1 has not edited their title and body 2 years.  So user1 title,body 2 years ago is the same as user1 title,body current.
I do not want many rows of user1 because user1 is unchanged.
I do not want to prune duplicate values.

columns:
 id (int),profile_id (varchar), timestamp, title (varchar), body (varchar).  If body,title is different save a new title,body


I want historical values so if values are different, insert do not update


Please use mysql query.
because I have another question of trigger and stored procedure and php

I want to test this one query in mysql workbench query editor.
0
Comment
Question by:rgb192
  • 13
  • 11
26 Comments
 
LVL 50

Expert Comment

by:Lowfatspread
Comment Utility
it isn't very clear what you are asking us for...

it isn't very clear what you are trying to achieve..

please provide some sample data to clarify the scenario...

ie.. what do you mean by duplicate...
       what is the key to your table....
         (how is the History maintained/related?)
      you mention 2 years of unchanged data , but don't accurately reflect how that is to be identified....

    e.g. if the user hasn't changed the data ... the natural assumption is that only a single row will exist....
         

 if the user has cyclicly return to a "body" which is the same as a previous one ... how should that be treated?

  e.g.   July 2011  bay city rollers fan
            aug 2011  cure fan
           aug 2012  bay city rollers fan

What is the significance of the process being run 1 thousand times?
0
 

Author Comment

by:rgb192
Comment Utility
ie.. what do you mean by duplicate...
same title/body
       what is the key to your table....
I have not created the table yet
         (how is the History maintained/related?)
I do not know
      you mention 2 years of unchanged data , but don't accurately reflect how that is to be identified....
Going to the website page (myspace.com) and gathering title and body many times for 2 years

    e.g. if the user hasn't changed the data ... the natural assumption is that only a single row will exist....
         

 if the user has cyclicly return to a "body" which is the same as a previous one ... how should that be treated?
  e.g.   July 2011  bay city rollers fan
            aug 2011  cure fan
           aug 2012  bay city rollers fan

I have not thought about that yet and will be a future question.  Only comparing to current title/body row.



What is the significance of the process being run 1 thousand times?
This may be run every hour for two years
0
 

Author Comment

by:rgb192
Comment Utility
Did I answer your questions properly?
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
use id,body,title as the primary key or unique index of the history table

run insert queries on that table : if the data already exists, nothing will be inserted and the query will fail with an explicit error

you can use "insert ignore" so you can run a single insert query for multiple rows : in that case you won't be able to tell which rows were ignored or inserted unless you run an additional select but the mysql_affected_rows will reflect the number of rows that were actually inserted
0
 

Author Comment

by:rgb192
Comment Utility
Insert ignore where not the same profile_id, title, body. Because two users may have the same title and body.
What is the query for that?
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
nothing related to the query

just create a primary key or unique index with the combination of fields that should never be duplicated

if you create a unique index on the 3 keys specified, 2 users with the same body and title are allowed as long as they have different ids
0
 

Author Comment

by:rgb192
Comment Utility
is this the proper syntax?

CREATE UNIQUE INDEX index_name
ON table_name (column1_name, column2_name,column3_name)


CREATE UNIQUE INDEX duplicationPrevention
ON userText ( profile_id, title, body)


and if this were in the bottom of a create table query

 UNIQUE INDEX duplicationPrevention ( `profile_id`, `title`, `body`)
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
yes. create this index and run your insert ignore queries. you can use now() for the timestamp. batch-inserting multiple rows will work as expected. if you want the timestamp to be updated even when the data has not changed, add "on duplicate key update timestamp=VALUES(timestamp)"
0
 

Author Comment

by:rgb192
Comment Utility
I do not want duplicate key update
if there is new data, then it is a new inserted row, so I can see historical changes

I do not understand what the insert statement should be
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
then forget about the on duplicate key if you don't need it.

the insert query is just your existing insert query with an extra ignore keyword :

insert IGNORE into tbl_name (col1,col2,col3) VALUES ...

design the insert exactly in the same way you would if you wanted the duplicated rows to actually be inserted.

it is the job of mysql to detect that the row is a duplicate using the unique index or primary key you defined.

the ignore keyword makes mysql ignore the duplicate rows without throwing an sql error. this is more convenient when you insert rows one by one because it makes the error handling easier. it is mandatory if you insert multiple rows at once because mysql would stop inserting after the first duplicate if you did a regular insert (at least with mysql's default config)
0
 

Author Comment

by:rgb192
Comment Utility
insert IGNORE into tbl_name (col1,col2,col3) VALUES ...

insert IGNORE into tbl_name (profile_id,title,body) VALUES ...

would my unique key work for this query?

CREATE UNIQUE INDEX duplicationPrevention
ON userText ( profile_id, title, body)
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
yes. give it a try
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 

Author Comment

by:rgb192
Comment Utility
CREATE TABLE `history`
 ( 
`history_id` int(11) NOT NULL, 
`profile_id` varchar(20) default NULL,
 `title` varchar(100) default NULL, 
`body` varchar(2000) default NULL,
 PRIMARY KEY (`history_id`),
UNIQUE INDEX duplicationPrevention ( profile_id, title, body)
 )
 ENGINE=MyISAM DEFAULT CHARSET=utf8;

Open in new window


Error Code: 1071. Specified key was too long; max key length is 1000 bytes



but
CREATE TABLE `history`
 ( 
`history_id` int(11) NOT NULL, 
`profile_id` varchar(20) default NULL,
 `title` varchar(100) default NULL, 
`body` varchar(200) default NULL,
 PRIMARY KEY (`history_id`),
UNIQUE INDEX duplicationPrevention ( profile_id, title, body)
 )
 ENGINE=MyISAM DEFAULT CHARSET=utf8;

Open in new window


works


however the body field requires 2000 characters.
0
 

Author Comment

by:rgb192
Comment Utility
correction, I found the maximum length of every field
drop table history;
CREATE TABLE `history`
 ( 
`history_id` int(11) NOT NULL, 
`profile_id` varchar(16) default NULL,
 `title` varchar(50) default NULL, 
`body` varchar(550) default NULL,
 PRIMARY KEY (`history_id`),
UNIQUE INDEX duplicationPrevention ( profile_id, title, body)
 )
 ENGINE=MyISAM DEFAULT CHARSET=utf8;

Open in new window


Error Code: 1071. Specified key was too long; max key length is 1000 bytes
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
this is a limitation in mysql : key lengths can't be bigger with this engine and mysql version.

note that using multibyte encodings such as UTF-8 makes the length of your columns bigger so it is not surprising you hit a 1000 limit with 550+50+16

i don't know exactly the limits of each build+engine. current innodb limits are above 3k so one option is to upgrade to a newer version. i guess this can also be done at compile time.

---

a simple workaround could be to store the md5 of the body in a separate column and use it in the index.

i'd recommend this solution since it is easy to implement. you can use mysql itself to calculate the md5 using the builtin MD5() function.

- create the necessary column (let's call it body_hash)
- update table set body_hash=MD5(body)
- change your insert query so it also populates the body_hash column as well
0
 

Author Comment

by:rgb192
Comment Utility
could you elaborate or show a link of md5 example.

Because I think  when I insert into original table automatic insert into second table (what is this called)?
And then deletes from first table delete second table. (What is this called)
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
could you elaborate or show a link of md5 example.

i'll be very happy to answer specific questions related to that matter, or possibly modify code that you provide if it is reasonably clean, but i don't have time to do your work since i'm pretty busy with mine already.

why don't you try and follow the 3 simple steps i gave ? i'll gladly help if you get stuck along the way.
i forgot to mention the implicit fourth step : change the unique index to ( profile_id, title, body_hash) which mysql will allow since the body_hash column will be much smaller than the initial body column

Because I think  when I insert into original table automatic insert into second table (what is this called)?
And then deletes from first table delete second table. (What is this called)

no idea what you are talking about. you never mentioned a second table before and you don't need one.
0
 

Author Comment

by:rgb192
Comment Utility
are these the 3 steps

- create the necessary column (let's call it body_hash)
- update table set body_hash=MD5(body)
- change your insert query so it also populates the body_hash column as well


this is an internal system.
is there a way to design without md5

maybe using
id instead of md5 id
0
 
LVL 26

Accepted Solution

by:
skullnobrains earned 500 total points
Comment Utility
other options :

- you can give a shorter length to the text column containing the body if you are sure the length of the texts will be shorter.

- if you expect english text, you can use ASCII instead of UTF-8 which you don't need

- you can declare the body as non-multibyte charset (latin1 should do for example) even though you actually store utf-8 inside that column. beware that this is a dirty hack, and that each accentuated or other weird character in your input will reduce the available length of the text container.

----

the md5 way is better.

- create the necessary column (let's call it body_hash)
- update table set body_hash=MD5(body)
- create a unique index on ( profile_id, title, body_hash)
- change your insert query so it also populates the body_hash column as well
<< insert into ...  (...,body) VALUES (.....,$BODY)
>> insert into ... (...,body,body_hash) VALUES (....,$BODY,MD5($BODY))

you can also compute the md5 beforehand in your code so you don't send the body to mysql twice

maybe using
id instead of md5 id

no way. ids are unrelated to the body's content so you cannot expect to detect changes in the body using them. if this is not obvious to you, then you are way over your head and you definitely should focus on understanding rather than solving... or hire someone.
0
 

Author Comment

by:rgb192
Comment Utility
no way. ids are unrelated to the body's content so you cannot expect to detect changes in the body using them. if this is not obvious to you, then you are way over your head and you definitely should focus on understanding rather than solving... or hire someone.

I am hiring someone.  I just dont know what to tell him.
Could you give an example of how id is not body and only md5 is body.
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
your are supposes to generate the content of the md5 field using the content of the body field. that is how they are related.

as to why the id is not, i can't answer as it is pretty much the same as answering to "why do the number of oranges in my basket is unrelated to the color of the apples"
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
if you hire someone and he's having a hard time, maybe let him read and post in this thread
0
 

Author Closing Comment

by:rgb192
Comment Utility
I need to understand a tutorial, then I will learn more about this process. Thanks.
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
ok. feel free to ask questions in this thread along the way. i may need a few days to answer depending on the moment you ask, though, cause i'm not always around
0
 

Author Comment

by:rgb192
Comment Utility
Thanks. This question requires me to study tutorials so i think above my skill level.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

A lot of articles have been written on splitting mysqldump and grabbing the required tables. A long while back, when Shlomi (http://code.openark.org/blog/mysql/on-restoring-a-single-table-from-mysqldump) had suggested a “sed” way, I actually shell …
More Fun with XML and MySQL – Parsing Delimited String with a Single SQL Statement Are you ready for another of my SQL tidbits?  Hopefully so, as in this adventure, I will be covering a topic that comes up a lot which is parsing a comma (or other…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now