Avatar of Sam K
Sam K
Flag for Australia asked on

SQLite max query

Hello experts,
I was wondering if there was a better method of creating a SQL search parameters than what I have used.

In my application written in C# connected to an SQLite database, I have provided a single column datagrid for users to enter values, my original intention was for small to medium scale queries for quick results. As I am sure everyone has found out that a developers expectations and what a user does is often very different.
The program currently opens a connection to the DB and loops through the values checking for a scalar value, if no scalar, it reports the selected value doesn't exist and continues to the end of the loop and closes the connection, alerting the user to the erroneous value/s.
It then opens a new connection to the DB where it creates an SQL query string in a list, with a new item added for each non-blank entry in the datagrid , because the values exist in two separate tables (for unfortunate legacy reasons) Views are created and destroyed in the statement.

A sample query for 8 items would look like the following:

CREATE VIEW view1 AS SELECT * FROM table1, table2;
CREATE VIEW view2 AS SELECT * FROM table3, table4;
CREATE VIEW collection AS SELECT * FROM view2
UNION ALL
SELECT * FROM view1;
SELECT * FROM collection WHERE collection.Column1 LIKE "%value1%"
or collection.Column1 like "%value2%"
or collection.Column1 like "%value3%"
or collection.Column1 like "%value4%"
or collection.Column1 like "%value5%"
or collection.Column1 like "%value6%"
or collection.Column1 like "%value7%"
or collection.Column1 like "%value8%"
DROP VIEW view1;
DROP VIEW view2;
DROP VIEW collection;

Whilst the process time for this example in Navicat is 0.304 seconds, when users enter hundreds or even thousands of items, the program becomes slow, unresponsive and it turns out that the SQLite engine can only handle a max of 1000 requests per query.
For each value in the datagrid a "or collection like value.." is created in a list which is joined together using using a list.toarray ie: string s1 = string.Join(string.empty, list.ToArray()).

Is there a way to make this part of the program more robust and efficient?
Thank you
Databases.NET ProgrammingC#SQL

Avatar of undefined
Last Comment
PortletPaul

8/22/2022 - Mon
PortletPaul

Do you literally use that script? (or has it been "simplified" for our consumption?)

Why do you use "SELECT * FROM table1, table2"?  or "FROM table3, table4"
Do you know that creates a "Cartesian product" (all rows of table1 multiplied by all rows of table2)?

Why create a views and then drop them?

I think you really need to provide sample data (per source table, not your views) and then an expected result.

btw: If you have simplified your real sql I would suggest just giving us the real deal.
Sam K

ASKER
Hi Paul,
I have adjusted the names of the views, tables and the values for readability, otherwise yes that is a copy and paste of the script.
The Cartesian join was created as an identifier, what I mean by that is dealing with the legacy data and the new data, an identifier was required, the contents of table1 and table3 is a single value, lets call it "database1" and "database2" so now all the data in view1 has the first column value of "database1".
Re-reading over the script, I should have make collection.Column1 , collection.Column2 as that is where the value actually is, but that is just semantics.
Creating the views, allowed me to grab the data out of the two tables as a combined single table, dropping the views after the result was generated, seemed to be good code cleanup.
If for instance, a user inputs 100 values into the datagrid the script will grow to check the value in the 'or collection value' by 100. (100 actually works ok) but obviously, the more values input, the longer the query becomes and the slower the program operates.
The scope of the existing legacy data is 50,000 rows by 100 columns that strangely enough is not static, that data is received from a 3rd party and whilst should never change, they are prone to small, impromptu changes. Which is the main reason for the table separation.
Olaf Doschke

>the contents of table1 and table3 is a single value
Do you mean they have one column each? Or do you mean they have one record each?
The core problem still is that you combine every record of table1 with every record of table2, and every record of table3 with every record of table4. As users can search for 100s of values I assume you have 100s or rather 1000s of records in each table, that yields in millions of possible combinations You rather want to left join the tables on some key.

Bye, Olaf.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
ASKER CERTIFIED SOLUTION
PortletPaul

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Sam K

ASKER
Thank you for the idea of putting the search terms into a separate table, it was so obvious once you pointed it out.
The expense to run the pre-pending, Cartesian join on both tables and then union was the same in mysql and sqlite, both returning 0.006s which i see as an acceptable expense.

I have altered the program, to create a new table in C# and then in a single transaction add all the values from the datagridview.
A search is done comparing the results of the searchtable to the unified view.

SELECT * FROM collection cl WHERE EXISTS
(SELECT 1 FROM  searchTable st where st.searchValues = cl.ID);
DELETE FROM searchTable;

The search outputs the results and deletes the context of the search table so it is an empty table for the next search.
I just ran a query on 4k entries, it took 6 seconds, whilst it is an unfortunate delay, it is a vast improvement on the previous process, which would a) take a very, very long time b) that number of records would crash the sqlite engine.
PortletPaul

Thanks for the feedback, good to know you have a workable solution. cheers, Paul