I have the following table

CompanyId  int(2)          Primary key
CustomerId char(20)     Primary key
SaleDate        DateTime
SalesAmount   Decimal(12,2)
SalesStatus      char(1)

This table has more than 8989998788 records

I am using the below where condition to fetch the  records

select customerId , SaleDate,SalesAmount from  Customer_sales
where CustomerId >'12345'
and CustomerId <'879999'

Should I have to include the CompanyId   in the where condition since it is a composite primary key ?   Will this improve the query performance ?

And,  Does where clause in the same order how the composite primary key builded ?
 (ie)  where CompanyId =12 and CustomerId >'12345'
Varshini SAsked:
Who is Participating?
PortletPaulConnect With a Mentor freelancerCommented:
>>"The third link is about Oracle."

I see MySQL, Oracle, PostgreSQL, and SQL Server provided in that reference (i.e. the discussion is generic about index design - not queries) and it contains the following (emphasis added):
Full Table Scan

The database does not use the index because it cannot use single columns from a concatenated index arbitrarily. A closer look at the index structure makes this clear.


The ordering of a two-column index is therefore like the ordering of a telephone directory: it is first sorted by surname, then by first name. That means that a two-column index does not support searching on the second column alone; that would be like searching a telephone directory by first name.
It then goes on to make this assertion:
Even though the two-index solution delivers very good select performance as well, the single-index solution is preferable. It not only saves storage space, but also the maintenance overhead for the second index. The fewer indexes a table has, the better the insert, delete and update performance.

So that article is arguing that you wouldn't need both employees_pk and subsidiary_id in an index and suggests singular field indexes would be preferred. However this is not relevant for this question as CompanyId, CustomerId form the primary key and (in SQL Server) they will form the clustered index, and hence both fields should be used in the where clause to take advantage of that index.  If only wanting to filter by CustomerId then an additional index is needed as already discussed by others.

The article does not appear to make any claim as the sequence required within a where clause.
lcohanConnect With a Mentor Database AnalystCommented:
"Should I have to include the CompanyId   in the where condition since it is a composite primary key ?   Will this improve the query performance ?"

Only if you put the CompanyId first in the where clause like you did above.
CompanyId =12 and CustomerId >'12345'

where CompanyId =12 and CustomerId between '12345' and '879999'

As much costly as it may look I would also try to add an index on CustomerId then try the
just the where CustomerId between '12345' and '879999' to see the difference.
Dale BurrellConnect With a Mentor DirectorCommented:
Whether you include the CompanyId depends on the requirements for the data you are trying to obtain.

It may improve performance, anything the reduces the number of records returned will usually be quicker. But if the query is running slow a good index will probably do more for you.

There is no order in a where clause - all parts are equal and will be executed in whatever order SQL Server sees fit.
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

Kyle AbrahamsConnect With a Mentor Senior .Net DeveloperCommented:

You need to include both columns unless customerId is first in the key.

The actual order (companyID = 12 and customer >'12345') vs (customer>'12345' and companyID = 12) shouldn't matter.
If you have queries that do not use the CompanyId but only CustomerId, consider a normal non-clustered index on just CustomerId. The system is able to figure it out if you rearrange the order in the WHERE clause if you do want to include CompanyId.

You have a bizarrely high number of records.  I assume 9 trillion is an exaggeration.

If you have an excessive number of records, like a few hundred million or more, consider looking into partitioning the table on CompanyId.

If there are a lot of inserts and updates, be sure to schedule a regular check of the index and rebuild as needed.
lcohanConnect With a Mentor Database AnalystCommented:
"There is no order in a where clause - all parts are equal and will be executed in whatever order SQL Server sees fit. "

I found that In SQL (unfortunately) comparing to other databases the order of columns in WHERE must match the INDEX columns order for best performance.




Also I see you use the quotes around the Customer - isn't it that a INT type column rather than text?

Auch... I just noticed that CustomerId is char(20) in PK and that's definitely not helping performance...so in this case if you can add the CompanyId prior to Customer that should definitely help.

I bet if you check the query plan with that structure and have the single quotes around the customer id as a number you will see implicit cast that SQL has to do to match the data type.
magarityConnect With a Mentor Commented:
In the links you've given, the first one is a guy who says he's heard people claim order matters but he's unable to reproduce it or notice that himself. The second is a forum with a lot of people saying that's been hashed over already and proven not the case. The third link is about Oracle.
CustomerId is listed in the table definition in the original question as a character string 20 so it needs to be in quotes.
Scott PletcherConnect With a Mentor Senior DBACommented:
For an index lookup to be effective, you must include the first column in the index.  Other index columns, in order, should follow whenever available.

For your specific example, if the clustering key is:
( CompanyId, CustomerId )
then you must provide the CompanyId, or your query will have to scan the entire table (unless an appropriate nonclustered index is available, of course).

As an analogy, think of a phone book, which is effectively clustered on ( last_name, first_name ).  If I provide you a last name, you can do an effective lookup.  But you can't quickly tell me everyone with a first name of John.  An RDBMS (SQL / Oracle / whatever) index is roughly the same restriction.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.