Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 640 Vote(s) - 3.43 Average
  • 1
  • 2
  • 3
  • 4
  • 5
When should I use CROSS APPLY over INNER JOIN?

#1
What is the main purpose of using [CROSS APPLY][1]?

I have read (vaguely, through posts on the Internet) that `cross apply` can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)

I also know that `CROSS APPLY` doesn't require a UDF as the right-table.

In most `INNER JOIN` queries (one-to-many relationships), I could rewrite them to use `CROSS APPLY`, but they always give me equivalent execution plans.

Can anyone give me a good example of when `CROSS APPLY` makes a difference in those cases where `INNER JOIN` will work as well?


---

**Edit:**

Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where `cross apply` is faster/more efficient)

create table Company (
companyId int identity(1,1)
, companyName varchar(100)
, zipcode varchar(10)
, constraint PK_Company primary key (companyId)
)
GO

create table Person (
personId int identity(1,1)
, personName varchar(100)
, companyId int
, constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId)
, constraint PK_Person primary key (personId)
)
GO

insert Company
select 'ABC Company', '19808' union
select 'XYZ Company', '08534' union
select '123 Company', '10016'


insert Person
select 'Alan', 1 union
select 'Bobby', 1 union
select 'Chris', 1 union
select 'Xavier', 2 union
select 'Yoshi', 2 union
select 'Zambrano', 2 union
select 'Player 1', 3 union
select 'Player 2', 3 union
select 'Player 3', 3


/* using CROSS APPLY */
select *
from Person p
cross apply (
select *
from Company c
where p.companyid = c.companyId
) Czip

/* the equivalent query using INNER JOIN */
select *
from Person p
inner join Company c on p.companyid = c.companyId


[1]:

[To see links please register here]

Reply

#2
I guess it should be readability ;)

CROSS APPLY will be somewhat unique for people reading to tell them that a UDF is being used which will be applied to each row from the table on the left.

Ofcourse, there are other limitations where a CROSS APPLY is better used than JOIN which other friends have posted above.
Reply

#3
Well I am not sure if this qualifies as a reason to use Cross Apply versus Inner Join, but this query was answered for me in a Forum Post using Cross Apply, so I am not sure if there is an equalivent method using Inner Join:

Create PROCEDURE [dbo].[Message_FindHighestMatches]

-- Declare the Topical Neighborhood
@TopicalNeighborhood nchar(255)

AS
BEGIN

-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON

Create table #temp
(
MessageID int,
Subjects nchar(255),
SubjectsCount int
)

Insert into #temp Select MessageID, Subjects, SubjectsCount From Message

Select Top 20 MessageID, Subjects, SubjectsCount,
(t.cnt * 100)/t3.inputvalues as MatchPercentage

From #temp

cross apply (select count(*) as cnt from dbo.Split(Subjects,',') as t1
join dbo.Split(@TopicalNeighborhood,',') as t2
on t1.value = t2.value) as t
cross apply (select count(*) as inputValues from dbo.Split(@TopicalNeighborhood,',')) as t3

Order By MatchPercentage desc

drop table #temp

END
Reply

#4
Cross apply works well with an XML field as well. If you wish to select node values in combination with other fields.

For example, if you have a table containing some xml

> <root>
> <subnode1>
> <some_node value="1" />
> <some_node value="2" />
> <some_node value="3" />
> <some_node value="4" />
> </subnode1>
> </root>

Using the query

SELECT
id as [xt_id]
,xmlfield.value('(/root/@attribute)[1]', 'varchar(50)') root_attribute_value
,node_attribute_value = [some_node].value('@value', 'int')
,lt.lt_name
FROM dbo.table_with_xml xt
CROSS APPLY xmlfield.nodes('/root/subnode1/some_node') as g ([some_node])
LEFT OUTER JOIN dbo.lookup_table lt
ON [some_node].value('@value', 'int') = lt.lt_id


Will return a result

xt_id root_attribute_value node_attribute_value lt_name
----------------------------------------------------------------------
1 test1 1 Benefits
1 test1 4 FINRPTCOMPANY

Reply

#5
> Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?

See the article in my blog for detailed performance comparison:

* [**`INNER JOIN` vs. `CROSS APPLY`**][1]

`CROSS APPLY` works better on things that have no simple `JOIN` condition.

This one selects `3` last records from `t2` for each record from `t1`:

SELECT t1.*, t2o.*
FROM t1
CROSS APPLY
(
SELECT TOP 3 *
FROM t2
WHERE t2.t1_id = t1.id
ORDER BY
t2.rank DESC
) t2o

It cannot be easily formulated with an `INNER JOIN` condition.

You could probably do something like that using `CTE`'s and window function:

WITH t2o AS
(
SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
FROM t2
)
SELECT t1.*, t2o.*
FROM t1
INNER JOIN
t2o
ON t2o.t1_id = t1.id
AND t2o.rn <= 3

, but this is less readable and probably less efficient.


**Update:**

Just checked.

`master` is a table of about `20,000,000` records with a `PRIMARY KEY` on `id`.

This query:

WITH q AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM master
),
t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
JOIN q
ON q.rn <= t.id

runs for almost `30` seconds, while this one:

WITH t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
CROSS APPLY
(
SELECT TOP (t.id) m.*
FROM master m
ORDER BY
id
) q

is instant.


[1]:

[To see links please register here]

Reply

#6
Cross apply can be used to replace subquery's where you need a column of the subquery

subquery

select * from person p where
p.companyId in(select c.companyId from company c where c.companyname like '%yyy%')

here i won't be able to select the columns of company table
so, using cross apply

select P.*,T.CompanyName
from Person p
cross apply (
select *
from Company C
where p.companyid = c.companyId and c.CompanyName like '%yyy%'
) T
Reply

#7
This is perhaps an old question, but I still love the power of CROSS APPLY to simplify the re-use of logic and to provide a "chaining" mechanism for results.

I've provided a SQL Fiddle below which shows a simple example of how you can use CROSS APPLY to perform complex logical operations on your data set without things getting at all messy. It's not hard to extrapolate from here more complex calculations.

[To see links please register here]

Reply

#8
`cross apply` sometimes enables you to do things that you cannot do with `inner join`.

Example (a syntax error):

select F.* from sys.objects O
inner join dbo.myTableFun(O.name) F
on F.schema_id= O.schema_id

This is a **syntax error**, because, when used with `inner join`, table functions can only take *variables or constants* as parameters. (I.e., the table function parameter cannot depend on another table's column.)

However:

select F.* from sys.objects O
cross apply ( select * from dbo.myTableFun(O.name) ) F
where F.schema_id= O.schema_id

This is legal.

**Edit:**
Or alternatively, shorter syntax: (by ErikE)

select F.* from sys.objects O
cross apply dbo.myTableFun(O.name) F
where F.schema_id= O.schema_id

**Edit:**

Note:
Informix 12.10 xC2+ has [Lateral Derived Tables][1] and Postgresql (9.3+) has [Lateral Subqueries][2] which can be used to a similar effect.


[1]:

[To see links please register here]

[2]:

[To see links please register here]

Reply

#9
It seems to me that CROSS APPLY can fill a certain gap when working with calculated fields in complex/nested queries, and make them simpler and more readable.

Simple example: you have a DoB and you want to present multiple age-related fields that will also rely on other data sources (such as employment), like Age, AgeGroup, AgeAtHiring, MinimumRetirementDate, etc. for use in your end-user application (Excel PivotTables, for example).

Options are limited and rarely elegant:

- JOIN subqueries cannot introduce new values in the dataset based on data in the parent query (it must stand on its own).

- UDFs are neat, but slow as they tend to prevent parallel operations. And being a separate entity can be a good (less code) or a bad (where is the code) thing.

- Junction tables. Sometimes they can work, but soon enough you're joining subqueries with tons of UNIONs. Big mess.

- Create yet another single-purpose view, assuming your calculations don't require data obtained mid-way through your main query.

- Intermediary tables. Yes... that usually works, and often a good option as they can be indexed and fast, but performance can also drop due to to UPDATE statements not being parallel and not allowing to cascade formulas (reuse results) to update several fields within the same statement. And sometimes you'd just prefer to do things in one pass.

- Nesting queries. Yes at any point you can put parenthesis on your entire query and use it as a subquery upon which you can manipulate source data and calculated fields alike. But you can only do this so much before it gets ugly. Very ugly.

- Repeating code. What is the greatest value of 3 long (CASE...ELSE...END) statements? That's gonna be readable!

- Tell your clients to calculate the damn things themselves.

Did I miss something? Probably, so feel free to comment. But hey, CROSS APPLY is like a godsend in such situations: you just add a simple `CROSS APPLY (select tbl.value + 1 as someFormula) as crossTbl` and voilà! Your new field is now ready for use practically like it had always been there in your source data.

Values introduced through CROSS APPLY can...

- be used to create one or multiple calculated fields without adding performance, complexity or readability issues to the mix
- like with JOINs, several subsequent CROSS APPLY statements can refer to themselves: `CROSS APPLY (select crossTbl.someFormula + 1 as someMoreFormula) as crossTbl2`
- you can use values introduced by a CROSS APPLY in subsequent JOIN conditions
- As a bonus, there's the Table-valued function aspect

Dang, there's nothing they can't do!
Reply

#10
The essence of the APPLY operator is to allow correlation between left and right side of the operator in the FROM clause.

In contrast to JOIN, the correlation between inputs is not allowed.

Speaking about correlation in APPLY operator, I mean on the right hand side we can put:

- a derived table - as a correlated subquery with an alias
- a table valued function - a conceptual view with parameters, where the parameter can refer to the left side

Both can return multiple columns and rows.
Reply



Forum Jump:


Users browsing this thread:
3 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through