join - SSIS Merge Join component wrote 0 rows

Question

First of all, thanks to the community for the amount of information on the site, helped me a lot with C# and SSIS. The second thing is that i'm not very good with english, so please be patient, if you don't understand something, please ask, i'll try to make it better.

I got 2 OLEDB connection source from different databases, both tables got a column with an ID that I use as a Join Key. In RUT CRUZADOS, the ID its a float datatype, while in the other source (CTACTE AÑO PAS) I don't know which type of data it is (I can't open the database with sql server, i can only do SELECT operations).

When I combine them in the Merge,it doesn't return me any mistake, but when I run the program, this happens.

[SSIS.Pipeline] Information: "component "CARGOS ABONOS" (239)" wrote 0 rows.

In Microsoft Access, the "Inner Join" returns like 4 Millions of rows. I think the problem its the metadata but i dont know how to use the "Data Conversion". Can someone help me please.

Thank you all

score 3 · Accepted Answer

You can view the data types, at least as far as SSIS is concerned by double clicking on connector lines. In the Data Flow Path Editor that pops up, the Metadata tab will describe the column types.

That said, it doesn't matter because the Merge Join transformation is only going to allow you to merge data of the same type.

A Merge Join requires the source system data to be sorted. This is accomplished by either adding sort components into the stream (not recommended as this is an asynchronous transform that eats all your memory and kills your performance) or by explicitly sorting in your source systems and then marking them as sorted in the Advanced tab.

Since I don't see a Sort, that leads me to believe the sort is done in the source systems. Or, the sorts are not done there but someone has marked the output as sorted. There must be explicit ORDER BY clauses in those source queries. Sometimes, SQL Server will return data in the same order but unless there is an ORDER BY, it cannot be guaranteed. (I wish I could use the flash tag to emphasis the last point).

Future readers, if you have a sort in both systems and they are both sorted on the same column, then you need to examine collations. Case Insensitive is a different beast than Case Sensitive and a sort on an ASCII based system yields a different sort than one using EBCIDIC for mixed alpha-numeric like I once had...

As the source data type appears to be floats, then sorting is not the likely culprit. The realization is dawning on me, instead of sort issues, you likely have an uglier and more insidious comparison issue. Floating point numbers are approximations. 1=1 but 1.00000000000(etc) may or may not be equal to 1.0000000000(etc)1

Do you actually need the decimal places to make the match? If not, casting to an integer in both (and sorting on the CAST'ed value) systems should make these matches work. If there are decimal places that matter, then you're going to need to cast that into an exact numeric type (and pray that they both convert in the same way). The fact that Access does it leads me to believe Integer data type will be your salvation.

join - SSIS Merge Join component wrote 0 rows

1 に答える 1

Related

Reference