This was something i had faced couple of times in my development adding a logic to my mapping and i have always done the same mistake. Now this is why i have decided to share this with other developers, avoid using the ‘!='(Not Equal) operator in your expression as much as you can.

Let me quickly put down the real issue i had faced with my experience getting to use ‘!=’ operator and this was some real interesting learning experience getting to understand what and how really your data is being processed by the Informatica’s Integration Service.

I had a requirement, where in i had to populate 2 targets satisfying 2 different logic based on a column. So as an illustration i am taking the below records,

SID_No  Group_Nm Amount
1 A    1000
2 B  1500
3 A  2000
4 NULL 1200
5 NULL 1000
6 C 2000
7 C 3000

Now my need for the day is to populate to 2 targets as,

Target_A -> where Group_Nm=’A’
Target_B -> where Group_Nm !=’A’ and Group_Nm !=’B’ (expecting all other records other than those in Target_A and are not
‘B’)

So as the above logic says, i had straight away gone to put down my expressions as below, which would take all Group-A as Flag=1 and remaining records with Group name != ‘B’ would be expected to go to Flag=2.

IIF(Group_Nm = ‘A’,1,
IIF(Group_Nm != ‘B’,2,0))

I am expecting
1,3 to be tagged as Flag=1
4,5,6,7 to be tagged as Flag=2

Now what have i got in my target is,
1,3 tagged as Flag=1 flows to Target_A
6,7 tagged as Flag=2 flows to Target_B.

So it was quite clear that the issue was with the records having Group_Nm as NULL. That clearly gave a clue that the expression was not handling the NULL values as expected. So the root caus ebeing the second IF condition checking for Group_Nm != ‘B’ and hence scanning the left out records 2,4,5,6,7 it had got 6,7 satisfying the condition so what really happened to 4,5 was ‘as it happened to have NULL values these were ignored for a check and so were dropped out’ and so i had them flagged as ‘0’.

So quickly understanding all i had a change in the expression was to keep it as below,

IIF(Group_Nm = ‘A’,1,
IIF(Group_Nm = ‘B’,0,2))

So finally i had my results as expected with the right records flowing to the respective targets. Hope this made so good understanding to you. Please drop in your questions or suggestions if any.

Advertisements

This was something of a unexpected bug found in our code dealing with a mapping variable and i was unable to figure out why was the variable value not taken up as expected.

The issue was when i was trying to pass a value from a port to the Variable and use the same in the Workflow. So while i was checking at the Session level, the value was different whereas the same Port value flowing to the target was looking correct.

With some repeated run and test, i was able to catch what had really went wrong. Here i used a mapping variable defined and i had my only option to set the Aggregation option as either ‘MAX’ or ‘MIN’. So went on to select ‘MAX’.

Then comes assigning the port value to the variable with SETVARIABLE(). In doing so i was assuming the value on every run shall just be assigned to the Variable here. I was completely wrong with this assumption.

As referred in the Informatica content-Help,

 “At the end of a successful session, the PowerCenter Integration Service compares the final current value of the variable to the start value of the variable. Based on the aggregate type of the variable, it saves a final current value to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.”

Breaking this down for an easy understanding – The Variable $$Var will first be assigned the default value, then on the successful run when we try setting a new value say here default value was $$Var=10 and i am having a new value as ’15’ and my aggregation was type ‘MAX’ and so compares the new value with old and finds the new is greater value so $$Var=15 is assigned.

And now this value $$Var shall not change or be over written until i get a value >15. So was my issue where the new values were not assigned as expected with SETVARIABLE().

Thus my solution was to set a minimum value before the session starts, as simple as it is to just make it a override value through Parameter file (.prm file) and so i set it as ‘$$Var=0’. Now my new value gets always the greater value and is assigned to the variable with the latest 🙂

Hope this makes a good understanding, please put down your questions and comments if any. Thanks!!

Hello Friends, Time for some learning. This time i have taken up the Informatica Variables and Parameters which i am very much fond of 🙂

This time i have taken up something that i came across in one of the development my team was involved in, and there was a need to pass a value from one task/session to another as we had a check with respect to the value we get from one session to be used going forward in the process.

blog_banner_session_Variable_assignment

For instance, my requirement is to check on the count of records from session_1 and then to count on records in session_2 but with the same counter (increment from where the Session_1 count ended) ending up with the total count of records. Something like, i had 100 records in session_1 and then i have to count from there as 100+1 and so on for the session_2. Looks simple, but quite a nail biting and challenging experience while we try getting it in Informatica!!

And then we came across this approach/feature available in the Informatica ‘Pre-Session/Post-Session Variable Assignment’. Which was all about the assigning of Variable values between the Sessions and it’s Parent Worklet/Workflow and vice-versa. Interesting right? If not clear lets go with a step by step illustration for a better understanding on this.

1. All we need to do is define a Counter variable/Parameter $$v1_Count for the session_1, and then our session runs to populate $$v1_Count as a usual Mapping Variable/Parameter does.

2. Next is, to assign this value to the common Variable (that acts as an intermediate between two of our sessions) which will be the Parent Worklet/Workflow Variable defined as any other Variable in the Edit Workflow/Worklet Options. In my case i have Workflow variable $$wf_v_Count defined.

3. Now in the session_1, Components tab -> lets edit the “Post session Variable assignment” option (it can be on failure or on success),

(i) Post-session on success variable assignment
(ii) Post-session on failure variable assignment

and i define it as – “Parent_Workflow/Worklet_Variable = Mapping_Variable/Parameter”

i.e., $$wf_v_Count = $$v1_Count

So here we have the $$v1_Count value open for any task,session or Worklet under the same parent can take up and use it for processing.

4. In our case, for the next session_2, i have to go for Components tab -> “Pre Session Variable Assignment” to be defined to assign the value from the $$wf_v_Count to the respective session’s Mapping Variable/Parameter $$v2_Count.

So we define it as “Mapping_Variable/Parameter = Parent_Workflow/Worklet_Variable”

i.e., $$v2_Count = $$wf_v_Count

The same can be used all over the Parent Workflow/Worklet, in any decision tasks or on link conditions. Thus we have now passed on the value from one session to the other and can be processed thereafter. Hope this was an interesting and helpful piece to you.

Do post your comments on any corrections or queries with respect to this. Share this if it can help someone.

Hello friends..

I have been trying to get on with this article for a long time, but have been moving around with other commitments. Finally time to share a interesting topic with all Informatica nerds.

Not so functional or a logical thing i am here with to share with you guys, it was just about how i made use of Sorter and never meant to sort things!

Confused?? OK without much ado lets get on with this..

I was in a need to compare 2 results from same source (SQL Table) and then do an update on the same, kind of a data clean up activity i was involved into. And so i was just done with my mapping design and then just went on to run the session as well. But to my surprise, all i see was the queries, 2 source tables was competing with each other and also there was an insert/update on the target end leading to lot of pressure onto the DB and hence resulting in the process to hang with data load literally halting.

Using_Sorter_Not_For_Sorting

Then a bulb glows (a bit late though).. and the thought was, why cant i tap the data at one point and once it reaches, then to open the tap. But how?

As we all know we have ‘The Integration Service passes all incoming data into the Sorter transformation before it performs the sort operation.’ I just made use of this feature, and as you all might have got it by now, i used 2 sorters at the 2 source flows coming in and joining and then once the source is read with different SQ SQL overrides, all the transformations are performed and finally the data is written on to the same SQL table.

Hope this gets as a Savior to you too at a similar point when you are helpless!!

Share and help others learn with a joy 🙂

When i first came across using the Normalizer, all i was thinking about is just passing multiple values in one row of several columns to multiplly/normalise the source to several rows. Say i have a table source tracking down each Customer’s Instalment on a monthly basis as like, And now my requirement to put these down as different transactions to the target, then as we normalise it through this transformation which comes in handy here.

All we might know

So this was an assumption on what can be the limit of this transformation is just to Normalise the de normalised data, but then this was something i got to know that we can make a big use of some more features of this transformation.The Normalizer tab in the transformation where we define the Ports which are to pass through and doing so we also define the
‘Occurence’ and this is where we define the Column which has to be Normalized from multiple column to rows. As in our instance i define the Monthly amount Column, where i shall have the Occurence given as 12.

Understanding the GCID and GK ports –

Now my Normalizer shall have 12 inputs and the output port for these shall be 12 rows with different Amount values and other Column being of same value. Also i will need to have a track on the months and would like to track one extra column which i can define as Month_Name which has to give the Month. So how i do this, quite simple – as we can see in the port tab along with the 12 Ports of Month_Amount i have a new port auto created as’GCID_(the reoccuring_Column_name)’. This represents a Integer value (Generated COlumn ID). This will be repetitive values of 1 to 12 for each Amount with respect to the month.

Now all we have to do is pull this port also to the next Expression transformation and have your expression to define the Month name depending upon the GCID value. Also.. also we have the another port created by default in the Normalizer for the respective multiple occuring Column GK_{column_name}. This is the Generated
Key column which is a Key column can be used if required for a unique key value.

VSAM and the Pipeline Type:

Also the Normalizer i have used as a transformation was only the Pipeline transformation as what we just discussed in the above example instance. There als is a type known as VSAM Normalizer, which is just a Saource Qualifier like transformation for a COBOL source (COBOL VSAM source). Here the VSAM COBOL source type does gives a de normalized data which are then Normalized through the VSAM Normalizer.The VSAM Normalizer does not allows to edit the ports and the the Normalizer tab is just read only. A VSAM Normalizer transformation has one input port for a multiple-occurring column unlike the Pipeline Normalizer which has multiple input ports for the multiple occuring value.

Have not got a chance to wet my hands with VSAM type, hope to do some day soon and shall update on many more such experiences. Untill then take care and a happy learning 🙂

Just in a situation when i was in need of a guide to help me in getting a Bulk import/export done i was struck as i was not getting one good article that can help me in doing. It was all in bits, hard to get them all in one place and get what i expect. Let me take up this and get it down here, hope this helps for a better understanding.

blog_banner_bcp_utility

We had some data in a flat file populated with the and ETL (Informatica) tool and now i have to load the same data to my SQL database table as a bulk load. And to load your data as bulk we have the BCP utility that comes in handy in allowing bulk load from data file to the SQL table and vice verse.

We have both import and export commands in BCP to load data from and to the SQL table. The standard BCP command syntax as follows,

bcp {db_name.schema_name.Table_Name} in/out {File_Path\file_name.txt} -S {server_name} -t ‘field_terminator’ –T

here we have,
-S -> Server Name
-t -> filed terminator (example “/t” for tab delimited, “,” for comma delimited)
in/out -> ‘in’ for import from data file to sql table and ‘out’ for export o data file from the sql table.
-T -> to allow a trusted connection (for non trusted connection we define ‘-U’ user_name and ‘-P’ password)

This is the simple bcp command that can help you load data faster and in an elegant way.

There is one add on feature or a parameter that you can define for the bcp command which can make the load from and to the data file more tightly mapped and Provides a flexible system for writing data files that requires little or no editing to comply with other data formats.

We need to create a format file before loading the data with the bcp command. And the syntax as follows,

bcp {db_name.schema_name.Table_Name} format nul -c -f {Format_File_Path\file_name.fmt} -S {server_name} –T

format nul -f -> to define the path and the format file name to be created (.fmt or .xml file)

We can have both an xml and a non xml format file, here i go with a non xml format file which gives a structure that looks as
below,

blog_banner_format_file

Hope this helps in getting the bulk load for your data to and from a data file to the sql table. Learn and share 🙂

Just curious to write on this, something not every one would be aware of. This can be really useful in finding the real BUG in you Informatica code. This one good thing in Informatica has a lot to say. Yes when you have a error thrown on running your Informatica session with a row returned to the Bad file, would wonder how can the bad file help or why do we need a bad file to just see a rejected row in it!!

With combination of the session log along with the Rejected bad file can be a a life saver for any Informatica developer. The Rejected bad file saves at the path $PMBadFileDir (by default a bad file directory in your Infa root folder).

So here for all those who are not aware, and for those who are aware but are missing some clarity on this. When we have an error where a data is rejected from being inserted to the target and gets loaded to the bad file, this is majorly due to some data type mismatch or some non NULL allowing target column being inserted with a NULL value and many more such data
issues. How to track what is what here??

That is where the bad file concept comes into the picture to help us out. We have two indicators in a bad file, the row indicator and the column indicator. Where the row indicator says about the whole row and the column says about each column value whether its a valid or an invalid data.

Tips: For a clear and easy reading of the bad file, Save it as a CSV file and open it.

Row Indicator:

The first column in the reject file is the Row Indicator , that determines whether the row was for insert, update, delete or reject. It is basically a flag that determines the Update Strategy for the data row. When the Commit Type of the session is configured as “User-defined” the row indicator indicates whether the transaction was rolled back due to a non- fatal error, or if the committed transaction was failed.

Mahaveer_Blog_Informatica_bad_file

Column Indicatror:

Then is the column indicator, each one along with respective column in the bad file gives the column indicator. It says about the column data that was being inserted/updated to target and might have rejected. The column indicator has several meaning to its value as is shown below.

Mahaveer_Blog_Informatica_bad_file2

 

Hope this helps, keep sharing as knowledge acquired has its value when its shared 🙂

%d bloggers like this: