| Author | 
		  Message
		 | 
		
		  | nk | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 5:57 am    Post subject: processing csv file | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 05 Jul 2012 Posts: 19
  
  | 
		  
		    
			  
				I'm processing csv and converting into txt format. for eliminating duplicate entries based on certain field comparing i need to create a record after mapping all the fields. And from the record i need to do checking for duplicate entries .
 
 
I've done for the mapping part but dont hv any idea how to proceed further. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | lancelotlinc | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:00 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Knight
 
 Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA 
  | 
		  
		    
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqjeff | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:07 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 17447
  
  | 
		  
		    
			  
				Clearly you need to write code that deletes duplicates. 
 
 
This means you need to know how to identify a given record, and then determine if that record is a duplicate of another record or not.
 
 
If your records are sorted, you can tell that the current record is a duplicate because it is the same as the previous record. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | kimbert | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:10 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Council
 
 Joined: 29 Jul 2003 Posts: 5543 Location: Southampton 
  | 
		  
		    
			  
				@nk: The normal way to process CSV in message broker depends on the version that you are using:
 
- v6/v7 : Use the MRM parser
 
- v8 : Use the DFDL parser. There is a CSV wizard that will generate the correct DFDL schema for you. 
 
Either way, once you have a message tree you can do whatever you like with it. 
 
I did not understand your description of the de-duplication logic. If you want help with that part then you will need to explain the requirements in more detail.
 
 
 
@lancelotinc: Why is a Java Compute node the correct answer? | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | lancelotlinc | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:17 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Knight
 
 Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA 
  | 
		  
		    
			  
				
   
	| kimbert wrote: | 
   
  
	| @lancelotinc: Why is a Java Compute node the correct answer? | 
   
 
 
 
Its not the only answer. 
 
 
In response to:
 
 
   
	| nk wrote: | 
   
  
	| [I] dont [have] any idea how to proceed further. | 
   
 
 
 
Its a suggested way to move forward. _________________ http://leanpub.com/IIB_Tips_and_Tricks
 
Save $20: Coupon Code: MQSERIES_READER | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqjeff | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:19 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 17447
  
  | 
		  
		    
			  
				
   
	| lancelotlinc wrote: | 
   
  
	
   
	| kimbert wrote: | 
   
  
	| @lancelotinc: Why is a Java Compute node the correct answer? | 
   
 
 
 
Its not the only answer. 
 
 
In response to:
 
 
   
	| nk wrote: | 
   
  
	| [I] dont [have] any idea how to proceed further. | 
   
 
 
 
Its a suggested way to move forward. | 
   
 
 
 
I agree, there's no *correct* answer to "How do I move forward". 
 
 
There's a process to follow, not an answer to be provided. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | lancelotlinc | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:24 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Knight
 
 Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA 
  | 
		  
		    
			  
				
   
	| mqjeff wrote: | 
   
  
	| There's a process to follow, not an answer to be provided. | 
   
 
 
 
Which is really at the heart of the OP's dilemma. Postulate a possible software routine that would de-duplicate, code that possible solution, test the possible solution, modify the code based on the test results. _________________ http://leanpub.com/IIB_Tips_and_Tricks
 
Save $20: Coupon Code: MQSERIES_READER | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | nathanw | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 6:28 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Knight
 
 Joined: 14 Jul 2004 Posts: 550
  
  | 
		  
		    
			  
				million monkey + million typewriters = works of shakespeare
 
 
the way forward is mainly worked out by the developers area of expertise _________________ Who is General Failure and why is he reading my hard drive?
 
 
Artificial Intelligence stands no chance against Natural Stupidity.
 
 
Only the User Trace Speaks The Truth   | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | nk | 
		  
		    
			  
				 Posted: Thu Aug 30, 2012 11:01 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 05 Jul 2012 Posts: 19
  
  | 
		  
		    
			  
				@lancelotlinc : I'll hv to use only compute node. 
 
@kimbert: I'm using V7 :MRM parser
 
for ex i got txt record after mapping is 
 
 
10 TSI SRQ  10   A  2012-09-122012-01-02  2   SHZ  SPQ
 
20 API SPQ  12   B  2012-12-122012-01-20  3   SHZ  SLQ
 
30 TST LTQ  21   L   2012-02-122012-01-31  4   SHZ  SHZ
 
40 TLI RNQ  55   D   2012-08-122012-08-05  5  SHZ  TPM
 
 
In the above 3rd row  is having last two columns(SHZ  SHZ) equal so i need to skip tht row and the subsequent  row 4 will be having  previous columns value 4 instead of 5 
 
 
 
@mqjeff : How to sort the MRM record? | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqsiuser | 
		  
		    
			  
				 Posted: Fri Aug 31, 2012 12:05 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Yatiri
 
 Joined: 15 Apr 2008 Posts: 637 Location: Germany 
  | 
		  
		    
			  
				
   
	| nk wrote: | 
   
  
	| @mqjeff : How to sort the MRM record? | 
   
 
 
You parse the records (as Kimbert explained)
 
 
On the resulting logical (type) tree (it's relativly independent of Parsers (e.g. MRM or DFDL) from here on):
 
 
I am providing quicksort, you can use that to pre-process... and then (easily) remove duplicates. But that will result in a sorted output   (that probably/likely doesn't matter, but also isn't what you really want).
 
 
There are at least 3 ways (that I see) to remove duplicates.
 
 
Probably you try the move ref where function(s) first.
 
 
... ofc. ... probably you might need a different/custom solution   _________________ Just use REFERENCEs | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | 
		    
		   |