Showing posts with label Talend. Show all posts
Showing posts with label Talend. Show all posts

Sunday, February 05, 2017

Extracting HL7 data using Talend and storing in Cassandra

HL7 is a set of international standards used by clinical and healthcare providers to exchange information. This video explores using this HL7 data, extracting a few sample fields from it and then saving them into one of the popular NoSql database Cassandra.

Tuesday, January 17, 2017

Talend Project - Child jobs, Functions and Variables

This concluding part of Talend project video explores creating sub/child jobs, how to pass data between the two and different strategies to share the data.
Code in subroutines
// Code from https://www.youtube.com/watch?v=a7-HUU4js9E
package routines;

public class formatEmails {

    public static String formatEmail(char oldRating, char newRating, String typeOfRating) {
       String fEmailText="";
    if (newRating>oldRating) {
    fEmailText=fEmailText+""+typeOfRating +" rating reduced from " +Character.toString(oldRating) +" to "+Character.toString(newRating)  +"
" ;
        }
    if (newRating
    fEmailText=fEmailText+""+typeOfRating +" rating improved from " +Character.toString(oldRating) +" to "+Character.toString(newRating)  +"
" ;
        }
    return fEmailText;
    }
}


It also explores subroutines/functions that helps you reduce the code and modularizes it. 



Check out the detailed video - https://www.youtube.com/watch?v=a7-HUU4js9E

tjavarow component code
/* -- Code from https://www.youtube.com/watch?v=a7-HUU4js9E ---- */ 
String wholepage;  
String ratings;
wholepage=input_row.document.toString(); 
int pos=wholepage.indexOf("composite_val"); 
ratings=wholepage.substring(pos,pos+250).replaceAll("[\\[\\]\"]", "").replaceAll(" \n", " ").replaceAll(" composite_val_vgm","");    String allratingsonly="";
String[] splitratings = ratings.split("composite_val>"); 
int i=0;
context.EmailText=context.EmailText+"

Ratings for : " +context.stock +"
" ;


 for (String eachratingrow : splitratings) 
  {    
   if (eachratingrow.length()>0)  
    { 
  
   if (i==0){  
     output_row.z_growth_rating=Character.toString(eachratingrow.charAt(0));   
        context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.growth_rating.charAt(0), eachratingrow.charAt(0), "Growth");
   }    
   if (i==1)
   {
    output_row.z_momentum_rating=Character.toString(eachratingrow.charAt(0));   
       context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.momentum_rating.charAt(0), eachratingrow.charAt(0), "Momentum");  
     }  
   if (i==2)
  
   output_row.z_value_rating=Character.toString(eachratingrow.charAt(0));   
   context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.value_rating.charAt(0), eachratingrow.charAt(0), "Value"); 
   }     
   
   if (i==3)
   {   
    output_row.z_vgm_rating=Character.toString(eachratingrow.charAt(0));  
    context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.vgm_rating.charAt(0), eachratingrow.charAt(0), "VGM"); 
    }  
    i++;
      
      }
  } 
output_row.EmailText=context.EmailText;
  /* - End of Code from https://www.youtube.com/watch?v=a7-HUU4js9E --*/

Thursday, January 05, 2017

Talend Project - Send mail (tSendMail component)

Talend's tSendMail component can be used to send HTML formatted emails. this Video demonstrates framing a HTML formatted text using the project we were working on.

This is the code that we used in tJavaRow component to extract the rating and create an email body that lists and changes compared to previous rating. Please follow the video get a complete understanding.

/* -- Code from https://www.youtube.com/channel/UCT3bqK2QL93j-IFYFYbvjWQ ---- */ String wholepage; String ratings; wholepage=input_row.document.toString(); int pos=wholepage.indexOf("composite_val"); ratings=wholepage.substring(pos,pos+250).replaceAll("[\\[\\]\"]", "").replaceAll(" \n", " ").replaceAll(" composite_val_vgm",""); //output_row.document = ratings; String allratingsonly=""; String[] splitratings = ratings.split("composite_val>"); int i=0;String EmailText="Ratings for : " +context.stock +"" ;for (String eachratingrow : splitratings)    if (eachratingrow.length()>0)   { allratingsonly=allratingsonly+";"+ eachratingrow.charAt(0)+"";     //allratingsonly=allratingsonly+eachratingrow+"**;";      if (i==0){  output_row.z_growth_rating=Character.toString(eachratingrow.charAt(0));  
  if ( eachratingrow.charAt(0)>input_row.growth_rating.charAt(0))  {   EmailText=EmailText+"Growth rating reduced from " +input_row.growth_rating+ " To "+ eachratingrow.charAt(0)+"
"; 
";   }
      if ( eachratingrow.charAt(0)    {   EmailText=EmailText+"Growth rating improved from " +input_row.growth_rating +" To "+ eachratingrow.charAt(0)+"
";  
";      }  }    if (i==1){ output_row.z_momentum_rating=Character.toString(eachratingrow.charAt(0));   if ( eachratingrow.charAt(0)>input_row.momentum_rating.charAt(0))  {   EmailText=EmailText+"momentum rating reduced from " +input_row.momentum_rating+ " To "+ eachratingrow.charAt(0)+"
"; 
";   }
      if ( eachratingrow.charAt(0)    {   EmailText=EmailText+"momentum rating improved from " +input_row.momentum_rating +" To "+ eachratingrow.charAt(0)+"
";  
";      }  }  if (i==2){ output_row.z_value_rating=Character.toString(eachratingrow.charAt(0));   if ( eachratingrow.charAt(0)>input_row.value_rating.charAt(0))  {   EmailText=EmailText+"value rating reduced from " +input_row.value_rating+ " To "+ eachratingrow.charAt(0)+"
"; 
";   }
      if ( eachratingrow.charAt(0)    {   EmailText=EmailText+"value rating improved from " +input_row.value_rating +" To "+ eachratingrow.charAt(0)+"
";  
";       }  }     if (i==3){   output_row.z_vgm_rating=Character.toString(eachratingrow.charAt(0));   if ( eachratingrow.charAt(0)>input_row.vgm_rating.charAt(0))  {   EmailText=EmailText+"vgm rating reduced from " +input_row.vgm_rating+ " To "+ eachratingrow.charAt(0)+"
"; 
";   }
      if ( eachratingrow.charAt(0)    {   EmailText=EmailText+"vgm rating improved from " +input_row.vgm_rating +" To "+ eachratingrow.charAt(0)+"
";  
";       }  }  i++;
    } 

output_row.EmailText=EmailText;/* - End of Code from https://www.youtube.com/channel/UCT3bqK2QL93j-IFYFYbvjWQ --*/



If you haven't visited this project from the beginning, Here is the first post about it in this blog - http://sanjaykattimani.blogspot.com/2016/12/talend-project-to-parse-webpage-zackscom.html

Saturday, December 17, 2016

Talend project to parse a webpage (Zacks.com)

Created another interesting Talend project over the weekend. This Talend job parses zacks.com webpage to extract zacks scores and then convert them to rows that can be used in other components. tHTMLParse compent use to parse the website is available in Talend's exchange (market place) for free. String manipulation consumed majority of my time on this project. I intend to extend this project in future


Here is the code that goes in tJavaRow component that extracts only ratings out of the whole page and returns a string of ratings separated by semicolon.

/* -- Code from https://www.youtube.com/channel/UCT3bqK2QL93j-IFYFYbvjWQ ---- */ 

String wholepage; 
String ratings; 
wholepage=input_row.document.toString(); 
int pos=wholepage.indexOf("composite_val"); 
ratings=wholepage.substring(pos,pos+250).replaceAll("[\\[\\]\"]", "").replaceAll(" \n", " ").replaceAll(" composite_val_vgm",""); 
//output_row.document = ratings; 
String allratingsonly=""; 
String[] splitratings = ratings.split("composite_val>"); 
for (String eachratingrow : splitratings) 
   if (eachratingrow.length()>0)
   { allratingsonly=allratingsonly+";"+ eachratingrow.charAt(0)+"";     //allratingsonly=allratingsonly+eachratingrow+"**;"; 
    } 
output_row.document=allratingsonly;
/* - End of Code from https://www.youtube.com/channel/UCT3bqK2QL93j-IFYFYbvjWQ --*/