Showing posts with label ETL. Show all posts
Showing posts with label ETL. Show all posts

Tuesday, January 17, 2017

Talend Project - Child jobs, Functions and Variables

This concluding part of Talend project video explores creating sub/child jobs, how to pass data between the two and different strategies to share the data.
Code in subroutines
// Code from https://www.youtube.com/watch?v=a7-HUU4js9E
package routines;

public class formatEmails {

    public static String formatEmail(char oldRating, char newRating, String typeOfRating) {
       String fEmailText="";
    if (newRating>oldRating) {
    fEmailText=fEmailText+""+typeOfRating +" rating reduced from " +Character.toString(oldRating) +" to "+Character.toString(newRating)  +"
" ;
        }
    if (newRating
    fEmailText=fEmailText+""+typeOfRating +" rating improved from " +Character.toString(oldRating) +" to "+Character.toString(newRating)  +"
" ;
        }
    return fEmailText;
    }
}


It also explores subroutines/functions that helps you reduce the code and modularizes it. 



Check out the detailed video - https://www.youtube.com/watch?v=a7-HUU4js9E

tjavarow component code
/* -- Code from https://www.youtube.com/watch?v=a7-HUU4js9E ---- */ 
String wholepage;  
String ratings;
wholepage=input_row.document.toString(); 
int pos=wholepage.indexOf("composite_val"); 
ratings=wholepage.substring(pos,pos+250).replaceAll("[\\[\\]\"]", "").replaceAll(" \n", " ").replaceAll(" composite_val_vgm","");    String allratingsonly="";
String[] splitratings = ratings.split("composite_val>"); 
int i=0;
context.EmailText=context.EmailText+"

Ratings for : " +context.stock +"
" ;


 for (String eachratingrow : splitratings) 
  {    
   if (eachratingrow.length()>0)  
    { 
  
   if (i==0){  
     output_row.z_growth_rating=Character.toString(eachratingrow.charAt(0));   
        context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.growth_rating.charAt(0), eachratingrow.charAt(0), "Growth");
   }    
   if (i==1)
   {
    output_row.z_momentum_rating=Character.toString(eachratingrow.charAt(0));   
       context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.momentum_rating.charAt(0), eachratingrow.charAt(0), "Momentum");  
     }  
   if (i==2)
  
   output_row.z_value_rating=Character.toString(eachratingrow.charAt(0));   
   context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.value_rating.charAt(0), eachratingrow.charAt(0), "Value"); 
   }     
   
   if (i==3)
   {   
    output_row.z_vgm_rating=Character.toString(eachratingrow.charAt(0));  
    context.EmailText=context.EmailText+formatEmails.formatEmail(input_row.vgm_rating.charAt(0), eachratingrow.charAt(0), "VGM"); 
    }  
    i++;
      
      }
  } 
output_row.EmailText=context.EmailText;
  /* - End of Code from https://www.youtube.com/watch?v=a7-HUU4js9E --*/

Saturday, December 17, 2016

Talend project to parse a webpage (Zacks.com)

Created another interesting Talend project over the weekend. This Talend job parses zacks.com webpage to extract zacks scores and then convert them to rows that can be used in other components. tHTMLParse compent use to parse the website is available in Talend's exchange (market place) for free. String manipulation consumed majority of my time on this project. I intend to extend this project in future


Here is the code that goes in tJavaRow component that extracts only ratings out of the whole page and returns a string of ratings separated by semicolon.

/* -- Code from https://www.youtube.com/channel/UCT3bqK2QL93j-IFYFYbvjWQ ---- */ 

String wholepage; 
String ratings; 
wholepage=input_row.document.toString(); 
int pos=wholepage.indexOf("composite_val"); 
ratings=wholepage.substring(pos,pos+250).replaceAll("[\\[\\]\"]", "").replaceAll(" \n", " ").replaceAll(" composite_val_vgm",""); 
//output_row.document = ratings; 
String allratingsonly=""; 
String[] splitratings = ratings.split("composite_val>"); 
for (String eachratingrow : splitratings) 
   if (eachratingrow.length()>0)
   { allratingsonly=allratingsonly+";"+ eachratingrow.charAt(0)+"";     //allratingsonly=allratingsonly+eachratingrow+"**;"; 
    } 
output_row.document=allratingsonly;
/* - End of Code from https://www.youtube.com/channel/UCT3bqK2QL93j-IFYFYbvjWQ --*/