Pollyanna<br />A machine learning system for classifying product pages on the Internet<br />
What is Pollyanna?<br />Pollyanna is a Machine Learning System that uses ‘Supervised Learning’ techniques to associate wor...
What does Pollyanna do?<br />It reads the text in the product pages of Internet merchants and retailers,<br />Quantitative...
The Context<br />What is Pollyanna’s business context?<br />
The Comparison Shopping Engine (CSE) Eco-system<br />
The Process<br />Comparison Shopping Engine<br />Internet Shopper<br />Internet Retailer<br />
Sample Product Taxonomy for Classification<br />
Classification<br />Classification of Retailer’s offers is a critical process for most Comparison Shopping Sites<br />Clas...
Efficiency of existing classification methods<br />The approximate accuracy of current classification algorithms (in the C...
Problem Definition<br />How to most effectively classify merchant/retailer offers accurately at the lowest cost?<br />
The Solution<br />The Pollyanna System<br />
A fresh perspective of the process and inputs<br />
A new viewpoint on support vectors in a machine learning system<br />
Pollyanna’s Current 1 dimensional relationship analysis<br />
Example of the one dimensional relationship<br />
Conditional Probability<br />Conditional probability is the probability of some event A, given the occurrence of some othe...
Conditional Probability<br />Data from Pollyanna<br />In this example	CP	=	195/823<br /> CP	= 0.2369380316<br />
 <br />Normal BP<br />Congestive Heart Failure<br />High Systolic BP<br />No CHF<br />400<br />400<br />1500<br />3000<br ...
 <br />Does not contain “Oxford”<br />Men’s Shoes<br />Document Contains<br />“Oxford”<br />Not Men’s Shoes<br />738<br />...
Pollyanna is a Linear Classifier<br />If the input feature vector to the classifier is a real vector x, then the output sc...
Solution Statement<br />Pollyanna is a Machine Learning System that uses new processes, inputs and statistical theories<br...
Pollyanna Demo<br />
Architecture<br />Internet Cloud<br />Training Module<br />Perl<br />Internet Cloud<br />Front End Tool<br />Perl<br />Per...
Pollyanna can be applied to predictive analytics in online payment fraud<br />If the following conditions are met:<br />
The problem must be clearly defined in terms of:<br />Input<br />Type of data: Integer, String, Floating, Boolean<br />Fil...
Historical data should be available<br />Reliable data<br />Data is sufficiently complete and error free<br />Valid data<b...
Data should yield binomial probability distribution for each attribute<br />Example<br />A key attribute of an online tran...
Data should yield binomial probability distribution for each attribute<br />Example – Continued from previous slide<br />T...
How will the support vector be calculated in the context of online payment transaction<br />Illustration with an hypotheti...
Support Vector Computation Example<br />To simplify the problem let us say that every transaction has to be bucketed into ...
Support Vector Computation Example<br />The training module is supplied with 200 sample transactions (historical data) rep...
Support Vector Computation Example<br />Association between Fraudulent Transaction and Attribute ‘X’<br />
Support Vector Computation Example<br />Applying a synthesis of theories in probability and statistics the support vector ...
How will the machine learning system forecast fraud loss<br />Illustration with an hypothetical case<br />
Forecasting Fraud Loss<br />The problem:<br />To forecast the value of losses on all fraudulent credit card payment transa...
Forecasting Fraud Loss – Step 1Determining whether a transaction is fraudulent or not<br />Let us hypothetically say that ...
Forecasting Fraud Loss – Step 1Determining whether a transaction is fraudulent or not<br />So the linear function is appli...
Forecasting Fraud Loss – Step 2Summing the values of all fraudulent transactions<br />Step 1 is performed on each transact...
All the details contained in the examples above are imaginary. They serve only for the purpose of understanding the system...
Benefits of adopting Pollyanna<br />
Benefits of adopting Pollyanna<br />For a process that is currently supported by human intelligence, Pollyanna may confer ...
“Any technology sufficiently advanced is indistinguishable from magic”<br />Sir Arthur C. Clarke <br />
Thank You<br />
Contact Details<br />PG Vijay (Consultant – Machine Learning Systems)<br />Mobile: +91 98418 21167<br />E-Mail: vijaypg@gm...
of 44

Pollyanna Document Classifier

Published on: Mar 4, 2016
Published in: Technology      Education      
Source: www.slideshare.net


Transcripts - Pollyanna Document Classifier

  • 1. Pollyanna<br />A machine learning system for classifying product pages on the Internet<br />
  • 2. What is Pollyanna?<br />Pollyanna is a Machine Learning System that uses ‘Supervised Learning’ techniques to associate words and categories quantitatively, based on the examples in the training set.<br />The training system is programmed to interpret the association between words and categories using theories in probability and statistics<br />It applies the training knowledge to classify documents based on the text contained in the document using the ‘linear classifier’ function<br />
  • 3. What does Pollyanna do?<br />It reads the text in the product pages of Internet merchants and retailers,<br />Quantitatively associates the words in the title, meta and body tags with the product categories in its taxonomy, and<br />Predicts the top 3 categories to which the products in the product page may belong<br />
  • 4. The Context<br />What is Pollyanna’s business context?<br />
  • 5. The Comparison Shopping Engine (CSE) Eco-system<br />
  • 6. The Process<br />Comparison Shopping Engine<br />Internet Shopper<br />Internet Retailer<br />
  • 7. Sample Product Taxonomy for Classification<br />
  • 8. Classification<br />Classification of Retailer’s offers is a critical process for most Comparison Shopping Sites<br />Classification enables a focused search for a product within a specific product category <br />
  • 9. Efficiency of existing classification methods<br />The approximate accuracy of current classification algorithms (in the Comparison Shopping Space) – 65%<br />About 10 % of merchant offers are manually classified<br />About 10 % of merchant offers are always mis-classified<br />
  • 10. Problem Definition<br />How to most effectively classify merchant/retailer offers accurately at the lowest cost?<br />
  • 11. The Solution<br />The Pollyanna System<br />
  • 12. A fresh perspective of the process and inputs<br />
  • 13. A new viewpoint on support vectors in a machine learning system<br />
  • 14. Pollyanna’s Current 1 dimensional relationship analysis<br />
  • 15. Example of the one dimensional relationship<br />
  • 16. Conditional Probability<br />Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written as P(A|B), and is read as "the probability of A, given B".<br />Bayes Theorem provides the Equation for Conditional Probability which can be stated as:<br /> P (A | B) = P (B | A) * P (A) <br /> P (B)<br />Can be written as = P (A ∩ B) <br /> P (B) <br />
  • 17. Conditional Probability<br />Data from Pollyanna<br />In this example CP = 195/823<br /> CP = 0.2369380316<br />
  • 18.  <br />Normal BP<br />Congestive Heart Failure<br />High Systolic BP<br />No CHF<br />400<br />400<br />1500<br />3000<br />1100<br />2600<br />Risk Ratio<br />Example from Cohort studies in Medicine.<br />
  • 19.  <br />Does not contain “Oxford”<br />Men’s Shoes<br />Document Contains<br />“Oxford”<br />Not Men’s Shoes<br />738<br />29689<br />1546<br />445677<br />808<br />415988<br />Risk Ratio<br />Data from Pollyanna<br />
  • 20. Pollyanna is a Linear Classifier<br />If the input feature vector to the classifier is a real vector x, then the output score is<br />where w is a real vector of weights and f is a function that converts the scalar product of the two vectors into the desired output.<br />
  • 21. Solution Statement<br />Pollyanna is a Machine Learning System that uses new processes, inputs and statistical theories<br />That provides a highly accurate automated classification (87% ± 3%) <br />Unlike other classification algorithms (in the E-Commerce space) that are dependent on retailer’s data-feeds, and are less accurate (Approx 65%) and are supported by manual classification<br />We have assembled ahighly accurate classification system that is cost effective, one that does not require an ongoing manual support<br />
  • 22. Pollyanna Demo<br />
  • 23. Architecture<br />Internet Cloud<br />Training Module<br />Perl<br />Internet Cloud<br />Front End Tool<br />Perl<br />Perl<br />User/Client<br />
  • 24. Pollyanna can be applied to predictive analytics in online payment fraud<br />If the following conditions are met:<br />
  • 25. The problem must be clearly defined in terms of:<br />Input<br />Type of data: Integer, String, Floating, Boolean<br />File type: XML, Delimited, Database<br />Process<br />Human intelligence and any other methods, procedures required for arriving at a decision, prediction or forecast<br />Output<br />All possible decisions/outcomes<br />Examples:<br />Bucketing a transaction into fraud risk category<br />Forecasting fraud losses on completed transactions<br />
  • 26. Historical data should be available<br />Reliable data<br />Data is sufficiently complete and error free<br />Valid data<br />Data actually represents what you think is being measured<br />Sufficient data<br />Data is adequate to support the outcome of the process or the decision<br />Spatial data<br />Time series data<br />
  • 27. Data should yield binomial probability distribution for each attribute<br />Example<br />A key attribute of an online transaction is the location of the “IP” address and the location of the physical address of the credit card holder:<br />Two outcomes are possible for the above attribute<br />The “IP” address and the physical address are located geographically in the same country<br />The “IP” address and the physical address are not located geographically in the same country<br />Continued in the next slide<br />
  • 28. Data should yield binomial probability distribution for each attribute<br />Example – Continued from previous slide<br />The Machine Learning system is supplied 200 online payment transactions received in the previous year.<br />The machine learning system should be able to determine, for each possible outcome, the number of Yes or No events observed<br />Example: For the outcome “The IP address and the physical address are located geographically in the same country” – 20 Yes and 180 No<br />
  • 29. How will the support vector be calculated in the context of online payment transaction<br />Illustration with an hypothetical case<br />
  • 30. Support Vector Computation Example<br />To simplify the problem let us say that every transaction has to be bucketed into one of the two classes:<br />A genuine transaction<br />A fraudulent transaction<br />The training module’s goal is to calculate the relationship - between an attribute of a transaction and each of the classes mentioned above - which is the ‘Support Vector’<br />
  • 31. Support Vector Computation Example<br />The training module is supplied with 200 sample transactions (historical data) representing the population<br />Of the 200 transactions 20 are fraudulent and 180 are genuine<br />A key attribute of the transaction is: The IP address and the physical address of the credit card holder are not located geographically in the same country. Of the 200 transactions 40 had the above attribute and 160 did not have the above attribute. Let us call the above attribute ‘X’.<br />The training module will analyze the data and arrive at the following matrix:<br />
  • 32. Support Vector Computation Example<br />Association between Fraudulent Transaction and Attribute ‘X’<br />
  • 33. Support Vector Computation Example<br />Applying a synthesis of theories in probability and statistics the support vector is calculated as 4.040816<br />The support vector is a measure of the relationship between a Fraudulent Transaction and the attribute: “The IP address and the physical address of the credit card holder are not located geographically in the same country”.<br />
  • 34. How will the machine learning system forecast fraud loss<br />Illustration with an hypothetical case<br />
  • 35. Forecasting Fraud Loss<br />The problem:<br />To forecast the value of losses on all fraudulent credit card payment transactions that have been successfully executed in a given month<br />There are two steps to doing this:<br />Step 1: Determine whether each transaction is fraudulent or not based on attributes of the transaction<br />Step 2: Sum the values of the fraudulent transactions to arrive at the forecast of loss for that month<br />
  • 36. Forecasting Fraud Loss – Step 1Determining whether a transaction is fraudulent or not<br />Let us hypothetically say that there are two outcomes for each transaction, either it is a Fraudulent transaction or it is a Genuine transaction.<br />For each outcome the following linear function is applied:<br />Refer slide 20 for a brief explanation of the function<br />
  • 37. Forecasting Fraud Loss – Step 1Determining whether a transaction is fraudulent or not<br />So the linear function is applied for the observed attributes in the transaction (X vector) weighted by the Support Vector (W vector) calculated in the training module<br />For our example there are 2 outcomes for each transaction – Fraudulent or Genuine<br />For every transaction, the linear function gives the values for both the outcomes and the prediction will be in favor of the outcome with the higher value<br />
  • 38. Forecasting Fraud Loss – Step 2Summing the values of all fraudulent transactions<br />Step 1 is performed on each transaction in a given period<br />The values of fraudulent transactions are totaled to arrive at the forecast of losses due to fraud in a given period<br />
  • 39. All the details contained in the examples above are imaginary. They serve only for the purpose of understanding the system and its application to the field of fraud analytics<br />
  • 40. Benefits of adopting Pollyanna<br />
  • 41. Benefits of adopting Pollyanna<br />For a process that is currently supported by human intelligence, Pollyanna may confer a cost saving benefit ranging from 40% to 80% from reduction of human resources<br />For a process that is already automated or uses machine intelligence, Pollyanna may bring efficiency or accuracy improvement ranging from 10% to 25%<br />
  • 42. “Any technology sufficiently advanced is indistinguishable from magic”<br />Sir Arthur C. Clarke <br />
  • 43. Thank You<br />
  • 44. Contact Details<br />PG Vijay (Consultant – Machine Learning Systems)<br />Mobile: +91 98418 21167<br />E-Mail: vijaypg@gmail.com<br />LinkedIn Public Profile: http://www.linkedin.com/in/machinelearning<br />