Saturday 4 April 2015

Loading Salesforce Data into Analytics Cloud

Loading Salesforce Data into Analytics Cloud

Introduction

In my earlier post on Analytics Cloud (aka Salesforce Wave) I showed how to load data from a CSV file. Another common use case is to load Salesforce data, such as Cases and Opportunities, for example to see how performance has improved (or otherwise!) over time.

Setting Up

To get started, access the Data Monitor page by clicking the gear icon:

Screen Shot 2015 04 04 at 16 44 04

this takes you to the Dataflow View. The Analytics Cloud provides a Default Salesforce Dataflow that you can download from the drop down on the right hand side of the page:

Screen Shot 2015 04 04 at 16 47 23

This will download a Dataflow Definition file in JSON format. The default data flow is documented in the Analytics Cloud Implementation Guide, so I won’t go into further detail on that here. Instead I’ll walk through a Dataflow Definition that I use to bring the case data from my Salesforce instance. The full JSON file can be downloaded from the Related section below.

Loading Case Data

The following JSON stanza pulls the Case records and creates an interim dataset called “Extract_Cases” - the dataset name is important as will be explained later:

    "Extract_Cases":{
        "action":"sfdcDigest",
        "parameters":{
            "object":"Case",
            "fields":[
                {
                    "name":"CaseNumber"
                },
                {
                    "name":"ContactId"
                },
                {
                    "name":"AccountId"
                },
                {
                    "name":"Type"
                },
                {
                    "name":"Hours_Worked__c"
                },
                {
                    "name":"Subject"
                },
                {
                    "name":"Status"
                },
                {
                    "name":"Description"
                },
                {
                    "name":"CreatedDate"
                },
                {
                    "name":"ClosedDate"
                }
            ]
        }
    }

this pulls all of the case data, but the account and contact information is only available as Salesforce IDs, which isn’t particularly helpful if I want to investigate performance for specific customers, so I need to augment the dataset with the Account and Contact details. The first thing I need to do is extract the Account and Contact data to interim datasets:

    "Extract_Contacts":{
        "action":"sfdcDigest",
        "parameters":{
            "object":"Contact",
            "fields":[
                {
                    "name":"Id"
                },
                {
                    "name":"Name"
                }
            ]
        }
    },
    "Extract_Accounts":{
        "action":"sfdcDigest",
        "parameters":{
            "object":"Account",
            "fields":[
                {
                    "name":"Id"
                },
                {
                    "name":"Name"
                }
            ]
        }
    }

 Once I have the Account/Contact datasets, I can augment the interim Case dataset, first with the contact name. Note that I have to specify the “Extract_Cases” interim cases dataset as the “left” parameter value and the “Extract_Contacts” dataset as the “right”. the left/right_key parameters define the fields from each dataset that tie the contacts to the cases, The results of the transformation are stored in the interim dataset “Transform_Augment_CaseWithContactDetails" :

    "Transform_Augment_CaseWithContactDetails":{
        "action":"augment",
        "parameters":{
            "left":"Extract_Cases",
            "left_key": [ "ContactId" ],
            "relationship": "CaseContact",
            "right":"Extract_Contacts",
            "right_key": [ "Id" ],
            "right_select":[
                "Name"
            ]
        }
    }

Next, I augment the Account name - note that the “left” parameter dataset value is “Transform_Augment_CaseWithContactDetails” - the dataset created after augmenting with the contact name.  If I use the original “Extract_Cases” dataset I discard the contact information I’ve worked so hard to add. This is something I have real trouble remembering for some reason!

    "Transform_Augment_CaseWithAccountDetails":{
        "action":"augment",
        "parameters":{
            "left":"Transform_Augment_CaseWithContactDetails",
            "left_key": [ "AccountId" ],
            "relationship": "CaseAccount",
            "right":"Extract_Accounts",
            "right_key": [ "Id" ],
            "right_select":[
                "Name"
            ]
        }
    }

Now that the Account/Contact information has been added to the Cases, I can register the fully augmented dat set so that it can be used in the Analytics Cloud: 

    "Register_Dataset_ClosedCases":{
        "action":"sfdcRegister",
        "parameters":{
            "alias":"AllCases",
            "name":"AllCases",
            "source":"Transform_Augment_CaseWithAccountDetails"
        }
    }

I then upload the Dataflow via the same drop down that I downloaded it from:

Screen Shot 2015 04 04 at 17 20 42

and then choose the Start option from the same menu to execute the load:

Screen Shot 2015 04 04 at 17 27 10

Once the dataset is loaded, I receive an email notification and I can then start exploring it, for example to see the average days worked for cases grouped by customer:

Screen Shot 2015 04 04 at 17 23 43

(I can only show the BrightGen figure as this dataset contains real customer data!)

Related

2 comments:

  1. Hi Bob,

    I am trying to link into a Wave Dashboard. I have hacked a Wave Dashboard by altering the lens JSON so that I can enter PACIFIC (in step 1) and my dashboard show that analytics data (from step 2 using step 1 selection) - and then I can select NORTHWEST and my dashboard shows that data, etc. It's quite slick with very cool animations. Now my challenge is to pass PACIFIC, NORTHWEST, and other data through the URI. It looks like Wave takes queries and partial queries (maybe SAQL) from the server based on the embedded javascript. I hacked around for a couple hours but couldn't figure it out. I figured maybe you would know. If you do then any help would be most appreciated.

    Thanks a lot,
    jason

    ReplyDelete
  2. Hi BOB,

    Any guess how to load dataset by code...There is load call in Saql but not able to make the post body...any help will be appreciated..

    ReplyDelete