How to Parse UTM Parameters Used in Emails

Recently I worked on a project where the requirement was to extract UTM parameters used in triggered emails for some audit purposes. These are the tracking parameters that are appended to the URLs when the links are clicked in the emails and they are used to understand and attribute the web traffic generated from the CRM emails.

For example, some common parameters will be the source, content block name, campaign name, email name etc. Source will always be email, campaign name will be the name of the campaign (Black Friday, Summer Sale, Boxing Day etc.), email name to understand which email generated the higher traffic and revenue and content block names to understand how different sections within the email performed. 

The approach that I took was to:
  1. First, identify all the triggered sends in the account using WSproxy. This provided a list of all active triggered emails in the account along with important details like name, customer key, JobId, BatchId, Subscriberkey etc.
  2. Second, download the click tracking extract data
  3. Join the click-tracking extract table with the Triggered email table
  4. Once all the data was combined into one data extension - All the required information was available and now the task was to clean up the LinkContent field to parse the UTM parameters
Say, for instance, the URL was 
https://ujjwaltiwari.ca/?ab_u=%%customer_id%%&ab_ecmp=SummerOffers&utm_campaign=CivicHoliday&utm_content=Herobanner&utm_source=EmailTrigger

The objective was to break this URL into several columns for ab_u, ab_ecmp, utm_campaign, utm_content, utm_source etc.

Initially, we tried exporting the data in a spreadsheet and using formulas to clean it but it was not simple and was not working as per our expectation. 

Next, we tried using a combination of SubString and CharIndex SQL functions in Query Studio but since the data was not consistent, this method was also not working.

Finally, we tried using SSJS activity to pull the data from the source data extension and append it to the target data extension with UTM parameter values being added to different columns or fields. 

The SSJS code mentioned below uses the Row Retrieve function and while it worked beautifully the limitation with it is that it only processes the first 2500 records no matter how many rows are there in the source data extension. To overcome this limitation, we added a boolean field processed in the source data extension and modified the code so that it processes only records for which the value of the boolean field is false and once it processes a row, it updates the boolean field (processed) to true. 

To automate this process I created an automation with this SSJS activity as the first step, then created a Query activity as the second step to update the source data extension and delete rows with the processed field value as false. Now I duplicated the automation so that there were two automation with the same two activities. I added a script activity as the third step in both the automation to invoke each other. This way both automation kept running and all the required data was processed in around 3-4 hours.

You may have to modify the below code a bit as per your needs and depending on the kind of data you are trying to extract.

JS Code to be Used in Script Activity in Automation
<script runat="server">
Platform.Load("core", "1.1.1");

try {
    var getUrlParams = function(url) {
        var urlParts = String(url).split(/\?/);
        if (urlParts.length < 2) {
            return {}; // Return an empty object if no URL parameters
        }
        var queryString = urlParts.slice(1).join("").split(/\#/);
        var queryStringParts = queryString[0].split(/\&/);

        var i;
        var params = {};
        for (i = 0; i < queryStringParts.length; i++) {
            var param = queryStringParts[i].split(/=/);
            params[param[0]] = param[1];
        }

        return params;
    };

    var sourceDe = DataExtension.Init("YourSourceDataExtension"); // CustomerKey of the source DE
    var targetDe = DataExtension.Init("YourTargetDataExtension"); // CustomerKey of the target DE
    var filter = { Property: "processed", SimpleOperator: "equals", Value: "False" };
    var rows = sourceDe.Rows.Retrieve(filter);
    var i, row, urlParams;

    for (i = 0; i < rows.length; i++) {
        row = rows[i];
        urlParams = getUrlParams(row["LinkContent"]);
        if (urlParams && Object.keys(urlParams).length > 0) {
            // Add data to the target DE
            targetDe.Rows.Add({
                "utm_source": urlParams["utm_source"],
                "utm_medium": urlParams["utm_medium"],
                "utm_campaign": urlParams["utm_campaign"],
                "ab_u": urlParams["ab_u"],
                "utm_content": urlParams["utm_content"],
                "ab_ecmp": urlParams["ab_ecmp"],
                "Name": row["Name"], 
                "TriggeredSendCustomerKey": row["TriggeredSendCustomerKey"],
                "EmailSubject": row["EmailSubject"], 
                "LinkName": row["LinkName"],
                "LinkContent": row["LinkContent"],
                "clickeventdate": row["clickeventdate"],
                "SubscriberKey": row["SubscriberKey"],
                "URL": row["URL"],
                "Email": row["Email"],
                "JobID": row["JobID"],
                "BatchID": row["BatchID"],
                "SubscriberID": row["SubscriberID"],
                "ID": row["ID"]
            });

            // Update the 'processed' field in the source DE
            sourceDe.Rows.Update({ "processed": "True" }, ["ID"], [row["ID"]]);
        } else {
            // Log an error if URL parameters cannot be extracted
            Platform.Response.Write("Error: Unable to extract URL parameters for row ID " + row["ID"] + "\n");
        }
    }
} catch (e) {
    // Log any other errors
    Platform.Response.Write("Error: " + String(e) + "\n");
}

Comments