expressor: From Data to Discovery - An internet retailer story
by , 01-31-2012 at 01:20 PM (470 Views)
The Intro
Generally speaking, an internet retailer sells goods online to a consumer (B2C) or to a business (B2B). Collectively, with the online processes of marketing, delivering, servicing and paying for products and services this is known as eCommerce. In order to sustain a successful e-business, internet retailers are looking for easier and more cost effective ways to analyze the vast amount of data they have. From products and inventory, to sales and logistics...data needs to be stored, cleansed, aggregated and reviewed so the business can answer the question: "What is going on?".
The Setup
While working with an internet clothing retailer, I demonstrated how expressor can quickly create an automated Dataflow to perform operations on their data. The Dataflow decomposes a product's inventory SKU into its individual product characteristic codes and then summarizes those results. The resulting aggregates provide the retailer with additional metrics and dimensions that are used to track inventory and sales performance.
The Problem
In their current environment, the inventory codes are returned from a popular inventory web service that is invoked manually. The web service has a few methods that allow merchants to stay up to date on the status of inventory in the "warehouse(s)". However, there wasn't any way to track the individual characteristics of the product, such as style, materials, descriptions and sizes. The inventory product codes were a concatenation of 12 alphanumeric characters that represented individual attributes of the product. For example, code 1121CPBLAK1X can be broken down into Style, Material, Description and Size. Their first attempt was to investigate the use of PHP and PERL scripts. They assumed they can use them to invoke the web service, extract the inventory data to a staging file and then process the file. They also wanted to create aggregates (sum, count, etc.) of the individual characteristics they collected. Upon searching the web for ways to do this, they stumbled upon some scripts on a developer forum. The scripts provided them with an initial template, but they still struggled adapting it to their environment as they lacked the appropriate skill sets. "We should have been worried when the developer's post signed off with "Happy Coding!" the prospect chuckled. "It was more cumbersome then we thought." "Even if we got it to work… to maintain it would be counterproductive"; referring to the addition of on-boarding new products and the changing format of the codes available from the vendors.
The Objective
Their new objective was to find a "tool" that would allow them to automate the access of this data and reduce any coding or manual intervention. They also wanted to provide a shareable and reusable repository of "ETL" objects that other developers can apply to other projects; reducing the time it would take to modify existing or create new applications. This was a perfect fit for expressor thanks to its ease of use, semantic based data integration and template based reusable architecture.
The Process - (not really a tutorial)
(Please note: Since I do not have a web retailer merchant account, I needed to simulate the merchant's web service response using expressor Datascript.) This will provide the result(s) as it is to be processed by expressor. You can see an example of the live result to be processed here: http://michaeltarallo.com/Amazon/aws.php. - a data file or RDBMS table could also have been easily substituted as an input source. You can download the complete solution from the expressor solutions section here. (available shortly)
In my prototype, I used an expressor Dataflow to simulate and invoke the web service with a Read Custom operator. I then used the Filter, Transform, Sort, Copy, Aggregate and Funnel operators to perform "operations" on the attributes and data. These operators allow me to: control data flow based on business rules, decompose the codes, sort the individual characteristics, aggregate the metrics and dimensions and merge the results into a single file. The expressor Dataflow, which includes the execution of the web service, can be automated without having to stage the data; another improvement the prospect wanted.
Read more... at http://michaeltarallo.com/2012/02/ex...etailer-story/
This article was originally posted at Turning raw data into actionable information |










Email Blog Entry



