Sunday, December 8, 2013

Simple dependency management for dependent Salesforce objects

Introduction

Salesforce programmers know it is sometime difficult to save multiple objects with dependencies on each other in the right order and with as little effort possible.  The "Collecting Parameter" pattern is an easy way to do this and this article will show you how to use it in your own code.

Unit of Work

In June 2013, FinancialForce's CTO, Andrew Facett, wrote his Unit Of Work article, explaining how a dependency mechanism might be implemented to simplify the saving of multiple objects with dependencies between them.

The problem is a common one for Salesforce programmers--the need to create master and detail objects simultaneously.  Programmers must save the master objects first before their IDs can be set in the detail objects.

An example might be an invoice and its line-items.  To save any InvoiceLine__c, its master object, an Invoice__c, must be saved first.

To solve this problem, Xede uses a pattern popularized by Kent Beck in his 1995 book, Smalltalk Best Practice Patterns, called Collecting Parameter.  For those unfamiliar with the Smalltalk programming language, it can be briefly described as the first object-oriented language where everything is an object.  Numbers, messages, classes, stack frames--everything.  In 1980, (nearly 34 years ago) it also supported continuable exceptions and lambda expressions.    Lest I gush too much about it, I'll say only that nearly all object oriented languages owe their best features to Smalltalk and their worse features to either trying to improve on it or ignoring prior art.

Returning to the subject at-hand, dependency saves, Xede will have created two classes to wrap the objects; Invoice and InvoiceLine.  Each instance of Invoice will aggregate within it the InvoiceLine instances belonging to it.

The code might look something like this.

// create an invoice and add some lines to it
Invoice anInvoice = new Invoice(anInvoiceNumber, anInvoiceDate, aCustomer);
...
// adding details is relatively simple
anInvoice.add(anInvoiceLine);
anInvoice.add(anotherInvoiceLine);
anInvoice.save();

So now our Invoice has two detail instances inside it.  Keeping true to the OO principles of data-hiding and loose-coupling, we can safely ignore how these instances store their sobject variables;  Invoice's Invoice__c and InvoiceLine's InvoiceLine__c.  But without knowing how they store their sobjects, how can we save the master and the detail records with the minimum of two DMLs, one to save the master and another to save the details?

We do it using a collecting parameter.

Collecting Parameter

A collecting parameter is basically a collection of like-things that cooperating classes add to.  Imagine a basket that might get passed to attendees at a charity event.  Each person receiving the collection basket may or may not add cash or checks to it.  In both programming and charity fundraisers it is better manners to let each person add to the basket themselves than to have an usher reach into strangers' pockets and remove cash.  The latter should be regarded as criminal--if not at charity events then in programming.

For programmer's such a thing violates data-hiding; not all classes keep their wallets in the same pocket (variable), some may use money clips rather than wallets, some use purses (collection types), some may have cash while others have checks or coin.  Writing code that will rummage through each class' data looking for cash is nearly impossible--even with reflection.  In the end they all get deposited into a bank account.

Let's first look at the saveTo() methods of Invoice and InvoiceLine.  They are the simplest.

public with sharing class Invoice {
    public getId() { return sobjectData.id; }

    public override void saveTo(list<sobject> aList, list<XedeObject> dependentList)
    {
        aList.add(sobjectData);

        for (InvoiceLine each : lines)
            saveTo(aList, dependentList);
    }

    Invoice__c sobjectData;
    list<InvoiceLine> lines;
}

Invoice knows where it keeps its own reference to Invoice__c (cohesion), so when it comes time to save it simply adds its sobject to the list of sobjects to be saved.  After that, it also knows where it keeps its own list of invoice lines and so calls saveTo() on each of them.

public with sharing class InvoiceLine {
    public override void saveTo(list dependentList) {
        if (sobjectData.parent__c != null)  // if I already have my parent's id I can be saved
            aList.add(sobjectData);

        else if (parent.getId() != null) {  // else if my parent has an id, copy it and I can be saved
            sobjectData.parent__c = parent.getId();
            aList.add(sobjectData);
        }

        else
            dependentList.add(this); // I can't be saved until my parent is saved
    }

    Invoice parent;
    InvoiceLine__c sobjectData;
}

InvoiceLine's implementation is nearly as simple as Invoice's, but subtly different.

Basically, if the InvoiceLine already has its parent's id or can get it's parent's id, then it adds its sobject data to the list to be saved.  If it doesn't have its parent's id then it must wait its turn, and adds itself to the dependent list.

Reader's may wonder why Invoice doesn't decide for itself whether to save its children.  Invoice could skip sending saveTo() to its children if it doesn't have an id, but whether or not its children should be saved is not its decision--it's theirs.  They may have other criteria that must be met before they can be saved.  They may have two master relationships and are waiting for them both.  They may have rows to delete before they can be saved, or may have detail records of their own with other critiera independent of whether Invoice has an id or not.  Whatever the reason may be, the rule is each objects should decide for itself whether it's ready to save, just as it's each person's decisions whether and how much money to put into the collection basket.

In our example below, save() passes two collection baskets; one collects sobjects and another collections instances of classes whose sobjects aren't ready for saving--yet.  save() loops over both lists until they're empty, and in this way is able to handle arbitrary levels of dependencies with the minimum number of DML statements.

Let's look at the base class' (XedeObject) implementation of save().

public virtual class XedeObject {
    public virtual void save() {
        list objectList = new list();
        list dependentList = new list { this };

        do {
            List aList = new List();
            List updateList = new List();
            List insertList = new List();

            objectList = new list(dependentList);
            dependentList.clear();

            for (XedeObject each: objectList)
                each.saveTo(aList, dependentList);

            for (sobject each : aList) {
                if (each.id == null)
                    insertList.add(each);
                else
                    updateList.add(each);
            }

            try {
                update updateList;
                insert insertList;
            } catch (DMLException dmlex)  {
                XedeException.Raise('Error adding or updating object : {0}', dmlex.getMessage());
            }
        } while (dependentList.isEmpty() == false);
    }

    public virtual void saveTo(list anSobjectList, list aDependentList)
    {
        subclassMethodError();
    }
}

To understand how this code works you need to be familiar with subclassing.  Essentially, the classes Invoice and InvoiceLine are both subclasses of XedeObject.  This means they inherit all the functionality of XedeObject.  Though neither Invoice or InvoiceLine implement save(), they will both understand the message because they've inherited its implementation from XedeObject.

The best way to understand what save() does is to walk through "anInvoice.save()."

anInvoice.save() executes XedeObject's save() method because Invoice doesn't have one of its own (remember it's a subclass of XedeObject).  save() begins by adding its own instance to dependentList.  Then it loops over the dependent list, sending saveTo() to each instance, and collecting new dependent objects in the dependent list.

After collecting all the objects it either updates or saves them, then returns to the top of the list of the dependent list isn't empty and restarts the process.

When the dependent list is empty there's nothing else to do and the methods falls off the bottom returning to the caller.

XedeObject also implements saveTo(), but its implementation throws an exception.  XedeObject's subclasses ought to implement saveTo() themselves if they intend to participate in the dependency saves.  If they don't or won't, there's no need to override saveTo().

One of our recent projects was a loan servicing system.  Each loan could have multiple obligors, and each obligor could have multiple addresses.  The system could be given multiple loans at a time to create, and with each batch of loans a log record was recorded.  We had an apiResponse object with a list of loans.  When we called anApiResponse.save(), it's saveTo() sent saveTo() to each of its loans, each loan sent saveTo() to each of its obligors, and each obligor() sent saveTo() to each of its addresses, before apiResponse sent saveTo() to its log class.

In the end, ApiResponse saved the loans, obligors, addresses, and log records with three DML statements--all without anything much more complicated than each class implementing saveTo().

Some programmers may argue that interfaces might have accomplished the same feat without subclassing, but in this case it is not true.  Interfaces don't provide method implementation.  Had we used interfaces then every object would be required to implement save().

Still to do

As useful as save()/saveTo() has proved to be, I can think of a few improvements I'd like to make to it.

First, I'd like to add a delete list.  Some of our operations include deletes, and rather than having each object do its own deletes I'd prefer to collect them into a single list and delete them all at once.

Next, the exception handling around the update and insert should be improved.  DmlException has lots of valuable information we could log or include in our own exception.

Third, I would love to map the DML exceptions with the objects that added them to the list.  save() could then collect all the DML exceptions and send them to the objects responsible for adding them to the list.

Coming up

  • XedeObject has implemented other useful methods we tend to use in multiple of our projects.  Implementing them once in XedeObject and deploying it to each of our customer's orgs saves time, money, and improves consistency across all our projects.  One of these is coalesce().  There are many others.
  • Curl scripts for exercising Salesforce REST services.
  • Using staticresources as a source for unit-test data.

Friday, March 22, 2013

A better way to generate XML on Salesforce using VisualForce

There are easier ways to generate XML on Salesforce than either the Dom library or XmlStreamWriter class.  If you've done either, perhaps you'll recognize the code below.

public static void DomExample()
{
    Dom.Document doc = new Dom.Document();
    
    Dom.Xmlnode rootNode = doc.createRootElement('response', null, null);

    list accountList = [ 
        select  id, name, 
                (select id, name, email from Contacts) 
          from  Account 
    ];
          
    for (Account eachAccount : accountList) {
        Dom.Xmlnode accountNode = rootNode.addChildElement('Account', null, null);
        accountNode.setAttribute('id', eachAccount.Id);
        accountNode.setAttribute('name', eachAccount.Name);
        
        for (Contact eachContact : eachAccount.Contacts) {
            Dom.Xmlnode contactNode = accountNode.addChildElement('Contact', null, null);
            contactNode.setAttribute('id', eachContact.Id);ac
            contactNode.setAttribute('name', eachContact.Name);
            contactNode.setAttribute('email', eachContact.Email);
        }
    }
    
    system.debug(doc.toXmlString());            
}

Or maybe this example.

public static void StreamExample()
{
    XmlStreamWriter writer = new XmlStreamWriter();
    
    writer.writeStartDocument('utf-8', '1.0');        
    writer.writeStartElement(null, 'response', null);
    
    list accountList = [ 
        select  id, name, 

                (select id, name, email from Contacts) 
          from  Account 
    ];
          
    for (Account eachAccount : accountList) {
        writer.writeStartElement(null, 'Account', null);
        writer.writeAttribute(null, null, 'id', eachAccount.Id);
        writer.writeAttribute(null, null, 'name', eachAccount.Name);        

        for (Contact eachContact : eachAccount.Contacts) {
            writer.writeStartElement(null, 'Contact', null);
            
            writer.writeAttribute(null, null, 'id', eachContact.Id);
            writer.writeAttribute(null, null, 'name', eachContact.Name);
            writer.writeAttribute(null, null, 'email', eachContact.Email);
            
            writer.writeEndElement();
        }
        
        writer.writeEndElement();
    }
    
    writer.writeEndElement();
    
    system.debug(writer.getXmlString());
    
    writer.close();            
}

But wouldn't you rather write something like this?

public static void PageExample()
{
    PageReference aPage = Page.AccountContactsXML;
    aPage.setRedirect(true);
    system.debug(aPage.getContent().toString());
}

Let's take a look at what makes creating the XML possible with so few lines of Apex.

Rather than build our XML using Apex code, we can type it directly into a Visualforce page--providing we strip all VF's page accessories off using apex:page attributes.


<apex:page StandardController="Account" recordSetVar="Accounts" contentType="text/xml" showHeader="false" sidebar="false" cache="false">
<?xml version="1.0" encoding="UTF-8" ?>
<response>
<apex:repeat value="{!Accounts}" var="eachAccount" >
    <Account id="{!eachAccount.id}" name="{!eachAccount.name}">&
    <apex:repeat value="{!eachAccount.contacts}" var="eachContact">
        <Contact id="{!eachContact.id}" name="{!eachContact.name}" email="{!eachContact.email}"/>
    </apex:repeat>
    </Account>
</apex:repeat>
</response>
</apex:page>

The secret that makes this code work is setting the page's API version 19.0 inside its metadata.  That is the only thing that allows the <?xml ?> processing instruction to appear at the top without the Visualforce compiler throwing Conniptions (a subclass of Exception). 

Depending on how much XML you need to generate, another advantage to the VisualForce version is how few script statements are required to produce it.

Number of code statements: 4 out of 200000

Our Dom and Stream examples require 28 and 37 respectively--and that's in a developer org with only three accounts and three contacts.  Additionally, the Page example is only 18 lines including both the .page and .cls, whereas the Dom and Stream examples are 27 and 38 lines respectively (coincidence?).

But what happens when we add billing and shipping addresses (and two more contacts)?

Our page example's Apex code doesn't change, but its page does.

<apex:page StandardController="Account" recordSetVar="Accounts" contentType="text/xml" showHeader="false" sidebar="false" cache="false">
<?xml version="1.0" encoding="UTF-8" ?>
<response>
<apex:repeat value="{!Accounts}" var="eachAccount" >    
    <Account id="{!eachAccount.id}" name="{!eachAccount.name}">
        <apex:outputPanel rendered="{!!IsBlank(eachAccount.billingStreet)}" layout="none">
            <Address type="Billing">
                <Street>{!eachAccount.billingStreet}</Street>
                <City>{!eachAccount.billingCity}</City>
                <State>{!eachAccount.billingState}</State>
                <PostalCode>{!eachAccount.billingPostalCode}</PostalCode>
                <Country>{!eachAccount.billingCountry}</Country>
            </Address>        
        </apex:outputPanel>        
        <apex:outputPanel rendered="{!!IsBlank(eachAccount.shippingStreet)}" layout="none">            
            <Address type="Shipping">
                <Street>{!eachAccount.shippingStreet}</Street>
                <City>{!eachAccount.shippingCity}</City>
                <State>{!eachAccount.shippingState}</State>
                <PostalCode>{!eachAccount.shippingPostalCode}</PostalCode>
                <Country>{!eachAccount.shippingCountry}</Country>
            </Address>
        </apex:outputPanel>
        <apex:repeat value="{!eachAccount.contacts}" var="eachContact">&
            <Contact id="{!eachContact.id}" name="{!eachContact.name}" email="{!eachContact.email}"/>
        </apex:repeat>
    </Account>
</apex:repeat>
</response>
</apex:page>

We've added sections for both the billing and shipping codes, with conditional rendering in-case either doesn't exist.  In addition to our six lines of Apex (PageExample() above) we've added 12 new lines to the earlier 18 for a total of 36 lines.  The best part is, even with the extra XML being generated our Page example will still only consume 4 script statements of the already-insufficient 200,000.

How do our Dom and Stream examples fair?  Both are pasted together below into a single code section.

public static void DomExample()
{
    Dom.Document doc = new Dom.Document();        
    
    Dom.Xmlnode rootNode = doc.createRootElement('response', null, null);

    list accountList = [ 
        select    id, name, 
                billingStreet, billingCity,
                billingState, billingPostalCode,
                billingCountry,
                shippingStreet, shippingCity,
                shippingState, shippingPostalCode,
                shippingCountry,
                (select id, name, email from Contacts) 
          from    Account ];
          
    for (Account eachAccount : accountList) {
        Dom.Xmlnode accountNode = rootNode.addChildElement('Account', null, null);
        accountNode.setAttribute('id', eachAccount.Id);
        accountNode.setAttribute('name', eachAccount.Name);
        
        if (String.IsNotBlank(eachAccount.billingStreet)) {
            Dom.Xmlnode addressNode = accountNode.addChildElement('Address', null, null);
            addressNode.setAttribute('type', 'Billing');
            addressNode.addChildElement('Street', null, null).addTextNode(eachAccount.billingStreet);
            addressNode.addChildElement('City', null, null).addTextNode(eachAccount.billingCity);
            addressNode.addChildElement('State', null, null).addTextNode(eachAccount.billingState);
            addressNode.addChildElement('PostalCode', null, null).addTextNode(eachAccount.billingPostalCode);
            addressNode.addChildElement('Country', null, null).addTextNode(eachAccount.billingCountry);                
        }
        
        if (String.IsNotBlank(eachAccount.ShippingStreet)) {                
            Dom.Xmlnode addressNode = accountNode.addChildElement('Address', null, null);
            addressNode.setAttribute('type', 'Shipping');
            addressNode.addChildElement('Street', null, null).addTextNode(eachAccount.shippingStreet);
            addressNode.addChildElement('City', null, null).addTextNode(eachAccount.shippingCity);
            addressNode.addChildElement('State', null, null).addTextNode(eachAccount.shippingState);
            addressNode.addChildElement('PostalCode', null, null).addTextNode(eachAccount.shippingPostalCode);
            addressNode.addChildElement('Country', null, null).addTextNode(eachAccount.shippingCountry);                
        }
        
        for (Contact eachContact : eachAccount.Contacts) {
            Dom.Xmlnode contactNode = accountNode.addChildElement('Contact', null, null);
            contactNode.setAttribute('id', eachContact.Id);
            contactNode.setAttribute('name', eachContact.Name);
            contactNode.setAttribute('email', eachContact.Email);
        }
    }
    
    system.debug(doc.toXmlString());            
}

public static void StreamExample()
{
    XmlStreamWriter writer = new XmlStreamWriter();
    
    writer.writeStartDocument('utf-8', '1.0');        
    writer.writeStartElement(null, 'response', null);
    
    list accountList = [ 
        select    id, name, 
                billingStreet, billingCity,
                billingState, billingPostalCode,
                billingCountry,
                shippingStreet, shippingCity,
                shippingState, shippingPostalCode,
                shippingCountry,
                (select id, name, email from Contacts) 
          from    Account ];
          
    for (Account eachAccount : accountList) {
        writer.writeStartElement(null, 'Account', null);
        writer.writeAttribute(null, null, 'id', eachAccount.Id);
        writer.writeAttribute(null, null, 'name', eachAccount.Name);
        
        if (String.IsNotBlank(eachAccount.billingStreet)) {
            writer.writeStartElement(null, 'Address', null);
            writer.writeAttribute(null, null, 'type', 'Billing');                
            
            writer.writeStartElement(null, 'Street', null);
            writer.writeCharacters(eachAccount.billingStreet);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'City', null);
            writer.writeCharacters(eachAccount.billingCity);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'State', null);
            writer.writeCharacters(eachAccount.billingState);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'PostalCode', null);
            writer.writeCharacters(eachAccount.billingPostalCode);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'Country', null);
            writer.writeCharacters(eachAccount.billingCountry);
            writer.writeEndElement();

            writer.writeEndElement();                
        }
        
        if (String.IsNotBlank(eachAccount.shippingStreet)) {
            writer.writeStartElement(null, 'Address', null);
            writer.writeAttribute(null, null, 'type', 'Shipping');                
            
            writer.writeStartElement(null, 'Street', null);
            writer.writeCharacters(eachAccount.shippingStreet);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'City', null);
            writer.writeCharacters(eachAccount.shippingCity);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'State', null);
            writer.writeCharacters(eachAccount.shippingState);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'PostalCode', null);
            writer.writeCharacters(eachAccount.shippingPostalCode);
            writer.writeEndElement();
            
            writer.writeStartElement(null, 'Country', null);
            writer.writeCharacters(eachAccount.shippingCountry);
            writer.writeEndElement();

            writer.writeEndElement();                
        }

        for (Contact eachContact : eachAccount.Contacts) {
            writer.writeStartElement(null, 'Contact', null);
            
            writer.writeAttribute(null, null, 'id', eachContact.Id);
            writer.writeAttribute(null, null, 'name', eachContact.Name);
            writer.writeAttribute(null, null, 'email', eachContact.Email);
            
            writer.writeEndElement();
        }
        
        writer.writeEndElement();
    }
    
    writer.writeEndElement();
    
    system.debug(writer.getXmlString());
    
    writer.close();            
}

Our Dom example is 52 lines and takes 60 script statements and our Stream example has ballooned to 96 lines and takes 104 script statements on our tiny data set.  For anyone keeping track, PageExample() has 30% fewer lines than DomExample() and 63% fewer lines than StreamExample().  Most importantly, no matter how much data is involved, PageExample will only ever use 4 script statements while the other two will scale gometrically as each new row of data requires more than one script statement to generate.

Caveats and disclaimers

  • The page above is about as basic as I could come up.  It stands alone and requires no controllers.  Readers should be able to paste it directly into their development orgs and see what they get (dont' forget to set the API to 19.0).
  • So basic a page doesn't take into account ordering of the data.  If the XML data needs to be in a specific order a controller would be required to return that list back to the page using a SOQL "order by" clause.
  • Though this technique is great for generating XML it can't consume XML.  That's probably obvious to programmers but is important to point-out for management types that may visit.
  • XSL stylesheets can be easily reference from the XML page by simply adding <?xml-stylesheet type="text/xsl" href="..."?> after the <?xml ?> instruction.  Such a thing can be done with PageExample() and StreamExample(), but the Dom classes don't allow adding processing instructions that I know of.
  • It's impossible to use getContent() inside test methods. 

Note: This article was originally published March 22, 2013 at it.toolbox.com on Anything Worth Doing.

Tuesday, April 13, 2010

Inside-out design : Parts I and II



The topic of bottom-up vs. top-down design has accumulated a lot of baggage since both descriptions of system design were first introduced (1970s). Both are perhaps well understood, or many may assume they understand both. This series of articles introduces the terms inside-out and outside-in to help readers visualize a three-dimensional design (an onion’s layers would be a good example) rather than a two-dimensional design similar to tree rings.

Business software is discovered, not invented. Arguments that computer technology has fundamentally changed business, or even invented it, are exaggerated. The business of banking remains much the same as it was 150 years ago, deposits and loans. Insurance remains much the same, pay smaller amounts now for the promise to cover expenses later. Retailing, logistics, and drafting are also mostly unchanged.

If computers haven’t invented these businesses what can we truthfully assert they have done? We can assert that they’ve helped make humans, both individually and collectively, super human. The way in which software has incrementally accomplished this feat can be described as from the inside-out. This article will elaborate on what inside-out design is, use it as a model for how new software projects should be designed and developed, and describe how inside-out design (IOD) avoids the many shortcomings of alternative approaches.

“We can make him better than he was before. Better, stronger, faster.” 
Introduction to The Six Million Dollar Man

Though banking may be a complicated business, its basic activities are simple. Customers “deposit” money at the bank and are paid interest. Banks pay a lower interest rate on deposits than they earn lending money to other customers.

Perfect Memory

Our first step at creating a super-human banker is to improve their memory—regardless of age. How many customers, account balances, and interest rates for each can a human remember perfectly on their own? Whatever that number is a banker that can remember 100 times more will be more profitable. The banker that can remember 100 times more than that even more profitable. A banker with perfect memory is limited only by his productivity and efficiency—but we’ll address that later.

Perfect memory is what databases provide the banker. A database is capable of remembering, perfectly, the name of every customer, their address, phone number, accounts, account balances, transaction history, and even their relationships to other customers and their accounts.

This is the core of our inside-out design. The business already existed—all we did was discover and record the banking schema into a database.

If nothing else is done, our banker may be better off than they were before. Without any additional features the possibilities are nearly endless. Anything that can be stored in the database can be done so perfectly. Any number or type of account and any number or type of transaction can be perfectly stored and perfectly retrieved.

Much more can be written of the benefits of relational databases, and indeed much already has. Not the least of which include RDB’s basis in relational set theory, referential integrity, and normalization.

But even with mathematically provably correct data, perfect memory can still be tarnished with imperfect manipulation. The next layer will enhance the first with perfect execution.

Perfect Execution

With perfect memory our banker will never forget your name or account balance. They simply record each of your transactions in their database.
If this were a relational database our banker could use SQL. Using SQL they can find your account number using your name or phone number:

SELECT @ACCOUNT_NUMBER = ACOUNT_NUMBER
 FROM CUSTOMER
 JOIN ACCOUNT 
   ON ACCOUNT.OWNER_KEY = CUSTOMER.CUSTOMER_KEY

WHERE CUSTOMER.PHONE_NUMBER = "248 555 2960"

Once they have your account number they can enter the transaction

INSERT INTO TRAN_HISTORY (ACCOUNT_NUMBER, TRANSACTION , AMOUNT)
VALUES (@ACCOUNT_NUMBER, “DEPOSIT”, $100.00)


Depending on how “perfect” their database is, and how many accounts the customer has, or whether they recently bounced a check and must pay an NSF fee, or how accounts feed the general ledger, more SQL will likely be required to keep everything “perfect.”

So even though the banker can remember perfectly what was done they have difficulty remembering how to do it.

Most contemporary relational databases provide a mechanism for building SQL macros or functions called stored procedures. Stored procedures extend the syntax of SQL and a mechanism for storing the function inside the database itself. In this manner an RDB may hide the details of its schema as much for its own benefit as our banker’s. Additionally, invoking stored procedures is simpler than typing all the SQL each time, making it easier for more bankers to use the database even if they must still learn some syntax.

If SQL is the lowest-level language for manipulating relational database tables, or 1st generation language, stored procedures can be thought of as a less low-level or 2nd generation language. Using stored procedures are example above may be simplified.

EXEC ACCOUNT_DEPOSIT(“248 555 2960”, $100.00)

How ACCOUNT_DEPOSIT is implemented is hidden both by virtue and necessity. By virtue because bankers don’t have to remember all the details of an account deposit, and by necessity because such an interface is required to provide perfect execution—the database is always updated consistently no matter who invokes the procedure. Additionally, the procedure is free to change its implementation without affecting bankers as long as the order, type, and number of the procedure’s arguments are unchanged.

The reasons for the procedure’s change are also hidden from the procedure’s users. Its implementation may have changed because of new features or schema change. Regardless the reason, the procedure’s consumers benefit by its improved implementation without needing to change what they already know and the processes they’ve already documented.

It’s worth noting that an RDB that provides stored procedures is very much like an object in a traditional object-oriented point-of-view. Just as objects implement publicly-accessible methods to hide their implementation our banking RDB schema implements publicly-accessible procedures to hide its implementation.

Our banking database’s stored procedures define its Application Programming Interface. Any user can use the stored procedures to affect perfect transactions.

It’s important to pause here and contemplate an important inside-out feature. Any user can use the stored procedures to affect perfect transactions. One banker may be a teller, another may be an ATM, or a Point-of-Service terminal, or still another may be a web page.

Even though our implementation requires applications (tellers, ATMs, POSs, etc.) have access to our database, no other technical hurdle is erected. Any programming language that provides a library to access our RDB is capable of executing perfect transactions. In this sense, the surface area of our system has been increased. We’ve simultaneously improved our system’s integrity while increasing its utility to other languages and applications.

Outside-in designs may approach this differently. It is too commonplace for applications to be designed from the outside-in—designing the user interface first and the supporting infrastructure afterwards. The result, though possibly to the user’s liking, is only as capable as it will ever be. It has only a single interface and its supporting mechanisms implement only that interface's required features. It has little surface area.

So now our banker has perfect memory and perfect execution. In the next article we’ll explore inside-out’s next super-human enhancement—ubiquity.

Tuesday, December 2, 2008

If it's not in Bugzilla, it doesn't exist



There are many ways to manage projects. Just because I understand time estimates are important doesn't mean I have to like or believe them.

An alternative to time-lines and resources estimates is to manage development, enhancements and fixes with little more than a defect tracking system. At InStream we used Bugzilla.

Using Bugzilla or any defect tracking tool as a substitute for project management software may not work for everybody, but it worked well for us. Below I'll describe why and how we used it.

As the development team at InStream grew larger and end-user requests more frequent, we did what most companies do--create a technology steering committee to track and prioritize enhancements and fixes to more closely track the priorities of our business. We had a board with 3x5 cards on it we filled out with each request, we put them into buckets on the board describing what might be done one-week out, two-weeks out, and included a when-we-get-to-it-we'll-get-to-it category.

The committee consisted of the COO, the CTO (myself), the development staff, the QA manager, the CCO (chief credit officer), and some of our end-users.

A project manager was appointed and their job was to organize the cards after our meetings into a software package to track the requests, the progress on them, and prepare for the next meeting with updates for the entire committee.

A funny thing happened over the next few weeks. It turns out our development staff was so quick at implementing features and fixing bugs that the steering committee wasn't unable to keep up with the progress. More time was spent trying to keep "The Project" updated and current than was required to enhance the software.

The developers had recently started using Bugzilla to organize themselves and give myself insight into what they were doing during the day. We were using it so well, in fact, we proposed banning the committee in favor of relying on Bugzilla--with a few usage guidelines.

Rule Number One

Whether it was a bug, feature request, or fix, I had a simple rule for all our users and developers: If it's not in Bugzilla it doesn't exist.

For end-users it meant that everything they wanted the system to do, or anything they thought needed fixing, or anything they thought could look better or perform faster had to be entered into the system--by them.

User's couldn't complain about a bug they hadn't reported. They couldn't be waiting for a feature they hadn't asked for. By entering the bug themselves users took ownership of the bug's reporting, its description, and ultimately (and this is important) its closing. A bug wasn't closed until the user confirmed it in production.

A side-benefit of using Bugzilla is it also became our working requirements tool. Users would describe what they thought they needed, developers would ask questions about it, users would clarify, developers would confirm, and the end result was a complete audit trail of a design requirement, followed from definition, implementation, deployment, to end-user acceptance.

Does your project management software do that?

For developers it meant they didn't work on anything that didn't exist in Bugzilla even if they had to enter it themselves.

One of the benefits of a defect tracking system over project management is the ability to create tasks (incidents, bugs, items, whatever you want to call them) to document what it is your developers do all day. Bugzilla was then able to report who opened items, who worked on them, who checked-in the fixes, and when the items were resolved.

As a manager I discovered it more valuable to monitor the velocity of my staff's productivity than the time they spent being productive. As the system's original developer (but kicked-out of coding by my staff) I discovered I could use Bugzilla as a way to program through my staff, except instead of writing Smalltalk or PHP I only needed to describe what I wanted it to do and it would find its way into the code base.

Making Bugzilla easy for end-users is relieving them of having to answer all the questions Bugzilla asks. We agreed that end-users were only responsible for the description and prioritizing requests so engineering had an idea how important it was to them.

Each new bug would go through triage, usually by a developer. It was the developer's responsibility to figure out which product the bug related to, which category, and what the bug's severity was.

And because Bugzilla copies bug owners on everything that happens to their requests, our end-users never had to ask if something was being worked on or what its status was. They received email updates every time a bug's status changed and learned to get excited when the saw the CVS COMMIT messages recorded to their requests.

Engineering and QA shared the responsibility of determining which fixes would be included in which releases. We delivered to production both hot fixes and releases.

Hot fixes consisted of bug fixes and enhancements with minimal or isolated impact to the database, that could be moved into production with few or no side effects. Hot fixes could occur daily, and it was not unusual for cosmetic or low-impact bugs to be corrected same-day.

Full Releases were reserved for database changes impacted either many systems or our posting programs. Since protecting the database was our production rule #1 we were careful that database changes and the posting programs were well tested before releasing them into production.

Thursday, May 1, 2008

The next big thing

Joel Spolsky is the president of Fogg Creek Software and frequent commentator on the software development industry. His latest article, Architecture Astronauts, criticizes Microsoft's continued re-invention of something no one seems to want. 

Read Joel's article to get the full comic affect, but here's a pertinent excerpt: 
When did the first sync web sites start coming out? 1999? There were a million versions. xdrive, mydrive, idrive, youdrive, wealldrive for ice cream. Nobody cared then and nobody cares now, because synchronizing files is just not a killer application. I'm sorry. It seems like it should be. But it's not.
A killer application would certainly be the next big thing. If you're unsure what a killer application is think of the first word processor, spreadsheet, or database program. Some of you may not appreciate the impact a killer-application can have on the world because the last killer-application was Tim Berner Lee's introduction of the World Wide Web in 1991--17 years ago! 

As it relates to "the next big thing" or what users really want, after reading Joel's essay two things popped into my mind immediately. The first is my frustration with needing a different user ID for every website that requires registration. As if to add insult to my injury when I went to make a comment on Joel's essay on Reddit I had to create Yet Another Account Profile (YAAP). I was reminded of the next while reading other users' comments I noticed how poorly discussion forums are implemented as web applications. 

There are many companies and portals that pretend to provide single sign-on. The idea being that users create a single account including user ID and password and are automatically credentialed for multiple applications across the internet. The problem I see with the current approach is two-fold. First, I don't trust many companies with being the guardians of my "official" profile due to my suspicion of their ulterior motives. Will be profile information be sold? Will it be harvested by advertising companies? What will the company or their "partners" do with the information about other sites I authenticate to using their credentials? 

Microsoft Passport wanted to be a single-sign-on for the internet, but Microsoft had already demonstrated their contempt for users making it so difficult to verify the authenticity of my Windows license when simply upgrading my computer--much less throwing it out and replacing it with a new one. Even Microsoft seems to have admitted Passport's reputation by dropping it. Of course, not willing to let go control completely they re-invented it as Windows Live

Do you really want to trust Microsoft with your profile after their Orwellian Windows Genuine Advantage patch? 

There are entities I might be willing to trust. First is the US Post Office. We already trust them to deliver our mail, first class and bulk, desirable or not, and best of all--everything is brought to my door-step by a uniformed representative of the United States Government. 

Perhaps out of necessity, I also trust my bank. Even if it is out of necessity, my credit union hasn't given me cause to believe they want to own me. Instead, my credit union (and bank before that) actually trust me with their money for my credit card, car loan, mortgage, and home equity LOC. 

It's a place to start, anyway. OK, two places to start. 

I'll discuss the next thing in the next article, I'm thinking of calling "The next big thing should stop ruining the last good thing."

Tuesday, July 10, 2007

Bad Idea : Outsourcing Intellectual Property

A familiar echo

A colleague of mine has a theory why Vista requires 2GB of RAM and a late-model CPU to run satisfactorily. He believes this is likely the first edition of Microsoft's flagship operating system primarily developed in India rather than Redmond Washington.

Except for press reports of Microsoft's huge investments in China and India, and their outsourcing of development to those countries, I'm unaware of precisely what is being outsourced and what measures Microsoft has taken to insure a quality product. Quality being measured not just in bugs and resilience to breakdown, but the quality experienced programmers know can exist in the code itself. The economy of expression. Elegant algorithms. Brilliant structures and modularization. Unless Microsoft releases Vista's source code, which I think unlikely, we'll never know for sure whether Vista has the hidden qualities Paul Graham describes in his essay, Hackers and Painters.

The shot heard round the boardroom

Our development team was asked by one of our largest investors to visit another company he owned and analyze their software, development methodologies, and testing procedures. No greater compliment could have been paid us. The company in question was on the verge of signing a large contract that had the potential for significant revenue growth and pressure on the existing software platform. Company directors were anxious about the deal because the software was showing significant signs of stress. When we visited there were over 800 bugs listed as critical. Among them were reports that took too long to be usable, some of their customers were able to see other customers' data, and invoicing was broken.

We'll skip the messy details, but there are some red flags that predicted their problems. To protect the innocent and guilty alike we'll call the company Newco.

The good

Newco had a great start. Their innovative web-delivered service was easy to learn and use. They didn't need the overhead of a sales staff because the service was self-enrolled. Membership included newsletters with helpful articles both on using the system and advice from industry professionals. Additionally, because the service required only an internet connection it was priced competitively and easily won business from other providers.

The bad

Curiously, Newco's management had no previous experience in either their product's industry or software development. They created the service and attracted quality investors, but that was pretty much the end of their most valuable contributions.

Neither Newco or their directors realized they were in the software business. True, the service wasn't software related, but the entirety of Newco's intellectual property was invested in the software. The danger of not knowing what business you're in is loss of focus. In this case the loss of focus wasn't a mere distraction, it was completely misdirected. Instead of jealously guarding and nurturing that which defined their company, the software, their attentions were elsewhere. From the beginning, software development was an expense to be minimized rather than aggressively invested in.

The ugly

Newco's management was filled with large-company escapees that approached small-company's software development the same way a large company might: simple project management. All they had to do was to find inexpensive labor, describe the requirements, agree on delivery dates, and hold the developer to them.

Their CTOs were either not experienced developing software or weren't given the opportunity. The last CTO had no experience writing or designing software (or in Newco's industry) but instead had many years experience managing projects at a large IT consulting firm.

They peddled their IP for development to outside contractors across three countries and two continents--none of them domestic. This isn't an indictment of the quality available from overseas developers, but evidence of how far away geographically and culturally they dispatched their company's jewels. All the time they did this they didn't have in-house technical expertise to measure or critique the software's design or engineering.

Ultimately, Newco lost complete control of the software. It's design, it's host operating system, the database, development tools, infrastructure tools, language, and issue tracking. In short, they'd lost their ability to be self-deterministic and had become completely dependent on other parties for their survival. By the time we arrived their own intellectual property was completely foreign to them both literally and figuratively.

The clever bookend

Which brings us back to Redmond. If my colleague's suspicions are true what might that say about the business Microsoft is in? It may be they're perfectly capable of managing off-shore development with greater competence than Newco possessed. Or it may indicate an significant change of direction for Microsoft--demonstrating it's no longer in the software development business as much as it is another business, perhaps the patent and property protection business?

Microsoft is certainly a large company. Perhaps one of the largest. It's certainly exercised its marketing, legal, and acquisition might and expertise with the financial resources to back them up. And now that its head is turned toward other activities unrelated to the actual exercise of writing its own software has created an opportunity for other companies that are focused on writing their own software and jealously guarding it to establish a beach-head that wouldn't have been imaginable not too many years ago.

Can you say Google?

Newco was eventually sold at a discount to a competitor for the only thing it possessed worth paying for--its customer list.

Monday, June 18, 2007

Databases as Objects: My schema is a class

In my previous article I wrote that the database is the biggest object in my system. If that is the case, I should be able to test the concept against the Gang of Four's Design Patterns to see how the idea holds up. 

But before doing that I need to define, in database terms, what classes are and what their instances may look like. 

In OO terms, a class is a template that defines what its instances look like. Cincom's VW Smalltalk's Date class defines two instance variables, day and year. Given those two instance variables any Date class instance can keep track of a date. 

My database has a schema. That schema can be executed as a sequence of data definition language (DDL) statements to create a new instance. In addition to our production database we have multiple other instances created with the same schema our developers and quality analysts use to test the system. 

Part of a class' template defines its instances methods. Which operations does it support. What behaviors can a user of any of a class' instances expect to be available? Inside a class hierarchy classes inherit the behavior of their superclasses--the classes from which they derive their base behavior. A class can add new behavior or override inherited behavior to create an object with unique capabilities not available in any of its ancestors. 

Before I extend any of my database' behaviors, it too, has default behaviors. At the lowest level I can use SQL statements to introspect and interact with my database in all kinds of low-level ways. On their own, these low-level behaviors know nothing of my application or its unique abilities and requirements. Like a class, though, I can add new behavior or even override default behavior using stored procedures and views to provide unique capabilities not available or impractical if they didn't exist. 

In the world of Sybase, every database inherits the attributes and behavior of a database named Model. 

Model 



Efinnet 

By itself, this is beginning to look like a class tree--though a very shallow one. Something's belonging to a tree isn't more probably based on the depth of a tree (or its lack of depth). In fact, many OO designers are advocating for shallower hierarchies. In either respect, our database fits right in. 

We already talked about instance variables and methods, but what are some of the other OO-ish things my database can do? 

Persistence - One if its most important features is its ability to persist itself on disk and maintain its integrity. The entire state of my system is preserved and maintained inside my database object. 

Introspection - My database can tell me things about itself, its variables and its methods 

Composition - My database is composed of other objects called tables. Some of the tables were inherited from its superclass, others were added to extend its functionality.

Singleton - Instances of my database exist as singletons. For each instance of my system one, and exactly one, instance of my database exists to preserve and protect the state of my system. 

Messages - The only way I can communicate to it is by sending messages to it. I can not (and care not) to manipulate its data directly at a low level (disk) because that would risk its integrity--not in a referential way but at a disk-level consistency way. 

Extendability - I can extend my database's schema to define new variables (tables) and behaviors (procedures). Even better, I can apply the new schema its instances. 

It's amazing it took me 20+ years to recognize the similarities between objects and databases. But now that I'm confident my database is an instance of my schema and in other important respects is in fact an object (singleton) of its own, I can start visiting various of the GoF's patterns to see how well they apply. 
Follow @TomGagne