Anything worth doing is worth doing poorly

Tuesday, April 13, 2010

Inside-out design : Parts I and II

The topic of bottom-up vs. top-down design has accumulated a lot of baggage since both descriptions of system design were first introduced (1970s). Both are perhaps well understood, or many may assume they understand both. This series of articles introduces the terms inside-out and outside-in to help readers visualize a three-dimensional design (an onion’s layers would be a good example) rather than a two-dimensional design similar to tree rings.

Business software is discovered, not invented. Arguments that computer technology has fundamentally changed business, or even invented it, are exaggerated. The business of banking remains much the same as it was 150 years ago, deposits and loans. Insurance remains much the same, pay smaller amounts now for the promise to cover expenses later. Retailing, logistics, and drafting are also mostly unchanged.

If computers haven’t invented these businesses what can we truthfully assert they have done? We can assert that they’ve helped make humans, both individually and collectively, super human. The way in which software has incrementally accomplished this feat can be described as from the inside-out. This article will elaborate on what inside-out design is, use it as a model for how new software projects should be designed and developed, and describe how inside-out design (IOD) avoids the many shortcomings of alternative approaches.

“We can make him better than he was before. Better, stronger, faster.”
Introduction to The Six Million Dollar Man

Though banking may be a complicated business, its basic activities are simple. Customers “deposit” money at the bank and are paid interest. Banks pay a lower interest rate on deposits than they earn lending money to other customers.

Perfect Memory

Our first step at creating a super-human banker is to improve their memory—regardless of age. How many customers, account balances, and interest rates for each can a human remember perfectly on their own? Whatever that number is a banker that can remember 100 times more will be more profitable. The banker that can remember 100 times more than that even more profitable. A banker with perfect memory is limited only by his productivity and efficiency—but we’ll address that later.

Perfect memory is what databases provide the banker. A database is capable of remembering, perfectly, the name of every customer, their address, phone number, accounts, account balances, transaction history, and even their relationships to other customers and their accounts.

This is the core of our inside-out design. The business already existed—all we did was discover and record the banking schema into a database.

If nothing else is done, our banker may be better off than they were before. Without any additional features the possibilities are nearly endless. Anything that can be stored in the database can be done so perfectly. Any number or type of account and any number or type of transaction can be perfectly stored and perfectly retrieved.

Much more can be written of the benefits of relational databases, and indeed much already has. Not the least of which include RDB’s basis in relational set theory, referential integrity, and normalization.

But even with mathematically provably correct data, perfect memory can still be tarnished with imperfect manipulation. The next layer will enhance the first with perfect execution.

Perfect Execution

With perfect memory our banker will never forget your name or account balance. They simply record each of your transactions in their database.

If this were a relational database our banker could use SQL. Using SQL they can find your account number using your name or phone number:

SELECT @ACCOUNT_NUMBER = ACOUNT_NUMBER

FROM CUSTOMER

JOIN ACCOUNT

ON ACCOUNT.OWNER_KEY = CUSTOMER.CUSTOMER_KEY

WHERE CUSTOMER.PHONE_NUMBER = "248 555 2960"

Once they have your account number they can enter the transaction

INSERT INTO TRAN_HISTORY (ACCOUNT_NUMBER, TRANSACTION , AMOUNT)
VALUES (@ACCOUNT_NUMBER, “DEPOSIT”, $100.00)

Depending on how “perfect” their database is, and how many accounts the customer has, or whether they recently bounced a check and must pay an NSF fee, or how accounts feed the general ledger, more SQL will likely be required to keep everything “perfect.”

So even though the banker can remember perfectly what was done they have difficulty remembering how to do it.

Most contemporary relational databases provide a mechanism for building SQL macros or functions called stored procedures. Stored procedures extend the syntax of SQL and a mechanism for storing the function inside the database itself. In this manner an RDB may hide the details of its schema as much for its own benefit as our banker’s. Additionally, invoking stored procedures is simpler than typing all the SQL each time, making it easier for more bankers to use the database even if they must still learn some syntax.

If SQL is the lowest-level language for manipulating relational database tables, or 1st generation language, stored procedures can be thought of as a less low-level or 2nd generation language. Using stored procedures are example above may be simplified.

EXEC ACCOUNT_DEPOSIT(“248 555 2960”, $100.00)

How ACCOUNT_DEPOSIT is implemented is hidden both by virtue and necessity. By virtue because bankers don’t have to remember all the details of an account deposit, and by necessity because such an interface is required to provide perfect execution—the database is always updated consistently no matter who invokes the procedure. Additionally, the procedure is free to change its implementation without affecting bankers as long as the order, type, and number of the procedure’s arguments are unchanged.

The reasons for the procedure’s change are also hidden from the procedure’s users. Its implementation may have changed because of new features or schema change. Regardless the reason, the procedure’s consumers benefit by its improved implementation without needing to change what they already know and the processes they’ve already documented.

It’s worth noting that an RDB that provides stored procedures is very much like an object in a traditional object-oriented point-of-view. Just as objects implement publicly-accessible methods to hide their implementation our banking RDB schema implements publicly-accessible procedures to hide its implementation.

Our banking database’s stored procedures define its Application Programming Interface. Any user can use the stored procedures to affect perfect transactions.

It’s important to pause here and contemplate an important inside-out feature. Any user can use the stored procedures to affect perfect transactions. One banker may be a teller, another may be an ATM, or a Point-of-Service terminal, or still another may be a web page.

Even though our implementation requires applications (tellers, ATMs, POSs, etc.) have access to our database, no other technical hurdle is erected. Any programming language that provides a library to access our RDB is capable of executing perfect transactions. In this sense, the surface area of our system has been increased. We’ve simultaneously improved our system’s integrity while increasing its utility to other languages and applications.

Outside-in designs may approach this differently. It is too commonplace for applications to be designed from the outside-in—designing the user interface first and the supporting infrastructure afterwards. The result, though possibly to the user’s liking, is only as capable as it will ever be. It has only a single interface and its supporting mechanisms implement only that interface's required features. It has little surface area.

So now our banker has perfect memory and perfect execution. In the next article we’ll explore inside-out’s next super-human enhancement—ubiquity.

Tuesday, December 2, 2008

If it's not in Bugzilla, it doesn't exist

There are many ways to manage projects. Just because I understand time estimates are important doesn't mean I have to like or believe them.

An alternative to time-lines and resources estimates is to manage development, enhancements and fixes with little more than a defect tracking system. At InStream we used Bugzilla.

Using Bugzilla or any defect tracking tool as a substitute for project management software may not work for everybody, but it worked well for us. Below I'll describe why and how we used it.

As the development team at InStream grew larger and end-user requests more frequent, we did what most companies do--create a technology steering committee to track and prioritize enhancements and fixes to more closely track the priorities of our business. We had a board with 3x5 cards on it we filled out with each request, we put them into buckets on the board describing what might be done one-week out, two-weeks out, and included a when-we-get-to-it-we'll-get-to-it category.

The committee consisted of the COO, the CTO (myself), the development staff, the QA manager, the CCO (chief credit officer), and some of our end-users.

A project manager was appointed and their job was to organize the cards after our meetings into a software package to track the requests, the progress on them, and prepare for the next meeting with updates for the entire committee.

A funny thing happened over the next few weeks. It turns out our development staff was so quick at implementing features and fixing bugs that the steering committee wasn't unable to keep up with the progress. More time was spent trying to keep "The Project" updated and current than was required to enhance the software.

The developers had recently started using Bugzilla to organize themselves and give myself insight into what they were doing during the day. We were using it so well, in fact, we proposed banning the committee in favor of relying on Bugzilla--with a few usage guidelines.

Rule Number One

Whether it was a bug, feature request, or fix, I had a simple rule for all our users and developers: If it's not in Bugzilla it doesn't exist.

For end-users it meant that everything they wanted the system to do, or anything they thought needed fixing, or anything they thought could look better or perform faster had to be entered into the system--by them.

User's couldn't complain about a bug they hadn't reported. They couldn't be waiting for a feature they hadn't asked for. By entering the bug themselves users took ownership of the bug's reporting, its description, and ultimately (and this is important) its closing. A bug wasn't closed until the user confirmed it in production.

A side-benefit of using Bugzilla is it also became our working requirements tool. Users would describe what they thought they needed, developers would ask questions about it, users would clarify, developers would confirm, and the end result was a complete audit trail of a design requirement, followed from definition, implementation, deployment, to end-user acceptance.

Does your project management software do that?

For developers it meant they didn't work on anything that didn't exist in Bugzilla even if they had to enter it themselves.

One of the benefits of a defect tracking system over project management is the ability to create tasks (incidents, bugs, items, whatever you want to call them) to document what it is your developers do all day. Bugzilla was then able to report who opened items, who worked on them, who checked-in the fixes, and when the items were resolved.

As a manager I discovered it more valuable to monitor the velocity of my staff's productivity than the time they spent being productive. As the system's original developer (but kicked-out of coding by my staff) I discovered I could use Bugzilla as a way to program through my staff, except instead of writing Smalltalk or PHP I only needed to describe what I wanted it to do and it would find its way into the code base.

Making Bugzilla easy for end-users is relieving them of having to answer all the questions Bugzilla asks. We agreed that end-users were only responsible for the description and prioritizing requests so engineering had an idea how important it was to them.

Each new bug would go through triage, usually by a developer. It was the developer's responsibility to figure out which product the bug related to, which category, and what the bug's severity was.

And because Bugzilla copies bug owners on everything that happens to their requests, our end-users never had to ask if something was being worked on or what its status was. They received email updates every time a bug's status changed and learned to get excited when the saw the CVS COMMIT messages recorded to their requests.

Engineering and QA shared the responsibility of determining which fixes would be included in which releases. We delivered to production both hot fixes and releases.

Hot fixes consisted of bug fixes and enhancements with minimal or isolated impact to the database, that could be moved into production with few or no side effects. Hot fixes could occur daily, and it was not unusual for cosmetic or low-impact bugs to be corrected same-day.

Full Releases were reserved for database changes impacted either many systems or our posting programs. Since protecting the database was our production rule #1 we were careful that database changes and the posting programs were well tested before releasing them into production.

Thursday, May 1, 2008

The next big thing

Joel Spolsky is the president of Fogg Creek Software and frequent commentator on the software development industry. His latest article, Architecture Astronauts, criticizes Microsoft's continued re-invention of something no one seems to want.

Read Joel's article to get the full comic affect, but here's a pertinent excerpt:

When did the first sync web sites start coming out? 1999? There were a million versions. xdrive, mydrive, idrive, youdrive, wealldrive for ice cream. Nobody cared then and nobody cares now, because synchronizing files is just not a killer application. I'm sorry. It seems like it should be. But it's not.

A killer application would certainly be the next big thing. If you're unsure what a killer application is think of the first word processor, spreadsheet, or database program. Some of you may not appreciate the impact a killer-application can have on the world because the last killer-application was Tim Berner Lee's introduction of the World Wide Web in 1991--17 years ago!

As it relates to "the next big thing" or what users really want, after reading Joel's essay two things popped into my mind immediately. The first is my frustration with needing a different user ID for every website that requires registration. As if to add insult to my injury when I went to make a comment on Joel's essay on Reddit I had to create Yet Another Account Profile (YAAP). I was reminded of the next while reading other users' comments I noticed how poorly discussion forums are implemented as web applications.

There are many companies and portals that pretend to provide single sign-on. The idea being that users create a single account including user ID and password and are automatically credentialed for multiple applications across the internet. The problem I see with the current approach is two-fold. First, I don't trust many companies with being the guardians of my "official" profile due to my suspicion of their ulterior motives. Will be profile information be sold? Will it be harvested by advertising companies? What will the company or their "partners" do with the information about other sites I authenticate to using their credentials?

Microsoft Passport wanted to be a single-sign-on for the internet, but Microsoft had already demonstrated their contempt for users making it so difficult to verify the authenticity of my Windows license when simply upgrading my computer--much less throwing it out and replacing it with a new one. Even Microsoft seems to have admitted Passport's reputation by dropping it. Of course, not willing to let go control completely they re-invented it as Windows Live.

Do you really want to trust Microsoft with your profile after their Orwellian Windows Genuine Advantage patch?

There are entities I might be willing to trust. First is the US Post Office. We already trust them to deliver our mail, first class and bulk, desirable or not, and best of all--everything is brought to my door-step by a uniformed representative of the United States Government.

Perhaps out of necessity, I also trust my bank. Even if it is out of necessity, my credit union hasn't given me cause to believe they want to own me. Instead, my credit union (and bank before that) actually trust me with their money for my credit card, car loan, mortgage, and home equity LOC.

It's a place to start, anyway. OK, two places to start.

I'll discuss the next thing in the next article, I'm thinking of calling "The next big thing should stop ruining the last good thing."

Tuesday, July 10, 2007

Bad Idea : Outsourcing Intellectual Property

A familiar echo

A colleague of mine has a theory why Vista requires 2GB of RAM and a late-model CPU to run satisfactorily. He believes this is likely the first edition of Microsoft's flagship operating system primarily developed in India rather than Redmond Washington.

Except for press reports of Microsoft's huge investments in China and India, and their outsourcing of development to those countries, I'm unaware of precisely what is being outsourced and what measures Microsoft has taken to insure a quality product. Quality being measured not just in bugs and resilience to breakdown, but the quality experienced programmers know can exist in the code itself. The economy of expression. Elegant algorithms. Brilliant structures and modularization. Unless Microsoft releases Vista's source code, which I think unlikely, we'll never know for sure whether Vista has the hidden qualities Paul Graham describes in his essay, Hackers and Painters.

The shot heard round the boardroom

Our development team was asked by one of our largest investors to visit another company he owned and analyze their software, development methodologies, and testing procedures. No greater compliment could have been paid us. The company in question was on the verge of signing a large contract that had the potential for significant revenue growth and pressure on the existing software platform. Company directors were anxious about the deal because the software was showing significant signs of stress. When we visited there were over 800 bugs listed as critical. Among them were reports that took too long to be usable, some of their customers were able to see other customers' data, and invoicing was broken.

We'll skip the messy details, but there are some red flags that predicted their problems. To protect the innocent and guilty alike we'll call the company Newco.

The good

Newco had a great start. Their innovative web-delivered service was easy to learn and use. They didn't need the overhead of a sales staff because the service was self-enrolled. Membership included newsletters with helpful articles both on using the system and advice from industry professionals. Additionally, because the service required only an internet connection it was priced competitively and easily won business from other providers.

The bad

Curiously, Newco's management had no previous experience in either their product's industry or software development. They created the service and attracted quality investors, but that was pretty much the end of their most valuable contributions.

Neither Newco or their directors realized they were in the software business. True, the service wasn't software related, but the entirety of Newco's intellectual property was invested in the software. The danger of not knowing what business you're in is loss of focus. In this case the loss of focus wasn't a mere distraction, it was completely misdirected. Instead of jealously guarding and nurturing that which defined their company, the software, their attentions were elsewhere. From the beginning, software development was an expense to be minimized rather than aggressively invested in.

The ugly

Newco's management was filled with large-company escapees that approached small-company's software development the same way a large company might: simple project management. All they had to do was to find inexpensive labor, describe the requirements, agree on delivery dates, and hold the developer to them.

Their CTOs were either not experienced developing software or weren't given the opportunity. The last CTO had no experience writing or designing software (or in Newco's industry) but instead had many years experience managing projects at a large IT consulting firm.

They peddled their IP for development to outside contractors across three countries and two continents--none of them domestic. This isn't an indictment of the quality available from overseas developers, but evidence of how far away geographically and culturally they dispatched their company's jewels. All the time they did this they didn't have in-house technical expertise to measure or critique the software's design or engineering.

Ultimately, Newco lost complete control of the software. It's design, it's host operating system, the database, development tools, infrastructure tools, language, and issue tracking. In short, they'd lost their ability to be self-deterministic and had become completely dependent on other parties for their survival. By the time we arrived their own intellectual property was completely foreign to them both literally and figuratively.

The clever bookend

Which brings us back to Redmond. If my colleague's suspicions are true what might that say about the business Microsoft is in? It may be they're perfectly capable of managing off-shore development with greater competence than Newco possessed. Or it may indicate an significant change of direction for Microsoft--demonstrating it's no longer in the software development business as much as it is another business, perhaps the patent and property protection business?

Microsoft is certainly a large company. Perhaps one of the largest. It's certainly exercised its marketing, legal, and acquisition might and expertise with the financial resources to back them up. And now that its head is turned toward other activities unrelated to the actual exercise of writing its own software has created an opportunity for other companies that are focused on writing their own software and jealously guarding it to establish a beach-head that wouldn't have been imaginable not too many years ago.

Can you say Google?

Newco was eventually sold at a discount to a competitor for the only thing it possessed worth paying for--its customer list.

Monday, June 18, 2007

Databases as Objects: My schema is a class

In my previous article I wrote that the database is the biggest object in my system. If that is the case, I should be able to test the concept against the Gang of Four's Design Patterns to see how the idea holds up.

But before doing that I need to define, in database terms, what classes are and what their instances may look like.

In OO terms, a class is a template that defines what its instances look like. Cincom's VW Smalltalk's Date class defines two instance variables, day and year. Given those two instance variables any Date class instance can keep track of a date.

My database has a schema. That schema can be executed as a sequence of data definition language (DDL) statements to create a new instance. In addition to our production database we have multiple other instances created with the same schema our developers and quality analysts use to test the system.

Part of a class' template defines its instances methods. Which operations does it support. What behaviors can a user of any of a class' instances expect to be available? Inside a class hierarchy classes inherit the behavior of their superclasses--the classes from which they derive their base behavior. A class can add new behavior or override inherited behavior to create an object with unique capabilities not available in any of its ancestors.

Before I extend any of my database' behaviors, it too, has default behaviors. At the lowest level I can use SQL statements to introspect and interact with my database in all kinds of low-level ways. On their own, these low-level behaviors know nothing of my application or its unique abilities and requirements. Like a class, though, I can add new behavior or even override default behavior using stored procedures and views to provide unique capabilities not available or impractical if they didn't exist.

In the world of Sybase, every database inherits the attributes and behavior of a database named Model.

Model
^
|
|
Efinnet

By itself, this is beginning to look like a class tree--though a very shallow one. Something's belonging to a tree isn't more probably based on the depth of a tree (or its lack of depth). In fact, many OO designers are advocating for shallower hierarchies. In either respect, our database fits right in.

We already talked about instance variables and methods, but what are some of the other OO-ish things my database can do?

Persistence - One if its most important features is its ability to persist itself on disk and maintain its integrity. The entire state of my system is preserved and maintained inside my database object.

Introspection - My database can tell me things about itself, its variables and its methods

Composition - My database is composed of other objects called tables. Some of the tables were inherited from its superclass, others were added to extend its functionality.

Singleton - Instances of my database exist as singletons. For each instance of my system one, and exactly one, instance of my database exists to preserve and protect the state of my system.

Messages - The only way I can communicate to it is by sending messages to it. I can not (and care not) to manipulate its data directly at a low level (disk) because that would risk its integrity--not in a referential way but at a disk-level consistency way.

Extendability - I can extend my database's schema to define new variables (tables) and behaviors (procedures). Even better, I can apply the new schema its instances.

It's amazing it took me 20+ years to recognize the similarities between objects and databases. But now that I'm confident my database is an instance of my schema and in other important respects is in fact an object (singleton) of its own, I can start visiting various of the GoF's patterns to see how well they apply.

Monday, June 11, 2007

I remember my first time...

A recent ACM Queue Advice Column by Kode Vicious, called Advice to a Newbie, asked:

Do you remember the first time? The first time when, after struggling with a piece of code, you felt not only "I can do this!" but also "I love doing this!"

I still remember that rush. It was addictive. When it happened I decided what I wanted to be when a grew up: a computer programmer.

In 1983 I was senior at Troy High School, in Michigan. I was taking a computer programming elective at the same time I was taking trigonometry. We were learning BASIC on Apple IIe computers. Our final assignment was to write a graphic animation of something. Anything. Mine was influenced by both being a high-school student and America's 1981 return to space with NASA's shuttle program.

Using BASIC and the IIe's low-resolution graphics (pixels that seemed the size of a Tic-Tac) I simulated the launch of the Space Shuttle Columbian (did I mention I was in high school?). My rendering of the shuttle was as good as it could have been, considering the resolution, and included a 10-second count-down, smoke, flames, and a lift-off off the top of the screen. After that the shuttle was flying right-to-left, with the appearance of motion provided by stars in the background moving left-to-right. The loops were gigantic. Inside the loops the program made sure the stars disappeared behind the shuttle and reappeared at the appropriate time.

Then the pièce de résistance, a PacMan moved across the screen and gobbled the shuttle into nothingness.

I got an A.

But better than that, I triumphed over the task using BASIC and geometry. The loops moving the stars non-destructively behind the shuttle were nothing compared to the routines to open and close the PacMan's mouth as it moved across the screen. I remember how impressed my parents pretended to be when I showed them the print-out of the code.

I also remember how slow the program ran. It seemed everything was happening under water. I could almost make out each line of the PacMan's mouth closing drawing yellow then black again to open it as it devoured the space ship.

But then something amazing happened.

Our teacher, Mr. Ralph Nutter, who was my older brother's math teacher and swim coach a few years earlier, demonstrated all our projects in front of the entire class--but now they were compiled into machine language. The lift-off was smooth and the screen almost looked as though it were on fire. Most importantly, my PacMan moved across the screen so smoothly and cleanly the jagged resolution was invisible, and it seemed to race over the shuttle so gloriously I could hear the game's music playing inside my head.

And I was hooked.

That was 24 years ago and to this day, it is one of the single biggest life-changing events of my life. Almost everything that's happened to me since turned on what happened that last May in 1983, 4th hour, in the closing days of my last year in school.

Wednesday, June 6, 2007

The database is the biggest object in my system

After posting a link to my The TOA of Tom a couple interesting discussions occurred inside comp.object. While responding to Bryce Jacobs, a better way of describing what we're doing came to me. It's buried in:

In fact, after inspecting multiple C APIs for LDAP, iCAL, and other libraries it appears it's not even foreign to C programmers. Often a structure is used to hold state information for a connection to a resources, but the format of that information isn't exposed to API-users except through an API. Even when a method may only be returning a value from the structure, API programmers know that level of indirection affords them flexibility inside the structure they may use without negatively impacting all the API's users.

So a common pattern is used by both C and OO programmers. What my paper is promoting (and I'll try to do a better job explaining) is that the same pattern be applied to how OO programs access the DB.

Essentially, my paper on transaction processing encourages thinking of the database as one big object with all the rules for data hiding and interfaces OO programmers are already acquainted with.

Why shouldn't applications have embedded SQL? Because it's the same as accessing the private data members of an object. It shouldn't be done. OO programmers know the correct way to interface with an object is to use its method interface--not attempt direct manipulation of the object's data. OO programmer's attempts to violate that rule is what causes so much frustration mapping the application's data graph into a relational database's tables, rows, and columns. Those things belong to the DB--not to the application.

Now, OO programmers and system designers can return to their favorite Patterns books and reevaluate the lessons from a new perspective. Should make some interesting reading.