Just because you can do something doesn't mean you should...

Monday, April 12, 2010

Die Flash Die

(yeah, yeah, not the proper type of title, but not a "regular" post either, also I'll throw in that these are my personal thoughts and do not represent those of the company I work for :-) (did you know someone once tried to get me fired when I worked at Symantec for suggesting that they should not ask other people on Usenet to do their homework for them?!))

Recently, as you probably know, Apple released the beta version of the iPhone SDK with new terms stating that, in essence, for non web-based apps, you could only use C/C++/Objective-C.  This shuts out the upcoming Adobe product that compiles Flash to an iPhone app, as well as other things like MonoTouch.

For years I searched for a language that I liked that was cross platform:

- C sort of is, as long as you are careful about sizes of data types.
- C++ is the same as C, but when I was doing it less so since the compilers were all over the map with their support.  Also C++ is a pretty large language and you get to remember a lot of special cases.  I don't like special cases.
- Java was as close as I could get, and I have been using in since 1.0 Alpha 2, back in 1995 (so coming up on 15 years).

Doing iPhone development you really need to learn Objective-C.  Every time I looked at Objective-C I would shudder.  It is NOT a natural extension of C.  C++ is a natural extension of C.  Objective-C is some whacked out thing.  That being said I sat down with a book or two or three and learned what I needed to.

Back in school we had a C++ course, sadly the instructor didn't know the language (was too new) and he focused on teaching us some pretty useless things.  Once I finished school I taught myself C++ and did a pretty decent job learning what I needed to.  When Java came out it tool me a about 1 day to learn the syntax, a week to get comfortable in the language, and about a month to understand the libraries (I would say it would take me 6 months to do all of that if I started now since Java has gotten a lot larger).

Objecticve-C took me about 1 week until I no longer needed to look at the book to write each line of code.  Fortunate for me I was exposed to Smalltalk in school and Objective-C has a lot of smalltalk ideas.  Once I clued into that it was easier.

I have never looked at developing anything in Flash.  I don't want to develop anything in Flash.  I have Flash disabled in my browser because the last thing I care to see is moving advertisements.  Most Flash games I have seen are "neat" but not "wow cool!".  And there are some sites out there, such as the Lego site, that I cannot see most of the things on when using my iPhone since it is flash based (sucks when you have an 8 year old and you cannot go to the Lego site!).

So Flash is used for:
1) website navigation when it doesn't add value, and companies are too lazy to develop something people without Flash can see
2) advertisements
3) simple games
4) videos

Why on earth does Flash exist then?  The websites would be better off in HTML.  I never look at ads (since they are blocked), if I want to play a game I do it for an  immersive experience, and for Videos, well there are better alternatives than Flash (see HTML5).

More importantly, do I want those people able to easily write things for anything I am on?  My answer is Hell No!

Also, there are technical merits for Apple to do what they are doing (though I am sure developer locking is what they are really after):

- say the iPhone didn't have a camera.  So Flash/MonoTouch/Whatever doesn't have anything for a camera.
- now a new iPhone is released with a camera and Apple adds a number of camera APIs.
- Flash/MonoTouch/whatever are slow to support the camera.

Now you have the case where other companies are controlling what can happen on Apples platform.  That doesn't make for a good user experience.


If you are a Flash developer and you are complaining about Apple because you are too lazy to go and learn C/C++/ObjectiveC this is what I hear:



To the Adobe Flash Evangelist, as the former Symantec Java Evangelist, instead of telling Apple to "Go screw [itself]" I think you should do this instead:




(And, yes, I know the videos are in Flash, and no, the irony is not lost on me...)

Thursday, March 18, 2010

Another Brick in the Wall: Part 1 (Software Design, Part 1)

A while ago Judy asked me for the names of books I would recommend for programmers.  One of them isn't your typical book, and for me it is more of a guiding philosophy than anything.  It is the ISO OSI or the International Organization for Standardization Open System Interconnection Reference Model, also known as the OSI Seven Layer Model (yeah, I know the acronym and the words don't match.... google it!)

The brief background on it is that a long time ago (the '70s) in galaxy far far away (well not too far actually, given it all happened on Earth) the ISO wanted to come up with a standard way of dealing with computer networks.  To do this they came up with seven layers (remember up above it is called the Seven Layer Model) that could be swapped out with different implementations.

The seven layers are (top to bottom):

Application - What the user interacts with
Presentation - data formatting, like encryption
Session - computer to computer handling, such as remote procedure calls
Transport - data transfer between computers
Network - routing between networks
Datalink - low level data transfer, including things like error detection
Physical - the actual device that sends the data

While this model isn't really used in practice (TCP/IP is instead, which only has 4 layers) the concept is very powerful.  The whole idea is that you can swap out any given layer with a different implementation and it will not change any of the other layers.  For example you could swap out the encryption (presentation layer) without having to make any changes to the layers around it (Application or Session).  The reason why that works is that each layer is only allowed to talk to the immediate neighbours (for example Transport can only talk to Session and Network).

So if we don't use something from the '70s then why on earth am I writing about it?  Well, IMNSHO, it is the best way to write software, and the principles that it embodies are very powerful.

A common way to talk about Object Oriented programming a few years ago was the idea of "Software Integrated Circuits" which borrowed ideas from hardware where you could have off the shelf components that you just plug together to make software.  To do that, each piece (class) must have a well defined interface (API - essentially the public methods).  That sort of sounds like something the the OSI model would need doesn't it?

The next step is to group the classes into functionality, which makes the layers.  For example, in the current application we are writing we have two sets of "layer" classes:

- Database
- DataSource

The Database class takes care of all of the database access.  It has some sub-packages that it talks to that do the low level work while it provides a nice abstraction.  Any code that wants to talk to the database must go through the Database class and not make direct use of any of the classes in the subpackage.

The exact same idea us used for the DataSource class which takes care of the network communication with the servers we are pulling data from.  Each REST API call we make to a server has a single method in the DataSource class.

To make things testable the Database and DataSource types are interfaces.  This means we can substitute the "NetworkedDataSource" for the "DatabaseDataSource" and pull all of the data from a database instead of from live servers for testing (put the expected data into the database and have the DatabaseDataSource return that data instead of actually making a network connection to a server - you cannot test changing data!).

On the iPhone Apple provides the Core Data API to abstract the database.  It is great!!!!  Unfortunately every book and sample I have found mixes the Core Data code in with the rest of the application.  Same with the network code. 


We, however, have a Database and a Network class on the iPhone and anything that wants to interact with those pieces has to go through those classes.  The end result is that the code is easier to follow, easier to maintain, and less error prone.

So remember kids, when you are writing code, try to isolate as much as possible into layers of functionality where you "hide" most of the work behind a few well defined methods rather than having stuff strewn all over the code.

The Final Cut (If you are deleting code you are doing something right!)

A long time ago I once worked with someone who said something along the lines of "If you are deleting code you are doing something right".  Of course one might wonder why you should write something that needs to be deleted... but if you are not doing that, you are probably doing something wrong.

As you go through your code you should always keep an eye out for similarities - and once you find them get rid of them.  The way I look at it is if your code isn't all the same you are doing something wrong, and once you notice it is the same if you don't make it all different you are doing something wrong (what?!).

Take the example that I just did in our code:

- Connect to server X and request a JSON result
- Parse the JSON result int a class (we use the Gson library from Google for that)
- Store the information for further processing

We also, for now, save some statistics such as the size and amount of time it took to get the result from the server so we can see if there are some patterns in what makes certain requests slower than others (once you have a lot of data you would be surprised what you can find out).

Each connection to a server is handled pretty much the same way: make a URL, connect to it, read the resulting stream.  Each time we parse the JSON stream it is the same (except that the class we store into changes).  The way the data is stored is the same.  The way information is stored in the database is the same.

Take any request cycle above for any website (we have about 5 requests that we make on 3 different sites) and the code should be pretty much the same.  If you were to blur the code (we will call this the "Blurry Code Stage") you would say that the code is all the same - the only difference would be some string constants, a class name, and, perhaps, some variable names. 

It makes perfect sense that code that has to do something similar to something else looks pretty much the same.  If it doesn't look the same then you have a big problem.  Code that does the same sort of thing in a different way is harder to understand (now you need to understand 2 or more ways instead of just one) and you will probably have way more bugs in the code.


Once you are at the "Blurry Code Stage" you should realize that you have done the right thing: you have made the code the same.

Once you are at the "Blurry Code Stage" you should realize that you have done the wrong thing: you have made the code the same.

(WHAT?!????!!!!one!!!!!111?????)

The "Blurry Code Stage" is a turning point; you have done the right thing by making a lot of code follow the same pattern, but at the same time the pattern repeats time and time again which means you can consolidate the code.   Consolidating means deleting code, and "If you are deleting code you are doing something right".  You should strive to get your code to the "Blurry Code Stage" and then strive to remove as much code as possible.

The above example in our code probably winds up being about 100 lines in total (no, I am not going to go check the code and count the lines!).  We have 5 sites.  That means there are about 500 lines when there only needs to be about 100 (that will actually go up a bit because you probably need to introduce some variables and probably some additional logic, so lets' say 125 lines instead of 500).

If you are lucky you will notice after the 3rd one that the code is similar and you should refactor it.  For example, I caught it after the 2nd one and managed to move all of the database code into one place.  That meant that the other three requests automagically had the database code work with no further effort on my part.  I noticed that the website request was the same after the 4th one and took about 120 lines of code.

Last night I noticed that I really have two ways of parsing the JSON data.  I also noticed that I didn't write the code the same way, so I really have three different ways of writing it (and in reality there are four ways I have written that code).  Grrrrrrr!  So how to fix it?  First up - parse the JSON data the same way.  That will get me down to three ways.  Next is to get three ways down to two... which will involve adding some new methods, some of which will do nothing (if they do nothing why add them?!, keep reading, I'll get to it).

Doing all of this in a Object Oriented language (I am using Java) is pretty easy. For example:

public class Add
{
    public int add(final int a, final  int b)
    { 
        if(a < 0)
        {
            throw new  IllegalArgumentException("a must be >= 0, was: " + a);
        } 

        if(b < 0)
        {
            throw new  IllegalArgumentException("b must be >= 0, was: " + b);
        }

        return (a + b);
    }
}

public class Subtract
{
    public int subtract(final int a, final  int b)
    { 
        if(a < 0)
        {
            throw new  IllegalArgumentException("a must be >= 0, was: " + a);
        } 

        if(b < 0)
        {
            throw new  IllegalArgumentException("b must be >= 0, was: " + b);
        }

        return (a - b);
    }
}

Now blur your eyes:

public class Add/Subtract
{
    public int add/subtract(final int a, final  int b)
    { 
        if(a < 0)
        {
            throw new  IllegalArgumentException("a must be >= 0, was: " + a);
        } 
         
        if(b < 0)
        {
            throw new  IllegalArgumentException("b must be >= 0, was: " + b);
        }

        return (a +/- b);
    }
}

You see it is only Add/Subtract, add/subtract, and +/- that are different.  We can easily fix that like this:

public abstract class Operation
{
    public final int perform(final int a, final  int b)
    {
        final int result;

        if(a < 0)
        {
            throw new  IllegalArgumentException("a must be >= 0, was: " + a);
        } 

        if(b < 0)
        {
            throw new  IllegalArgumentException("b must be >= 0, was: " + b);
        }

        result = doPerform(a, b); 

        return (result);
    }
   
    protected int doPerform(int a, int b);
}

public class Add
    extends Operation
{
    protected int doPeform(final int a, final int b)
    {
        return (a + b); 
    }
}

public class Subtract
    extends Operation
{
    protected int doPeform(final int a, final int b)
    {
        return (a - b); 
    }
}

Now if you want to add a new operation you just have to extend the Operation class and then override the doPerform method to do the right thing.

In the case of divide we might want to do something special when dividing by zero...

public class Divide
    extends Operation
{
    protected int doPeform(final int a, final int b)
    {
        // do not want a divide by zero error, but also don't necessarily want to handle the special case here... 
        return (a / b); 
    }

    protected void checkNumbers(final int dividend, final int divisor)
    {
        if(divisor == 0)
        {
            throw new IllegalArgumentException("divisor cannot be 0"); 
        }
    }
}

now we need some way to call it:

public abstract class Operation
{
    public final int perform(final int a, final  int b)
    {
        final int result;

        if(a < 0)
        {
            throw new  IllegalArgumentException("a must be >= 0, was: " + a);
        } 

        if(b < 0)
        {
            throw new  IllegalArgumentException("b must be >= 0, was: " + b);
        }

        checkNumbers(a, b); 
        result = doPerform(a, b);

        return (result);
    }
    
    protected int doPerform(int a, int b);
 
    protected void checkNumbers(final int a, final int b)
    {
        // add and subtract don't do anything, they just inherit this empty method 
    }
}

Great... but now we don't pass the "Blurry Code Test" since the a < 0 and b < 0 are pretty close to the divisor == 0... so I'd do:

public abstract class Operation
{
    public final int perform(final int a, final  int b)
    {
        final int result;

        checkNumbers(a, b); 
        result = doPerform(a, b); 

        return (result);
    }

    protected int doPerform(int a, int b);

    protected void checkNumbers(final int a, final int b)
    {
        if(a < 0)
        {
            throw new  IllegalArgumentException("a must be >= 0, was: " + a);
        } 
    
        if(b < 0)
        {
            throw new  IllegalArgumentException("b must be >= 0, was: " + b);
        }
    }
}

public class Divide
    extends Operation
{
    protected int doPeform(final int a, final int b)
    {
        // do not want a divide by zero error, but also don't  necessarily want to handle the special case here... 

        return (a / b); 
    }

    protected void checkNumbers(final int dividend, final int divisor)
    {
        // could do it this way, or use a doCheckNumbers like the other way 
        super.checkNumbers(dividend, divisor);

        if(divisor == 0)
        {
            throw new IllegalArgumentException("divisor cannot be  0"); 
        }
    }
}

Add and Subtract remain unchanged.

Some newer programmers (and probably some older programmers) don't feel comfortable with this because now you have to jump around the code to follow it. Short answer: get over it. Your methods should be small, and pretty easy to understand.

The benefit is that you can add new operations (multiply for example) without having to do a lot of work. If that work consists of copy/paste then you are at the "Blurry Code Stage" and you really should stop and fix the code instead of continuing on.

In my case I want to add a sixth call to a website and I'd rather do that in about 10 lines rather than 100. Is it slower to change the code then just add the new one? In the short term yes it is slower.. but in the long term you get a huge pay off in terms of time when coding something new or when fixing a bug (you can fix the bug in one place not many).

While I swapped between this blog and the code I wound up doing the following:

1) getting it down to one single way of parsing instead of 4
2) normalizing the JSON I am parsing (I used to jump through some hoops to only parse certain bits, now all of the returned JSON is parsed in all cases)
3) fixed a semi-related bit of ugliness in the code - some times I used the raw JSON results, other times I used extra classes that augmented the results.  Now I always use augmented results.

The last two are important because they make the code consistent (consistent code is probably the most important thing) and may lead me to more blurry code later.

Thursday, February 11, 2010

Thinking is bad / It is hard work being lazy.

 It has been a while since I wrote anything on here... been too busy.

I just checked and it has been exactly one month since the first check-in of the code to subversion.  Given that I was hoping for this to take two weeks you might think I'd be disappointed, but we changed the focus a bit, I had a few things to learn, and there was some refactoring of code to make it better.  The refactoring is one of those things where it sucks up time now, but in the end it saves so much time.  As I used to tell my students "Thinking is bad" - and "It is hard work being lazy".  The refactoring made the code much simpler to understand (less thinking) and it took a chunk of time to do the refactoring but I have an easier time adding new features (I get to be lazy).

Speaking of thinking and lazy - I HATE programming books.  For a beginner they show you all sorts of bad habits, for someone who is more advanced they make you figure out what they are actually trying to do because you have to fight your way through all the bad habits.  A while back Judy asked me for a book list... I think I am going to take some time in the next while to do a blog on each book and why I like it.  If you ever write a book or sample code do the work a favour and make sure that you write nice, clean code, instead of something that tries to hide real work complexity.

The server is very stable now - no memory leaks, runs for days without any issues (before I stop it to put in some changes), and it is very easy to add in new things.  The client is underway on the iPhone. 

I have been doing my best to keep the number of messages between the client and the server low.  I thought I could get by with one type of message, but in order to keep the overall data transfer small I wound up with two required messages and an optional one.  Out of the two required messages one is, strictly speaking, optional but in reality I am sure it will be sent as often as the required one.  What this does is make the client a little slow on startup (but not too slow compared to the competition), but once it is up and running it is fast since there are no more data transfers required for the most common things.  I'll have to implement one more message (another optional one).  I also managed to get the size of the messages down very small.  Best case it is 10x smaller than the first way I did it, and on average it should be at least 5 times smaller I think.

Now that the client is underway I am in the position of do some work on the client, then hop back over and make changes to the server (like adding new messages as needed).  I got part way setting up the database on OS X, I should go back to that so I can do the server coding under OS X instead of Windoze.  It is annoying to reboot, make the server changes, then reboot again to test the changes.  I could use two machines I suppose, but that smacks of effort :-)

I found a handful of beta testers to try out the app once I get it a bit more functional.  The only comment so far was that the colour would have to change for one of them to test it (I liked that colour!).  If that is someones biggest complaint I'll be happy!

Tuesday, February 9, 2010

Olympic Torch Run

Took a 5 minute break to go watch the torch pass by the local pub... you will notice that here in New Westminster it is more of a walk than a run... :-)

Friday, January 29, 2010

It Would Be So Nice

After about 18 hours of coding yesterday I feel like this... (http://xkcd.com/695)

Tuesday, January 19, 2010

Signs of Life (Servers up, client is underway... oh and network bandwidth)

Ah the life of a programmer.  Nothing is ever "done".  I have a few "TODO" comments in the server code, some are simple (like this variable has no state so should not be a local - trying to keep things fast) others are more complex (this needs to be refactored out into a few other classes).  None of them are too difficult, but need to prioritize :-)

Recently I learned that EclipseLink (the JPA provider that I am using for the database access) has some not-fun behaviours when you deal with two ore more EntityMangers - there is a a cache for each EntityManager which means that changes made to the database in one are not visible to changes in the other... sigh.  After some digging around I found I had to add a line to my persistence.xml: "" which fixes that issue.

I took the time to make a number of changes to the database and the architecture... now things move much faster (coding wise).  The code is safer, easier to understand, and not slow.  I used JMeter to to a load test on the server... the good news is that it is network bound (meaning I am limited by how much data I can push out the Ethernet card) instead of CPU or disk bound.  Basically my code is faster than the network.  That puts me into the next part of optimization - reducing the network traffic.  I am now figuring out ways that I can send less data per request.  If the way I am thinking of doing it works then I should be able to at least double, if not triple or quadruple the number of concurrent connections. 

When you (or your company) are paying for the bandwidth you need to be extra careful to keep the data your are sending out small!  I cannot imagine the bill youtube has for network bandwidth every day let alone every month  - just did a search: "To serve up all these streams, the company has to pay for a broadband connection capable of hurtling data at the equivalent of 30 million megabits-per-second—about 6 million times as fast as your home Internet connection. All this bandwidth costs Google $360 million a year, the analysts estimate.",

Wow... ok now my ~2500bytes per request doesn't seem so bad :-) (but I still want to get it WAY smaller).

The bandwidth issue isn't just on the company side either, clients have to pay for their Internet access too.  When you are talking about a mobile device, such as the iPhone, you really want to reduce the number of times the app has to hit the network and the amount of data it has to send and request.  I'd say it should take more effort to design the data to be small than it does to ensure that the web service is fast enough to push the data out.  A client won't notice an extra 10ms spent on the server but they will notice an extra 2 seconds with the app fiddling around on the network.  5 seconds and they will probably never start the app again.

The server ran for a number of days without a hiccup (after I fixed a couple of minor issues).  I just added a new source of data which needed a database change and a service restart - so off it'll go for another few days.  While that happens I'll be working on the iPhone client.  I think I'll save working on the data reduction until Judy is fine with the functionality and the UI on the client.