Caching, virtual bits don’t rot

October 28th, 2014 No comments

This is very common so I just wanted to address it because it annoys me.

Time based caching is a last resort, not a default go-to.  Virtual pages don’t turn yellow over time, data in a cache doesn’t start to slowly rot away. Caching isn’t a stop-gap solution against bad performance, it’s a layer or multiple layers in your application that you have to think about.

Proper caching strategies can improve the performance of your applications by a metric ton. The reason is obvious, instead of doing something every single time, you only do it when it’s needed. No more, no less.

But more often then I’d like I see caching thrown in as a stop-gap solution, where some part of the application couldn’t scale well enough and some caching is thrown in around it and it’s set to refresh every 5 minutes or every 10 hours or every 24 hours or what have you.  It’s ugly and it’s setting you up for a technical debt.

Caching should be a holistic solution. Applications have (spaghetti legacy code notwithstanding) natural separators between certain parts. The database model, some remote API, your controllers, etc.. These are natural places to add a caching layer. More importantly by adding caching in these places you can ensure that neither side of the code overly depends on the caching.  As opposed to slapping say a few lines of caching code around some bits of code but not others. That just adds to complexity, potentially creates unexpected behaviour, and it probably becomes impossible to do proper cache warming.

Now that we have this thin caching layer. Instead of setting a time to live and calling it a day, actually take a step back and try to get it to cache for as long as possible. Data doesn’t rot, and cached HTML output doesn’t turn yellow.  What you want is independent invalidation.

For the sake of having an example let’s say we have added some caching to our database model and when getting a User object from our repository we actually return a cached version instead of doing a database query. And we won’t invalidate that cache until the User object actually changes. We can detect when it changes by simply triggering the cache invalidation when the User object gets saved with changes.

You want to have the caching as a separate service not tightly integrated with your object model though. Because if you at some point want to do a bulk change on the Users in your database you want to be able to invalidate them all again, and perhaps more important apply cache warming so that the new users get put back into the cache even before the application actually needs it. Because nothing is worse then taking the “the first user to visit the page will trigger it” approach to things.

Another caching optimization step you can take is looking at the data and extracting data that isn’t dependant on each other into separate entities. The point here isn’t normalization, or necessarily looking at cohesion. It’s about cache strategy.  So say a User entity has a counter that keeps track of how often he has logged in. In short this means you’d have to invalidate the cache each time the user logs in, not exactly a perfect world.

So what you can do is extract that counter into it’s own entity and link it to the owning user and make it a property. Now don’t get me wrong, I’m not necessarily talking about moving about tables in your database. Just the internal object representation of the data. So before the User model had perhaps an integer loginCounter property, and now he has a LoginCounter loginCounter property.  Where the LoginCounter can be retrieved and saved by itself without disturbing the User entity, even though they might live in the same table in the database.  Objects aren’t tables and all that jazz.

Now there are unfortunately valid places where you might want time based caching.  Situations where no mater how you slice it it’s just a very expensive operation. And in those situations it’s perfectly valid to just have a cronjob or jobqueue or whatever solution and defer the entire thing to manage performance.

Anyway, if some part of your application is underperforming  take a step back instead of slapping some caching around it and calling it a day.

 

 

Categories: software development Tags: , ,

some thoughts on proof of concepts

May 28th, 2014 No comments

Twttr_sketch-Dorsey-2006This post is way too long, so here’s the TL;DR
“Don’t be afraid to write concept code while designing your project. Make sure the overall architecture is sound when you do. When starting to implement, revisit the concept code, refactor the shit out of it. Don’t be afraid to throw away large chunks of it, code is cheap, it’s the underlying ideas that you want.”

If you prefer rambling, then by all means read on.

For the last couple of weeks I’ve been working on a rather large new project with a bunch of specific non-standard needs. The part I’ve mostly been working on is only a small part of a much larger whole that my team is working on.

For the most part I’ve been writing documentation, defining how things should work, and collaborating with my team to make sure all the pieces still work together. This also entailed a LOT of R&D. Simply from experience I generally have a good idea how to solve a given problem, but I feel it’s often worthwhile just to quickly implement it to make sure it actually works.

This in turn means identifying both critical functionality, functionality without which an entire facet of the project wouldn’t work anymore, or high risk functionality. Basically solutions I thought up for problems where I’m not sure if it would actually work in practice.

Creating proof of concept code is invaluable not only for testing theories and assumptions, but also for recognizing problems and getting better insights in how to solve them. Simply code fast and dirty if you have to, the exercise is to get a feel for the solution, not to win an award for the most elegant code ever written. Do take a moment to make sure your inputs and outputs are well done though, if it for instance needs dependency injection, add dependency injection, or at least make sure it *could* work with DI. This will save work later, and makes you consider the overall architecture of your project. For instance, if some functionality needs a session, but in your architecture it was supposed to be stateless. Solve that. If you can’t make it work, then the solution doens’t work. Even if it would work if you simply hardcoded a few bits now in a quick & dirty way. The code can be dirty, but the architecture should be sound.

As a concrete example of discovering hidden problems, one of the wishes of the project was to implement HTML5 pushState technology in combination with client side template rendering. The benefits are obvious, a more responsive experience for the user and less data transfered for the server. win/win.

I had a little proof concept working in my sandbox branch and a few days later while tackling one of the other features, which was ESI (Edge Side Includes) support. Things broke. As a requirement of the pushstate stuff we wanted only to maintain one set of templates for both the back-end as well as the client side. Not a big problem. But when you introduce ESI to cut out parts of your template to become essentially their own actions, you inherently break client side rendering of templates.

Of course there are various solutions to this problem. But I dare say I wouldn’t have discovered the problem had I not spend some time making quick & dirty implementations.

Now after about 2 months of pouring out design documents, diagrams, and a fairly complete technical design. We come to the part where we actually have to start building the damn thing.

A key rule that I’m sure everyone will know is to throw away your proof of concept code. And I fully agree with that. But with an asterisk attached to it. I think it’s sort of generally understood but perhaps interesting to point out, that you shouldn’t actually throw away your concept code. You simply shouldn’t USE it. Don’t copy paste, hit F5 and if it doesn’t segfault call it a day.

What you should do is revisit it. The code served a purpose, it solved a problem, the ideas it represent are probably still correct. Especially if you took the time to make sure it made sense within the larger architecture. Write unit tests to test the functionality it adds, and define all the edge cases you can think of. It might be that everything is green across the board when you are done, but more likely then not you should have some corner cases or functionality that you didn’t end up adding to the concept code which fails.

Now simply start fixing the code, be as destructive as you feel you need to be, perhaps there’s some fancy design pattern in there that looked brilliant at the time and looks like the worst thing ever now, just yank it out and give it a think to implement things better. Add all those input validations, missing functionality, cleaning up the code, rethink method names, variable names, removing code smells, taking out hard coded things, etc..

You have your unit tests to tell you everything is still working as it should, and if you feel you refactored yourself into a big scary pit, a simple revert will give you another shot.

Chances are, at the end a fair portion of your intial code got changed, maybe even everything, and maybe you had some pretty good ideas first time around and you only needed some tweaks here and there.

But the important thing is that you didn’t start from scratch. You didn’t need to spend time thinking about how to solve the problem, you could immediately spend time consdering if your solution was correct, without necessarily still being in-love with your solution (a dangerous thing), spend time polishing and making the code better. This especially works wonders when there is some sizable chunk of time between when you wrote the concept code and when you revisit it, you can immediately identify those “WTF” parts of your code.

Currently I’m doing the exciting job of writing task/feature tickets, and from the half a dozen concepts I’ve made I’ve already identified 2 that will more then likely end up in the project with only some light refactoring, then another 2 concepts of which I’m just really not happy and in the back of my mind I’m already thinking of how to re-implement them, and I wouldn’t be surprised if I end up rewriting most of it.

And that’s Ok too. Proof of concepts allow you to make mistakes and learn from them. You’ve already tackled a problem once, and now you are allowed to do it again. Meanwhile if the back of your brain is anything like mine you’ve already been thinking about the not-quite-elegant solutions you’ve made and have been thinking of better ways to solve those problems.

Also don’t be afraid to revisit a concept again during documentation, sometimes inspiration just strikes. I’ve had a bunch of code that added functionality to twig, and it was just bugging me to no end. It wasn’t nice, it wasn’t elegant, it wasn’t correct. Then one day while writing about something else entirely the back of my brain dumped the solution for my problem, and I was able to throw away the entire mess and quite literally replace it with 15 lines of code, of which only 3 actually interacted with Twig.

So to end this rant, don’t be afraid to write concept code while designing your project or when adding an extensive feature. Just make sure the overall architecture is sound when you are done. Then don’t be afraid to revisit that concept code, and make use of the lessons and ideas it represents. Also don’t be afraid to throw away large chunks of it, code is cheap, it’s the underlying ideas that take time to build.

Categories: PHP, software development Tags:

Custom Symfony2 CLI output

October 13th, 2013 No comments

Just a quick little post for my future self.

I’ve recently been working on a little hobby project involving some intensive CLI stuff with Symfony2. I felt the output handler was lacking though, it was a small thing but I really wanted a prefix for each output with a time and a time difference between the last message. Simple stuff to see how long certain steps took in the process I was working with.

I did some googling and it was actually rather easy to add with symfony2. You simply have to add a custom ConsoleOutput which extends the normal one. This is what the one I made looks like.

<?php

namespace testPrj\ProcessingBundle\Component;
use Symfony\Component\Console\Output\ConsoleOutput;

use Symfony\Component\Console\Formatter\OutputFormatterInterface;
use Symfony\Component\Console\Output\ConsoleOutputInterface;

class ConsoleTimeStampOutput extends ConsoleOutput implements ConsoleOutputInterface
{
    protected $lastTime = 0;

    public function __construct($verbosity = self::VERBOSITY_NORMAL, $decorated = null, OutputFormatterInterface $formatter = null)
    {
        $this->lastTime = microtime(true);
        parent::__construct($verbosity = self::VERBOSITY_NORMAL, $decorated = null, $formatter = null);
    }

    protected function doWrite($message, $newline)
    {
        $message = $this->addTimeStamp($message);
        parent::doWrite($message, $newline);
    }

    protected function addTimeStamp($message)
    {
        $now = microtime(true);
        $diff = number_format($now - $this->lastTime, 5);
        $message = "[".date('H:i:s')."][{$diff}] "  . $message;

        $this->lastTime = microtime(true);
        return $message;
    }
}

Then in the CLI file app/console I added.

// Custom Output handler
use testPrj\ProcessingBundle\Component\ConsoleTimeStampOutput;

$output = new ConsoleTimeStampOutput();
...
$application->run($input, $output);

And that was that.

Now when I print output via $output->writeln() it comes out as

[20:09:30][0.00299] Pulling till 33778503  - 2013-10-12
[20:13:34][244.00970] [20:13:34] http://example.org :: Cache:False
[20:13:34][0.38880] 33649011 - 2013-10-05
[20:13:34][0.00005] Memory: 19.962341308594MB
[20:13:34][0.00003] fnd records:158
[20:13:34][0.00003] new records:42
[20:13:34][0.00003] sent records: 200
[20:13:35][0.28439] Done
Categories: PHP, software development Tags: , ,

the incredibly lazy guide to installing mod_pagespeed

November 12th, 2010 No comments

You hate reading? you want to try out mod_pagespeed? you run a ubuntu or other debian based server? Well then just follow the following steps.

  1. get the binary package based on your architecture. (to check which one run “uname -m”. If it says x86_64, they you have a 64bit server)
    • 64 bit.
      wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_amd64.deb
    • 32 bit.
      wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
  2. install the package (substitute am64.deb with i386.deb if you don’t have a 64bit version)
    sudo dpkg -i mod-pagespeed-beta_current_amd64.deb
  3. open up the following file with your favorite editor
    /etc/apache2/mods-available/pagespeed.conf
  4. add all the cool features you want, i currently run this. (line 47 in the file, but it doesn’t really matter)
    ModPagespeedEnableFilters collapse_whitespace,elide_attributes
    ModPagespeedEnableFilters combine_css,rewrite_css,move_css_to_head,inline_css
    ModPagespeedEnableFilters rewrite_javascript,inline_javascript
    ModPagespeedEnableFilters rewrite_images,insert_img_dimensions
    ModPagespeedEnableFilters extend_cache
    ModPagespeedEnableFilters remove_quotes,remove_comments
  5. restart apache
    sudo service apache2 restart
  6. done.

I haven’t fully looked into mod_pagespeed and all its filters and implications there of myself, but I always like following these kinds of lazy quick guides myself to start poking around instead of actually reading something for a change. So i figured I should just make one as well.

phing + dbdeploy website deployment

November 8th, 2010 1 comment

I recently had a project with xs2theworld to help create the mobile websites for intel asia. Because this project was quite important and I wanted to step up my game I created a proper deployment strategy. No more sweaty palms while running custom scripts, pressing svn up or switching symlinks. I wanted a fully automated deployment. A deployment I could test, run and always get the same result.

Because I’ve been hearing about phing and dbdeploy from dragonbe and harrieverveer I looked into them. They ended up being excellent tools to reach my goal.

Phing is a deployment tool in which you can create a deployment “script” made up out of a ant like syntax using xml.

<copy todir="${buildDir}" >
  <fileset dir="${projRoot}">
    <include name="**" />
  </fileset>
</copy>

DBdeploy is a tool that compares your patches to your database and creates a forward SQL patch and a backwards SQL patch, aggregating your SQL patches in the forward file and the undo statements in the backwards file.

In this blog post I will highlight some of the things I did.

I created an actual ‘build’ stage, where all the website elements were processed and copied into a separate build directory. The reason for this was two-fold. Firstly I wanted to be able to check the result of a ‘build’ without it being deployed, especially the SQL patches that dbdeploy generated. Secondly, I wanted to only copy those files that were needed for the site to run. So no .git directory, no sql patches directory, etc..
This has really been a great choice, because of the separate build stage i’ve at least had two instances in which i caught a problem before deployment. Saving me from the embarrassment of having to make quick fixes while the site was in offline mode.

I created separate phing property files for different environments. (production, staging, development) this combined with a simple wrapper script that called phing resulted in a very pleasant way of deploying by just issuing a command like “./deploy build development” or “./deploy rollout production” and the inevitable “./deploy rollback production”. Much better then “phing -Denvironment=staging build”.  Property files are basically ini files that contain key/value pairs that can be referenced from within your phing build file.
Then in phing you can say “<property file=”deploy/${environment}.properties” />” and it will read the property file. Please note that “${environment}” refers to a variable. which in my case was set when calling phing. (the -Denvironment=)

Dbdeploy is a piece of software that can read your SQL patch files, compare them to the database and create a single SQL file you can run to update your database. Unfortunately dbdeploy is very fussy about the separator you use between your patch and your undo patch. Yes, undo patch. At some point you want to rollback a deployment and at that time you really do not want to find out that you can’t because the new table structure breaks the old code.
It only takes very little time to create undo statements and you will avoid excruciating minutes of frantically applying changes manually when things break.
Also when creating undo statements be sure to set them in the reverse sequence of your normal sql patch statements.
e.g.

ALTER TABLE `myrecords`  ADD `rank` int NOT NULL;
RENAME TABLE `myrecords`  TO `myrecord`;

-- //@UNDO

RENAME TABLE `myrecord` TO `myrecords`;
ALTER TABLE `myrecords` DROP `rank`;

Also, that is how you should write the undo separator. Exactly like that. If you don’t dbdeploy will simply add the undo section to your deployment SQL file as well. Which is very much unwanted.
Also be sure to always, ALWAYS, ALWAYS! run both your forward SQL patch as well as your backwards SQL patch to make sure it works. Preferably not on production.

That’s about it I guess. There are many wonderful guides that will explain how to use phing and dbdeploy in detail, which is the reason I didn’t. I just wanted to pass along some things I used and thought worked nicely.
I would like to point people who want to read more about phing to the following blog posts that helped me heaps:
– Diving into Phing I http://groups.drupal.org/node/4363
– Diving into Phing II http://groups.drupal.org/node/5400
– Phing Build File http://sean.gravener.net/blog/web-design/phing-build-file/134/
– How To: Simple database migrations with Phing and DbDeploy (*this was a bit outdated*) http://www.davedevelopment.co.uk/2008/04/14/how-to-simple-database-migrations-with-phing-and-dbdeploy/

most important though.
http://phing.info
http://dbdeploy.com/

traveling elephpant

June 30th, 2010 5 comments

So a little while back ibuildings had this fun contest to build a path finding program that would solve a traveling salesman like problem. The constraints where pretty simple, you got a CSV file with latitude/longitude locations. You started at a certain location and you should end at a certain location. The application should then find the shortest route that touched all locations.

Now I’ll be the first to say that PHP is really not the language for that. A few years back I wrote a pathfinding tool for use with a game called EVE online in which I calculated certain trade routes based on data you could export from the game. After seeing PHP’s performance I switched to Python and more or less sliced the processing time in half if not more. Mainly because Python has specific array like types and PHP just has generic array’s, which with large data sets matters a lot. Perhaps also because my pathfinding-foo was still rather weak :)

However, back to now and the ibuildings challenge. The challenge would be rated on 3 criteria. Speed, lines of code and code complexity. Personally I could care less about the latter two and focused purely on speed. In the weeks that followed I had a great time comparing execution times with Remi and Andries. I think this was also key to keep diving into it and tweaking it until it was as fast as I could get it.

Sadly, my submission actually had a off-by-one bug in it which more or less instantly disqualified my entry. Yes, bit of a bummer, but such is life.

Now I had already decided to publish my code after the contest, however sadly I never really found the time to type this up. So a bit late but here is the code for my solution for the ibuildings elephpant challenge.

Below is the submitted code, and a link to download the code and original test data file so you can try it out for yourself.

download the code – Just unpack and run via php contest.php elephpant_landmarks.csv

-edit-

Ok scratch the code, I’m having some trouble with getting it to play nice. Just download the tar.gz and view the code in your favorite editor.


running ubuntu on a vaio BZ series laptop

January 18th, 2010 4 comments

I recently purchased a sony vaio VGN-BZ31VT. To be short, everything works for as far as I know and care.

Specs

CPU: Intel® Core™2 Duo-processor P8700 @ 2,53 Ghz
mem: DDR2 SDRAM (2 x 2 GB)
graphics: Mobile Intel® Graphics Media Accelerator 4500MHD

Wifi: intel wifi link 5100
audio: intel HD audio
ethernet: Intel 82567 Gigabit
bluetooth: ?  2.0 + EDR

what works

Well all the basics seem to work, the special function keys on the keyboard, the mouse pad, the screen, wifi and ethernet port.
But most importantly, suspend and hibernate also work. All of this out of the box, just install and go.

incidentally, this CPU also support intel VT, it is off by default, but you can easily enable it in the bios. If like me you use virtual machines a lot, it is rather nice to have. Haven’t done any real tests to see if it is faster, but at least it’s there.

Not tested

I haven’t tested bluetooth, don’t need it.

What doesn’t work

The laptop also has a fingerprint scanner, which with some tinkering can be used. It’s not so much a problem of hardware support it seems, but more that there isn’t a mainstream way of integrating fingerprints scanners with the security system in linux. The solution I read needs you to install some fingerprint scanning software and load that as a module in PAM. Too much work for too little gain for my taste, but if you really want it, then you can get it to work (probably).

conclusion

I wanted a no-nonsense development laptop with lots of memory and preferably virtualization support in the CPU, it should also work under linux with minimal fuss and suspend working was a must have.
Mission successful it seems.

Seeing as there is very little recent user experience info about this laptop out there at the moment, i figured i should write this little blog, if only to give people the peace of mind that you can safely buy this laptop for running linux.

Categories: hardware Tags: , ,

user settings cookie

January 17th, 2010 No comments

Sometimes in applications you will have certain user settings that you want to apply, even when the user is not logged in. Take for instance these examples:

  • “welcome back <name>” msg on return.
  • You have a portal type page where the user can control what content is shown where
  • You want to track where the user was when he last visited the site, perhaps to offer him the option to return to there.

I recently needed some functionality like that. So I’ve created a object that can help me with that.

I thought about it for a moment and created a singleton settings object for me to call upon to set and retrieve certain settings. Now I have to warn you that there is a small problem with singletons, if you use unit testing it can be difficult to control the behaviour of singletons over multiple tests. So be wary of this when you are running unit tests.

I also wrap all data in a separate array. This isn’t really necessary, but it makes handling the data a lot easier. If you wanted you could also add some sort of encryption to the cookie data so that users couldn’t easily tamper with it.

easy and simple transparency effect using GIF

October 3rd, 2009 1 comment

Transparency in HTML/CSS is largely an already solved problem, recent browsers all seem to handle PNG transparency pretty well and there are scripts that will make sure older browsers will handle them as well.

transparant gif exampleHowever, I wanted to make a post about a little technique I rarely see used which I think is quite genius in it’s simplicity.  Whenever you want to create a semi-transparent surface you create a gif file that contains a simple pattern of transparent and opaque pixels; as in the example on the right.  The white you see in the chequered image is of course transparent.

So let’s demonstrate how this effect actually looks.

transparency examples

Now as you can see the effect itself is very specific, and different backgrounds have different outcomes for the effect. Which might not fit every design. Another disadvantage is that it can only be used to show a 50% transparency effect. There might be pixel patterns that will give you a different distribution but I’ve never seen them.

The biggest advantage however is that you don’t need any fancy CSS or javascript or PNG, which in certain specific cases can be a big plus.  It’s more of a hack on your eyes/brain then on the browser :)

Categories: design Tags: , , , ,

Simple design rules for webdevelopers

September 4th, 2009 No comments

Designers. Can’t live with them, can’t live without them. More often then not, developers will have to work together with designers to create a website. Which most of the time means the designer will create a design and some HTML & CSS. Which the developer will then integrate and adapt to fit into his software to actually make the site work.

The problem however is; Developers normally didn’t go to design school and many of them have the artistic ability of your average garden rock. I’m certainly no exception to that. However, if you learn a few basic guidelines and rules, you can make the life of your designer buddy a lot happier by not screwing up his design.

So here are just a few general hints and tips to explain what is important when adding something to a existing design or when integrating it into the actual software of the site.

Aligning stuff

Stuff needs to be aligned, both horizontally and vertically. It must be aligned “visually” instead of accurately, which means that if you look at it, it should look aligned. Which could mean two pixels to the right or left of the point where two blocks where actually aligned.

however as rule of thumb you can pretty much aligned them accurately. The reason for aligning stuff is because it is visually pleasing for us to look at.

misaligned

aligned

In some few cases the designer might actually want to purposefully have a few elements be misaligned, so if your doubting, just look at his designs or simply ask him.

Whitespace

Where saying less is more. Whitespace, as the name suggests, is the empty space around objects. As with alignment, everything should have a little bit of white space and for the most part, the amount of whitespace should be the same.

Whitespace brings some calm into a design. A design will look far less crowded with ample use. Also, whitespace will often be used to emphasise certain elements within a design. This is where the amount of whitespace will differ. Headers for instance will often have more whitespace around them then paragraphs.

It is also always a good idea to make sure there is a lot of whitespace around important elements for your website. When there is a lot of whitespace around a element in comparison to other elements on a page, your average human will read that element first.

If you want to know a lot more about whitespace, read this article from a list apart that deals exclusively with whitespace.

colours

Colours are not only pretty, but also important. Often companies and brands will have very specific colours associated with them. Coca Cola red, UPS gold&brown, etc… Now designers could probably bore you for hours about colour theory and all that stuff. However what’s important for us developers to know, is to never introduce new colours. A designer will have chosen a small select group of colours for use in the design.

You will have one or two base colours, and a accent colour. For instance, for the design of this site, the base colours are light grey and white. and the accent colour is blue, and perhaps black.

What this means is that if I should add some new element, it should be one of those colours. Not purple simply because I like purple. Your best bet, depending on the size of the new element, would be to use the accent colour. The use of the accent colour is to basically add some spice to a design and use it to draw the attention of viewers.

When using colours, be precise. You might think, blue is blue is blue. But if your designer used #7c95e7 in his design, then be sure to use #7c95e7 as well. You could even ask your designer to write down the used colours for you in HEX.

ask your designer

Above are just a few general tips to watch for, but the best advice I can give is to simply ask your designer when in doubt or even just to verify. Because basically, It is like you created the software for the site and then some amateur takes it and start modifying it in small ways. I think most developers would at least like steer how those modifications are made, explain a bit what the thoughts where behind how it was written and such.

I have to say though, not all designers will like explaining it to you. But just remember, your not doing the above to please the designer, your doing the above to please the client, to make a better website. If that means having to bother your designer for 5 minutes, then so be it.