First, a short intro – I’ve been thinking of what to do with all the ideas I’m coming up and I’d like to try posting blog entries under “Concepts” category. I’ll accept comments and will write additions to the concept there as well. We’ll see what it’ll be like ;)
I think all the key components to this technology already exist!
First we need to know what to pre-load and here comes Google Analytics API and it’s Content / ga:nextPagePath dimension that will give us all the probabilities for next pages that users visit.
Now, when we have page URLs, we need to understand which assets to load from those pages and that can be solved by running headless Firefox with Firebug and NetExport extension configured to auto-fire HAR packages at beacons on the server.
HAR contains all the assets and all kind of useful information about the contents of those files so tool can make infinitely complex decisions regarding picking the right URL to pre-load from simple “download all JS and CSS files” to “only download small assets that have infinite expiration set” and so on (this can be a key to company’s secret ingredient that is hard to replicate). This step can be done on a periodic basis as to run it in real time is just unrealistic.
The last piece is probably the most trivial – actual script tag that asynchronously loads the code from the 3rd party server with current page’s URL as parameter which in turn post-loads all the assets into hidden image objects or something to prevent asset execution (for JS and CSS).
So, all user will have to provide is site’s homepage URL and approve GA data import through OAuth. After that, data will be periodically re-synced and re-crawled for constant improvement.
Some additional calls on the pages (e.g. at the top of the page and at the end of the page) can measure load times to close feedback loop for asset picking optimization algorithm.
It can be a great service provided by Google Analytics themselves, e.g. “for $50/mo we will not only measure your site, but speed it up as well” – they already have data and tags in people’s pages, the only thing left is crawling and data crunching that they do quite successfully so far.
Anyone wants to hack it together? Please post your comments! ;)