Everyone Loves a Prequel | Data and the Web

Data and the Web

Everyone Loves a Prequel

popcornYesterday, Kirix Strata™ received a nice write up by The Register. Unfortunately we had a little rough sailing in the morning with the ensuing web traffic. We apologize to anyone who had to suffer through worse-than-dial-up speeds while downloading the beta. The problem was fixed and so hopefully it won't occur again.

The article hinted at the origins of Strata and I thought it may be useful to fill this story out a little bit more. Thankfully, this prequel does not involve midichlorians.

We introduced Strata at LinuxWorld a couple years ago as a “dynamic database” — sort of a cross between a desktop database and a spreadsheet — that made it really easy to use, manipulate and analyze structured data. In addition to its ease of use, Strata also had tremendous data capacity and speed, bringing the difficult world of databases a step closer to those who would otherwise shiver at the sight of SQL. These traits actually helped it win the LinuxWorld “Best in Show” award for desktop/productivity/business applications.

Unfortunately, there were a couple issues that limited its mass appeal. The first issue was connectivity: users needed to import all their data into the project. This isn't a problem if the data is static, historical data, but it becomes a bigger problem if the data requires regular updating. The second problem was repeatability: users couldn't easily replicate logic without performing a set of manual steps. So, unfortunately, for any type of repeated analysis, use on a daily basis could become burdensome.

These two areas became the primary focus of the new and improved Strata. We wanted 1) to enable people to work with data outside of a Strata “project” and 2) to provide a way to code their logic into scripts that could be run on a regular basis. For the former, we added the ability to open up files and manipulate them directly, like a CSV file on your desktop or a MySQL table on a server. For the latter, we implemented a scripting language (ECMAScript) with both a database and interface API, enabling developers to create a repeatable process (in embedded SQL) that could easily be deployed as an extension.

But then, as we looked at data accessibility and how to connect to various data sources, we started thinking about the web as a database. Although the web contains large amounts of information in HTML, quite a bit of information can be interpreted in a structured way with just a little bit of work. And with other data available as CSVs, RSS feeds, or through APIs, we thought it would also be useful to allow users to access some of these web-based data resources more directly.

That meant we needed to embed a browser, and after investigating several options, we settled on Mozilla's Gecko layout engine. Of course, the real trick was not just to let people browse web pages, but to let them interact with the content in a more data-oriented manner — a “trick” we're still exploring, implementing and refining. So, in many ways, Strata is a bit more of a “Data Interactor” than a “Data Browser.” But, I suppose, the former doesn't roll off the tongue as nicely…

So, in the end, this beta version of Strata builds really builds on a history of database power and analytics. And now we've got a chance to see what happens when we apply these things to the web too.

Comments are closed.

About

Data and the Web is a blog by Kirix about accessing and working with data, wherever it is located. We have a particular fondness for data usability, ad hoc analysis, mashups, web APIs and, of course, playing around with our data browser.