Data and the Web: The Official Kirix Weblog - Part 6

Data and the Web

Kirix Strata Beta 2 Now Available

August 17th, 2007

Hi everyone, we've now got a new beta iteration up and available for download.

Proxy Configuration Screenshot

Strata Beta 2 includes the following new features and bug fixes:

  • Added proxy configuration support (see screenshot above) as requested here. To adjust your settings, go to Tools>Options>Internet Tab.
  • Fixed the overly-eager numeric auto-sensing for things such as ip addresses (12.233.132.33) and English Premier League scores (1-2) first reported here.
  • Fixed the web page source view issue as reported here. Note that you can now toggle between views by either using the “View” menu, by clicking on the “Toggle View” icon from the toolbar or via the right-click menu on web pages.
  • Fixed the Ubuntu installation issues as reported in many places.
  • Fixed the MySQL DateTime issue reported here.
  • Also fixed the MySQL off-by-one issue as reported here.
  • Fixed the column break issue reported here.
  • Fixed a whole bunch of additional nickel and dime issues such as adding new hot keys, fixing some menu issues, and cleaning up some scripting bugs.
  • We also have updated our documentation related to Strata's scripting. Still a lot more to go, but it's something we'll be building on in the coming weeks.

We've also upgraded our build process so hopefully we'll be able to turnaround new beta iterations more frequently from here on out. Thanks to everyone who has contributed to this beta effort (and, as an aside, have earned themselves a free license when Strata is released), either via the forum, the bug report form or via support emails. Please keep ‘em coming!

Horizontal Tab Groups Make Bug Entry Fun!

August 10th, 2007

Bug Entry is Fun! (screenshot)

Thanks for all the bug reports this week; we're working hard to sort them out.

We're hoping to have a new beta for everyone early next week. In addition to a lot of nickel and dime fixes, we'll definitely be adding a configuration page for proxy settings, our most requested feature.

Have a good weekend!

Everyone Loves a Prequel

August 8th, 2007

popcornYesterday, Kirix Strata™ received a nice write up by The Register. Unfortunately we had a little rough sailing in the morning with the ensuing web traffic. We apologize to anyone who had to suffer through worse-than-dial-up speeds while downloading the beta. The problem was fixed and so hopefully it won't occur again.

The article hinted at the origins of Strata and I thought it may be useful to fill this story out a little bit more. Thankfully, this prequel does not involve midichlorians.

We introduced Strata at LinuxWorld a couple years ago as a “dynamic database” — sort of a cross between a desktop database and a spreadsheet — that made it really easy to use, manipulate and analyze structured data. In addition to its ease of use, Strata also had tremendous data capacity and speed, bringing the difficult world of databases a step closer to those who would otherwise shiver at the sight of SQL. These traits actually helped it win the LinuxWorld “Best in Show” award for desktop/productivity/business applications.

Unfortunately, there were a couple issues that limited its mass appeal. The first issue was connectivity: users needed to import all their data into the project. This isn't a problem if the data is static, historical data, but it becomes a bigger problem if the data requires regular updating. The second problem was repeatability: users couldn't easily replicate logic without performing a set of manual steps. So, unfortunately, for any type of repeated analysis, use on a daily basis could become burdensome.

These two areas became the primary focus of the new and improved Strata. We wanted 1) to enable people to work with data outside of a Strata “project” and 2) to provide a way to code their logic into scripts that could be run on a regular basis. For the former, we added the ability to open up files and manipulate them directly, like a CSV file on your desktop or a MySQL table on a server. For the latter, we implemented a scripting language (ECMAScript) with both a database and interface API, enabling developers to create a repeatable process (in embedded SQL) that could easily be deployed as an extension.

But then, as we looked at data accessibility and how to connect to various data sources, we started thinking about the web as a database. Although the web contains large amounts of information in HTML, quite a bit of information can be interpreted in a structured way with just a little bit of work. And with other data available as CSVs, RSS feeds, or through APIs, we thought it would also be useful to allow users to access some of these web-based data resources more directly.

That meant we needed to embed a browser, and after investigating several options, we settled on Mozilla's Gecko layout engine. Of course, the real trick was not just to let people browse web pages, but to let them interact with the content in a more data-oriented manner — a “trick” we're still exploring, implementing and refining. So, in many ways, Strata is a bit more of a “Data Interactor” than a “Data Browser.” But, I suppose, the former doesn't roll off the tongue as nicely…

So, in the end, this beta version of Strata builds really builds on a history of database power and analytics. And now we've got a chance to see what happens when we apply these things to the web too.

Situational Integration

July 31st, 2007

ProgrammableWeb has a nice write up today about some of the challenges in the mashup tools market. It included a link to an excellent write-up of mashup platforms by Dan Hinchliffe of ZDnet. Dan writes:

Mashups could theoretically allow business users to move — when appropriate — from their current so-called "end-user development tools" such as Microsoft Excel that are highly isolated and poorly integrated to much more deeply integrated models that are more Web-based and hence more open, collaborative, reusable, shareable, and in general make better use of existing sources of content and functionality. Remember, business workers still spend a significant amount of time manually integrating together the data in their ever increasing number of business applications. Tools that could let thousands of workers solve their situational software integration problems on the spot themselves, instead of waiting (sometimes forever) for IT to provide a solution, is indeed a potent vision.

We agree.

We've seen time and again how business users need to integrate and work with data from different sources — although usually only with data internal to the company. However, as the web provides more and more useful information, people will also want to include external data as well. And, if normal people can do this on their own without much IT support, the potential for increased productivity and efficiency, not to mention new discovery, skyrockets.

We're currently exploring some of these possibilities with our recently-released beta of Kirix Strata™. What makes Strata unique is its ability to work seamlessly with data wherever it's located — whether a back-end database like Oracle or an Excel file on your desktop or a website, with or without an API. Much of our work is still cooking in our labs, but we'll be providing some concrete examples shortly. Stay tuned!

Spreadsheets, Ltd.

July 25th, 2007

strata gridA friend of mine uses Microsoft Excel quite a bit and recently asked me what Kirix Strata™ can do that Excel can't. This is a very reasonable question to ask.

In fact, as an avid spreadsheet user myself, Excel lets me do all kinds of great things with data like creating budgets or putting together various lists. I can use formulas to create instant calculations and change data on a whim to perform what-if scenarios. Excel even gives me a few “database” tools to use, like sorting and filtering.

However, the strength of a spreadsheet lies in its ability to handle unstructured data really well. When I create a budget, I'm happy to mingle a column heading, my data points and a sum/total in the same column — and Excel is delighted to let me do it (or, at least, so suggests Clippy). It is cell-based, so you can place data wherever you'd like without any concern.

The trouble comes when you start dealing with larger amounts of structured data. We've seen this issue a lot, particularly when working with corporate clients. Excel is the most familiar tool for ad hoc calculations, but when something comes up where a user is presented with 20,000 records (or millions), it gets a little more dicey. Often the only option is to start working with a desktop database like Access. Unfortunately, a desktop database can often be a bit too complex for someone who just wants to quickly use their data like they would with a spreadsheet.

This is where Strata can really help. At its core, it was built to solve the problem of data usability. Basically, we're trying to give people the ability to handle structured data really easily, wherever they may encounter it.

Strata will happily take the tens of thousands or tens of millions of records and let you create calculations instantly across the entire column. Or, just like Excel, you can sort or filter your data, but do so across the entire data set with a single click. Of course, there are plenty of more “database” things you can do too (relationships, queries, reports, scripting, etc.), but the key is being able to quickly and easily use the data however you wish.

A pretty classic business issue came up in a forum post today. In this situation, Greg was trying to identify duplicate inventory items in a 63,000 record file. He created a calculation to remove some “noise” from the data, then he grouped it together and found out which ones were duplicated. From there, he could take the results and remove the duplicated records from the original database to prevent future processing errors.

This process would have taken all of a couple minutes to perform. With a spreadsheet, however, this would have been much more cumbersome because of the file size (it would barely fit in most versions of Excel) and the need to copy a formula over 63,000 rows. I'm actually not sure if Excel could handle the grouping function in the same way.

Excel is a excellent tool for unstructured data, but just wasn't designed for the rigors of handling structured data. One of the many things Strata offers is an easy transition for folks needing to analyze larger amounts of structured data.

Do you have any data issues that seem to be pushing the scope of your spreadsheet? Let us know, we'd be happy to help.

Tagging the World's publicdata

July 17th, 2007

SignpostThere's a surprising amount of publicly available data on the web — government statistics, economic information, sports data, etc. And lots of it is in good ol' fashioned CSV files ripe for analysis.

Jon Udell has recently begun tracking this kind of data using del.icio.us and has asked anyone who is so inclined to follow along. All you have to do to join in the fun is tag your bookmark publicdata.

With Kirix Strata™, we've been interested in identifying public data sources as well and have been jotting bookmarks down as we've come across them. We're quite pleased to finally have a useful, publicly available place to put them:

kirixstrata/publicdata

We've only added a few to start with, but you'll see more added in the coming weeks.

Got any good publicdata to share?

The Birth of a Data Browser

July 17th, 2007

Strata LogoWell, it took a lot more blood, sweat and tears than we expected, but we're really excited to announce our first public beta release of Kirix Strata™, the data browser.

And what, pray tell, is a “data browser”?

Well, Strata is a specialty browser that lets you access and manipulate data from pretty much anywhere on the web. For instance, Strata will let you grab HTML tables or RSS Feeds or even open up CSV files directly from a URL (wow, that's a lot of acronyms).

Then when you've got the data in a table, you can do all sorts of ad hoc analysis. You can create calculations or sort and filter or create queries and reports — similar to the kinds of things you might do with a desktop database or a spreadsheet. In addition to web data, you can still work with data from your desktop or in a database system like Oracle or MySQL Enterprise.

And for those more technically-inclined, Strata also includes an implementation of ECMAScript — so anyone familiar with Javascript should feel right at home. The nice thing about the scripting is that it also includes bindings for SQL and HTTP — which can make for a lot of fun when connecting to Web APIs, creating “desktop mashups” or building extensions. And to boot, it runs on both Windows and Linux (at this moment, only Ubuntu is supported officially).

We also just want to give a quick shout out to the excellent folks at wxWidgets (we use their GUI library) and Mozilla (Strata incorporates the Gecko engine) — without which, Strata would only be a mere twinkle in our eye.

So, without further ado, check out the Kirix Strata introduction video:

Play Video

(And here's an embeddable YouTube version…)

and then

Download and try out the data browser for yourself

We hope you enjoy it!

About

Data and the Web is a blog by Kirix about accessing and working with data, wherever it is located. We have a particular fondness for data usability, ad hoc analysis, mashups, web APIs and, of course, playing around with our data browser.