Data and the Web

Archive for July, 2009

Further Sunlight on Government Data

Monday, July 20th, 2009

sunbeams1.pngIn a previous post, we discussed some of the interesting things the US government is doing to make its data more widely available, culminating in the Data.gov website.  This website is now up and running and has definitely made some progress since we’ve last discussed it.

Data.gov is broken down into three main catalogs:

  1. Raw Data Catalog (with data files available in XML, CSV, KML, etc.)
  2. Tools Catalog (list of tools built to work with various open data sets)
  3. Geodata Catalog (links to Federal geospatial data)

They’ve also tried to make it easier to search for data sets, which like video, is quite reliant on being tagged with good, meaningful descriptions and related meta data.  It’s a hard nut to crack.  For example, government agencies tend to release data sets on an annual basis, so you’ll have, say, 5 different data sets (and counting) for the “Public Libraries Survey” from 2004 through 2008.  If your search terms aren’t specific enough, these repetitious items tend to clutter up the search results.  As Data.gov continues to add more data sets, hopefully they can refine this area further.

But, then again, maybe they won’t have to.  The folks at Sunlight Labs, whose mission is to build technology that makes government more transparent and accountable, has recently announced a project called The National Data Catalog.  It will be a tool that aims to take the Data.gov concept and improve upon it.  From the announcement:

“We think we can add value on top of things like Data.gov and the municipal data catalogs by autonomously bringing them into one system, manually curating and adding other data sources and providing features that, well, Government just can’t do. There’ll be community participation so that people can submit their own data sources, and we’ll also catalog non-commercial data that is derivative of government data like OpenSecrets. We’ll make it so that people can create their own documentation for much of the undocumented data that government puts out and link to external projects that work with the data being provided.”

This should be interesting to watch.  As the Sunlight folks say in a later post, they are not out to replicate Data.gov, but to stand on its shoulders (similar to how, say, Weather.com relies on and improves upon the National Weather Service).  Given the nature of the beast, data sets need to be described really well in order to be both searchable and useful.  Hopefully the community aspect, in particular, can help give this data more utility.  If any are tech savvy folks interested in either following the project or contributing with code, here’s the project page.

A Wee Bit of Housekeeping…

Friday, July 17th, 2009

brooms2.pngWe haven’t been doing much regular blogging lately, but we’re hoping this will change in the coming weeks.

In the meantime, we’ve recently done some housekeeping on our website, so if you haven’t visited recently we’d encourage you to do so. We’ve updated many pages with new content, but here are two sections in particular that we’d steer you toward:

  • Examples Section.  This is a long overdue section that puts together some quick examples of how Kirix Strata™ can be applied to common data problems.  The section is still a work in progress with more videos still to be produced.  However, we expect what we have now will prove useful to new and old Strata users alike.  Check it out.
  • Video Tutorials and Archive.  We’ve done a bunch of different videos and screencasts over the past year or so, but they’ve been they’ve been posted all over our website.  This new section wrangles all of the videos together in one place for posterity.  The feature tutorials, in particular, are worth viewing as they help give a more comprehensive look at how to use specific features in Strata.  Take a look.

So, in a nod to the Matrix, where one cannot be told what it is, but one must see for oneself, we’ve tried to make some high quality video documentation available.  Stay tuned for more to come.  Enjoy!

wxWebConnect: Open-source Browser Library for wxWidgets

Wednesday, July 8th, 2009

labs_home_connect.png

This is sort of out of the scope of this particular blog, but I thought I’d pass along the news that we just released another open-source library for wxWidgets users. This one is called wxWebConnect and it’s a library for wxWidgets that enables developers to quickly integrate advanced web browser capabilities.

Basically, it wraps up functionality exposed by the Mozilla Foundation’s Gecko engine (XULRunner) into a set of user-friendly classes to: embed browser controls, search web content, print web pages, interact with the DOM, implement custom content handling for different MIME types, issue POST calls using the current browser state, etc. Notably, with this library you can also embed all of your favorite Firefox browser plug-ins into your application. We’ve also gone out of our way to make sure that getting a browser control up and running in your application is as easy as possible.

More information can be found at the wxWebConnect project page. Also, feel free to view some screenshots and a short video demonstration too. If you’re a wxWidgets developer, give it a whirl and let us know what you think.

About

Data and the Web is a blog by Kirix about accessing and working with data, wherever it is located. We have a particular fondness for data usability, ad hoc analysis, mashups, web APIs and, of course, playing around with our data browser.