Kirix Strata Blog

Archive for the ‘dates’ Category

Excel Date Conversion (Days from 1900)

Thursday, February 4th, 2010

Dates tend to always be tricky things to convert, since there are so many different formats used in different software packages.  Today we received a support request about converting Excel dates.  This should have been straightforward, as Excel simply provides the number of days since January 1, 1900.  So, for example:

20    - January 20, 1900
35981 - July 5, 1998
39341 - September 16, 2007

So, as we saw with the JD Edwards Conversion, we should simply need to use the following formula to convert it:

date(1900,1,1) + fieldname - 1

However, it turns out that Excel has a date bug from its very early days, due to an even earlier date bug in Lotus 1-2-3:

“When Lotus 1-2-3 was first released, the program assumed that the year 1900 was a leap year, even though it actually was not a leap year. This made it easier for the program to handle leap years and caused no harm to almost all date calculations in Lotus 1-2-3.

When Microsoft Multiplan and Microsoft Excel were released, they also assumed that 1900 was a leap year. This assumption allowed Microsoft Multiplan and Microsoft Excel to use the same serial date system used by Lotus 1-2-3 and provide greater compatibility with Lotus 1-2-3. Treating 1900 as a leap year also made it easier for users to move worksheets from one program to the other.”

So, in order to properly convert, we need to account for this extra day.  And, we get the final formula:

date(1900,1,1) + fieldname - 2

For further information on dates, please take a look at the other posts we’ve done on this topic in the past.

JD Edwards Date Conversions (CYYDDD)

Thursday, April 16th, 2009

In the last post, we talked about Julian dates and Unix Timestamp conversion.  We recently received a support request regarding another type of date conversion and, since it might be applicable to others, we thought it would be a good idea to show how you can convert it in Strata.  The format is C-YY-DDD, which is used by JD Edwards software: the century after 1900, the year in that century and the Julian date within that year.  Here are a few examples:

105031 - January 31, 2005
107263 - September 20, 2007
98129 - May 9, 1998

To convert the dates we first want to get the string into the proper year and then we’ll add the Julian date to get the proper day/month (which will also take into account leap years).  So, let’s take the example “107263″ (we’ll assume that it is in Numeric format in your field).  Here is the full expression you can use in a calculated field:

date(1900+(field1/1000),1,1)+val(right(str(field1),3))-1

Getting the Year

date(1900+(field/1000),1,1)

The first part of the formula takes the string and divides by 1000, which, when rounded, will provide the first three digits (”107).  It then adds 1900 to the 107 to give us “2007″.  Then the normal date conversion applies and we end up with “01/01/2007.”

Getting the Day and Month

val(right(str(field1),3))-1

Here we are adding the Julian days for the year to the date we created above “01/01/2007″ — so, if we added “1″ to this date, we would get 01/02/2007, and so on.  Here we are using the RIGHT() function to pull out the 3 digits on the right side of the field, in this case “263.”  Because it was numeric, we needed to first convert it to a string using the STR() function and then we converted the result back to a numeric value using the VAL() function.  This gives us “09/21/2007.”  Because we started at “1″ (January 1st) instead of “0″, we simply need to subtract 1 from our date to obtain the correct date of “09/20/2007.”

NOTE:  If your JDE date was a character field instead of a numeric field to begin with, you could change the structure or just convert manually in your expression with the VAL() function as follows:

date(1900+(val(field3)/1000),1,1)+val(right(str(val(field3)),3))-1

If anyone has other date conversions they’re having trouble with, please let us know and we’ll see if we can help.

Converting Julian dates and Unix timestamps

Thursday, March 26th, 2009

Converting date formats can be a tricky thing.  Two of the more cryptic formats that come up in our support questions relate to Julian dates and Unix timestamps.  Here’s how you convert them in Strata:

Julian Dates

A Julian date is the number of days since January 1, 4713 BC GMT.  To convert a Julian date to the current date, you can simply use DATE(0,0,0) + <number>, where <number> is the Julian date.  For for example, if the number appears as “2454917″, the formula in Strata would be:

date(0,0,0) +  2454917  [the result would be March, 26, 2009]

Unix Timestamps

A unix timestamp is the number of seconds since 1970.  To convert a unix timestamp to the current date, use DATE(<number>), where <number> is the number of milliseconds since 1970.  If the timestamp is given in seconds, multiply by 1000:  DATE(<number>*1000).  So, for example, if you the number appears as “1237657172″, the formula in Strata would be:

date(1238088021 * 1000)  [the result would be March, 26, 2009]

As a side note,  sometimes one might mistake a Julian or Unix date for a different custom format.  For instance, dates could be stored in the format YYDDD or YYYYDDD where YY/YYYY is the year, and DDD is the number of days since the beginning of that year.  So 2001003 would be January 3, 2001

For further information on Strata date conversion, please see this help page.

Checking Date Ranges Prior to Analyzing New Data Sets

Thursday, June 26th, 2008

Date Range Extension ImageSeasoned data analysts know that one of the first things you need to do with a new, unfamiliar data set, is to run some basic tests to determine what kind of animal you’re working with.  This is particularly important when working with larger data sets that may be amalgamated from multiple systems or appended together from archived files.

One of these tests is a date range check.  So, for example, if a client has shipped you all the data from the first 6 months of 2007, you want to make sure you actually have a full, complete 6 months of data to work with.  In fact, you’d like to see something like this:

12/2006 -  43 records
01/2007 - 255 records
02/2007 - 249 records
03/2007 - 265 records
04/2007 - 287 records
05/2007 - 259 records
06/2007 - 263 records
07/2007 -  53 records

The outlying dates on the end (12/2006 and 7/2007) do provide some comfort that the data set is truly complete.  However, it is surprising how often you’ll actually see something like this:

01/1999 - 196 records
12/2006 -  43 records
01/2007 - 255 records
02/2007 - 249 records
03/2007 -  96 records
04/2007 - 287 records
05/2007 - 259 records
06/2007 - 263 records
07/2007 -  53 records

This second example is a dirtier data set; there is a strange, high-count outlier from 1999 and we also see that there was a significant drop in the record count during March 2007.

Before you actually start performing your analysis, you’d want to investigate the items from 1999, which could just be empty records that can be ignored or, worse, could be something wrong with the formatting of these records.  The precipitous drop in March 2007 is a little more worrisome.  Was it because sales dipped drastically that month or was it because there was an error when the IT department appended this data set together?

Whatever the cause, it’s better to get your data in order and make sure you have a complete set before jumping into your analysis and providing that client with incorrect or skewed results.  In order to help you to do this, we’ve created a simple date range analysis extension.  Running this utility on a new data set from the get-go can save you a lot of time and hassle later on.

You can install the date range analysis extension and learn how to use it here.  Got some other data utilities you’d like in your toolkit?  Let us know.