Tuesday, September 29, 2015

Next

Next

Next is one of the most useful ideas that mathematics ever invented.  We often start a list without thinking about how large it might become. Often we will implicitly make a choice,  three digits mean no more than 1000 members, 4 digits mean 10,000.  The nine digit social security number runs out of space at only 1,000,000,000
people, and with a population of 350,000,000 we are starting to get close.

If you are naming things using sequence numbers, next is simple. Just add 1 to the last number you used.  If you are at 5392 then the next one is 5393.  Simple.  You never run out. They are all different.

But what if you want more information.  Suppose that the number is stamped on the object you are numbering.  When you see the object at some future time, you want to know when it was numbered without looking it up in your records.  Well, if you want to more information in the number, it will have to be longer.  for example,  2014-07-23-5393.  5393 was numbered on July 23, 2014.
So our simple sequence number has expanded by 8 digits, and you can now tell when it was made.  The dashes make it easier to read, 201407235393 is hard to read.  And next year 2015110127281 will
be harder to read.  (2015-11-01-27281)

How can we make this numbering system better?  Let us look at several options with two things in mind.  1:  It should be easy for the user to understand the numbers.  2:  Fewer characters is better.

If we leave the dashes out, the string of numbers becomes shorter, easier to write, BUT harder to understand.  In this case, the punctuation wins.  The dashes are important to understanding.   what about the year?  Will you be numbering these things in the year 3247?  Well, you will not, but will someone be doing it? What about 2147, will someone be doing this in a hundred years?   So can we drop the first two characters?
         2014-07-23-5393 => 14-07-23-5393 ?

A casual user seeing this for the first time might identify 2014 as a year.  And from that, decode the whole sequence number.  Is that a good thing.  Do you want just anybody to identify the date?  That takes the whole matter up a level.  A question for management.

Similarly, the month identifier, 07.  If we use O for October, N for November and D for December, we can drop the zero. 1 for January,  6 for June, 7 for July.  Two characters become one character.  We add two characters to the character set.
           2014-07-23-5393 => 14-7-23-5393.

For the day, we could do a similar thing, but adding 22 characters
would be very confusing.  Hard to use.

The 5393 is next.  If you know the final number, 5393, you can look up all the other information. It uniquely identifies the object it is assigned to.  Suppose that instead of using 5393 we use the number since the day started, and further suppose that we never have as many as 100 new things in a day.  so if 5393 is the sixth thing numbered on July 23, 2014 we can assign 14-7-23-06 to the object.

So we have assigned an unique identifier to objects that arise somehow no more than 99 on one day between 1/1/2000 and 12/31/2099.  And the date the identifier was assigned is easy to read.  And 7 characters are used.

Unfortunately,  a normal sorting program will put 15-D-07-22 before 15-N-07-22; November comes before December.  Bummer!


No comments:

Post a Comment