Work the Shell - What Day Is That Date in the Past?

 in

In a previous article, we started a script that worked backward from a day and month date and figured out the most recent year—including possibly the current year—that would match that date occurring on that particular day. For example, April 1st as a Friday was most recently in this year, 2011, but April 1st as a Tuesday? When did that last occur?

To make things interesting, our script is focused on tapping in to one of the unsung utilities of Linux, cal, and parsing its output to identify a day for a given date.

As is typical with a shell script, much of the work so far has been involved in normalizing the input data so that what we hand to the cal program will work and be understood by the program.

The bigger challenge, however, was to figure out whether a possible date could be in the current year. Since the program always is looking backward, it needs to know the current date to compare. That is, I'm writing this on April 3, 2011. If I check for the most recent April 1 being a Friday, it should say 2011, but if I check for the most recent May 1 being a Sunday, it should not suggest 2011. That's in the future and isn't a valid answer.

That's all shown in my previous column, so let's get on to something new: figuring out how to parse the cal output.

Parsing cal Calendars

For any given month and year, cal produces output similar to this:


    August 2008
Su Mo Tu We Th Fr Sa
                1  2
3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31

Let's say we're looking for August 3rd. To search for it in this output, we need to specify that there should not be a digit before or after the date. This is doable with a simple regex:


$ cal aug 2008 | grep -e '[^0-9]3[^0-9]'
3  4  5  6  7  8  9

(As you'll learn later, this is insufficient as a regular expression. If you're really paying attention, you're already suspecting it's going to end up being a bit more complicated.)

Now, we need to figure out which digit matches.

awk to the Rescue

The basic approach we're going to use is to have awk step through each field on lines that match the pattern specified by using a for loop:


{ for (i=1;i<=NF;i++) if ($i~/regex/) print i}

We could use this with the grep statement above, but let's save a command by letting awk do the conditional test too:


$ cal aug 2008 | awk -e '/regex/ { for (i=1;i<=NF;i++)
  if ($i~/regex/ print i }'

To test this, let's use a regular expression that tests for the 5th day of the month:


[^0-9]5[^0-9]

This kind of works, but there's a problem. If we search for the 10th, because it appears at the very beginning of the line, it doesn't match the regular expression fragment [^0-9]10. The solution means our regex becomes more complicated, but here it is—one that works for the situation where it's possibly either the beginning of the line or the end of the line:


[^0-9]10[^0-9]|^10[^0-9]|[^0-9]10$

The | is a logical "or" statement, so it's now the earlier expression or one that has the pattern we seek followed by not-a-digit, but is at the beginning of the line (the ^ by itself) or is the pattern preceded by not-a-digit at the end of a line (the $ notation).

Fortunately, we're writing a script so we won't have to type this in more than once. Just as well!

There's another wrinkle in this output. We need to know not only in what field the matching number appears, but also how many fields total are on the matching line. Why? Otherwise, match 2 above occurring on a Monday would look exactly like the above, the 2nd occurring on a Saturday.

Here's our test script fragment, so far:


expr="[^0-9]${day}[^0-9]|^${day}[^0-9]|[^0-9]${day}\$"
cal aug 2008 | awk "/$expr/ { print \$0 }"

Notice that we need to use double quotes so that the variable $day is expended, and then $expr is also expanded, which means that we also need to escape the $0 in this test.

That's not what we want though. The awk statement needs to be more sophisticated, because we want to know the matching field number (for example, day of week 1–7) along with the total number of fields in the matching line. Ready?


expr="[^0-9]${day}[^0-9]|^${day}[^0-9]|[^0-9]${day}\$"
cal aug 2008 | awk "/$expr/ { for (i=1;i<=NF;i++) {
     if (\$i~/${day}/) { print \"i=\"i\", NF=\"NF }}}"

The double quotes add a tiny bit of complication, but really, this is just a complicated script.

The output, against our August 2008 calendar, looks like this:


$ sh match.sh 2
i=2, NF=2
$ sh match.sh 10
i=1, NF=7
$ sh match.sh 19
i=3, NF=7

That all makes sense. The next challenge is to figure out what day of the week we've matched for a given day and number of days in the week. Remember, day #1 on a three-day week is Thursday, while day #1 in a seven-day week is Sunday. Confusing, eh?

Day Of Week as an Array

The fast way to calculate this is to, well, pre-calculate it by creating a bunch of arrays. Like this:


if NF=1 days=[Sat]
if NF=2 days=[Fri,Sat]
if NF=3 days=[Thu,Fri,Sat]

and so on. There's a formula at play here, but more important, there's a pattern: (7-NF)-i is consistent. So day #1 on a three-day week is (7-3)+1 = 5 = Thursday, while day #1 on a 7-day week is (7-7)+1 = Sunday.

Let's double-check: in Aug 2008, Aug 1 is (7-2)+1 = 6 = Saturday, and Aug 4 = (7-7)+2 = Monday and Aug 31 = (7-1)+1 = 7 = Saturday.

Uh-oh, that last one's wrong, showing that we need to differentiate between the first week of the month, in which situation the days are right-aligned (as it were!), but in the last week of the month, they're left-aligned.

Ah, another nuance. Crikey, this is a rather tricky to write, isn't it?

Next time, we'll continue to build the script. Meanwhile, experiment with awk and regular expressions and see if you can find a more streamlined solution.

Keyboard photo via Shutterstock.com.

______________________

Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Cal? Really?

Marcelo (the real one)'s picture

At some point I seriously considered cancelling my LJ suscription just to stop funding Dave Taylor's column, but the quality of the rest of the magazine won at the end. I'm happy to support the "there's more than one way to do it" and "why? because I can!" mantras, but at some point you need to at least run this by an editor who's able to say "dude, you are overcomplicating this way too much, you ought to teach people good habits!"

What's the most recent year when December 25th falls on a Tuesday, including this year?

Proof of concept, coded on the command line:

y=`date +%Y` ; while true ; do case `date +%u --date="December 25, $y"` in 2) echo $y ; break ;; esac ; let y=y-1 ; done

Just in case the previous line is obliterated by the comment sanitizer: http://pastebin.com/3kQ385iQ

wiki?

Anonymous's picture

do we really need a wiki to calculate dates?

As others before me said...

estani's picture

As others before me said... this is the most complicated way of solving a simple task... use
date -d +

man date

(parsing the output of cal?! Really...)

I was doing something useful, got bored and made a space shooter

bolt's picture

If anyone wants to check out a space shooter written in Bash 3 / Bash 4, I just made one...

script
screenshot
blog post

Straightforward answer using date

Magic Banana's picture

As the previous comments already mentioned, you should read 'info date' and understand its power. Here is a straightforward answer to the problem (notice that you pretend that it is hard "to figure out whether a possible date could be in the current year"... well, that it is "my" fourth line, which compares UNIX times):

#!/bin/sh

dow=`date -d $1 +%a`
shift
count=0
if [ `date +%s` -lt `date -d "$*" +%s` ]
then
count=1
fi
until [ `date -d "$count year ago $*" +%a` = $dow ]
do
count=`expr $count + 1`
done
date -d "$count year ago" +%Y

Executions:
$ ./whatyear.sh THURSDAY february 24 # Notice the free formatting
2011
$ ./whatyear.sh Tue Mar 27 # true this year... but in the future
2007

date's power

Magic Banana's picture

As the previous comments already mentioned, you should really read 'info date' and understand its power. Here is my take on your problem (notice that you pretend that it is hard "to figure out whether a possible date could be in the current year"... well that it is my second line that uses the UNIX time):

#!/bin/sh

count=0
if [ `date +%s` -lt `date -d "$1" +%s` ]
then
count=1
fi
until [ `LC_TIME=C date -d "$count year ago ${1#* }" +%a` = ${1%% *} ]
do
count=`expr $count + 1`
done
date -d "$count year ago" +%Y

Execution:
$ ./whatyear.sh "Thu Feb 24"
2011
$ ./whatyear.sh "Tue Mar 27"
2007

Test script

J. E. Aneiros's picture

I wrote this script script to test my idea, it seems to work:


#!/bin/bash

for i in {1..31}
do
wd=$(cal 8 2008 | grep "\b${i}\b" | sed 's/[ ][ ][ ]\|[ ][ ]\|[ ]/|/g; s/^|//; s/|/\n/g' | cat -n | grep "\b${i}\b$" | tr -s [:space:] ' ' | cut -d ' ' -f 2)
echo $i, $wd
done

The result:

1, 6
2, 7
3, 1
4, 2
5, 3
6, 4
7, 5
8, 6
9, 7
10, 1
11, 2
12, 3
13, 4
14, 5
15, 6
16, 7
17, 1
18, 2
19, 3
20, 4
21, 5
22, 6
23, 7
24, 1
25, 2
26, 3
27, 4
28, 5
29, 6
30, 7
31, 1

\b Word boundary

J. E. Aneiros's picture

I think your are not using all the power of RE here:

[^0-9]5[^0-9]

You should use \b5\b, which is more simple and precise, I think.

Laan Penge

Rosalie's picture

I can't really relate the problem solving but I really amazed.

Alternative date +%u

Ajay Ramasehan's picture

Hello,

Was running through the man page of date and came across this date +%u which returns the day of the week from 1 to 7 Monday being 1 so we could use it as :-

Monthinput and dateinput are the inputs month and date we desire to check for, and dayinput is the day of the week we desire to check for. (from 1-7)

First determine the starting year to check, either the current year or the prev year

while [ 1 ]
do
day=`date +%u -d "$monthinput $dateinputh $startyear"`
if [ $day -eq $dayinput ] ; then
echo "Found match in year $startyear$
break
else
startyear=$(($startyear-1))
fi
done

Make short and simply

Billerey Eric's picture

date -s 2008-08-07 +%A

Just use something like for i

The Adminblogger's picture

Just use something like

for i in {2012..2000}; do date -d "1 Apr $i" ; done

and grep the output might be the easier version of your script :-)

Really nice :-) Something

Marcelo (the real one)'s picture

Really nice :-)

Something like this gets closer to a solution to the problem has posted:

When was April 1st a Monday?

for i in {2012..2000}; do date -d "1 Apr $i" "+%u %Y" ; done | grep '^1 ' | cut -d' ' -f2

Well that's one way to do it

Deltaray's picture

Here is another way. This will show the last 5 times (including a future date of this year) April 1st happened on a Sunday.

n=0;for y in $( seq $(date +%Y) -1 0 );do [[ $( date -d "Apr 1 $y" +%A ) == "Sunday" ]]&&echo $y&&true $[n++];[[ $n -ge 5 ]]&&break;done

You could probably easily turn that into a function that takes two args and excludes future dates.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState