Link to home
Start Free TrialLog in
Avatar of tel2
tel2Flag for New Zealand

asked on

Linux 'sort' order: cron vs command line

Hi Experts,

If I run this Bash script:
#!/bin/bash

echo "Unsorted:"
cat sort_test11
od -bc sort_test11
LC_ALL=""
echo "LC_All=$LC_ALL"
/bin/sort sort_test11
LC_ALL=C
echo "LC_All=$LC_ALL"
/bin/sort sort_test11
LC_ALL=en_US.utf8
echo "LC_All=$LC_ALL"
/bin/sort sort_test11

Open in new window


From the command line I get this output:

Unsorted
z
a
_
A
Z
0000000 172 012 141 012 137 012 101 012 132 012
          z  \n   a  \n   _  \n   A  \n   Z  \n
0000012
LC_ALL=
A
Z
_
a
z
LC_ALL=C
A
Z
_
a
z
LC_ALL=en_US.utf8
_
a
A
z
Z

But from cron I get this output:

Unsorted
z
a
_
A
Z
0000000 172 012 141 012 137 012 101 012 132 012
          z  \n   a  \n   _  \n   A  \n   Z  \n
0000012
LC_ALL=
_
a
A
z
Z
LC_ALL=C
_
a
A
z
Z
LC_ALL=en_US.utf8
_
a
A
z
Z

As you can see, the differences between command line and cron sort output occur when LC_ALL contains nothing or 'C'.  All cron job sorting seems to be case insensitive, but the only case insensitive output from the command line run occurs when LC_ALL=en_US.utf8.

Questions:
Q1. Why these differences between sorting when running the same script from cron and the command line?

Q2. How should I get case sensitive sorting from the cron job?

I haven't found anything useful in "man sort" on this platform yet.  Nor Google.

Here's the version of sort:
$ /bin/sort --version
sort (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and Paul Eggert.

And the version of CentOS:
$ uname -a
Linux <servername>.<domainname>.com 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed Jun 12 03:34:52 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Thanks.
tel2
ASKER CERTIFIED SOLUTION
Avatar of John Pope
John Pope
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of tel2

ASKER

Thanks for that, popesy.

> "Have you tried setting the LC_ALL var in the /etc/crontab?"
No, but I can't imagine how that could help if it doesn't work in the script.  But see below.

> "Also, shouldn't you be 'export'ing the LC_ALL var to set it?"
Good point!  I had tried this in some of my tests, but obviously not the right tests.  Here's the new cron output:
sort_test11
z
a
_
A
Z
0000000 172 012 141 012 137 012 101 012 132 012
          z  \n   a  \n   _  \n   A  \n   Z  \n
0000012
export LC_ALL=
_
a
A
z
Z
export LC_ALL=C
A
Z
_
a
z
export LC_ALL=en_US.utf8
_
a
A
z
Z
As you can see, we have success.  The LC_ALL=C is working case sensitively!

Command line script output has not been changed by exporting LC_ALL.

But that raises a (possibly just academic) question:
Q3. Why is exporting LC_ALL not required in the script when it's run from the command line?  Could it be that LC_ALL is somehow already exported in my command line environment?  How can I tell?  The command "export -p | grep LC" gives me no output.
If I run "locale" from cron I get this output:
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
If I run it from the command line I get this:
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
Hi tel2

For your shell environment, try;

env | grep LC

Open in new window


Cheers, JP
Avatar of tel2

ASKER

Hi popesy,

Chech this out:

$ env | grep LC      -> Nothin'
$ LC_ALL=C
$ env | grep LC      -> Nothin'
$ export LC_ALL=C
$ env | grep LC
LC_ALL=C

As you can see, the only time anything was returned by the grep was when I first manually exported LC_ALL=C.  So, I still don't know the answer to Q3, and unless you or someone else has any other ideas soon, I might have to just leave that acedemic question as one of those unsolved mysteries of the universe, and award points.
Hi tel2

It seems that distros differ in this regard.  I've got RHEL and SLES and 'locale' output is slightly different on both.  LC_ALL is not set at the command line without explicitly exporting it for me.  This is for both distros I have.

There are a few options to check the default settings;

/etc/sysconfig/language (SLES)

or

/etc/sysconfig/i18n (RH and other, CentOS perhaps?)

The LC* variables are for specific language settings (like what you want!) such that these can be overridden as required.  I know this doesn't answer your Q3 exactly, but may give some additional understanding.

Cheers, JP.
Avatar of tel2

ASKER

Thanks for your efforts, JP