Link to home
Start Free TrialLog in
Avatar of numtech
numtech

asked on

Java daemon under linux, file name encoding problem with start-stop-daemon

Hello,

We have developped a Java 6 SSLSocket server that runs as a daemon under Debian 5.0 using the start-stop-daemon in a /etc/init.d/ssl-server script.
We face a problem when accessing the filesystem with path containing non-unicode characters.
This problem occured only when starting the server using the daemon wrapper, but not when started within a user shell.

After some research we found out it was because the encoding of the file names inside the JVM was set by the system property "sun.jnu.encoding". And it this case, it was initialized to the value "ANSI_X3.4-1968" instead of "UTF-8".
So we tried to start the server with the command
java -Dsun.jnu.encoding=UTF-8 -jar server.jar

Open in new window

It was changing the System.property inside the JVM but had no real effect on the file encoding (I guess the property is overwritten to late, and the FileSystem object is already instantiated).

Finally we found a workaround by  adding a command to set the locale of the shell instance of the linux daemon at the top of our /etc/init.d/ssl-server script.
LANG=fr_FR.UTF-8; export LANG;

Open in new window

.

It solved the problem when our daemon was lauched by monit but not when it was lauched by puppetd (wich run as a user "puppet" wich has no shell).

For now we decided to prevent puppetd from rebooting our server, he is just killed and then restarted by monit.
Thow, I would like to understand fully the problem to learn from this time consuming problem we've just faced.

Can someone explain us what is really happening and what is the best practice?

Best regards,

Renaud
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

>>It was changing the System.property inside the JVM but had no real effect on the file encoding

You should try the following:

java -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 -jar server.jar
Avatar of numtech
numtech

ASKER

Well file.encoding is the encoding used for converting String to Byte in Writers en Readers so it affects the content of the file.
The property sun.jnu.encoding is the one used for converting String to Byte when resolving a java.io.File paths. This is the one we wan't to set to UTF-8.

But unfortunately, using  -Dsun.jnu.encoding=UTF-8 in the command line WILL change the value of System.getProperty("sun.jnu.encoding") but WILL NOT change the default value used to resolve java.io.File paths. This is precisely our problem.
As I told before, I guess this is because instantiating the FileSystem object instance will occur before overriding the default properties with the one from the command line.

When you change the Local of your shell, it DOES affect both the sun.jnu.encoding property and the java.io.File stuff. This is why we ended using the export LANG; trick!
ASKER CERTIFIED SOLUTION
Avatar of Mick Barry
Mick Barry
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of numtech

ASKER

Thank you "Objects" for answering.
I am happy to hear that according to you, LANG is the correct solution but can you argument a little bit more? Why does sun provide us with the sun.jnu.encoding property if it can only be modified too late in the startup process?
How is the mapping between user env vars and system.properties done at startup?
Why does the LANG export at the top of the /etc/init.d/ssl-server solve the problem when executed by monit but not when execute by puppetd?
Is there an user env var other than LANG that overrides sun.jnu.encoding during the mapping?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of numtech

ASKER

Below is the code of our /etc/init.d/ssl-server.
We can observe than running /etc/init.d/ssl-server start via puppetd --test in a root shell, will echo "LANG=fr_FR.UTF-8" twice in the test_LANG.txt file, but will output  sun.jnu.encoding=ANSI-xxxx in the java log.
Running the same exact command via monit will echo "LANG=fr_FR.UTF-8" twice in the test_LANG.txt file and will output sun.jnu.encoding=UTF-8  in the java log.

What's happening here?
Can you see a problem with the wrapper (code provided below)?

 
#! /bin/sh
### BEGIN INIT INFO
# Provides:          xxxxx java runtime
# Required-Start:    $remote_fs
# Required-Stop:     $remote_fs
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Example initscript
# Description:       This file should be used to construct scripts to be
#                    placed in /etc/init.d.
### END INIT INFO

# Author: sfqsdfsd <contact@xxxxx.fr>
#
# Please remove the "Author" lines above and replace them
# with your own name if you copy and modify this script.

## Do NOT "set -e"
LANG=fr_FR.UTF-8; export LANG;

#Get runtime
RUNTIME=`ls /usr/local/bin/xxxxx/ssl-server/bin | grep jar-with-dependencies.jar | grep runtime`

# PATH should only include /usr/* if it runs after the mountnfs.sh script
PATH=/sbin:/usr/sbin:/bin:/usr/bin
DESC="xxxxx SSL Service Server"
NAME=java
DAEMON=/usr/bin/$NAME
DAEMON_ARGS=" -Xmx1024M -cp $RUNTIME fr.xxxxx.sync.propagate.server.ServerCLI 3682 /usr/local/bin/xxxxx/ssl-server/truststore.jks /usr/local/bin/xxxxx/ssl-server/keystore.jks /home/datasync-sat /var/log/xxxxx DEBUG"
PIDFILE=/var/run/ssl-server.pid
SCRIPTNAME=/etc/init.d/ssl-server
WORKINGDIR=/usr/local/bin/xxxxx/ssl-server/bin

# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME

# Load the VERBOSE setting and other rcS variables
. /lib/init/vars.sh

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
. /lib/lsb/init-functions


# Function that starts the daemon/service
#

do_start()
{

        # Return
        #   0 if daemon has been started
        #   1 if daemon was already running
        #   2 if daemon could not be started

echo $LANG >> test_LANG.txt

        start-stop-daemon --start --quiet --background -d $WORKINGDIR --make-pidfile --pidfile $PIDFILE --exec $DAEMON --test > /dev/null \
                || return 1
        start-stop-daemon --start --quiet --background -d $WORKINGDIR --make-pidfile --pidfile $PIDFILE --exec $DAEMON -- \
                $DAEMON_ARGS

echo $LANG >> test_LANG.txt

#        start-stop-daemon --start -d $WORKINGDIR --make-pidfile --pidfile $PIDFILE --exec $DAEMON -- \
#                $DAEMON_ARGS \
         echo "Starting ssl server with pid: "`cat $PIDFILE`
        # Add code here, if necessary, that waits for the process to be ready
        # to handle requests from services started subsequently which depend
        # on this one.  As a last resort, sleep for some time.

}

#
# Function that stops the daemon/service
#
do_stop()
{
        # Return
        #   0 if daemon has been stopped
        #   1 if daemon was already stopped
        #   2 if daemon could not be stopped
        #   other if a failure occurred

        start-stop-daemon --stop --quiet --retry=TERM/50/KILL/5 --pidfile $PIDFILE --name $NAME
        # Wait for children to finish too if this is a daemon that forks
        # and if the daemon is only ever run from this initscript.
        # If the above conditions are not satisfied then add some other code
        # that waits for the process to drop all resources that could be
        # needed by services started subsequently.  A last resort is to
        # sleep for some time.
#        start-stop-daemon --stop --quiet --oknodo --retry=0/30/KILL/5 --exec $DAEMON
 #       [ "$?" = 2 ] && return 2
        # Many daemons don't delete their pidfiles when they exit.
        echo "Stopping ssl server with pid: "`cat $PIDFILE`
        rm -f $PIDFILE
}

#
# Function that sends a SIGHUP to the daemon/service
#
do_reload() {
        #
        # If the daemon can reload its configuration without
        # restarting (for example, when it is sent a SIGHUP),
        # then implement that here.
        #
        start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE --name $NAME
        return 0
}

do_status() {

        val=`ps ax | grep -v grep | grep '$DAEMON_ARGS'`
    case "$val" in
                0) echo "Runtime is not running" ;;
                1) echo "Runtime is running" ;;
                2) echo "Too much instance of Runtime are running" ;;
                                esac
}

case "$1" in
  start)
        [ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
        do_start
       case "$?" in
                0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
                2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
        esac
        ;;
  stop)
        [ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
        do_stop
        case "$?" in
                0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
                2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
        esac
        ;;
  status)
                do_status
                ;;
  #reload|force-reload)
        #
        # If do_reload() is not implemented then leave this commented out
        # and leave 'force-reload' as an alias for 'restart'.
        #
        #log_daemon_msg "Reloading $DESC" "$NAME"
        #do_reload
        #log_end_msg $?
        #;;
  restart|force-reload)
        #
        # If the "reload" option is implemented then remove the
        # 'force-reload' alias
        #
        log_daemon_msg "Restarting $DESC" "$NAME"
        do_stop
        case "$?" in
          0|1)
                do_start
                case "$?" in
                        0) log_end_msg 0 ;;
                        1) log_end_msg 1 ;; # Old process is still running
                        *) log_end_msg 1 ;; # Failed to start
                esac
                ;;
          *)
                # Failed to stop
                log_end_msg 1
                ;;
        esac
        ;;
  *)
        #echo "Usage: $SCRIPTNAME {start|stop|restart|reload|force-reload}" >&2
        echo "Usage: $SCRIPTNAME {start|stop|restart|force-reload}" >&2
        exit 3
        ;;
esac

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of numtech

ASKER

Ok I will have a look at the source code and I am getting back to you!
Avatar of numtech

ASKER

Ok I found the place in the HostSpot sources. I can find the lines vers LANG is read and where sun.jnu.encoding is set.

- sources jdk 6 23\motif\lib\Xm\XmString.c(56): #define env_variable "LANG"
- sources jdk 6 23\j2se\src\solaris\native\java\lang\java_props_md.c(182): sprops.sun_jnu_encoding = sprops.encoding;

But it is not easy to understand why in some case the LANG var is not found. I would need a real C++ environment and I am not fluent with this language.

I think I'll give up since we have a work around by always using monit to boot the server...

 
/*
 * %W% %E%
 *
 * Copyright (c) 2006, Oracle and/or its affiliates. All rights reserved.
 * ORACLE PROPRIETARY/CONFIDENTIAL. Use is subject to license terms.
 */

#ifdef __linux__
#include <stdio.h>
#include <ctype.h>
#endif
#include <pwd.h>
#include <locale.h>
#ifndef ARCHPROPNAME
#error "The macro ARCHPROPNAME has not been defined"
#endif
#include <sys/utsname.h>	/* For os_name and os_version */
#include <langinfo.h>           /* For nl_langinfo */
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/param.h>
#include <time.h>
#include <errno.h>

#include "locale_str.h"
#include "java_props.h"

#ifdef __linux__
#define CODESET _NL_CTYPE_CODESET_NAME
#else
#ifdef ALT_CODESET_KEY
#define CODESET ALT_CODESET_KEY
#endif
#endif

/* Take an array of string pairs (map of key->value) and a string (key).
 * Examine each pair in the map to see if the first string (key) matches the
 * string.  If so, store the second string of the pair (value) in the value and
 * return 1.  Otherwise do nothing and return 0.  The end of the map is
 * indicated by an empty string at the start of a pair (key of "").
 */
static int
mapLookup(char* map[], const char* key, char** value) {
    int i;
    for (i = 0; strcmp(map[i], ""); i += 2){
        if (!strcmp(key, map[i])){
            *value = map[i + 1];
            return 1;
        }
    }
    return 0;
}

/* This function sets an environment variable using envstring.
 * The format of envstring is "name=value".
 * If the name has already existed, it will append value to the name.
 */
static void
setPathEnvironment(char *envstring)
{
    char name[20], *value, *current;
    
    value = strchr(envstring, '='); /* locate name and value separator */
    
    if (! value)
	return; /* not a valid environment setting */
    
    /* copy first part as environment name */
    strncpy(name, envstring, value - envstring);
    name[value-envstring] = '\0';
    
    value++; /* set value point to value of the envstring */
    
    current = getenv(name);
    if (current) {
	if (! strstr(current, value)) {
	    /* value is not found in current environment, append it */
	    char *temp = malloc(strlen(envstring) + strlen(current) + 2);
        strcpy(temp, name);
        strcat(temp, "=");
        strcat(temp, current);
        strcat(temp, ":");
        strcat(temp, value);
        putenv(temp);
	}
	/* else the value has already been set, do nothing */
    }
    else {
	/* environment variable is not found */
	putenv(envstring);
    }
}

#ifndef P_tmpdir
#define P_tmpdir "/var/tmp"
#endif

/* This function gets called very early, before VM_CALLS are setup.
 * Do not use any of the VM_CALLS entries!!!
 */
java_props_t *
GetJavaProperties(JNIEnv *env)
{
    static java_props_t sprops = {0};
    char *v; /* tmp var */

    if (sprops.user_dir) {
        return &sprops;
    }

    /* tmp dir */
    sprops.tmp_dir = P_tmpdir;

    /* Printing properties */
    sprops.printerJob = "sun.print.PSPrinterJob";

    /* patches/service packs installed */
    sprops.patch_level = "unknown";
    
    /* Java 2D properties */
    sprops.graphics_env = "sun.awt.X11GraphicsEnvironment";
    sprops.awt_toolkit = NULL;

    /* This is used only for debugging of font problems. */
    v = getenv("JAVA2D_FONTPATH");
    sprops.font_dir = v ? v : NULL;

#ifdef SI_ISALIST
    /* supported instruction sets */
    {
        char list[258];
        sysinfo(SI_ISALIST, list, sizeof(list));
        sprops.cpu_isalist = strdup(list);
    }
#else
    sprops.cpu_isalist = NULL;
#endif

    /* endianness of platform */
    {
        unsigned int endianTest = 0xff000000;
        if (((char*)(&endianTest))[0] != 0)
            sprops.cpu_endian = "big";
        else
            sprops.cpu_endian = "little";
    }

    /* os properties */
    {
        struct utsname name;
	uname(&name);
	sprops.os_name = strdup(name.sysname);
	sprops.os_version = strdup(name.release);

        sprops.os_arch = ARCHPROPNAME;

        if (getenv("GNOME_DESKTOP_SESSION_ID") != NULL) {
            sprops.desktop = "gnome";
	}
        else {
            sprops.desktop = NULL;
        }
    }

    /* Determine the language, country, variant, and encoding from the host,
     * and store these in the user.language, user.country, user.variant and
     * file.encoding system properties. */
    {
        char *lc;
        lc = setlocale(LC_CTYPE, "");
#ifndef __linux__
        if (lc == NULL) {
            /*
             * 'lc == null' means system doesn't support user's environment
             * variable's locale.
             */
          setlocale(LC_ALL, "C");
          sprops.language = "en";
          sprops.encoding = "ISO8859-1";
          sprops.sun_jnu_encoding = sprops.encoding;
        } else {
#else
        if (lc == NULL || !strcmp(lc, "C") || !strcmp(lc, "POSIX")) {
            lc = "en_US";
        }
        {
#endif

            /*
             * locale string format in Solaris is
             * <language name>_<country name>.<encoding name>@<variant name>
             * <country name>, <encoding name>, and <variant name> are optional.
             */
            char temp[64];
            char *language = NULL, *country = NULL, *variant = NULL,
                 *encoding = NULL;
            char *std_language = NULL, *std_country = NULL, *std_variant = NULL,
                 *std_encoding = NULL;
            char *p, encoding_variant[64];
            int i, found;

#ifndef __linux__
            /*
             * Workaround for Solaris bug 4201684: Xlib doesn't like @euro
             * locales. Since we don't depend on the libc @euro behavior,
             * we just remove the qualifier.
             * On Linux, the bug doesn't occur; on the other hand, @euro
             * is needed there because it's a shortcut that also determines
             * the encoding - without it, we wouldn't get ISO-8859-15.
             * Therefore, this code section is Solaris-specific.
             */
	    lc = strdup(lc);	/* keep a copy, setlocale trashes original. */
            strcpy(temp, lc);
	    p = strstr(temp, "@euro");
	    if (p != NULL) 
		*p = '\0';
            setlocale(LC_ALL, temp);
#endif

            strcpy(temp, lc);

            /* Parse the language, country, encoding, and variant from the
             * locale.  Any of the elements may be missing, but they must occur
             * in the order language_country.encoding@variant, and must be
             * preceded by their delimiter (except for language).
             *
             * If the locale name (without .encoding@variant, if any) matches
             * any of the names in the locale_aliases list, map it to the
             * corresponding full locale name.  Most of the entries in the
             * locale_aliases list are locales that include a language name but
             * no country name, and this facility is used to map each language
             * to a default country if that's possible.  It's also used to map
             * the Solaris locale aliases to their proper Java locale IDs.
             */
            if ((p = strchr(temp, '.')) != NULL) {
                strcpy(encoding_variant, p); /* Copy the leading '.' */
                *p = '\0';
            } else if ((p = strchr(temp, '@')) != NULL) {
                 strcpy(encoding_variant, p); /* Copy the leading '@' */
                 *p = '\0';
            } else {
                *encoding_variant = '\0';
            }
            
            if (mapLookup(locale_aliases, temp, &p)) {
                strcpy(temp, p);
            }
            
            language = temp;
            if ((country = strchr(temp, '_')) != NULL) {
                *country++ = '\0';
            }
            
            p = encoding_variant;
            if ((encoding = strchr(p, '.')) != NULL) {
                p[encoding++ - p] = '\0';
                p = encoding;
            }
            if ((variant = strchr(p, '@')) != NULL) {
                p[variant++ - p] = '\0';
            }

            /* Normalize the language name */
            std_language = "en";
            if (language != NULL) {
                mapLookup(language_names, language, &std_language);
            }
            sprops.language = std_language;

            /* Normalize the country name */
            if (country != NULL) {
                std_country = country;
                mapLookup(country_names, country, &std_country);
                sprops.country = strdup(std_country);
            }

            /* Normalize the variant name.  Note that we only use
             * variants listed in the mapping array; others are ignored. */
            if (variant != NULL) {
                mapLookup(variant_names, variant, &std_variant);
                sprops.variant = std_variant;
            }

            /* Normalize the encoding name.  Note that we IGNORE the string
             * 'encoding' extracted from the locale name above.  Instead, we use the
             * more reliable method of calling nl_langinfo(CODESET).  This function
             * returns an empty string if no encoding is set for the given locale
             * (e.g., the C or POSIX locales); we use the default ISO 8859-1
             * converter for such locales.
	     */

	    /* OK, not so reliable - nl_langinfo() gives wrong answers on
	     * Euro locales, in particular. */
	    if (strcmp(p, "ISO8859-15") == 0)
		p = "ISO8859-15";
	    else		
                p = nl_langinfo(CODESET);

	    /* Convert the bare "646" used on Solaris to a proper IANA name */
	    if (strcmp(p, "646") == 0)
		p = "ISO646-US";

	    /* return same result nl_langinfo would return for en_UK,
	     * in order to use optimizations. */
            std_encoding = (*p != '\0') ? p : "ISO8859-1";


#ifdef __linux__
	    /* 
	     * Remap the encoding string to a different value for japanese
	     * locales on linux so that customized converters are used instead
	     * of the default converter for "EUC-JP". The customized converters
	     * omit support for the JIS0212 encoding which is not supported by
	     * the variant of "EUC-JP" encoding used on linux
	     */
	    if (strcmp(p, "EUC-JP") == 0) {
		std_encoding = "EUC-JP-LINUX";
	    }
#else
            /* For Solaris use customized vendor defined character 
             * customized EUC-JP converter
             */
            if (strcmp(p,"eucJP") == 0) {
                std_encoding = "eucJP-open"; 
            }
#endif
#ifndef __linux__
	    /* 
	     * Remap the encoding string to Big5_Solaris which augments
	     * the default converter for Solaris Big5 locales to include
	     * seven additional ideographic characters beyond those included
	     * in the Java "Big5" converter.
	     */
	    if (strcmp(p, "Big5") == 0) {
		    std_encoding = "Big5_Solaris";
	    }
#endif
	    sprops.encoding = std_encoding;
            sprops.sun_jnu_encoding = sprops.encoding;
        }
    }
    
#ifdef __linux__ 
#if __BYTE_ORDER == __LITTLE_ENDIAN
    sprops.unicode_encoding = "UnicodeLittle";
#else
    sprops.unicode_encoding = "UnicodeBig";
#endif
#else
    sprops.unicode_encoding = "UnicodeBig";
#endif

    /* user properties */
    {
        struct passwd *pwent = getpwuid(getuid());
	sprops.user_name = pwent ? strdup(pwent->pw_name) : "?";
	sprops.user_home = pwent ? strdup(pwent->pw_dir) : "?";
    }

    /* User TIMEZONE */
    {
	/*
	 * We defer setting up timezone until it's actually necessary.
	 * Refer to TimeZone.getDefault(). However, the system
	 * property is necessary to be able to be set by the command
	 * line interface -D. Here temporarily set a null string to
	 * timezone.
	 */
	tzset();	/* for compatibility */
	sprops.timezone = "";
    }

    /* Current directory */
    {
        char buf[MAXPATHLEN];
        errno = 0;
        if (getcwd(buf, sizeof(buf))  == NULL)
            JNU_ThrowByName(env, "java/lang/Error", 
             "Properties init: Could not determine current working directory.");
        else
            sprops.user_dir = strdup(buf);
    }

    sprops.file_separator = "/";
    sprops.path_separator = ":";
    sprops.line_separator = "\n";

    /* Append CDE message and resource search path to NLSPATH and
     * XFILESEARCHPATH, in order to pick localized message for 
     * FileSelectionDialog window (Bug 4173641).
     */
    setPathEnvironment("NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat");
    setPathEnvironment("XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt");

    return &sprops;
}

Open in new window

Avatar of numtech

ASKER

After some more research, it appears that puppetd is executing the /etc/init.d/ssl-server command via rubby.
I guess there is a kind of sandbox that prevents the exported LANG var to be transmitted from the "rubby process" running the script to the forked "start-stop-daemon process".
At the opposite the exported LANG is inherited from the "monit process" running the script to the forked "start-stop-daemon process".

I won't take any more time on this subject.

Thank you all for your help!

Regards

Renaud
Avatar of numtech

ASKER

I have been helped to find a way of understanding but no one could explain exactly the full problem.