Solved

Hunting the bug: sqlite3 and C++

Posted on 2013-11-19
3
385 Views
Last Modified: 2013-11-20
Hi,

I have written a simple program that should export a table from a SQLite 3 database into a CSV text file. I am observing the bug when the code is processing the specific database in the specific directory. I cannod debug it on-line there, nor I am able to find the problem by my eyes. Adding the log commands does not change the behaviour; however, it did not help me to discover the reason. Here is the full source, the description of the problem follows below:
#include <cassert>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

#include <sqlite3.h>
#include <Windows.h>

using namespace std;

void HelpExit(const std::string & pgmName)
{
    cerr << "Usage: " << pgmName << "input.db output.csv tablename\n";
    exit(1);
}


/*! \brief Conversion of the buffer content from UTF-16 to std::string in cp1250.
*/
std::string WCto1250(const void * psz, int len)
{
    // The len is correct. For zero, the psz can be or nullptr 
    // or zero terminated empty string.
    if (len == 0)
        return "";

    // The needed buffer size...
    int outLen = ::WideCharToMultiByte(1250, 0,               // cp1250
                               static_cast<LPCWCH>(psz), len, // input buffer
                               nullptr, 0,                    // output buffer
                               nullptr, nullptr);             // default char

    // Buffer allocation and initialization -- one byte extra for '\0'.
    vector<char> buf(outLen + 1, '\0');

    // The conversion...
    int n =      ::WideCharToMultiByte(1250, 0,               // cp1250
                               static_cast<LPCWCH>(psz), len, // input buffer
                               &buf[0], outLen,               // output buffer
                               nullptr, nullptr);             // default char

    // There could be some error.
    if (n == 0) {
        DWORD err = GetLastError();
        switch (err) {
        case ERROR_INSUFFICIENT_BUFFER:     cout << "ERROR_INSUFFICIENT_BUFFER"; break;
        case ERROR_INVALID_FLAGS:           cout << "ERROR_INVALID_FLAGS"; break;
        case ERROR_INVALID_PARAMETER:       cout << "ERROR_INVALID_PARAMETER"; break;
        case ERROR_NO_UNICODE_TRANSLATION:  cout << "ERROR_NO_UNICODE_TRANSLATION"; break;
        default:                            cout << "err = " << err << endl;
        }
        return "error";
    }

    // returned as std::string.
    return &buf[0];
}


std::string doubleQuotedEscaped(const std::string & s)
{
    if (s.length() == 0)
        return "\"\"";    // just ""

    // If there is no double quote inside, just add and return.
    string::size_type pos = s.find('"', 0);
    if (pos == string::npos)
        return '"' + s + '"';

    // Replace double quotes by two single quote.
    string result {};        // init - empty string
    for (const char c: s) {
        if (c == '"')
            result += "''";  // the replacement
        else
            result += c;
    }
    return '"' + result + '"';
}


/*! \brief Exports one table from SQLite3 database to the CSV file.

  The CSV file uses double quoting everywhere. The encoding
  is converted to windows-1250. The decimal dot is converted to
  decimal comma.
*/
void dbExport(const std::string & fname_in, 
              const std::string & fname_out,
              const std::string & tablename)
{
    sqlite3 * db;   // database connection
    int rc = sqlite3_open(fname_in.c_str(), &db);
    if (rc) {
        cerr << "Cannot open the database: " << sqlite3_errmsg(db) << "\n";
        sqlite3_close(db);
        return;
    }

    // Build the SQL query and prepare the statement object.
    string sqlSelect = "SELECT * FROM " + tablename;
    sqlite3_stmt * statement {nullptr};

    rc = sqlite3_prepare_v2(db,
                            sqlSelect.c_str(), sqlSelect.size() + 1,
                            &statement, nullptr);
    if (rc) {
        if (rc != SQLITE_EMPTY) {
            cerr << "sqlite3_prepare_v2 error:\n" << sqlite3_errmsg(db) << endl;
        }
        sqlite3_close(db);
        return;
    }

    // The statement object is ready to be used. Some columns must be processed.
    assert(statement != nullptr);
    int column_count = sqlite3_column_count(statement);

    // Flags for what columns must be processed or not.
    // Only the REAL and NUMERIC must be treated the special way.
    vector<bool> isdec(column_count, false);

    for (int i = 0; i < column_count; ++i) {
        const char * p = sqlite3_column_decltype(statement, i);
        string t {p == nullptr ? "" : p};   // nullptr for NUMERIC
        if (t == "REAL" || t == "")         // "" for NUMERIC
            isdec[i] = true;
    }

    // Open the output file.
    ofstream f(fname_out);

    int cnt = 0;  // only for indication

    // Looping through the rows of the table.
    while (true) {
        // The first or the next table row.
        int status = sqlite3_step(statement);

        if (status == SQLITE_ROW) {
            // It is the next row, export the values of the columns
            for (int col = 0; col < column_count; ++col)  {
                
                // Get the value as UTF-16 text, the len is in bytes.
                const void * text = sqlite3_column_text16(statement, col);
                int len = sqlite3_column_bytes16(statement, col);

                // Convert the text to the windows-1250 std::string.
                // If flag is set for the column, replace the decimal dot
                // by decimal comma.
                string val(WCto1250(text, len));
                if (isdec[col]) {
                    string::size_type pos = val.find('.');
                    if (pos != string::npos)
                        val = val.replace(pos, 1, ",");
                }

                if  (col > 0) f << ',';         // separator
                f << doubleQuotedEscaped(val);  // the text with replacements
            }
            f << '\n';
        }
        else if (status == SQLITE_DONE)
        {
            break;
        }
        else
        {
            // Some error...
            cout << "error\n";
            break;
        }

        // Visual indication of the progress.
        if (++cnt % 10000 == 0)
            cout << "." << flush;
    }

    // Finalizing the work with database, closing the exported file.
    sqlite3_finalize(statement);
    sqlite3_close(db);
    f.close();
}


int main(int argc, const char* argv[])
{
    // Store the program name (for error messages).
    string pgmName {"sqlite2txt"};

    if (argc != 4)
        HelpExit(pgmName);

    // Simplified input of arguments. Checking removed for simplicity.
    string fname_in {argv[1]};
    string fname_out {argv[2]};
    string tablename {argv[3]};

    // Do the conversion.
    dbExport(fname_in, fname_out, tablename);
    return 0;
}

Open in new window

The problem manifests in the dbExport function (starts at line 89). When the auxiliary logfile messages were added, they said that everything should be OK until somewhere in the middle of the loop that starts at the line 137. In the specific case, the export terminated quietly in the middle of the element of one row -- part of the element content was written to the output file (it could be that the rest of the file buffer was not flushed). Definitely, it worked when the database file was located in another directory (i.e. different path passed as the argument). This way I guess it is a kind of error with bad memory allocation, but I cannot find it by my eyes.

I had the log messages in both of the else branches of the if -- the messages were flushed to the log file so that I should not miss anything that was logged. Many rows of the table were exported. This way I know it works at least somehow. This way I also know that the column_count variable contained a positive value, and the for-loop did execute its body. It seems that one of the lines of the for-loop body (started at 146) and the called functions contain the bug.

I am ready to add the requested information.

Thanks for your help and experience,
   Petr

P.S. Fortunately, I am almost hairless. This way, the bug will not be the reason to loose my hair ;)
sqlite2txt.zip
0
Comment
Question by:pepr
  • 2
3 Comments
 
LVL 28

Accepted Solution

by:
pepr earned 0 total points
ID: 39662300
The bug was found. The len assigned at the line 147 means the number of bytes in the buffer with text encoded UTF-16. However, the ::WideCharToMultiByte function calls at the lines 29 and 38 expect the number of characters. This way also the function WCto1250 expects the number of characters. On the contrary, it is passed the number of bytes at the line 152. This way, it tried to decode twice as much "characters" from the buffer and failed somehow. I did not search for the exact details of the failure.
0
 
LVL 28

Author Comment

by:pepr
ID: 39665111
Thanks for watching ;)
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

Ever notice how you can't use a new drive in Windows without having Windows assigning a Disk Signature?  Ever have a signature collision problem (especially with Virtual Machines?)  This article is intended to help you understand what's going on and…
Citrix XenApp, Internet Explorer 11 set to Enterprise Mode and using central hosted sites.xml file.
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now