[Webinar] Streamline your web hosting managementRegister Today

  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 543
  • Last Modified:

Open discussion: Automated marking/grammar of submitted documents/text

My client currently has an online system where students submit essays created in Word, or simply typed out as text in HTML text fields.

Currently my client has 600 essays to mark each month, which he is paying $25/hour for a person to do.  Therefore, he wants to automate this process ASAP.  We suggested just using multiple choice style questions, but he also has to obey educational demands meaning that essays are still required.

He is wanting me to create a program that performs some kind of text/grammar matching so that when the documents/text fields are submitted, the program can 'scan' the contents looking for certain included words/phrases/grammar, and provide feedback or a result.

Now I know how to do simple text matching, but the grammar thing has got me thinking...  As there are a hundred different ways to write the same sentence, I thought I'd ask on EE just to see if anything like this has been attempted before.

So, does anyone have any pointers, advice for how to achieve this?  Throw me your feedback & comments (both negative and positive!) and I'll split points accordingly.  Thanks :-)
3 Solutions
well it is possible, because there are grammar checkers available on MS word etc.  It would be a problem incorporating heuristics, fuzzy logic and artificial intelligence - a big project in my opinion.  I can't tell you how to do it as such.

If I'd paid for my son to do a history course, and his essays were marked by a computer that gave him a mark for the phrase 'authoritarian rule' in an essay about Henry VIII, via a text search, I'd feel a bit ripped off that it was being marked like that.

But maybe I am naive about how essays are marked in practice.
RouchieAuthor Commented:
>> If I'd paid for my son to do a history course....I'd feel a bit ripped off that it was being marked like that.

I completely agree, however, that decision is out of my hands - we're just being asked about the possibility of implementing it.  I've been researching this all morning and it seems quite a hot topic in universities, where lots of theses have been written regarding its possiblity.  I'd like to just say 'no' and be done with it.... :-)
Essay grading probably awards part of the grade based on content and part of the grade based on spelling and grammar.  So far no computer has mastered either, so this is a huge undertaking.

For the spelling and grammar portion, you may want to take a look at using spelling and grammar engines like the one shipped with Microsoft Office as a starting point. Although it is far from 100% accurate, with nearly two decades of development it is getting better.  For the content portion you may want to research how IBM's Watson parses natural language, and note that even with a supercomputer and a multi-million dollar development budget it still wasn't 100% accurate in understanding language.

Although the commercial value of a successful and accurate human-language content and grammar module would be enormous, the investment requirement is unlikely to be supported in the current  business case.

If one is grading short responses looking only for key words, such as "Napoleon", "1812","Russia", the task becomes easier as a computer may be able to compare the current examinee's response with a database of previously graded responses.

Perhaps your client would be better served by creating a method to distribute the exam grading to a lower cost labor pool.  For example, allow remote graders to access completed exams from your client's website and pay them on a per graded piece basis. Have the same paper graded by multiple graders to help detect fraud or incompetence on the grader's behalf, keep grader merit-scores and adjust the number of times a paper is graded based on how competent the grader is.  Your client may end up with a better result at no more expense.
I did some research on a speech recognition for my university degree, which might be somehow similar to what you need. It's no easy topic, because as you said there are several ways to say the same. Therefore what's usually done in speech/text applications like this, is to use math theory on statistics (mainly the Markov Chains) to check grammar. Here's a link to a google search on the matter:
Another topic that might be important is AI, if you want to put some feedback into the system from the input you receive, as language might slightly change over time.
There are a few things more I could mention, but the above is the main thing to start looking at, at least from what I know. I'm not sure which qualifications you have, but I'm sure it's not an easy task, so I wish you the best. Regards!
RouchieAuthor Commented:
Thank you.  Those were the responses I was looking for.  "Expensive and inaccurate" are the key points I will communicate back.  The IBM Watson thing was particularly interesting.

Featured Post

Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now