hey,
i have a problem where i'm searching already a long time for.
i'm working for a company that develops community websites. the thing is that the last months our community is overwhelmed by turkish posts in the forum. we would want to keep it in english so that everybody can enjoy hanging around in the community.
is it somewhere possible to analyze the posts before they're inserted into the database? or if that's not possible to analyze it in a batch job at night?
it's a website/database with a VERY high load, so the analysis of the texts should have as less impact as possible
i would want a solutions that recognises a text that is turkish and then rejects the post.
i thougth already about a turkigsh words database, but i think if we have to match every post to that database, it'll take too long. the other problem is that i don't know where to get that database :-p
tnx
lee
Start Free Trial