<div>
how are they encoded? utf-8? or unicode? <br />
unicode anything &lt;=255 utf-8 anything not starting with c2 or if starting with c2 &lt;=255
</div>


how are they encoded? utf-8? or unicode?
unicode anything <=255 utf-8 anything not starting with c2 or if starting with c2 <=255

<div>
I have the words in a text file that is set to Unicode. &nbsp;Many Thanks!
</div>


I have the words in a text file that is set to Unicode.  Many Thanks!

<div>
Since this is .NET, you should be able to list out all of the characters you care about using a character class:<br />
<br />
e.g.<br />
<br />

<pre><code id="code-20-40716119-1">if (Regex.IsMatch(input, &quot;[^a-zA-ZäöüÄÖÜß(french chars here)]&quot;))
{
    // Found offending character
}</code></pre>
<br />
I'm afraid I don't speak/read French, so I don't know what the characters that are used within it are, but if you know them, then you should be able to just insert them in the advertised place in the above pattern--sans parens.
</div>


Since this is .NET, you should be able to list out all of the characters you care about using a character class:

e.g.



if (Regex.IsMatch(input, "[^a-zA-ZäöüÄÖÜß(french chars here)]"))
{
    // Found offending character
}


I'm afraid I don't speak/read French, so I don't know what the characters that are used within it are, but if you know them, then you should be able to just insert them in the advertised place in the above pattern--sans parens.

<div>
<div class="content wysiwyg-content">
I am trying to build a regular expression that can help me understand if a piece of text contains any characters outside of the English, French and German language sets. &nbsp;More specifically any characters you can type on a standard English, German or French keyboard.<br />
<br />
I have the task of going through millions of “words” – groups of characters wrapped with white spaces.<br />
<br />
Any suggestions would be greatly appreciated.
</div>
</div>


I am trying to build a regular expression that can help me understand if a piece of text contains any characters outside of the English, French and German language sets.  More specifically any characters you can type on a standard English, German or French keyboard.

I have the task of going through millions of “words” – groups of characters wrapped with white spaces.

Any suggestions would be greatly appreciated.

How to create a regular expression in C# to search for non English, French and German characters.

C# is an object-oriented programming language created in conjunction with Microsoft’s .NET framework. Compilation is usually done into the Microsoft Intermediate Language (MSIL), which is then JIT-compiled to native code (and cached) during execution in the Common Language Runtime (CLR).

The .NET Framework is not specific to any one programming language; rather, it includes a library of functions that allows developers to rapidly build applications. Several supported languages include C#, VB.NET, C++ or ASP.NET.

.NET Programming

A regular expression ("regex") is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. Regular expression processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK. Many programming languages provide regular expression capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python and C++ (since C++11). Most other languages offer regular expressions via a library.