• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 567
  • Last Modified:

Character frequency paragraph

Do you enjoy boring and tedious jobs for which the only pay is a few measly expert points and my eternal gratitude? If so, then:

Please write as short a paragraph as you can, which:

  a) contains at least 2 of every letter in the alphabet, and
  b) is coherent (so gibberish doesn't count).

I've done some myself, but it's taking me ages now and I can't think of anything to write about except physics or programming (how boring I am) - I'd like a mix of topics (films, poets, history, languages, anything that comes to mind really!) - also, I'm running low on words containing x's, z's, q's, etc.

Points will be split equally.


If you're interested in why I want this: I wish to analyse the typing style of different people, and require a text corpus (with plenty of the less frequent letters) for each subject to type up into my software (which will then analyse them).

(For anyone with Java installed, you can use the attached code to count the number of letters used.)

Much appreciated!!
//This Java program will count the number of letters used in your text as you type it...
//Excuse the poor programming
import javax.swing.*;
import java.awt.*;
public class Corpus extends JFrame implements Runnable
{
	public JTextArea area=null;
	public JTextArea count=null;
	private String[]alpha=new String[]{"A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"};
	private int[]adder=new int[alpha.length];
	
	public Corpus()
	{
		super();
		setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
		setLayout(new GridLayout(1,2));
		area=new JTextArea(30,30);
		area.setText("Type here...");
		count=new JTextArea(30,10);
		add(new JScrollPane(area));
		add(new JScrollPane(count));
		area.setLineWrap(true);
		pack();
		setLocationRelativeTo(null);
		setVisible(true);
		new Thread(this).start();
	}
	
	public void run()
	{
		for(;;)
		{
			update();
			try{Thread.sleep(100);}catch(Exception e){}
		}
	}
	
	public void update()
	{
		for(int i=0; i<adder.length; i++)
		{
			adder[i]=0;
		}
		String T=area.getText().toUpperCase();
		for(int i=0; i<T.length(); i++)
		{
			String C=T.substring(i,i+1);
			for(int j=0; j<alpha.length; j++)
			{
				if(alpha[j].equals(C))
				{
					adder[j]++;
				}
			}
		}
		
		String S="";
		for(int i=0; i<adder.length; i++)
		{
			S+=alpha[i]+"\t"+adder[i];
			if(adder[i]>=2)S+=" #";
			S+="\n";
		}
		count.setText(S);
	}
	
	public static void main(String[]a)
	{
		new Corpus();
	}
}

Open in new window

0
InteractiveMind
Asked:
InteractiveMind
4 Solutions
 
ozoCommented:
review of The Yards: Mark Wahlberg, Joaquin Phoenix, Charlize Theron

I sang very well; but he just looked up into my face with a very quizzical expression
0
 
fhillyer1Commented:
the reindeer that went into the woods, basically to find comfort and peace quietly stayed within the boundaries of the farm, until one day decided to escape the queue of it life, going into the deep, exhilarated and excited on its new adventure was joined by a kangoroo and a khoala and when they reached their quest, the joruney was done they shared a view on which animals can work together to achieve a common purpose in their adventure they avoided the zoo and some zoologists in the area



that story will cover your needs
0
 
Thibault St john Cholmondeley-ffeatherstonehaugh the 2ndCommented:
>I wish to analyse the typing style of different people
It looks like you are interseted in the Index of Coincidence:
http://en.wikipedia.org/wiki/Index_of_coincidence
There is a scary looking formula in there, but I built an app a while ago that measured this from a section of writing - much more simply I think. I can't access it at the moment as I have a rather dead looking hard drive, but the essence was to multiply the number of occurences of a letter by the percentage it appeareed in that section of writing, as long as the piece was a decent size.
It's quite useful in cryptology to identify a language even though the rest of the information is blurred. It was also amazingly accurate in identifying an author and even a time period (first half - second half of a century), whether it was a technical journal or a children's story, whether British or American etc - (Differences in .
I'm looking for better references to try to remember how it worked, meanwhile this lengthy banter might go some way to be some text that you can test with. I've tried to include a few rare letters, but I can't guarantee that I have two of each - you should be able to test that with whatever you are doing I hope.
 
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Thibault St john Cholmondeley-ffeatherstonehaugh the 2ndCommented:
>(Differences in .
oops! sorry, editing and not reading properly.
0
 
ozoCommented:
see
Francis, Darryl   Double Pangram Lists        Word Ways 1978 page 245
0
 
InteractiveMindAuthor Commented:
I think I now have enough to close this thread actually.

Thank you fhillyer1, that was fast!

And thank you, ozo, I'd never heard of pangrams!
http://en.wikipedia.org/wiki/List_of_pangrams  =)

RobinD: I had programmed something similar before to identify the language used (I used the Chi-square test to compare the observed letter frequency to that of a large text corpus). With this project however, I'm analysing typing style so as to identify someone based on how they type (rather than what they type).

Thanks everyone!
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now