Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 365
  • Last Modified:

Fetching data from a webpage and inserting to excel with Java

Hello,

I need to fetch data from the following url and insert it into excel.

http://apps.who.int/classifications/icd10/browse/2010/en

Manual entry of data is tedious by copy pasting. Was wondering, if I could accomplish the same using a simple java program.

Please find attached herewith a sample excel file. Would appreciate if someone could guide me in the correct direction as to how to proceed with this. The data need to be taken in the manner as shown in the excel.

Any help which could help me accomplish this perfectly the way in which I have demonstrated in excel, would be much appreciated with points surely.

Any queries, please revert.

Best Regards,
Anees
sample-data.xlsx
0
Sreejith22
Asked:
Sreejith22
  • 5
  • 3
1 Solution
 
Dave BaldwinFixer of ProblemsCommented:
The ICD-10 code set contains from 14,400 to 76,000 codes with descriptive text.  http://en.wikipedia.org/wiki/ICD-10   Putting them all in one spreadsheet is going to make the spreadsheet slow to load and cumbersome to use.  In addition, your sample spreadsheet did not include the important descriptive text that tells what the code refers to.

While I'm sure that there is probably a way to capture all that info, you might want to think about how you want to use it.  I would just download that entire web site for local use.  Or maybe even just use it as it is on the web.  It seems to have a pretty good search function that would be better than what you would find in Excel.
0
 
Sreejith22Author Commented:
I followed the link CEHJ provided and it brings me one step closer to what I want.

@DaveBaldwin - For some reason, I need to get this data inside excel badly. With just the three columns I showed and nothing extra I need.

I have attached here the xml which I downloaded from the link which CEHJ gave.

<Class code="A00-A09" kind="block">
		<SuperClass code="I"/>
		<SubClass code="A00"/>
		<SubClass code="A01"/>
		<SubClass code="A02"/>
		<SubClass code="A03"/>
		<SubClass code="A04"/>
		<SubClass code="A05"/>
		<SubClass code="A06"/>
		<SubClass code="A07"/>
		<SubClass code="A08"/>
		<SubClass code="A09"/>
		<Rubric id="D0000002" kind="preferred">
			<Label xml:lang="en" xml:space="default">Intestinal infectious diseases</Label>
		</Rubric>
	</Class>
	<Class code="A00" kind="category">
		<Meta name="MortBCode" value="001"/>
		<Meta name="MortL4Code" value="4-002"/>
		<Meta name="MortL3Code" value="3-003"/>
		<Meta name="MortL2Code" value="2-001"/>
		<Meta name="MortL1Code" value="1-002"/>
		<SuperClass code="A00-A09"/>
		<SubClass code="A00.0"/>
		<SubClass code="A00.1"/>
		<SubClass code="A00.9"/>
		<Rubric id="D0000003" kind="preferred">
			<Label xml:lang="en" xml:space="default">Cholera</Label>
		</Rubric>
	</Class>
	<Class code="A00.0" kind="category">
		<Meta name="MortBCode" value="001"/>
		<Meta name="MortL4Code" value="4-002"/>
		<Meta name="MortL3Code" value="3-003"/>
		<Meta name="MortL2Code" value="2-001"/>
		<Meta name="MortL1Code" value="1-002"/>
		<SuperClass code="A00"/>
		<Rubric id="D0000004" kind="preferred">
			<Label xml:lang="en" xml:space="default">Cholera due to Vibrio cholerae 01, biovar cholerae</Label>
		</Rubric>
		<Rubric id="D0000934" kind="inclusion">
			<Label xml:lang="en" xml:space="default">Classical cholera</Label>
		</Rubric>
	</Class>
	<Class code="A00.1" kind="category">
		<Meta name="MortBCode" value="001"/>
		<Meta name="MortL4Code" value="4-002"/>
		<Meta name="MortL3Code" value="3-003"/>
		<Meta name="MortL2Code" value="2-001"/>
		<Meta name="MortL1Code" value="1-002"/>
		<SuperClass code="A00"/>
		<Rubric id="D0000005" kind="preferred">
			<Label xml:lang="en" xml:space="default">Cholera due to Vibrio cholerae 01, biovar eltor</Label>
		</Rubric>
		<Rubric id="D0000935" kind="inclusion">
			<Label xml:lang="en" xml:space="default">Cholera eltor</Label>
		</Rubric>
	</Class>
	<Class code="A00.9" kind="category">
		<Meta name="MortBCode" value="001"/>
		<Meta name="MortL4Code" value="4-002"/>
		<Meta name="MortL3Code" value="3-003"/>
		<Meta name="MortL2Code" value="2-001"/>
		<Meta name="MortL1Code" value="1-002"/>
		<SuperClass code="A00"/>
		<Rubric id="D0000006" kind="preferred">
			<Label xml:lang="en" xml:space="default">Cholera, unspecified</Label>
		</Rubric>
	</Class>

Open in new window


The above given code is an excerpt from the attached xml. From the attached xml, can someone let me know how I can fetch the the three data as shown in the excel file attached in the question.

Any help is appreciated.
icd102010en.xml
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
CEHJCommented:
You can't really. A spreadsheet is not for presenting hierarchical data - it's for presenting flat data. If you want to force it into a spreadsheet, then you need to force the hierarchy flat - and then preferably use csv
0
 
Sreejith22Author Commented:
I do not need to have a hierarchical representation of data in the excel file. Just a code and the corresponding full name, classification and sub-classification.
0
 
CEHJCommented:
You'll need to parse the ClaML  xml. And you're going to need the DTD:

http://clam.eggbird.eu/download/EN14463-2007.dtd
0
 
Sreejith22Author Commented:
can someone please guide me on this with some code. I have no clues as to how to proceed.
0
 
CEHJCommented:
I can't now as i haven't the time. I suggest you look at general xml parsing tutorials. If you want code producing for you on a professional basis, you're welcome to contact me via my profile
0
 
CEHJCommented:
:)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now